Custom VB.Net scripting in SQL Server Integration Services
Robert Sheldon, Contributor
SQL Server Integration Services (SSIS) provides a powerful environment for extracting,
transforming and loading data, and most of the time you'll find that the built-in components in
SSIS are more than adequate for managing data. However, in some cases, you might need to extend
SSIS to satisfy specific requirements, and that's where the Script component comes in handy. The
Script component lets you add custom Visual Basic.Net code to your data flow, offering the full
power of .Net within any SSIS package.
In this article, I demonstrate how to add a Script component to an SSIS package. The example
package I use contains a single Data Flow task in the control flow. The Data Flow task is
configured with three components: OLE DB Source, Script component and Flat File Destination.
Premium Access
Register now for unlimited access to our premium content across our network of over 70 information Technology web sites.
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States.
Privacy
Dig Deeper
-
People who read this also read...
-
This was first published in December 2007
Figure
1 shows these components as they appear on the Data Flow tab in Visual Studio.
Figure 1: View of Data Flow tab in Visual Studio.
(Click on image for enlarged view.)
The Script component will generate user names based on employees' first, middle and last names
and the year they were hired. I extract the employee information from the AdventureWorks sample
database that ships with SQL Server 2005. Note that SSIS also supports a Script task that you can
use in the control flow; however, this article is concerned only with the Script component as it is
used in the data flow.
Although my focus is the Script component, I will also give you basic information about how to
configure the other components in case you want to create this solution yourself. If you need help
setting up an SSIS package or configuring its components, see SQL Server 2005 Books Online.
Configuring the OLE DB Source component
The OLE DB Source component uses a native OLE DB connection manager to
connect to the AdventureWorks database. In addition, the component specifies the following
Transact-SQL command to retrieve data from the Person.Contact table and the HumanResources.Employee
table:
SELECT c.FirstName, c.MiddleName,
c.LastName, e.HireDate
FROM Person.Contact c
INNER JOIN HumanResources.Employee e
ON c.ContactID = e.ContactID
As you can see, I create a simple join based on the ContactID columns in both tables. The
information I retrieve will be used to create the custom user names.
Configuring the Script component
After you configure the OLE DB Source component, you're ready to configure the Script component.
When adding a Script component, you must specify whether to use the component as a source,
transformation or destination. For this example, I use the Script component as a transformation
because I want to intercept data within the data flow and from that data generate an additional
column that contains the user names. In other words, I plan to transform the data as it passes from
a source to a destination.
After you add the Script component, connect the data flow path from the OLE DB Source component
to the Script component. Then, double click on the Script component to open the Script
Transformation editor. Figure 2 shows the editor's Input Columns page. On this page, you specify
which columns to make available as input to the Script component by selecting the check box next to
each column name. For this example, I checked all four columns.
Figure 2: The editor's Input Columns page.
(Click on image for enlarged view.)
Next, select the Inputs and Outputs page of the Script Transformation editor (shown in Figure 3)
and expand the Input 0 and Output 0 nodes. Notice that the input columns you selected in the Input
Columns page show up here as well.
You must add an output column to hold the user names your script will generate. To add a column,
select the Output Columns subnode under the Output 0 node, click Add Column and then type
UserName as the column name. In the Data Type Properties section on the right side of the
Inputs and Outputs page, specify String as the data type and 8 as the length.
Figure 3: The Inputs and Outputs page of Script Transformation editor.
(Click on image for enlarged view.)
That's all you need to configure in the Script component for this example. If you plan to use
system or package variables in your script, you need to add those on the editor's Script page;
however, this solution does not require those types of variables.
Adding the Visual Basic.Net code
Now you're ready to add the actual code to the Script component. To add the code, go to the
Script page of the Script Transformation editor and click Design Script. The Microsoft Visual
Studio for Applications window appears and includes the initial Visual Basic.Net code you need to
get started, as shown in Figure 4.
Figure 4: Visual Studio for Applications window with VB.Net code.
(Click on image for enlarged view.)
The code begins with several comments about the Script component. Comments are preceded by an
apostrophe. Once you read the comments, you can delete them and then add your own
 |
| Optimizing SQL Server Integration Services: |
|
|
|
|
 |
 |
comments. Note that the Visual Basic generated here is
specific to the Script Transformation component. The code is different from the code generated for
the component when it is used as a source or destination.
The auto-generated code begins with a series of Imports statements that define the initial
namespaces you're likely to use for your script. You can add more namespaces as necessary. The
ScriptMain class follows the Imports statements, and within the class is the
Input0_ProcessInputRow() method. You should start with this class and method and add your custom
code to the method where the comment says "Add your code here."
As you can see, SSIS makes it simple to get started. Just start typing your code in the
designated spot. For the solution shown in this article, I first define a set of variables and then
concatenate the variables to create the user names. You can download the entire Visual Basic.Net code for this solution by clicking
here.
I begin the custom code by creating variables for the first, middle and last names. The
variables are named First, Middle and Last, respectively. In the first line, I declare the First
variable as a string and then, in the second line, I set its value to the first letter of the first
name:
Dim First As String
First = LCase(Row.FirstName.Substring(0, 1))
Notice that I use the Row variable -- defined as part of the Input0_ProcessInputRow() method --
to retrieve the FirstName column from the input. When I retrieve the name, I use the Substring()
method to limit results to the first letter only. Then I wrap the results in an LCase() method,
which converts the letter to lower case.
For the Middle variable, I do something similar, but I first check whether the MiddleName value
is null:
Dim Middle As String
If Row.MiddleName_IsNull Then
Middle = ""
Else
Middle = LCase(Row.MiddleName.Substring(0, 1))
End If
As you can see, I use an If-Else-End If construction to determine whether the value is null. If
it is null, then I set the value of Middle to an empty string. Otherwise, I retrieve the first
letter as I do for the first name.
For the Last variable, I want to select the first four letters from the LastName column.
However, I need to account for names that contain fewer than four letters and that might include an
apostrophe:
Dim Last As String
If Row.LastName.Length < 5 Then
If Row.LastName.Contains("'") Then
Last = LCase(Row.LastName.Replace("'", ""))
Else
Last = LCase(Row.LastName)
End If
Else
If Row.LastName.Contains("'") Then
Last = LCase(Row.LastName.Replace("'", "").Substring(0, 4))
Else
Last = LCase(Row.LastName.Substring(0, 4))
End If
End If
First I set up an outer If-Else-End If condition to determine whether the name contains fewer
than five letters. If it does, I use the entire last name. If it doesn't, I use only the first four
letters.
To handle the possibility that a name contains an apostrophe, I embed If-Else-End If conditions
in the outer If and Else statements. If the name contains an apostrophe, then I replace it with an
empty string. Otherwise I retrieve the name as is.
Finally, I declare and set the YearHired variable, which extracts the year from the HireDate
column:
Dim YearHired As String
YearHired = CStr(Row.HireDate.Year).Substring(2, 2)
Notice that I first use the Row variable to retrieve the Year property of the HireDate column. I
then use the CStr() method to convert the date to a string and use the Substring() method to
retrieve the last two digits to the year.
After I define the variables, I create the user name by concatenating the variables:
If Row.HireDate < CDate("2000-01-01") Then
Row.UserName = First + Last + YearHired
Else
Row.UserName = First + Middle + Last + YearHired
End If
Once again, I use an If-Else-End If condition to determine which logic to use. For those hired
prior to 2000, I create a user name based on the first initial, first four letters of the last name
and the last two digit of the year hired. For everyone else, I use the same logic, except I also
include the middle initial. I assign the user name to the UserName column, which, like the other
columns, I access through the Row variable.
That's all there is to adding a Script component to your data flow. After you've added your
code, close the Visual Studio for Applications window and the Script Transformation editor. You're
now ready to add your destination.
Configuring the Flat File Destination component
After you add the Flat
File Destination component, connect the data flow path from the Script component to the
destination, and open the Flat File destination editor. Now add a Flat File connection manager that
points to a text file. For this example, I use c:\employees.txt and map the original four columns
and the new UserName column to the destination. I use the default settings for everything else.
After you finish, close any open editors and save your SSIS package. Now run the package. The
employee information -- along with the new user names -- will be added to the text file.
The solution I show here is, of course, a very basic example. However, it does demonstrate all
the components necessary to add custom code to your SSIS packages. You should also try to create
other types of scripts and try adding script sources and destinations. As you'll discover, the
Script component provides access to the rich Visual Basic.Net development environment, allowing you
to extend your SSIS packages and accommodate any of your extraction, transformation and load (ETL)
needs.
ABOUT THE AUTHOR
Robert Sheldon is a technical consultant and the author of numerous books,
articles, and training material related to Microsoft Windows, various relational database
management systems, and business intelligence design and implementation. You can find more
information at www.rhsheldon.com.
Disclaimer:
Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.
Join the conversationComment
Share
Comments
Results
Contribute to the conversation