SQL Server Integration Services (SSIS) provides a powerful environment for extracting, transforming and loading data, and most of the time you'll find that the built-in components in SSIS are more than adequate for managing data. However, in some cases, you might need to extend SSIS to satisfy specific requirements, and that's where the Script component comes in handy. The Script component lets you add custom Visual Basic.Net code to your data flow, offering the full power of .Net within any SSIS package.
In this article, I demonstrate how to add a Script component to an SSIS package. The example package I use contains a single Data Flow task in the control flow. The Data Flow task is configured with three components: OLE DB Source, Script component and Flat File Destination. Figure 1 shows these components as they appear on the Data Flow tab in Visual Studio.
The Script component will generate user names based on employees' first, middle and last names and the year they were hired. I extract the employee information from the AdventureWorks sample database that ships with SQL Server 2005. Note that SSIS also supports a Script task that you can use in the control flow; however, this article is concerned only with the Script component as it is used in the data flow.
Although my focus is the Script component, I will also give you basic information about how to configure the other components in case you want to create this solution yourself. If you need help setting up an SSIS package or configuring its components, see SQL Server 2005 Books Online.
Configuring the OLE DB Source component
The OLE DB Source component uses a native OLE DB connection manager to connect to the AdventureWorks database. In addition, the component specifies the following Transact-SQL command to retrieve data from the Person.Contact table and the HumanResources.Employee table:
SELECT c.FirstName, c.MiddleName,
FROM Person.Contact c
INNER JOIN HumanResources.Employee e
ON c.ContactID = e.ContactID
As you can see, I create a simple join based on the ContactID columns in both tables. The information I retrieve will be used to create the custom user names.
Configuring the Script component
After you configure the OLE DB Source component, you're ready to configure the Script component. When adding a Script component, you must specify whether to use the component as a source, transformation or destination. For this example, I use the Script component as a transformation because I want to intercept data within the data flow and from that data generate an additional column that contains the user names. In other words, I plan to transform the data as it passes from a source to a destination.
After you add the Script component, connect the data flow path from the OLE DB Source component to the Script component. Then, double click on the Script component to open the Script Transformation editor. Figure 2 shows the editor's Input Columns page. On this page, you specify which columns to make available as input to the Script component by selecting the check box next to each column name. For this example, I checked all four columns.
Next, select the Inputs and Outputs page of the Script Transformation editor (shown in Figure 3) and expand the Input 0 and Output 0 nodes. Notice that the input columns you selected in the Input Columns page show up here as well.
You must add an output column to hold the user names your script will generate. To add a column, select the Output Columns subnode under the Output 0 node, click Add Column and then type UserName as the column name. In the Data Type Properties section on the right side of the Inputs and Outputs page, specify String as the data type and 8 as the length.
That's all you need to configure in the Script component for this example. If you plan to use system or package variables in your script, you need to add those on the editor's Script page; however, this solution does not require those types of variables.
Adding the Visual Basic.Net code
Now you're ready to add the actual code to the Script component. To add the code, go to the Script page of the Script Transformation editor and click Design Script. The Microsoft Visual Studio for Applications window appears and includes the initial Visual Basic.Net code you need to get started, as shown in Figure 4.
The code begins with several comments about the Script component. Comments are preceded by an apostrophe. Once you read the comments, you can delete them and then add your own
Optimizing SQL Server Integration Services:
comments. Note that the Visual Basic generated here is specific to the Script Transformation component. The code is different from the code generated for the component when it is used as a source or destination.
The auto-generated code begins with a series of Imports statements that define the initial namespaces you're likely to use for your script. You can add more namespaces as necessary. The ScriptMain class follows the Imports statements, and within the class is the Input0_ProcessInputRow() method. You should start with this class and method and add your custom code to the method where the comment says "Add your code here."
As you can see, SSIS makes it simple to get started. Just start typing your code in the designated spot. For the solution shown in this article, I first define a set of variables and then concatenate the variables to create the user names. You can download the entire Visual Basic.Net code for this solution by clicking here.
I begin the custom code by creating variables for the first, middle and last names. The variables are named First, Middle and Last, respectively. In the first line, I declare the First variable as a string and then, in the second line, I set its value to the first letter of the first name:
Dim First As String
First = LCase(Row.FirstName.Substring(0, 1))
Notice that I use the Row variable -- defined as part of the Input0_ProcessInputRow() method -- to retrieve the FirstName column from the input. When I retrieve the name, I use the Substring() method to limit results to the first letter only. Then I wrap the results in an LCase() method, which converts the letter to lower case.
For the Middle variable, I do something similar, but I first check whether the MiddleName value is null:
Dim Middle As String
If Row.MiddleName_IsNull Then
Middle = ""
Middle = LCase(Row.MiddleName.Substring(0, 1))
As you can see, I use an If-Else-End If construction to determine whether the value is null. If it is null, then I set the value of Middle to an empty string. Otherwise, I retrieve the first letter as I do for the first name.
For the Last variable, I want to select the first four letters from the LastName column. However, I need to account for names that contain fewer than four letters and that might include an apostrophe:
Dim Last As String
If Row.LastName.Length < 5 Then
If Row.LastName.Contains("'") Then
Last = LCase(Row.LastName.Replace("'", ""))
Last = LCase(Row.LastName)
If Row.LastName.Contains("'") Then
Last = LCase(Row.LastName.Replace("'", "").Substring(0, 4))
Last = LCase(Row.LastName.Substring(0, 4))
First I set up an outer If-Else-End If condition to determine whether the name contains fewer than five letters. If it does, I use the entire last name. If it doesn't, I use only the first four letters.
To handle the possibility that a name contains an apostrophe, I embed If-Else-End If conditions in the outer If and Else statements. If the name contains an apostrophe, then I replace it with an empty string. Otherwise I retrieve the name as is.
Finally, I declare and set the YearHired variable, which extracts the year from the HireDate column:
Dim YearHired As String
YearHired = CStr(Row.HireDate.Year).Substring(2, 2)
Notice that I first use the Row variable to retrieve the Year property of the HireDate column. I then use the CStr() method to convert the date to a string and use the Substring() method to retrieve the last two digits to the year.
After I define the variables, I create the user name by concatenating the variables:
If Row.HireDate < CDate("2000-01-01") Then
Row.UserName = First + Last + YearHired
Row.UserName = First + Middle + Last + YearHired
Once again, I use an If-Else-End If condition to determine which logic to use. For those hired prior to 2000, I create a user name based on the first initial, first four letters of the last name and the last two digit of the year hired. For everyone else, I use the same logic, except I also include the middle initial. I assign the user name to the UserName column, which, like the other columns, I access through the Row variable.
That's all there is to adding a Script component to your data flow. After you've added your code, close the Visual Studio for Applications window and the Script Transformation editor. You're now ready to add your destination.
Configuring the Flat File Destination component
After you add the Flat File Destination component, connect the data flow path from the Script component to the destination, and open the Flat File destination editor. Now add a Flat File connection manager that points to a text file. For this example, I use c:\employees.txt and map the original four columns and the new UserName column to the destination. I use the default settings for everything else. After you finish, close any open editors and save your SSIS package. Now run the package. The employee information -- along with the new user names -- will be added to the text file.
The solution I show here is, of course, a very basic example. However, it does demonstrate all the components necessary to add custom code to your SSIS packages. You should also try to create other types of scripts and try adding script sources and destinations. As you'll discover, the Script component provides access to the rich Visual Basic.Net development environment, allowing you to extend your SSIS packages and accommodate any of your extraction, transformation and load (ETL) needs.
ABOUT THE AUTHOR
Robert Sheldon is a technical consultant and the author of numerous books, articles, and training material related to Microsoft Windows, various relational database management systems, and business intelligence design and implementation. You can find more information at www.rhsheldon.com.