SQL Server Integration Services programming basics

Serdar Yegulalp, Contributor

If you've ever moved data into or out of SQL Server, chances are you've done it through Microsoft's DTS or Microsoft's Data Transformation Services tools. DTS is an ETL tool (short for extraction, transforming and loading), which allows database administrators to create sets of steps that describe how data is to be transformed. These "packages," as they're called, can then be executed via a number of different programming languages or directly within SQL Server itself.

The idea was and still is a good one, but ultimately too limited -- it remained a glorified if convenient version of the bulk-copy system used in earlier versions of SQL Server. With SQL Server 2005, Microsoft decided to turn DTS into a whole new kind of ETL tool — SQL Server Integration Services (SSIS). Moving data from one database to another or exporting a database in another format are only the most basic things you can do with SSIS. Here area additional reasons to use SSIS:

Data cleansing.
If you have a data set that isn't "clean," i.e., it hasn't been normalized, SSIS has a broad tool set for cleaning and normalizing data. It includes functions like "fuzzy" matching and grouping, so data that is meant to be the same (but has been input or expressed differently) can be filtered and normalized. One good example of this would be data from different geographic regions that has different numeric formats: The same ideas that are expressed differently, need to be normalized before

    Requires Free Membership to View

they can be crunched.

Connectivity to Analysis Services cubes.
You can use SSIS to pipe data directly into a cube for data mining. Instead of importing the data into a native database and then mining it there, an SSIS package itself can be used as the data source.

Integration with SQL Server Reporting Services.
As with analysis cubes, SSIS packages can be used as direct data sources for creating reports.

Full .NET support.
This is probably a given, but SSIS programming can be done either natively or through COM object interfaces in any of the .NET languages — or, for that matter, any language that supports COM objects.

The end result is a system where the data sets you import and the packages you create can work more like programmatic objects than static data. This is precisely the idea, and it

Take a look at these handy SSIS migration tools:

 allows the user to programmatically integrate the way the reporting services, the analysis services, the data and the database itself all talk to each other. For instance, archived legacy data can continue to be used for trend analysis or data mining even if it's technically offline. Another example is live integration of data from multiple static sets, such as the case I mentioned earlier about data from multiple geographic locations.

Because SSIS is programmable, its functions can be invoked in a broad range of circumstances. For instance, an ASP.NET-driven Web site could be a management front end to run SSIS packages, or one could write a standalone application to connect to SSIS and obtain and refine data for its own use. The idea of being able to drive SQL Server programmatically has been around for a long time, but SSIS makes it possible to talk to everything else that SQL Server uses as well -- all at once and in a unified way. A programmer can also create custom SSIS extensions such as custom data transforms or enumerators. You can reuse them in other applications or other environments entirely.

Packages in SSIS can be written using the Business Intelligence Design Studio, which ships with SQL Server 2005. The resulting files are mostly XML, so they can be inspected and edited by hand if need be. Note that some of the functions in SSIS — data mining connectivity and fuzzy grouping and lookup, to name two — are only available in the Enterprise Edition of SQL Server 2005. SSIS isn't available at all in SQL Server 2005 Express Edition, so if you create SSIS packages, you can only use them in the full edition of SQL Server. To that end, when you're creating programmatic solutions in SSIS, they're really designed to work in other environments that fully support SSIS as well.

Serdar Yegulalp has been writing about computers and information technology for more than 15 years for a variety of publications, including InformationWeek and Windows Magazine.

This was first published in November 2006

There are Comments. Add yours.

TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

Disclaimer: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.