Q
Problem solve Get help with specific problems with your technologies, process and projects.

Performing incremental loads and extracting data

SQL Server Security Expert Joe Toscano shares source data tables designed to easily perform an incremental load / extract.

I am writing a package using SQL Server Integration Services (SSIS). After the first full run of Extract, Transform, Load (ETL), I don't want to pick up all the data again from my different data sources. Is there a way of performing an incremental load using SSIS packages, so next time I run the package it picks up only changed data from the source and appends it in the destination?
I've had to perform incremental loads when extracting data to populate a data mart. To do this I wrote extraction stored procedures that were executed as a SQL task within my Data Transformation Service (DTS) packages. Our source data tables were designed to allow this to happen quite easily. Here are the columns in each source table that allowed us to easily perform an incremental load/extract:

Identity Column as PK or Unique Constraint (int column)

This column allowed us to see if a row in the source was new or already existed by comparing the source ID value with those that exist in the destination.

LastUpdateDate (DateTime Column)

This column was automatically updated by our application. If we found that our row was an existing row (based on the identity column value above) we were able to compare the source LastUpdateDate column with the DestinationUpdateDate column to see if the row's data changed since the last extraction.

To make things easier for us, we created a table that held the current Id value and current update date values for each table we extracted. This allowed for easy comparison during each load and it turned out that this table was also heavily reported against. (Reports contained the extract datetime so report viewers could clearly see how current the data was.)

If you do not have the luxury of having these columns present in your source you may have to key off of the source table's primary key. You may also have to perform comparisons of the columns that can potentially change.

This was last published in May 2006

Dig Deeper on SQL Server Interoperability

Have a question for an expert?

Please add a title for your question

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

Start the conversation

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

-ADS BY GOOGLE

SearchBusinessAnalytics

SearchDataCenter

SearchDataManagement

SearchAWS

SearchOracle

SearchContentManagement

SearchWindowsServer

Close