. The main difference between these two functions is that the "get all changes" function will return every change that occurred per row of the corresponding table in the log sequence number (LSN) range. The "get net changes" function will return one row that may reflect multiple changes for the data, so you don't receive a distinct row for every change that occurred since the last query. In addition to these new functions, you must first decipher the proper LSN for your query by mapping the LSNs to the time parameters with this CDC system function: sys.fn_cdc_map_time_to_lsn.
To use these functions, replace with the value of the capture instance you configured to capture source table changes. The usage would look something like this:
SELECT * FROM
cdc.fn_cdc_get_all_changes_HR_Department(@from_lsn, @to_lsn,
'all');
In this example, I have chosen to bring back "all" records with the option in the functional so all changes within that LSN range are included. Your other option is "all update old" which will return all changes. It also returns both the row containing the column values before the update and the row containing the column values after the update. To get the from and to LSN values, you should first find the proper LSNs from the map functions described above, which would be:
DECLARE @from_lsn binary(10), @to_lsn binary(10);
SELECT @from_lsn = sys.fn_cdc_map_time_to_lsn('smallest greater than or equal', GETDATE()-1);
SELECT @to_lsn = sys.fn_cdc_map_time_to_lsn('largest less than or equal', GETDATE());
Note the argument in single tick quotes. This tells SQL Server which greater than or less than method to use and to include or not include records from the time query that matches the specified value. Here are the options: largest less than, largest less than or equal, smallest greater than or smallest greater than or equal.
In the end, there really aren't many steps to enable change data capture from your SQL Server 2008 databases and to add this capability into your data collection for your data warehouse. In fact, they are summarized very well in Microsoft SQL Server Books Online.
Choosing the proper data source candidates for enabling CDC is a critical decision when optimizing your
data warehouse or BI solution from a data collection perspective. Other CDC tools have their own nuances of setup and configuration. But as you migrate SQL Server sources to the SQL Server 2008 version, this capability can be added natively. Then you can collect just the changed data from your ETL process using built-in functions from SQL Server and SSIS. Collecting only changed data on shorter intervals is becoming more and more common and will increase data availability to your reporting solution business users.
| ABOUT THE AUTHOR: |
|
Mark Kromer is an internationally recognized leader in the business intelligence and database communities, having authored articles and blogs for TDWI, Microsoft, MSDN and Technet. Mark has more than 15 years' experience in IT, focusing on database technologies and solutions, including spending the past 2.5 years with Microsoft as the lead product manager for business intelligence customer solutions. In the past, he was responsible for training classes, partner readiness and product development of BI initiatives at Microsoft, Agilent Technologies and Cingular Wireless. He is currently the lead product manager for enterprise BI reporting solutions at Primavera Systems. Check out his blog at http://blogs.msdn.com/makrom
|
|
MEMBER FEEDBACK TO THIS TIP
Do you have comments on this tip? Let us know.