Home > SQL Server Tips > Data Warehousing and Business Intelligence > SQL Server 2008 Integration Services delivers new features
SQL Server Tips:
EMAIL THIS
 TIPS & NEWSLETTERS TOPICS 

DATA WAREHOUSING AND BUSINESS INTELLIGENCE

SQL Server 2008 Integration Services delivers new features


Mark Kromer
10.14.2008
Rating: -4.00- (out of 5)


Expert advice on database administration
Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google


Integration Services in SQL Server 2008 provides three new features for better performance and ease of use for administrators and developers. In this tip, SQL Server BI expert Mark Kromer discusses how data integration has been kicked up a notch with improved scripting capabilities, data profiling and data flow.

SQL Server has become a very large and all-encompassing enterprise database package over the years, particularly with the addition of the business intelligence features: Analysis Services, DTS and now SSIS (SQL Server Integration Services). With SQL Server 2008, you get off-the-shelf developer tools (BIDS), administrator tools (SSMS), performance tools, systems for data warehousing, business intelligence, report-building services and so on.

That's good and bad news for those whose everyday task is to administer and manage a SQL Server environment. In this tip, I'll focus on the data integration tool, SQL Server Integration Services (SSIS), that ships with SQL Server 2008 and three new exciting features. These new features in SSIS 2008 are improved scripting, data profiling and data flow. There are other improvements in SSIS 2008, including change data capture (CDC) and merge capabilities in SQL Server 2008, that I will not have the space to detail here. To find out more about CDC, visit my previous tip on how the change data capture feature improves BI reporting accuracy.

Let's start with the new C# scripting capabilities in SSIS 2008. In the SQL Server 2005 release of SSIS, scripting tasks were only available in Visual Basic as part of the VSA (Visual Studio for Applications) component in SSIS. After a lot of customer requests for making C# an optional task scripting language, Microsoft now makes that available by fully incorporating Visual Studio Tools for Applications (VSTA) into SSIS as a script task (see Figure 1). Now you have access to the full .NET Framework and can perform much more complex actions through the scripting task. Drop the scripting task onto the Control Flow design surface and you will be prompted to select either Visual C# or Visual Basic as your language of choice. So now SSIS programmers and administrators are able to move away from VB scripts like this:

'Set new internal variable to DTS/SSIS package variable
Dim fromDate As Date = Me.Variables.FromDate
Do While fromDate <= Me.Variables.ToDate
fromDate = DateAdd(DateInverval.Day, 1, fromDate)
Loop

…to the somewhat more elegant new C# variant:

DateTime fromDate = this.Variables.FromDate; // From SSIS
Variable
While (fromDate <= this.Variables.ToDate)
{
fromDate=fromDate.AddDays(1);
}

[IMAGE]
Figure 1: New Script ...


Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google



RELATED CONTENT
Data Warehousing and Business Intelligence
Recommended practices for SQL Server Analysis Services aggregations
Creating and managing SQL Server Analysis Services partitions
Sharing SSAS cube data in SharePoint with PerformancePoint Server 2007
New data profiling tools in SQL Server 2008
Utilize SSAS for data predictions and classification using Excel
Parent-child dimensions in SQL Server 2005 with Analysis Services MDX
Change data capture in SQL Server 2008 improves BI reporting accuracy
Manage traces in SQL Server 2005 Analysis Services with XMLA commands
Tutorial: SQL Server 2005 Analysis Services
Open SSIS packages without validation using these SQL properties

SQL Server Business Intelligence (BI) and Data Warehousing
DBA career paths could lead to business intelligence
Are data warehouses made for the cloud?
Q&A: Business intelligence gets a facelift in SQL Server 2008 R2
Project Gemini gets a new name, Madison earns buzz
Speed up reports in SQL Server Reporting Services with caching
Data Transformation Services vs. SSIS: The key differences
Using package configurations in SQL Server Integration Services (SSIS)
How SQL Server 2008 components impact SharePoint implementations
Achieving high availability and disaster recovery with SharePoint databases
Recommended practices for SQL Server Analysis Services aggregations

Microsoft SQL Server 2008
SQL Server 2008 R2 CTP set for November
Q&A: SQL Server 2008 a better fit for consolidation
End of life comes for SQL Server 2005 SP2, 2008
Microsoft releases SQL Server 2008 R2 CTP
What's new for installation with SQL Server 2008?
SQL Server Reporting Services 2008 offers faster speeds, new variations
Microsoft SQL Server 2008 Learning Guide
A first look at Microsoft SQL Server 2008 R2
Understanding transparent data encryption in SQL Server 2008
Working with sparse columns in SQL Server 2008

RELATED GLOSSARY TERMS
Terms from Whatis.com − the technology online dictionary
data aggregation  (SearchSQLServer.com)
data preprocessing  (SearchSQLServer.com)
data warehouse  (SearchSQLServer.com)
FileMaker  (SearchSQLServer.com)
GIS  (SearchSQLServer.com)
MOLAP  (SearchSQLServer.com)
pivot table  (SearchSQLServer.com)
Quiz: SQL Server 2000  (SearchSQLServer.com)
SQL  (SearchSQLServer.com)
T-SQL  (SearchSQLServer.com)

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary


Task in SSIS 2008 is VSTA and includes both Visual Basic as well as Visual C# scripting.

One of the most exciting new components in SSIS 2008 is the data profile task. With this new feature, you can sample SQL Server data and create profiles of that data to look for patterns, look for NULLs and duplicate data, among other capabilities. A major drawback of this first data quality tool to be made available by Microsoft is that the only data source it works with is SQL Server -- at this time.

There are many uses for data profiling, including analyzing statistics from the output and identifying candidate keys, but for this tip, let's focus on data movement in business intelligence scenarios. A data profiler is something that is very helpful in the time-consuming and tedious exercise of mapping data from one SQL Server database to another or when looking for the primary candidate for fields such as address, name or product in building dimensions for your data warehouse.

Many standalone ETL products on the market today include similar functionality and it is almost a necessity in the toolbox of data integration specialists. It is good to see Microsoft include data profiling in SSIS. But do not look for SSIS to become a standalone powerhouse ETL tool on its own anytime soon. This is likely to remain a tool attached to SQL Server and part of the overall SQL Server package value.

What you see in Figure 2 is the result of a data profiler task that has collected data samples from a database. When the task has completed in SSIS, you open up this viewer to display the results. The output from the package will be an XML file, and the best way to view and make sense of the output is to execute this viewer. The viewer is accessed outside of Visual Studio as a standalone application that is installed as a client tool with Integration Services. This sample output in Figure 2 is a look at the column length distribution profile.

[IMAGE]
Figure 2: Here is sample output from the SSIS package. (Click on image for enlarged view.)

Next is an enhancement to existing functionality. Microsoft has made buffer and caching improvements to its SSIS data flow components. Additional asynchronous capabilities in data flow components will easily provide low-hanging fruit performance gains in your SSIS pipeline. If you're familiar with SSIS in 2005 and have used data flow components in large, complex packages, then you've undoubtedly run into situations where performance quickly degrades when piping data. Most likely you were forced to make design decisions on your pipeline based on emphasizing parallelism. Well, the good news with SSIS 2008 is you don't need to worry as much about these issues and you can begin eliminating intricately designed performance-gain tricks.

Most of the bottlenecks I've seen in past SSIS routines occur when it's performing too many row-level operations or blocking components such as aggregates and sorting or when data needs to be copied to a new buffer instead of passing the same buffer onto the next transformation. In SSIS 2008, there is now the cache transform, which you can use to persist data throughout your dataflow. This allows you, for example, to reuse this cache in multiple lookups without requiring the SSIS engine to repopulate the cache each time. Figure 3 shows setup screens for the cache transform, which I have attached to my data source:

[IMAGE]
Figure 3: Create a new cache manager and give it a descriptive name. (Click on image for enlarged view.)

[IMAGE]
Figure 4: Now configure the metadata about each column you wish to store in cache. (Click on image for enlarged view.)

This technique can be used as a nice timesaver and the improvements in data flow components are significant, but you still must test and benchmark your packages for performance.

SQL Server 2008 and SSIS 2008 are not revolutionary releases for Microsoft. Instead, they expand upon the breakthrough foundation of SQL Server 2005 and continue to move the product toward becoming a more complete enterprise-ready, industrial-strength database option for businesses. As Microsoft moves further away from the old Sybase/SQL Server 7.0 years, SQL Server perceptions continue to improve, and those in the data warehouse and business intelligence areas will benefit from the increasing functionality, ease of use and performance that SSIS 2008 has for data integration. SSIS 2008 is still not a standalone ETL option because it does not include critical functionality such as managing metadata or high-speed connectors for the plethora of heterogeneous data sources in the market today.

But another real advantage that SQL Server administrators and developers should be aware of is that migrating from SQL Server 2005 SSIS packages to SQL Server 2008 packages is going to be nowhere near the pain involved for those that went through the fun of migrating from DTS in SQL Server 2000 to SSIS 2005. The fundamentals of the server and engine are, thankfully, the same this time around.


ABOUT THE AUTHOR:   

[IMAGE]Mark Kromer is an internationally recognized leader in the business intelligence and database communities, having authored articles and blogs for TDWI, Microsoft, MSDN and Technet. Mark has more than 15 years' experience in IT, focusing on database technologies and solutions, including spending the past 2.5 years with Microsoft as the lead product manager for business intelligence customer solutions. In the past, he was responsible for training classes, partner readiness and product development of BI initiatives at Microsoft, Agilent Technologies and Cingular Wireless. He is currently the lead product manager for enterprise BI reporting solutions at Primavera Systems. Check out his blog at http://blogs.msdn.com/makrom


MEMBER FEEDBACK TO THIS TIP

Do you have comments on this tip? Let us know.


Rate this Tip
To rate tips, you must be a member of SearchSQLServer.com.
Register now to start rating these tips. Log in if you are already a member.


Submit a Tip




DISCLAIMER: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.



SQL Server Development - .NET, C#, T-SQL, Visual Basic
HomeNewsTopicsITKnowledge ExchangeTipsAsk the ExpertsMultimediaWhite PapersIT Downloads
About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
SEARCH 
TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Site Map




All Rights Reserved, Copyright 2005 - 2009, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts