Q&A: Exploring what’s hot: SQL Server virtualization, PowerPivot, PDW

Want the inside scoop on SQL Server virtualization, PowerPivot, parallel data warehouse and the upcoming Denali? Microsoft database expert Mark Kromer discuss these year-driving trends and more.

SQL Server’s path in 2011 is being paved as surely as the days are getting longer. And we are familiar with the stones. There are the cost-cutting trends of SQL Server virtualization and the cloud, and there is Microsoft’s high-performance parallel data warehouse (PDW), released at the Professional Association for SQL Server (PASS) conference in Seattle in November. And then there are innovations in the upcoming SQL Server ’11, known by its code-name Denali, and also announced with great fanfare at PASS 2010. And no doubt we’ll hear much more about hot new R2 items like PowerPivot. SearchSQLServer.com explores these trends and more with Microsoft data platform technology specialist Mark Kromer in this month’s “SQL in Five.”

SQL Server virtualization is seen as becoming the norm in many corporations, but it used to have a reputation as having bad performance. In what ways has Microsoft improved how SQL Server is virtualized? Where does it go from here?

Mark Kromer: Virtualizing SQL Server 2000 and even SQL Server 2005 had difficulties keeping up with traditional direct SQL Server installations on the host OS [operating system]. Mostly, this was due to the virtualization hypervisors of the time, from Microsoft's Virtual PC and Virtual Server to earlier versions of VMware's ESX. And some of it was that SQL Server prior to SQL Server 2008 was not architected with virtualization in mind. But that has changed considerably in the past three years with SQL Server 2008, Microsoft Hyper-V and newer versions of VMware's ESX. There are other hypervisors and virtualization technologies out there today, but my anecdotal evidence in talking to customers over the past six to eight months is that 99% of virtualization of SQL Server is one of those two platforms. With new hypervisor technologies together with processor technologies like SLAT [Second Level Address Translation] and TCP Offloading, we see virtualization in all sizes of data centers and organizations taking off in a big way. These advances in hypervisors make SQL Server an even more popular target for businesses looking to virtualize their infrastructure. With virtualized SQL Server 2008 R2, Microsoft supports advanced capabilities like guest and host clustering for high availability, and you can even leverage the Microsoft Hyper-V Live Migration function that allows you to move a running instance of SQL Server from one VM [virtual machine] server to another with no downtime. With high-availability capabilities that come with SQL Server and virtualization combined with the increase in hypervisor performance, particularly with pass-through disk configurations, virtualizing SQL Server is becoming commonplace. What you should look for in SQL Server’s future is more and more investment in virtualization to improve instance and database density and to implement SQL Server “private cloud” infrastructures.

Much like SQL Server virtualization lets operating systems move among physical hosts, Data-Tier Applications let database administrators (DBAs) to move databases around from server to server.  Is SQL Server moving in this direction -- making large numbers of databases very small and making it easier on DBAs -- in 2011 and beyond?

Kromer: It’s hard for me to say that SQL Server is moving in one direction over the other. What your question brings to mind for me is a debate that I see often in organizations between loading-up, scaling-up applications on big hardware versus scaling-out and distributing applications over smaller commodity hardware. This is a classic debate in application development, enterprise architecture and database administration. Rather than go into the pros and cons, which leads to a lengthy debate, let’s just say that there are pluses and minuses that need to be weighed with either architecture. Specific to SQL Server direction and capabilities, I would fully agree that you are seeing in R2, and will see in Denali, many new capabilities to enable what Microsoft refers to as “multi-server management.” These are very helpful in enabling both developers as well as database administrators to monitor, manage and control large numbers of SQL Server instances and databases from within SSMS [SQL Server Management Studio] and from a Central Management Server (CMS). Besides CMS, there is also policy-based management (PBM), Utility Control Point (UCP) and Management Data Warehouse (MDW). And as you mentioned in the question, Data-Tier Applications (DAC) is a very important new capability in R2 that allows developers and DBAs to select the database objects specific to an application, package those together and migrate the DAC PAC through a development cycle, for example, a classic dev, test, stage, prod cycle. And when you distribute your databases over virtual machines, VM managers like Microsoft Virtual Machine Manager (VMM) make management of distributed SQL Server applications much simpler, and you can still utilize all of those previously mentioned features even in VM environments. What the future holds is advances of the MPP distributed data warehousing model with SQL Server Parallel Data Warehouse, which includes its own PDW monitoring and management tools to give DBAs complete visibility into active nodes and the distributed environment. And the smaller distributed sharded database application architecture is being exploited by Microsoft’s cloud database, SQL Azure. SQL Azure is key to Microsoft’s distributed database future. And since Azure is using SQL Server technology, you can already use common SQL Server tools like CMS, SSMS and DAC with Azure today.

Much has been made about Microsoft’s high-performance parallel data warehouse, but much less of its lower-cost cousin SQL Server Fast Track. How does it differ from PDW and would certain types of businesses benefit from Fast Track, as opposed to PDW?

Kromer: Both PDW and Fast Track are designed specifically for data warehouse loads. But as mentioned above in the previous question, PDW is a massively parallel processing (MPP) version of SQL Server. It is a brand new edition of SQL Server which is based on the technology that Microsoft acquired from DatAllegro a few years ago. Fast Track, on the other hand, ships with a single commodity-style server from Dell, HP [Hewlett-Packard Co.] or Bull running SQL Server 2008 R2 Enterprise Edition with low-cost SAN [storage area network] enclosures. They are both “appliances” in that they include the hardware you need to quickly stand-up a SQL Server highly scalable data warehouse. In both cases, the SAN is intended solely for the purpose of serving SQL Server and the architectures are tuned for large data warehouse workloads. But with Fast Track, you receive a “reference architecture” with a bill of materials that you assemble, while PDW delivers a preconfigured set of racks with distributed servers and SAN that you simply install in your data center, already assembled in the factory for you. PDW includes several data warehouse specific updates to T-SQL [Transact-SQL], DMVs [dynamic management views] and tools that help you to manage a massively parallel system.

PowerPivotis a hugely popular Excel add-in requiring SQL Server 2008 R2. But since its rollout, many people have asked, “When should I use PowerPivot? When should I use SQL Server Analysis Services (SSAS)?” What guidelines would you offer?

Kromer: If you are an SSAS developer or have complex data integration, ETL and analysis needs, you may find PowerPivot will only get you part of the way there. PowerPivot, as it exists today, is geared much more toward business power users who are comfortable building analysis reports and dashboards in Excel, while SSAS is based on BIDS [Business Intelligence Development Studio] (Visual Studio), which is not an environment that business users would commonly utilize. The technologies that come together to make-up PowerPivot such as VertiPaq and DAX [Data Analysis Expressions], enable users familiar with Excel-based analysis to have some additional powers including data integration from many different various sources and analysis without maximum row limitations. If I were developing a complex UDM [unified dimensional model] that needs to service many different front-end providers (SharePoint, Hyperion, etc.), I would likely do that in SSAS. But if I needed to provide an Excel workbook with a specific set of reports where the data sources are highly mutable and not shared in a data warehouse, then I would turn to PowerPivot.

What else can people expect to see in SQL Server in 2011?

Kromer: In terms of our discussion here this month, look for new “AlwaysOn” which is a new way to manage high availability in SQL Server that greatly simplifies clustering and mirroring with new concepts such as availability groups and multiple mirrors. PowerPivot’s strengths around DAX Excel-like language and in-memory cubes are being integrated into the BIDS environment to provide those powerful capabilities to the BI developer. New Reporting Services tools are coming out in CTP [community technology preview] over the coming months including Project Crescent, a Silverlight version of Report Builder, and Microsoft’s Cloud BI with Reporting Services for Azure. There are many, many more new exciting areas to look for in Denali (SQL 11) and you should definitely have a look through all of these by downloading and playing with the CTP1 of Denali (SQL 11). I know that PL/SQL Oracle developers are happy to see SQL Server sequence objects being added to the language. Two others areas of improvement that I would like to mention that I am personally excited about are the column store indexes and the new additions being made to data integration including impact analysis and data quality.

Editor’s note: For more from Mark Kromer, check out his blog at MSSQLDUDE.

Mark Kromer has over 16 years experience in IT and software engineering and is a leader in the business intelligence, data warehouse and database communities. Currently, he is the Microsoft data platform technology specialist for the mid-Atlantic region, having previously been the Microsoft Services senior product manager for business intelligence and the Oracle principal product manager for project performance analytics.

Dig Deeper on SQL Server Business Intelligence Strategies