More on SQL Server and big data
The SQL Server-Hadoop big data connection is new territory
Big data is one of the things driving SQL Server momentum in 2012
Microsoft and other technology vendors are readying themselves and their customers for the onslaught of “big data.” And SQL Server 2012 is no exception. The new database platform will have a host of big data capabilities, including connectors with Apache Hadoop, an open source distributed computing framework that can store and process massive amounts of structured and unstructured information.
In this Q&A, Microsoft database platform specialist Mark Kromer talks about how Microsoft SQL Server 2012 shops might actually use the big data capabilities, including its partnership with Hadoop developer Hortonworks. Kromer also goes into detail on SQL Server plans for open source and the cloud in 2012.
Last year Microsoft reached out to the open source community by releasing the ODBC Linux driver, which helps people move Linux-based applications to SQL Server, and Microsoft has planned to include support for Linux connections to SQL Azure in an upcoming release. What does Microsoft stand to gain from doing this? Are there additional plans to integrate with Linux and other open source technologies?
Kromer: Aside from the Linux ODBC driver announcements, perhaps the biggest open source announcement out of the SQL Server product team is the recent announcement of Microsoft's adoption of the Hadoop framework.
Microsoft is also including Hadoop connectors with SQL Server 2012. Does Microsoft expect SQL Server shops to begin using the technology this year, and what do you expect they’ll do with it?
Mark Kromer: The Hadoop adapters are currently available for download for SQL Server 2008 R2. These adapters provide Hive and Sqoop services that allow you to utilize the rich SQL Server data warehouse and BI solution stack with Hadoop. You will be able to build a big data analytical solution by distributing data across Hadoop nodes on Windows Server or Windows Azure and then using SQL Server tools like Power View, PowerPivot or even Excel to analyze the data that is sitting in your environment today that is not able to get processed by your data warehouse or other reporting systems.
What will Microsoft be delivering as part of its partnership with Hortonworks in 2012 and how will that partnership help customers take advantage of SQL Server’s Hadoop functionality?
Kromer: The partnership with Hortonworks is producing Hadoop distributions that can run on Windows Server for on-premises big data solutions or in the cloud with Hadoop on our Windows Azure platform.
Microsoft continues to beef up its cloud offerings, tripling SQL Azure’s database size and recently announcing the cloud service code named SQL Azure Compatibility Assessment, to help DBAs and developers determine how easy or hard a migration to SQL Azure would be. What levels of conversion from SQL Server to SQL Azure does Microsoft expect as we approach SQL Azure’s two-year anniversary?
Kromer: The SQL Server customers that I work with seem to be fairly indicative of the general SQL Server community in terms of migrating to SQL Azure databases. That is, I see SQL Azure primarily used in several cases most often: (1) using SQL Azure as a dev/test environment so that DBAs and IT do not need to set up, configure, provision and maintain multiple dev instances; (2) short-lived database application that may require large spikes in activity so that you can offload the capacity from on-premises to the cloud. This is a benefit because you would otherwise have to engineer your SQL Server data center infrastructure to the worst-case scenario in terms of traffic. SQL Azure makes it much easier to scale and handle large volume spikes; and (3) moving tier 2 and tier 3 applications or older SQL Server databases to SQL Azure to offload costs for maintenance, hardware and licensing. This works only if the applications that are using the SQL Server database are not third-party apps that will not support SQL Azure. But since SQL Azure utilizes the same connections and tools as SQL Server on-premises, these sorts of migrations are fairly straightforward. Once Microsoft releases the GA [general availability] version of SQL Federations and SQL Azure Reporting Services, this will open a whole new set of very exciting use cases that I am very much highly anticipating.
Aside from the release of SQL Server 2012, what else does 2012 have in store for SQL Server users?
Kromer: What? A new database platform release isn't enough? I'm looking forward to the general availability of several new Azure offerings that are currently in trial: SQL Azure Reporting Services, SQL Federations for scale-out queries, Azure Connect for VPN connections to the cloud and SQL Azure Data Sync, which is essentially replication for SQL Azure, including local database agents for cloud-to-on-premises synchronization.
Mark Kromer has more than 16 years experience in IT and software engineering and is well-known in the business intelligence (BI), data warehouse and database communities. He is the Microsoft data platform technology specialist for the mid-Atlantic region. Check out his blog, MSSQLDUDE.