This article is part of an Essential Guide, our editor-selected collection of our best articles, videos and other content on this topic. Explore more in this guide:
1. - Becoming Windows Azure SQL Database: Read more in this section
- Windows Azure SQL Database: Examining the positives
- On second thought: Looking at Azure SQL Database's downsides
- How to achieve high scalability with Windows Azure SQL Database
- Using T-SQL in SQL Azure Federations
- Understanding split root tables in SQL Azure Federations
- Comparing SQL Server and SQL Azure
- Microsoft SQL Azure: Expedite execution times
Explore other sections in this guide:
One of the biggest differences between using SQL Server on your network and using Microsoft SQL Azure is the geographical distance between your application servers or Web servers and the databases used. This can cause a lag in performance, but there are ways to mitigate time-cost issues with SQL Azure.
SQL Azure is a database in the cloud. The information sits in a Microsoft data center, most likely a good distance away from your servers, so querying and updating Azure data takes more time than it would when working with on-premises SQL Server.
The total cost of executing queries in the SQL Azure database and returning data is generally higher than when calling SQL Server instances on the local network. To find out what that additional cost of executing queries was in the SQL Azure database, I created a test application with very simple code -- I called a stored procedure and returned data from a small table with five rows, repeated 100 times. When executing the code against a SQL Azure database, it took six to seven seconds. But occasionally it would climb to 10 or 15 seconds. This radical difference in execution time can be attributed to greater variance in connectivity to SQL Azure over the Internet.
I tested the execution time of database calls against SQL Server instances on the local network in two parts. In the first part, the application was in the same data center as the SQL Server database I was using. The execution time was somewhere between 100 and 200 milliseconds for the same 100 database calls. When the application was placed in a different data center on the same network -- roughly 200 miles away with fast pipe in between -- the execution time climbed to three to four seconds. I executed the code at different times during the day and did not get the occasional slowdown I did when calling SQL Azure. Looking at these results, we can come to several conclusions.
As expected, accessing SQL Server locally is exponentially faster than accessing SQL Azure. The difference is less radical if you make database calls within your company’s local area network across long distances -- for example, a Web server in a New York office and database server in a Boston office. In this case, the speed of database access for SQL Azure was about 40% to 50% slower. Keep in mind that I was running a very fast query -- executing within a few milliseconds -- so the difference in execution time can be attributed to the cost of a network call. If I were calling a stored procedure that takes 20 or 30 seconds to run, the difference in execution times would drop to below 1% or 2%.
The results confirmed what performance-tuning experts have suggested all along -- It takes less time for an application to collect multiple result sets in a single database call than to make many separate calls, and that goes for both SQL Server and SQL Azure. For example, a Web recipe application might make four separate stored procedure calls to get all the details it needs to display a recipe: GetRecipeDetails, GetRecipeSteps, GetRecipeIngredients and GetRecipeNutritionalInfo. Instead, you could write just one stored procedure that returns all the data in a single call:
CREATE PROCEDURE GetRecipe ( @RecipeID INT )
EXEC GetRecipeDetails @RecipeID ;
EXEC GetRecipeSteps @RecipeID ;
EXEC GetRecipeIngredients @RecipeID ;
EXEC GetRecipeNutritionalInfo @RecipeID ;
In ADO.NET multiple result sets can be returned as a data set object. In this case, the data set object returned by the database would contain four data table objects, and you would get all the data with a single call to the server as opposed to four separate calls. To test this, I added another benchmark to the test application -- I wrote a stored procedure that would return the same data twice, and then I coded the application to call the database 50 times instead of 100. It took half the time to return the same amount of data. The difference in returning more data in a single call was negligible; it’s the cost of a network call that really matters.
Connection pooling is extremely important in SQL Azure. The cloud version of the database supports connection pooling the same way it does in SQL Server. After a connection is released by an application, ADO.NET keeps it open for about six minutes in case there’s another database call sometime soon after that. This minimizes the overhead of network calls, since you pay for establishing a SQL Server connection once and then use the connection for multiple calls. When connection pooling for SQL Azure calls is disabled, the performance suffers. The same 100 calls that ran in five to six seconds with connection pooling took around 90 seconds. This shows that the time it takes to establish a connection is the most expensive part of SQL Azure data access. So keep a pool of open connections.
If you use this application all the time, you shouldn’t have an issue. But if you have periods of inactivity, start a process that opens 10 or 20 threads concurrently and executes a dummy query every five minutes. That should keep a pool of active connections available for quicker execution time.
ABOUT THE AUTHOR
Roman Rehak is principal database architect at MyWebGrocer in Colchester, Vt. He specializes in SQL Server development, database performance tuning, ADO.NET and writing database tools. He contributes to Visual Studio Magazine, SQL Server Magazine and other publications and presents at user groups and conferences in the U.S. and Canada.