This article can also be found in the Premium Editorial Download "SQL Server Insider: Azure Cloud testing tips for SQL Server shops."
Download it now to read this article plus other related content.
SQL Server 2008 introduced a slew of features that offer a significant boost in stored-procedure performance over its earlier counterpart, SQL Server 2005. Almost three years later, 2008 R2 was rolled out with just a few minor tweaks to stored procedures. SQL Server 2012, which will be released sometime this year and is now available in a release candidate version, packs a number of its own performance-boosting additions.
Stored procedures today
One of the biggest stored-procedure improvements in SQL Server 2008 and 2008 R2 is table-valued parameters. This parameter type groups multiple rows of data into sets and then sends them all at once to the database server, reducing round trips between client and server, and it doesn’t create temporary tables or numerous parameters. Table-valued parameters don’t require any locking to initially populate client data. They also enable the client to specify sort order.
The table-valued structure is easy to understand and use, yet it can accommodate complex business logic. You use table-valued parameters the same way you use classic stored-procedure parameters. First, declare a user-defined table type and create a stored procedure with the input parameter of that table type. Next, declare a variable of the table type and reference it. Use an INSERT statement to populate the table variable instead of a SET or SELECT statement. Finally, plug the filled table variable into the stored procedure as an input parameter.
SQL Server 2008 also introduced a MERGE statement that allows for multiple data manipulation language (DML) operations in a single T-SQL statement. With the appropriate indexes on joined tables, you’ll gain increased query performance. That’s because a single MERGE statement reduces the number of times the data in source and target tables is processed. In SQL Server 2005 and earlier versions, each INSERT, DELETE, and UPDATE statement had to process the data one time. MERGE is useful, say, when you are synchronizing two tables (see “Sidebar title TK”).
In SQL Server 2008 and 2008 R2 stored procedures, grouping operations have been enhanced by a new GROUPING SETS syntax. A simple GROUP BY clause returns one row for each combination of all column values, whereas GROUPING SETS returns one row for each unique value in each column. Without this new feature, the latter would take multiple GROUP BY statements for each column combined in a UNION structure. That would inevitably require more resources, like disk I/O operations, memory and runtime.
SQL Server 2008 and 2008 R2 also use a new row constructor to insert multiple rows in a single INSERT statement in stored-procedure programming. Database administrators may already be familiar with this syntax: “INSERT INTO TABLE (COLa, COLb, COLc) … VALUES (V1a, V1b, V1c …), (V2a, V2b, V2c …), … (V1000a, V1000b, V1000c …)”. Here the number 1000 indicates the maximum allowed rows in one INSERT statement. The new INSERT syntax is cleaner because it removes duplicate strings. It also allows for faster round trips to and from the server by reducing network traffic.
In my triple-trial test, I ran 10 batches of 1,000 new-style inserts per batch and 10,000 individual inserts in parallel. Each test counted as one round trip, but compared with the old-school INSERTs, the simplified row constructor cut down the number of transactions and bytes sent from the server 1,000-fold and reduced the amount of client-sent data by a factor of three and client processing time by a factor of 300. Total execution time was reduced by more than 50%.
What’s in Store in 2012
SQL Server stored procedures get some additional tweaks in the upcoming release. In SQL Server 2012, stored-procedure performance in data warehouses can be sped up by 10, 100 or 1,000 times with the help of column-store indexes, a popular scalability feature in the third community technology preview, which was released this summer. Unlike traditional, row-based index structures (also called “row-store indexes”), column-store indexes group and store index data one column at a time, thus significantly reducing query execution time.
SQL Server 2012 also lets developers rebuild indexes on large-object data types online. In SQL Server 2008 R2 and earlier versions, users can’t perform online rebuild operations on, say, varchar(max), nvarchar(max) or varbinary(max) columns or extensible markup language (XML) data types. Stored procedures retrieving data from these data types do not have to wait for the database maintenance window to defragment indexes, hence increasing stored-procedure performance.
Another new feature in the upcoming release is the query optimizer hint called FORCESCAN. When the SQL Server query optimizer underestimates a returned number of rows and mistakenly chooses an index-seek operation, SQL Server developers can make the query scan the index instead. That way, the query process takes less time to complete.
I have only skimmed the surface of the features related to stored-procedure performance in SQL Server 2008, 2008 R2 and 2012. There are numerous subtle changes in these SQL Server versions that contribute indirectly to stored-procedure performance improvements as well, such as new functions and clauses, filtered indexes and new execution plan guides. But the features listed here -- regardless of the version you’re on -- are sure to give your queries a bigger-than-ever performance boost.
Get going with SQL Server 2012 stored-procedure features
Applying any new features without established guidelines is guaranteed to result in frustration. To that end, recommendations are in order for some of the more novel stored-procedure features like MERGE statements in SQL Server 2008 R2 and column-store indexes in SQL Server 2012.
- Index columns in joins for source and target tables.
- Specify search conditions in the ON clause by including columns in source and target tables only. Do not include extra comparisons, like constant values. If you use database objects like common table expressions or views to hold filtered rows, evaluate the results thoroughly. These are temporary result sets and may produce incorrect results.
- If you use a join hint, which forces the optimizer to use a certain join at some point in a query, use OPTION (LOOP JOIN) instead of a hash join, which bypasses indexes.
- Parameterize all literal values in ON and WHEN clauses in stored procedures. If you can’t, create a type TEMPLATE plan guide and specify the PARAMETERIZATION FORCE query hint. Make the setting carefully, because it will affect the entire database.
- Choose columns that provide the best compression, such as character or numeric columns.
- Follow a normal update procedure on a table with a column-store index by dropping the index, and then perform data modification operations to the table. Finally, rebuild the column-store index.
- Partition the table and switch the partitions. First, switch a partition out of the main table and into a staging table and drop in the column-store index. Update the data and then re-create the index in the staging table. Finally, switch the staging table back into the main table.
- Tables with column-store indexes cannot be updated; they become read-only once you create them. But do not use a column store index as a way to make a table read-only. Microsoft could very well remove the update restriction in future releases of SQL Server. If you need a read-only table, create a read-only file group and move the table to that file group.
ABOUT THE AUTHOR
Richard Ding is a SQL Server professional based in Boston. He has worked as a SQL database administrator and developer for more than 12 years. He has written articles for SQL Server Magazine and is a freelance writer for SearchSQLServer.com.
This was first published in January 2012