Thanks to the natural language roots of the SQL language, writing queries has become extremely easy for just about
anyone to pick up. But its simplicity also makes it easy to write poorly performing queries. Here are some simple changes you can make to improve not only query performance, but, in some cases, overall SQL Server system performance as well.
CREATE TABLE vs. SELECT INTO
Oftentimes, within stored procedures or other SQL scripts, temp tables must be created and loaded with data. When writing these queries, many SQL Server DBAs and developers like to use the SELECT INTO method, like this:
While this technique works fine for small tables, when dealing with large record sets or long-running queries, it creates locks on the system objects within the tempdb database. As a result, other queries and procedures that need to create objects within the tempdb database will have to wait for the long-running query to complete. This is because when an object is created, an exclusive lock is taken against the sysobjects, syscolumns, sysindexes, etc system tables (SQL Server 2000) or the sysallocunits, syscolpars, syshobtcolumns, sysschobjs, sysserefs, etc system tables (SQL Server 2005). You can see this easily by opening two query windows and running the following:
create table #test1 (c1 int)
(Second window SQL 2005)
select object_name(rsc_objid), *
where req_spid = 52 /*Where 52 = the SPID of the first window*/
order by 1
(Second window SQL Server 2000)
sp_lock 52 /*Where 52 = the SPID of the first window*/
When you have a very long-running query in a temporary table using the SELECT INTO format, those same system table locks are held until the query completes and data loads into the temp table. You can avoid system table locking by manually creating the table with the CREATE TABLE command – before loading the data into the table.
For example, this code …
CREATE TABLE #TempTable
INSERT INTO #TempTable
… will require much less locking than this code:
While the total number of locks taken is the same, the length of time the locks are held for the first query will be much shorter. This allows other processes to create temp tables.
Typically, when developing SQL code the development server has only a single user or few users. When working on SQL code, it's important to know when the code will impact sessions other than the current session. And unexpected interaction can cause major performance issues.
Accessing data across linked servers
Linked servers are an excellent way to get data in real time from one server to another. However, incorrectly written linked server queries can quickly decrease system performance on one or both servers. While it's easy to write these queries across linked servers, the query optimizer doesn't always work as you would expect. I often see queries that join a local table to two remote tables and the queries take hours to run. That's because the local optimizer doesn't know which records to request from the remote table.
It therefore requests that the remote server transmit the entire table, and all that data is then loaded into a temporary table and the join is done locally. Unfortunately, because the local table is a temporary table -- and not a physical table on the source system -- the indexes
on the remote table do not get created on the temporary table. Because of the lack of indexes, expected query execution time skyrockets.
There are a couple of techniques you can use to improve query response time. The first is to create a stored procedure on the remote database and have it return a record set, being a subset of the remote tables, which is then loaded into a local temporary table. It can then be indexed as needed. The trick with this method is to provide an input variable to the remote procedure where input values can be passed to. Thus, you will reduce the number of returned records by as much as possible. Fewer records will reduce the run time of that stored procedure as well as the network latency on transferring those records from the remote system to the local system.
The second technique you can use is a variation of the first method. You create local temporary tables for each of the remote tables and transfer over the columns and records needed from each of the remote tables. Next, index the tables as needed and join the temp tables locally.
While the second technique is easier and faster to set up and implement, the first method gives you a greater performance savings, as typically less data needs to be transferred between servers.
Subqueries as join partners
When working with joins, you may want to manually control the order that data is selected. An easy (and usually safe) way to do this is to use subqueries as the join object instead of joining directly to a table.
In some instances, you can decrease your query execution time by forcing the SQL Server to prefilter data in the table. This method is not foolproof and if used incorrectly it can increase the execution time of your query. The method should be fully tested before moving it to your production environment.
As we have seen, there are some quick and easy methods for improving query performance for some long-running processes. While these techniques will not apply to every issue you run across, they will help in some instances.
ABOUT THE AUTHOR
Denny Cherry has over a decade of experience managing SQL Server, including MySpace.com's over 175 million user installation, one of the largest in the world. Denny's areas of technical expertise include system architecture, performance tuning, replication and troubleshooting. Denny uses these skills on a regular basis in his current role as a Sr. Database Administrator and Architect at Awareness Technologies. Denny is a longtime member of PASS and Quest Software's Association of SQL Server Experts and has written numerous technical articles on SQL Server management.