SQL Server expert Matthew Schroeder gets down into the trenches to help resolve your SQL Server issues. Whether...
it's a failed cluster installation, a long-running stored procedure that causes a 3 a.m. phone call or a database in suspect mode, he's got you covered with this handbook for SQL Server DBAs. Learn how to solve SQL Server errors, problems and installation failures with this advice.
Most of the articles I write are about SQL Server concepts or how to do a specific process. I wanted to change things around a bit and write about DBA issues from the frontline. Here are real solutions to real SQL Server problems that occur when you're working on the leading edge of SQL Server.
Causes of SQL Server cluster installation failures
- Data Execution Prevention (DEP) (control panel\system\advanced\performance) -- Select Turn on DEP for Essential Programs Only, since any other setting will cause the installation to fail.
- Trailing spaces -- Trailing spaces on the NIC names will cause the installation to fail on the installation of the Database Engine piece with error number 67.
- Logging into other nodes -- If you or another user are RDPed into a node other than the one you are installing from, then the installation will fail.
- Internet Explorer Enhanced Security (control panel\add remove\system components) -- Often this will interfere with installations. It's best to leave it turned off to avoid problems.
SQL Server performance issues at 3 a.m.
- Profiler trace shows low CPU/IO consumption, but incredibly long duration. Typically, it is because of an out-of-date execution plan. Updating the stats will often eliminate this issue.
- Transactions are timing out, but disk space seems plentiful and it's a simple insert with just one row. Tracing the activity might reveal that the transactions are actually timing out while they try to grow the files. If a database is 100GB or more, having a growth percentage at 10% on a server under load can cause queries to start timing out. It's best to grow the databases out at a fixed rate rather than a percentage.
- Profiler trace shows a query timing out with high CPU/IO consumption. Assuming this is a change of behavior and the query behaved well before, chances are the indexes need to be defragmented, the statistics are out of date and/or the proc needs to be recompiled. Your best bet is to check the index fragmentation and, if necessary, re-index. Always try to update the statistics on the table and after that recompile the proc. If you update the statistics, but don't recompile the proc, sometimes it doesn't pick up the updated statistics immediately.
- A stored procedure takes forever to run. But if you paste the logic into query analyzer and run it directly getting extremely quick results, then be sure to check the parameters. If all the parameters have a default value, there is a good chance you're the victim of parameter sniffing. Basically, parameter sniffing is how Microsoft optimizes the execution plans of a stored procedure using certain parameters. In order for it to work correctly, you need at least one parameter on a stored procedure without any default value.
- The Max Degrees of Parallelism setting determines how many processors (cores) can be used per query. A setting of 0 indicates that queries can use all cores to execute. The only problem with this is that a long-running CPU-intensive operation can easily grab all the processors and block other users from running. On the flip side, limiting each query to a small number of processors causes statements such as re-indexing to take much longer to run and blocks users behind them. So it's a delicate setting.
- Table variables work excellent with small datasets. However, as the number of rows in the tables exceeds, the variables tend to cause lockups on production systems. Also, joins between table variables and real tables often cause table scans, since they are not indexed. The only way to index a table variable is to define a primary key constraint on it that generates an implicit clustered index. I would recommend replacing them with temporary tables, inserting the data, then applying indexes as needed that will match the production tables.
Backup and recovery options when there's a problem
- Database in suspect mode -- Choose one of two methods to resolve.
- Run sp_resetstatus 'dbname', then restart SQL Server.
- Run sp_configure, set Show Advanced Options to 1, run reconfigure, then run sp_configure; set Allow Updates to 1, then run reconfigure. Finally update the status in sys.databases and restart SQL Server.
You can't update sys.databases directly unless you set Allow Updates to 1. The first option resets a suspect database to normal and the second option allows you to modify the database setting to anything. If the database goes back into suspect mode, you have corruption. Once the database is no longer marked as suspect, run dbcc checkdb to repair damage.
- Database shrinking – Performing an end task, a lock-up or an accidental client shut off during this operation will occasionally throw the database into suspect mode. Don't worry, no corruption has occurred. Simply reset the status.
- Transaction log full – You have three choices here:
- Choose to back up the transaction log, thus clearing it.
- Switch the recovery mode to simple (not desirable in most cases).
- Add a new log file on an existing drive.
By far, option 2 is typically the fastest way to get the system back running. But it's often a poor choice because of the loss of point-in-time recovery and traceability. In most cases, option 1 is your best bet and is fairly fast. Running a full backup would have the same effect but would really delay your ability to get the system back running, since the transaction log isn't free until the full backup is complete.
Some odd SQL Server issues
- Errors in contains table relating to the use of the and and. Microsoft's full-text search functions are very picky about the use of "noise" words and will fail in many cases if they are passed as a search string. You can either edit the list of noise words contained on the SQL Server, then rebuild the catalogs, or you can use sp_configure transform noise words, 1 to tell SQL Server to ignore noise words that come in as part of the search criteria.
- Date constants in a multi-region system. When used in applications that might be deployed with various regional settings, data constants can be tricky. Declare @dDate datetime, set @dDate = '3/10/08' won't work for many regions and neither will set @dDate = '2008/03/10'. In order for your application to work in all regions, you should follow Microsoft's spec for date constants. The following spec works for all regions and allows your system to work internationally: set @dDate = '20080310'. It works for all regions and consists of the four-digit year, two-digit month and two-digit day.
- Creating a server login, then attempting to map it to a database, and it fails. Chances are the user already exists in the database and just needs to be linked to the server login. Run this command within the problem database: sp_change_users_login 'update_one', 'DBUser', 'ServerLogin'. Typically, the DBUser and the ServerLogin are the same names, but it's not required.
- "Property owner is not available for database dbname" error when right-clicking and selecting database properties. This occurs when the owner of a database is deleted from the system and can be resolved by altering authorization on database::dbname to NewLogin. DBs should be created with sa ownership wherever possible to avoid this situation.
- Serialization error when shrinking a log in SQL Server 2005. SQL Server thinks that some kind of backup is in process and can't shrink a log file while it is in process. If a backup is not in process, then this error is usually caused by a backup or a third-party backup tool that crashed. Running a full backup will reset the serialization and the lob will shrink.
I covered a lot of issues that you'll often discover by accident as part of your daily tasks. Hopefully, I helped point you in the right direction so you'll spend less time researching these issues. Stay tuned for further articles from the trench.
ABOUT THE AUTHOR
Matthew Schroeder is a senior software engineer who works on SQL Server database systems ranging in size from 2 GB to 3+ TB, with between 2k and 40+k trans/sec. He specializes in OLTP/OLAP DBMS systems as well as highly scalable processing systems written in .NET. Matthew is a Microsoft certified MCITP, Database Developer, has a master's degree in computer science and more than 12 years of experience in SQL Server/Oracle. He can be reached at firstname.lastname@example.org.
Do you have a comment on this tip? Let us know.
I'd never heard of most of those problems or their fixes. Thanks for going to the trouble to write it. I especially liked the "Problems at 3:00 a.m." section.