Some of the toughest SQL Server error messages to troubleshoot are memory related. Some error messages are logged
and say to "reduce system memory load or increase system memory," but most of the time troubleshooting memory problems in SQL Server is much harder than that. Typically, by the time messages of this type are displayed, the system is in dire need of attention.
When dealing with memory errors and memory problems in general, 95% of the time the memory problem is simply a symptom, not the root cause of the problem. Unfortunately, in cases when the 5% of the time is actually a low memory problem, the diagnoses is going to be an exclusionary diagnoses -- meaning you first have to check the other options. Once you explore and eliminate all other options, the remaining solution is that the server needs more physical memory installed.
Why proper indexing is important for SQL Server memory
The most common root cause of memory pressure is probably incorrect indexing options. As the size of the database increases, you'll want to look more and more into removing clustered indexes from your very large tables. By changing the clustered primary keys to nonclustered primary keys, you'll increase the amount of load on the disk subsystem. But because the clustered index contains the entire table, when the clustered index is loaded into memory, much more memory is used than loading a non-clustered version of the index. This will dramatically reduce your memory requirements for this table object.
As an example, I have a very wide table (average of 270 bytes per row) that has a total of 63 GB of data, meaning the clustered index is also 63 GB in size. By changing the primary key from a clustered index to a nonclustered index, the index space requirements drop from 63 GB to approx 1 GB (including padding).
This greatly reduces the amount of storage required, and when doing index scans against the primary key of the table, only 1 GB of data has to be loaded into memory – which means the data can be left in memory much longer.
This production server has only 16 GB of RAM installed, so having a 63 GB index in memory that must be queried constantly (the table is used very often within the application) is not a very effective use of memory. By making these changes to the primary key, disk load increased a small percentage. But memory load decreased dramatically allowing SQL Server to keep data in cache much longer, while also allowing SQL Server to cache other objects into memory as well.
When you look at objects that might be good candidates for having the clustered index converted to a nonclustered index, look at the sys.dm_os_buffer_descriptors Dynamic Management View (DMV). This DMV indicates which objects are in which data pages in memory. When looking at this DMV, you need to join it to the sys.allocation_units DMV to see which partition the allocation_unit is in. From there you can join to the sys.partitions DMV and you'll get the object_id of the object. While the sys.dm_is_buffer_descriptors, DMV is system-wide, the sys.allocation_units and sys.partitions DMVs are database-specific. This query tells you which tables and in what percentages they are loaded into memory. Only user tables are included in this query as system objects cannot be changed.
SELECT sys.tables.name TableName,
sum(a.page_id)*8 AS MemorySpaceKB,
SUM(sys.allocation_units.data_pages)*8 AS StorageSpaceKB,
CASE WHEN SUM(sys.allocation_units.data_pages) <> 0 THEN
SUM(a.page_id)/CAST(SUM(sys.allocation_units.data_pages) AS NUMERIC(18,2))
END AS 'Percentage Of Object In Memory'
FROM (SELECT database_id, allocation_unit_id, COUNT(page_id) page_id FROM
sys.dm_os_buffer_descriptors GROUP BY database_id, allocation_unit_id) a
JOIN sys.allocation_units ON a.allocation_unit_id =
JOIN sys.partitions ON (sys.allocation_units.type IN (1,3)
AND sys.allocation_units.container_id = sys.partitions.hobt_id)
OR (sys.allocation_units.type = 2 AND sys.allocation_units.container_id
JOIN sys.tables ON sys.partitions.object_id = sys.tables.object_id
AND sys.tables.is_ms_shipped = 0
WHERE a.database_id = DB_ID()
GROUP BY sys.tables.name
How to use PerfMon to check SQL Server memory usage
Some smaller objects can show a greater-than-100% loading in memory. This is normal. It appears to be due to SQL Server allocating additional space for the table to grow in memory without having to wait for additional data pages to be located before allocating the new page. From what I have seen, this happens when the data page holding the table data is fairly full.
If you've tuned your system to remove unneeded data from memory and are still having issues, you can use Peformance Monitor to see how the buffer cache is performing. Some good counters to look at are the SQLServer: Buffer Manager, Buffer Cache Hit Ratio and the SQLServer: Buffer Manager, Page life expectancy. The Buffer Cache Hit Ratio tells you how often SQL Server is able to find the data it is looking for in memory. The Page Life Expectancy shows you how long SQL Server expects to be able to keep data in memory before it is flushed out, allowing for additional data to be loaded from disk. This number is shown in seconds, and the higher the number the better. As this number falls lower and lower, disk I/O continues to increase because SQL Server must load the same data from the disk over and over and the data is flushed from memory to disk.
Differences in allocating Windows OS memory and SQL Server memory
When allocating memory from the operating system (OS) to SQL Server, it's important not to give SQL Server too much memory. By not reserving enough memory for the Windows OS and other applications running on the server, you can cause just as much harm to SQL Server than by not allocating enough memory to it. There are two schools of thought for deciding how much memory to allocate to SQL Server. One technique says to allocate 1 GB to 2 GB of RAM to the Windows OS. The second says to allocate approximetly 20% of the RAM to the Windows OS. The amount of memory in the server will dictate which technique you use.
No matter which technique you use, make sure you are allocating at least 512 MB to 1 GB of memory to the Windows OS. Any less and Windows won't have enough memory to handle the OS functions, and Windows will begin paging to the page file much more often than it needs to.
Note: If using SQL Server Enterprise Edition (SQL 2005 and up), you will want to assign the "Lock Pages In Memory" right to the account the SQL Server is running under. This setting tells SQL Server not to page data from RAM to disk when it comes under memory pressure. This setting is ignored in editions of SQL Server below Enterprise Edition.
ABOUT THE AUTHOR
Denny Cherry has over a decade of experience managing SQL Server, including MySpace.com's over 175-million-user installation, one of the largest in the world. Denny's areas of expertise include system architecture, performance tuning, replication and troubleshooting. He currently holds several Microsoft certifications related to SQL Server and is a Microsoft MVP.
Do you have a comment on this tip? Let us know.
Excellent article, as usual from Denny Cherry.
Presumably, adding a large number of fields as included columns to secondary indexes can also result in memory pressure, when one or more of the included columns are involved in the query restriction; is this correct? If one can avoid scans altogether, I believe none of this is an issue, although that's a tall order.
Two comments about this tip…
One alternative I have seen suggested by MVP Simon Sabin to changing the clustered index to nonclustered is to create a new nonclustered index that has the same columns as the clustered index. On a large table, the optimiser will identify the resource advantage of using the nonclustered index over the otherwise identical clustered index. In this way you avoid the increased IO that often occurs after a heap has had a few weeks of updates. The only downside to this idea is you need the extra disk space to hold the nonclustered index, but the cost of this is normally far less than the cost of lost performance in using a heap.
Denny suggests that a greater than 100% loading in memory may be due to SQL allocating extra space to allow for growth. An alternative view is that the % value is based on the number of extents in the bufferpool for the table. SQL performs its IOs in extents and SQL knows how much of each extent is used by the given table. Therefore for a small table that covers 2 extents but only uses 9 pages, we could see SQL reporting a 177% loading in memory for that table. The same algorithm would be used for tables of all sizes, but a larger table is unlikely to show more than 100% loading in memory, even if it does have unused space on many pages. Although in one way this is SQL reserving space for growth, it is due more to the natural way that SQL handles extents and memory, and not a deliberate decision to allocate more memory than would otherwise be needed.
Hi there! Awesome article Denny! I do think you might want to qualify that the "Enable Page Locking" policy only really applies to 32-bit installations. This TechNet article talks about how to enable lock pages in memory option.
With regard to allocating memory, why not leave SQL Server to dynamically adjust its usage? I'm sure I've read that SQL Server attempts to leave at least 5MB of free memory at all times, trimming its working set accordingly. Could Denny explain in a little more detail under what circumstances I would want to interfere with this policy of self-management.