Optimize disk configuration in SQL Server
One of the easiest ways to improve the lifetime performance of a SQL Server database is proper
setup of the physical and logical drives. While it's an easy technique, proper disk subsystem
configuration is often overlooked or handled by a member of the IT staff other than a SQL Server
DBA.
Premium Access
Register now for unlimited access to our premium content across our network of over 70 information Technology web sites.
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States.
Privacy
Dig Deeper
-
People who read this also read...
-
This was first published in June 2007
All too often the disk subsystem, including the disk array, is configured based only on storage
capacity, with no question of drive performance. Let's go beyond storage capacity requirements and
design for drive performance.
Storage specific terms that everyone should know
Return to Table of Contents
Before you sit down with a storage administrator or engineer to map out your plans for disk
configuration, here are basic preparation steps to take. Start by getting familiar with the terms
below and communicate your requirements much easier.
- RAID – Redundant Array of Inexpensive Disks, also known as Redundant Array of
Independent Disks.
- Disk subsystem – A general term that refers to the disks on the server.
- Spindle – Spindles are another way to refer to the physical disk drives that make up the
RAID array.
- I/O Ops – Input/Output operations, usually measured per second.
- Queuing – Number of I/O Ops that are pending completion by the disk subsystem.
- SAN – Storage area networks are collections of storage devices and fibre switches
connected together along with the servers that access the storage on the device. SAN has also
become a generic term, which refers to the physical storage drives such as EMC, 3PAR and
Hitachi.
- LUN – Logical Unit Number – This is the identification number assigned to a volume when
created on a SAN device.
- Physical drive – How Windows sees any RAID array, single drive or LUN that is attached
to the server.
- Logical drive – How Windows presents drives to the user (C:, D:, E:, etc.).
- Block size – The amount of data read from the spindles in a single read operation. This
size varies per vendor from 8 KB to 256 MB.
- Hardware array – A RAID array created using a physical RAID controller.
- Software array – A RAID array created within Windows using the computer management
snap-in.
- Hot spare – A spindle that sits in the drive cage and is added to the array
automatically in the event of a drive failure. While this does not increase capacity, it does
reduce the amount of time that the array is susceptible to data loss because of a second failed
drive.
- Recovery time – Amount of time needed for the RAID array to become fully redundant after
a failed drive has been replaced, either manually or automatically via a hot spare.
What's the best RAID level for your system?
Return to Table of Contents
There are many RAID levels available on modern RAID controllers. Only a subset of these is most
useful when configuring a Microsoft SQL Server. Each RAID level has a specific use and benefit.
Using the wrong type of RAID level can not only hurt system performance, but also add more cost to
your server configuration. Experts recommend that you never use software arrays on a database
server. Use of software arrays requires additional CPU power from Windows in order to calculate
which physical disk the data is written to. In hardware arrays, this overhead is offloaded to a
physical PCI, PCIe or PCIx card within the computer (or within the SAN device), which has its own
processor and software dedicated to this task.
RAID 1 – Mirror. A RAID 1 array is most useful for high write files, such as the page
file, transaction logs and tempdb database. A RAID 1 array takes two physical disks and creates an
exact duplicate of the primary drive on the backup drive. There is no performance gain or loss when
using a RAID 1 array. This array can survive a single drive failure without incurring any data
loss.
RAID 5 – Redundant Stripe Set. A RAID 5 array is most useful for high read files such as
the database files (mdf and ndf files) and file shares. It is the most cost-effective, high-speed
RAID configuration. With a RAID 5 array, there is a performance impact while writing data to the
array because a parity bit must be calculated for each write operation performed. For read
performance, the basic formula is (n-1)*o where n is the number of disks in the RAID 5 array
and o is the number of I/O operations each disk can perform. Note: While this calculation
is not perfectly accurate, it is generally considered close enough for most uses. A RAID 5
array can survive a single drive failure without incurring any data loss.
RAID 6 – Double Redundant Stripe Set. Like a RAID 5 array, a RAID 6 array is most useful
for high read files such as the database and file shares. With RAID 6, there is also a performance
impact while writing data to the array because two parity bits must be calculated for each write
operation performed. The same basic formula is used to calculate the potential performance of the
drives (n-2)*o. A RAID 6 array can survive two drive failures without incurring any data loss.
Because of the dual parity bits with RAID 6, it is more expensive to purchase than a RAID 5
array. However, RAID 6 offers a higher level of protection than RAID 5. When choosing between RAID
5 and RAID 6, consider the length of time to rebuild your array, potential loss of a second drive
during that rebuild time, and cost.
RAID 10 – Mirrored Strip Sets. A RAID
10 array is most useful for high read or high write operations. RAID 10 is extremely fast;
however, it is also extremely expensive (compared to the other RAID levels available). In basic
terms, a RAID 10 array is several RAID 1 arrays stripped together for performance. As with a RAID 1
array, as data is written to the active drive in the pair, it is also written to the secondary
drive in the pair. A RAID 10 array can survive several drive failures so long as no two drives in a
single pair are lost.
RAID 50 – Stripped RAID 5 Arrays. A RAID 50 array is an extremely high-performing RAID
array useful for very high-load databases. This type of array can typically only be done in a SAN
environment. Two or more RAID 5 arrays are taken and stripped together and data is then written to
the various RAID 5 arrays. While there is no redundancy between RAID 5 arrays, it's unnecessary
because the redundancy is handled within the RAID 5 arrays. A RAID 50 array can survive several
drive failures so long as only a single drive per RAID 5 array fails.
Best practices for drive and physical file layout
configuration
Return to Table of Contents
-
With the introduction of larger and larger hard drives, storage administrators have been
presented with a new issue: increased recovery times. In the past, 172 GB drives were the standard
size deployed. Even if you had five drives in a RAID 5 array, the total storage of the array was
only ~688 Gigs. At that size, recovery from a failed drive would take only hours. With those five
drives now closing in on the 1TB mark per drive, they now create a 4 TB array. This larger array
could take up to a day to recover (depending on system load, disk speed and so on).
Because of increased recovery time and, therefore, increased risk of data loss, it is
recommended that smaller disks be used in greater numbers. Instead of a single 2 TB array, use
smaller drives and make several 500 GB arrays. Not only will this improve your recovery time, but
your system performance will increase, too, as you now have many more disks in your arrays.
Whenever possible, connect each RAID array to its own RAID controller. That will minimize the
possibility of overloading the RAID controller with I/O operations.
- When setting up your disk subsystems, there are five basic groups of files to keep in mind.
These are the data files (mdf and ndf files), log files (ldf files), tempdb database, the SQL
Server binaries (files that are the actual SQL Server software) and Windows. Windows and
 |
| More on storage and disk arrays topics in SQL Server: |
|
|
|
|
 |
 |
the SQL binaries will perform nicely on a single RAID 1 array,
which should include the Windows page file. Note: Although it can be moved to another RAID 1
array, it's not necessary and will not provide any additional performance benefit.
Data files should be placed on one or more RAID 5 or RAID 6 arrays (based on system needs). In
certain systems, you should place the data files on RAID 1 or RAID 10 arrays, but those systems are
in the minority. Place the transaction log files on one or more RAID 1 or RAID 10 arrays (again,
based on size and performance needs). The tempdb database should be placed on its own RAID 1 or
RAID 10 array. None of these file groups (other than Windows and the SQL binaries) should share
physical drives. By separating your files into these groups, you will see a definite improvement in
performance.
- To determine the number of spindles,
you need to first know a few things about your hardware and your database and application. For the
hardware, find out the number of IOPs each physical drive can handle. For the application, you need
to know the typical high watermark that your database will need to function. The number of spindles
you have times the number of IOPs that each drive can handle must be higher than the high watermark
number of IOPs that your database requires.
Drive layout examples for SQL Server systems
Return to Table of Contents
Now that we have covered the basics, let's put all this information together and look at some
server drive layouts. These systems all have a RAID 1 drive holding the Operating System, page file
and SQL Server binaries:
The small database server. This would be a database system having a small number of users
and a low number of transactions per second. For a system of this type, putting the data files,
transaction logs and tempdb database on a single RAID 5 array will be fine. With a load this low,
there should be minimal to no performance loss because of having the files on the same array.
The medium database server. This would be a system that has a larger number of users and
no more than 100 transactions per second. At this system size, you will begin to separate parts of
the database. Start by separating the database files and log files onto different RAID arrays. The
tempdb database can probably remain on the same array as the database files. But, if the tempdb
database is heavily used for things like lots of temporary table usage, it should be moved to its
own drive array.
The large database server. This system is for hundreds of users and hundreds or thousands
of transactions per second. When designing systems of this size, break apart not only the database
files, log files and tempdb database files, but also want database into smaller data files on
multiple drives. This includes moving the indexes to separate files on RAID 1 or RAID 5 arrays,
moving blob data (data stored in TEXT, NTEXT, IMAGE, VARCHAR(MAX) and NVARCHAR(MAX) data types) to
a separate RAID 5 array. You can also place different types of tables on different drives by
assigning them to different file groups.
It is recommended that for larger systems, each file group of the database should have between
.25 to one physical file per physical CPU core in the server. The tempdb database should have one
data file per physical CPU core in the server.
It is highly recommended that all partitions are configured by using the DISKPAR (Windows
2000) or DISKPART (Windows 2003) commands. When using DISKPAR, adjust the alignment by 512 bytes,
and when using DISKPART, the disk should be aligned to 64. The reason for this is due to the
original master boot record design of WinTel based systems. The master boot record for all drives
is 63 blocks (1 block = 512 bytes).
The physical disks want to read and write data in 64 block chunks. Because the master boot
record is only 63 blocks, this puts the first block of actual data in block location 64, where it
should be in block location 65. That forces the disk to read 128 blocks for each 64 blocks read to
the disk, thereby increasing the work needed to be done and decreasing performance.
It is so highly recommended that volumes be created with this 64 block offset that
Microsoft is including this procedure as the standard when creating partitions starting in
Microsoft Windows 2008 Server. There are no published figures on what sort of performance
improvement will be seen by creating your disks using this method. It's because any numbers would
be relevant to only the system they were taken against, as all databases are different.
Unfortunately, once a partition has been created without the alignment offset, there is no easy way
to change the offset. The only method for doing that is to create a new volume and partition with
the offset, take down the SQL Server and manually migrate the files to the new drive in an offline
manor.
When designing larger database servers, it is important to pay careful attention to the disk
configuration. An improperly laid out disk configuration can easily have an adverse impact on your
database performance.
Even though laying out the disk subsystem of a SQL Server appears to be a simple task, take
great care to get the maximum amount of performance from your drive subsystem. A poorly configured
array can and will have a detrimental effect on the performance of your database system. When
planning and purchasing your storage hardware, think not only of the capacity, but also about the
amount of I/O operations that the disks will need to handle -- not only today, but far into the
future. Proper planning will give you a high-performing SQL Server today and years from now. When
designing a system, your goal should be that it performs well beyond the expected lifetime of the
hardware it resides on.
ABOUT THE AUTHOR
Denny Cherry is a DBA and database architect managing one of the largest SQL Server
installations in the world, supporting more than 175 million users. Denny's primary areas of
expertise are system architecture, performance tuning, replication and troubleshooting.
Disclaimer:
Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.
Join the conversationComment
Share
Comments
Results
Contribute to the conversation