The name of the game is I/O throughput. Running backup and restore sessions is one of the most I/O intensive activities...
you will perform on your SQL Server. That's because when you run a backup and/or restore you are reading the entire contents of the database and then writing the entire contents of the database or vice versa. The one big difference between that and executing queries is that the system reads the database file sequentially rather than in a random fashion, making it a much more efficient process. Because such a large amount of data needs to be accessed and written, the best approach is to split up your I/O activity as much as possible.
From a database perspective, there is not much you can do at this point. The database has been laid out already in filegroups, which have been placed on certain drives/disks on your system. As always, you can still modify and move your data to different drives/disks and filegroups, but this is something you would not be doing on a regular basis. Since we are talking about backups and restores, we will leave the database layout for another tip.
Most of what you must do should be pretty obvious. The more you can isolate disk activities, the greater throughput you will have for your backups. For instance, there are several tests that have been performed using third-party SQL Server backup tools where a several-terabyte-sized database can be backed up in under an hour. To achieve that kind of speed, a lot of thought and planning went into the hardware configuration.
As I mentioned earlier, it's important to optimize both the reading and writing of the backup process. In the example above, the vendors had the luxury of laying out the database as well as the disk subsystem for the backup file. Since you already have your database up and running, let's take a look at what can be done on the receiving end.
When setting up your database server consider such things as whether or not you should be locally attached or use storage area network (SAN) or network-attached storage (NAS). In addition, consider what types and how many controllers you will be using -- SCSI, iSCSI or Fibre Channel. After you have these things determined, think about the RAID levels you will be using for your database server.
RAID (redundant array of independent disks) levels come in a lot of different variations. The most common are:
- RAID 0 – striping
- RAID 1 – mirroring
- RAID 5 – distributed parity
- RAID 10 – stripe and mirror
Now, let's talk about the four RAID levels mentioned above. Since the backup process is a read process (reading your database file, which already exists) and a write process (writing your backup file), we will look at the writing side of the equation and which RAID level is best for your environment.
This RAID level offers the highest throughput from a write perspective, but there is no redundancy built in. Since your backup files are your last line of defense and there is not a redundant set of data, I would not recommend this option.
This level also offers a high level of I/O throughput as well as a redundant copy. The downside is that the size of the data partition using RAID 1 will be limited based on your physical disk size. One option would be to write your backup file to multiple RAID 1 arrays that are independent of each other and, possibly, to use different controllers or channels to further increase the I/O throughput.
With RAID 5 there is a write penalty because of the need to keep the parity in check. If you are really looking for faster backup processing, this RAID level does not make sense.
With RAID 10, you get the advantage of both RAID 0 and RAID 1. The downside is that it becomes very expensive to implement. But, the upside is your I/O throughput should be very fast. Since RAID 10 uses all of the drives in the array, to gain higher I/O rates, the more drives in the array will increase performance.
In addition to RAID levels, here are other things to take into consideration:
In setting up your RAID arrays, try to isolate the different types of I/O traffic to different controller cards or channels on the controller cards. Also, the better the controller card, the better the performance.
Disk drives come in multiple speeds. SCSI drives currently run as fast as 15 K rpm and are faster than the IDE drives. In most cases, enterprise-wide servers will probably be using SCSI technology which is the better choice.
You can now get drives that are several GBs in size. This is great if you want to store a lot of data that is not accessed very often. But for writing out a large amount of data as fast as possible, large disk size doesn't help. What will be more advantageous are more drives in the array versus a couple of very large drives.
As with a lot of things, the best approach is not always the most practical. There may be financial limitations, physical server limitations or other constraints. This should give you an idea of what you can do to ensure fast backups and restores for new servers. For existing servers, making these changes might not be as easy. In any case, now you have some additional information about what could or should be done whenever you get a chance to consolidate or migrate to new hardware.
About the Author: Greg Robidoux is the president and founder of Edgewood Solutions LLC, a technology services company delivering professional services and product solutions for Microsoft SQL Server. He has authored numerous articles and has delivered presentations at regional SQL Server users' groups and national SQL Server events. Robidoux, who also serves as the SearchSQLServer.com Backup and Recovery expert, welcomes your questions.
More from SearchSQLServer.com