Storage area networks, or SANs, are widely used to store data for Microsoft SQL Server installations with large...
databases. SANs are one of the most economical ways to deal with very large data sets. They're designed to scale better in this regard than disk arrays installed directly on the host. However, setting up SQL Server databases on a SAN requires some awareness on the part of the database administrator about the way SANs work. You can't simply dump a database onto the SAN and expect to get the same results you've been getting.
Here are some pointers about how to get the most out of a SAN setup when using SQL Server.
The first issue to consider is the bandwidth of the data channel. SANs are typically connected to their host over a fiber optic link. While these links are fast — the 20GFC protocol can in theory produce up to 2,000 MB/sec throughput — there's a big difference between their rated speed and the actual speed obtained when connected to a host. If you have an existing SQL Server installation that you're migrating to a SAN setup, find out what the I/O demands are for the existing setup first. Odds are the SAN will be able to comfortably encompass those demands, but make sure that is the case before you make the leap.
Measuring bandwidth demands in SQL Server 2005 isn't too tough -- just set up the performance monitoring application to derive logs of SQL Server's I/O usage. You can do this over the course of a day or even an hour of high utilization. This may also give you a chance to flag any previously undetected I/O performance issues that need to be squelched in a high-performance environment. SQLIO is another useful, albeit unsupported, tool for deriving live I/O stats.
The specific SAN you choose should be a robustly-designed product able to host SQL Server data reliably. Microsoft uses the term Stable media to describe any storage system that can survive a system reboot or failure without losing anything, including pending writes that might be currently held in a cache. The idea is to have a disk system that complements SQL Server's own needs as far as data consistency goes. To be frank, almost any SAN worth spending money on is already going to sport those sorts of features; if it doesn't, then you're not getting much of a SAN. One useful feature for SQL Server 2005 is the point-in-time snapshot. You use it in conjunction with things like Analysis Services in creative ways (the Analysis Services link goes to an article that tells you how to accomplish this).
Another issue to consider is the way SAN abstracts the physical devices it presents to the system. SANs present their devices to the computer as if they
were local disks, but the LUNs the computer sees and the actual disk arrangement can be radically different. You'll want to know about these things if the plan for your database, table and physical file structure is to take maximum advantage of parallelism -- and it better be, whenever you can).
For example: If you have a database that you want to place entirely on its own physical spindle, you may be inclined to do this by assigning it to a given LUN as advertised by the SAN. But, if you don't know that said LUN is actually split across disks that are shared by another LUN, then you won't get the performance you need. If you are not responsible for setting up the storage on the SAN, consult with the person who is and describe your needs to her in detail. The folks at the Microsoft SQL Server Development Customer Advisory Team have some suggestions about how to configure LUNs on a SAN how to configure LUNs on a SAN for SQL Server, with both rules and exceptions spelled out in detail.
My colleague Hilary Cotter has written a series of general suggestions for SANs in SQL Server that are worth noting. One suggestion is particularly applicable here: Use zoning, a feature supported by many SANs. Zoning allows specific disks in the SAN to be dedicated to a specific LUN, and therefore allows more accurate capacity and performance planning.
One final note, which also comes on behalf of Microsoft, is about which RAID level to use on the SAN. For SQL Server data and logs, Microsoft recommends using RAID 10, when possible, for a variety of reasons. It offers better availability than RAID 5 and better support for write-heavy environments (making it a good choice for the temporary database, too). The extra cost of implementing RAID 10 is more than worth it if you can afford it. If you can't shell out that much more, RAID 5 is an acceptable substitute in most cases, although it does come with a bit of a hit to performance.
ABOUT THE AUTHOR
Serdar Yegulalphas been writing about computers and information technology for more than 15 years for a variety of publications, including InformationWeek and Windows Magazine.