In the first part of this tips series, Edgewood Solutions' Greg Robidoux discusses hardware considerations for simpler failovers. In future tips, he'll discuss software and other supporting data needed for failover.
Failing over to another system in a disaster recovery situation is the last thing you want to do. Although you won't have to do this often, you may need to eventually -- and it's better to be prepared than not.
You may assume that the simpler the implementation the simpler the failover process will be. From a hardware perspective that may be the case, but from a SQL Server performance and high availability standpoint, simple is probably not the way to go.
In the following tip I'll describe what a simple SQL Server hardware implementation would look like and why that may be an ideal configuration for failover but not for performance and availability. I'll also discuss failover methods. Use the following table of contents to navigate this tip.
|TABLE OF CONTENTS
Simple hardware implementation for simple failover
Hardware and performance
Hardware and high availability
Primary hardware components for failover
Additional failover considerations
From the most basic standpoint your simplest hardware implementation would consist of something like the following:
- one physical server;
- one disk array made up of one or multiple disks;
- one controller;
- one network card;
- one or more CPUs;
- operating system software;
- SQL Server software;
- various service packs to keep the OS and SQL Server up to date;
- and that's about it!
This configuration is nice and simple, but lacks quite a bit in the high availability and performance area. So we need to think about what we are trying to achieve:
2. Make sure failover does not occur and, in the remote chance that it does, make it easier to failover to another server.
My choice, which I'm sure is also yours, is option number two.
In order to improve performance, most technical advice written about SQL Server and database systems tells you to spread I/O across as many disks as possible and separate different I/O functions, such as data, transactions, tempDB, paging file, work space, etc. In addition to breaking apart I/O functions, you must ensure different databases are not contending with the same physical I/O, which would require greater physical separation between them. So, you must first separate I/O for one database with data, logs, etc., and you may then need to separate disk I/O among multiple databases running on the same server.
You also have indexes and data that could be separated to further increase I/O, and you must consider controllers and their configurations to make sure they are not creating bottlenecks.
The overall disk configuration really depends on the type of application and database usage for your server. If your database needs do not warrant separating all disk activity, don't make the hardware overly complex by separating all disk activity.
Unfortunately there may not be a simple hardware configuration for failover. Performance impacts users every day, and they are well aware of performance changes. Systems must be tuned and hardware configured to maximize the user experience.
Hardware and high availability
On the other side we have high availability. In most cases a full-blown HA solution is often not implemented; instead a manual process is used to fail over to a secondary system. Although there have been several options available for quite some time, I am just now starting to see a market shift toward implementing complete failover solutions to minimize downtime as much as possible. Such interest may be driven by new database mirroring features in SQL Server 2005 and people's desire for greater HA capabilities, or the fact that more and more enterprise applications are being migrated to SQL Server and there is a business need for HA.
Several failover options exist, from free to expensive and simple to complex. Some are built into SQL Server and others are add-ons.
- Log shipping (free/simple)
This option has been pretty widely used for quite some time now. There are built-in capabilities in SQL Server 2000 Enterprise Edition, but this is method could easily be recreated in any version. SQL Server 2005 now has built-in tools for both Standard and Enterprise. The downside is that there is a lot of manual work for the failover.
- Hardware clustering (expensive/complex)
This option offers automatic failover, but also comes at a cost.
- Software clustering (expensive/complex)
This is another option that offers automatic failover at a cost.
- Backup and restore (free/simple)
This is a manual process for failover. It doesn't cost anything, but definitely provides no automation.
- Replication (free/complex)
This option is not the best option for a failover server. There are a lot off moving parts with replication and it is a lot of work to keep database changes in synch. If you employ this option there is still a lot of manual work required for failover.
- Database mirroring (free/simple-complex)
This is a new feature in SQL Server 2005. This option automates the failover for you, but other will need to be handled manually.
So what does make for a simple failover?
There are four primary SQL Server hardware components that come into play: memory, processors, network and disk I/O. These four items are core to your server's performance. From the memory, processor and network standpoint, in general, there is not much you need to do. SQL Server will take advantage of what is available. The faster the components the better the performance, so going from the primary server to the failover server is straight forward.
It's disk configuration that makes failover more complex. The failover option you choose determines what you must do for a successful failover. Disk configuration may be as simple as one large partition where all data, logs, work files, etc., are stored, to something as complex as having all disk activity separated on different arrays, drive letters, file shares, etc. Some other things that may make this configuration more complex are the use of one or more of the following:
- storage area network (SAN)
- network-attached storage (NAS)
- different RAID configurations
- filegroups on different physical drives
- file shares
In addition to data layout, data movement from the primary server to the secondary server should be a concern. If you have very large databases that need to be moved it will take some time to migrate data across the network. Having GB network cards and a network to support the movement of the data is also key, otherwise you could be waiting several hours just to move data from one server to another. If the bandwidth is already saturated, then the copy of the backup file will be very slow. Most companies have a management LAN or VLAN to support network traffic between machines.
Also consider the speed of your disk drives and how fast they can write data. The fastest drives available now are 15K SCSI drives. These drives will allow for faster data manipulation, which will also improve disk I/O.
Additional failover considerations
Up until this point, the difficult decision has been how to lay out data across disks. The real complex part is the software and other supporting data needed for failover. Since this tip is about the hardware, we will not go into those points just yet, but some of these things include:
- Other software applications
- Service packs
- Security settings
- Server names
- IP addresses
- OS versions
- Jobs, alerts and operators
- DTS (Data Transformation Services) or SSIS (SQL Server Integration Services) packages
- Application changes to point to failover server
In putting together a failover solution hardware is not the biggest issue. It is important to have comparable hardware for the failover, but more importantly you should consider what you are trying to prevent, make sure the systems can perform at there maximum ability and then think about what is the best solution for a failover. Is it clustering, database mirroring, log shipping, replication or some other technique?
You must also think about how long the failover process will take and how long your end users can be down. This will be the driving factor to determine what solution simple or complex, free or not, that makes the most sense for your environment.
About the author: Greg Robidoux is the president and founder of Edgewood Solutions LLC, a technology services company delivering professional services and product solutions for Microsoft SQL Server. He has authored numerous articles and has delivered presentations at regional SQL Server users' groups and national SQL Server events. Robidoux, who also serves as the SearchSQLServer.com Backup and Recovery expert, welcomes your questions.
More information from SearchSQLServer.com
- Ask the Experts: How does failover clustering work in SQL Server 2005
- Q&A: SQL Server Database Mirroring primer
- Checklist: How to maintain an effective DR strategy