Standby servers are a great mechanism for ensuring the high availability of your enterprise applications. In this tip I'll define standby server and offer options and recommendations for configuring and maintaining such servers.
A standby server is a computer that is located in close proximity to the production server(s). A standby server can be used to temporarily replace a production server if it experiences a hardware failure. It could also be used to verify that you can recover databases from their full backups.
Important considerations when setting up a standby server
I will discuss this list of questions to ask yourself in greater detail below.
1. Will your standby server be used to verify backups or log shipping? Will it be used for both?
2. If the standby server is used strictly to verify production backups, what is the total size of all databases that you must verify?
3. How many production servers do you intend to replace with the standby server in case of a catastrophic event? What is the most powerful server you would need to replace with the standby server?
4. How much disk space would you need to restore all databases currently residing on production servers to be replaced with the standby server?
5. How much transaction log space is used when you perform index maintenance on your servers?
6. How many SQL Server logins do you use in your applications?
7. If your data must be available 100% of the time, even in the case of a natural disaster, do you have a safe, geographically separate location for storing your standby server?
8. Does your operating system and database management system have the same service packs on production and standby servers?
Using a standby server to ensure that database backups are valid
Let's say you're an enterprise DBA responsible for numerous mission-critical servers. You feel great about your disaster recovery strategy because all of your backups seem to complete in a reasonable timeframe and your jobs never fail. Then one of your mission critical servers dies and you're left with a tape containing the full backup from the night before the failure. You quickly transfer the backup to the server you want to use in place of the one that is no longer available, fire up Enterprise Manager, choose "Restore Database" and … oops, you get an error similar to the following: "The backup set isn't valid." Murphy's Law never fails: Your backups may be useless when you need them most.
What can you do to avoid such situations? Make sure you can always restore databases from your backups. Fortunately SQL Server has come a long way from its humble beginnings. If you advise it to verify backups upon completion (as you absolutely should by using the RESTORE VERIFYONLY command), it will alert you to a backup problem. RESTORE VERIFYONLY was available in previous releases of SQL Server but wasn't always dependable. This statement has been enhanced with SQL Server 2005 to perform additional checking and increase the probability of detecting errors. However, the only true way to be 100% sure your backups are valid is to actually restore them and check the data on a standby server. Why, you would need a full-time employee to verify every single backup, you may complain. Judge for yourself: Would you prefer to lose mission critical data or hire an additional DBA to verify backups? In fact, if you automate the process of copying backups and restoring them to a standby server, you might not need an additional full-time DBA.
- Note that a server used simply to verify the validity of backups does not have to be as expensive or powerful as the production server; in this case, you don't intend to replace a production server with the standby server. Your goal is to ensure that you can restore databases from backups. As long as you have plenty of disk space, you could use a single standby server to validate backups from several production servers, depending on the size of your mission-critical databases.
- Notwithstanding the previous point, you don't want database backup verification to take any longer than necessary. Keep this in mind when investing in a standby server and ensure that it has plenty of memory and processing power to restore databases relatively quickly.
Using a standby server to continuously ship transaction log backups to a server replacing the production server in the event of a catastrophic failure
SQL Server provides a great mechanism for setting up a standby server – log shipping. Configuring log shipping is easily accomplished by using the Maintenance Plan Wizard. Note that log shipping is only supported in SQL Server Enterprise Edition. Any other edition requires you to configure log shipping manually, but you should be able to figure out how to copy and restore production database log backups to a standby server. Fortunately SQL Server 2005 supports log shipping with Standard Edition and Workgroup Edition as well as Enterprise Edition.
- If you use a standby database server for log shipping you must size the server appropriately. Keep in mind that this server might be serving part or all of your customers if the primary server fails (a log shipping server is not a good candidate for saving money). For instance, suppose you have a family of eight and your minivan breaks down – would you rent a Ford Escort or another economy car to replace it? Probably not. Similarly if your production server has eight processors and 8 GB of RAM you shouldn't replace it with a single processor desktop machine with 256 MB of RAM. Your log shipping server should be at least the same if not more powerful than your production server.
Why would you want to use a more powerful server for log shipping? Purchasing a standby server for every production box is not always an option; you can use the same standby server to ship logs from multiple production servers. Typically you don't expect several of your production servers to die on the same day (unless your primary data center blows up), but it's not unusual to experience a hard disk failure on multiple servers on the same day. You could try to serve multiple applications (or multiple sets of customers) with the same log shipping server simultaneously for a short time. This is when the additional processing power, plenty of disk space and memory can come in handy.
- So how do you size a log shipping server? It depends on your requirements. If you have a single production server shipping transaction logs to a standby box, the two servers should be identical. If you ship transaction logs from two servers, you should have at least three times as much disk space on the log shipping destination server as it would take to restore all of your production databases. Why? In order to restore a database you must have a full backup. So if your database is 100 GB, you need at least 200 GB free disk space to copy the backup file and then restore the database from it.
In addition, log shipping will need to copy transaction log backups to the destination server before those backups can be applied to a log shipped database. If you have ever done index maintenance on large tables, you know that transaction log backups can become quite large. In an environment that takes transaction log backups every 15 minutes, it's common to see 3, 4 or even 5 GB log backups during index maintenance. You can and should configure SQL Server to delete log backups after they are applied to log shipping destination, but even so you'll find it difficult to manage disk space if your log shipping server only has the same space as the single production server.
- If you use numerous SQL Server logins for your databases, you must ensure that those logins exist on the log shipping destination server before you can failover from the production system. Microsoft recommends using Windows authentication whenever possible, but SQL Server logins have their place. Books online has a good overview of the process you need in place for transferring logins. If you only use a single login to connect to all of your databases this won't be a huge factor in your failover strategy.
- Log shipping can be used to fail over at a geographically separate location. This is a good idea for mission-critical data that must be available even in the event of a terrorist attack or natural disaster. For example, what happens if your primary data center is destroyed due to an earthquake? You can't expect your customers worldwide to wait until you find a new data center. Instead you store your standby server(s) in a geographically separate data center and ship transaction logs from production servers. Network connectivity between the primary and standby servers must be considered carefully. Transferring data across a slow network link may not be acceptable if you must guarantee data availability. Faster networks will allow quicker transfer of transaction logs and thereby shorter delay in getting your failover servers ready to serve when called to take over the duties of primary servers.
Whether you use the standby server to verify database backups or for log shipping you must ensure that the operating system and the database engine have the same service pack level as the production server. For example, if your production server is running Windows Server 2003 Service Pack 1, so should your standby server. This way if you fail over to the log shipping server, you can be sure your application will behave exactly the same as it does against the production server. The fewer variables you have to work with the easier it will be to switch over to the log shipping server when need arises.
About the author: Baya Pavliashvili is a DBA manager with Healthstream - the leader in online healthcare education. In this role, Baya oversees database operations supporting over one million users. Baya's primary areas of expertise include performance tuning, replication and data warehousing. He can be reached at [email protected].
More information from SearchSQLServer.com