No one understands the importance of tested backups more than database administrators (DBAs). For most organizations, however, relying on tape backups for disaster recovery is neither appropriate nor reliable. In this article, we consider the pros and cons of various disaster recovery solution options.
Before you select a disaster recovery technology, determine your goals. Some companies, such as those in the financial sector, cannot tolerate any data loss. For others, high availability is primary: News outlets and communications companies will experience peak demand during disasters.
Essentials for SQL Server disaster recovery
Your disaster recovery plan should be complete and include all dependencies. While the ability to restore an up-to-date SQL Server is essential, you must also ensure that all dependencies of that SQL Server are in place. Windows accounts, file system dependencies, applications and other aspects of the server must be present, as many disaster recovery plans have failed because the site's hardware dependencies are not in place. This includes tape drives in a different location, missing passwords for password-protected tapes or the wrong version of either the firmware or the tape drive itself.
Your plan should also be as simple and nonintrusive as possible. Some disaster recovery technologies limit what can be done on the source server, such as whether you can change the recovery model while using database mirroring. Others require significant steps to ensure that a disaster recovery site is up to date: Replication and log shipping will not replicate logins, and additional processes are necessary to ensure that all logins are in place and current at the disaster recovery site.
Other solutions may require significant time before a site is operational, as some software database mirroring solutions take time to quiesce before a database is accessible. Executing a disaster recovery plan is a complex operation involving many steps; a simple plan with fewer moving parts limits that complexity and increases the reliability of a disaster recovery plan.
Including a plan for failback is also important. Once the disaster recovery site is operational and running, it will contain changes. When you are ready to return to your original site, these changes will have to move with you.
Your site also needs adequate network bandwidth to support the data flow that keeps it synchronized. This must take into account that in the event of a disaster the site has adequate hardware to support the load placed on it. A large media corporation's disaster recovery site, for example, can support a much larger load than its main data center. The reasoning is that during a disaster the public will use the company's resources more frequently than it would during a normal news cycle. Conversely, a luxury jewelry retailer designed its site to support only the minimum business-critical functions. The retailers realized that, in the event of a disaster, most customers would not shop for luxury goods.
Finally, the disaster recovery plan should be up to date. Having a site in place with older versions of your databases is useless to the current versions of the applications that access this data. A large online trading site with which I consulted does its development and quality assurance on its disaster recovery site, to ensure that the site remains an exact copy of its data center.
Disaster recovery solution options
The following is a list of the various database disaster recovery options available and their pros and cons:
The backup and restore option is conceptually simple and includes all database objects, but at the same time, it is not scalable for large databases. Transferring a database to a disaster recovery site can also take time and therefore creates a high risk of data loss.
The log shipping option also includes all database objects and is easily understood by DBAs. The exposure to data loss, when scheduled, is roughly a minimal five minutes. On the downside, you are vulnerable to events that break the log shipping chain, such as changing the database recovery model. Log shipping is also not scalable, and you cannot back up the transaction log while the database itself is being backed up.
The low latency and automatic redirection of clients to the standby site is a plus, but database mirroring is practical only for workloads that use the high-performance mode that is available only in SQL Server's Enterprise Edition. The success of your mirroring technology therefore depends on the speed and available bandwidth of your network.
Database mirroring also requires your database to remain in full recovery mode. Those using SQL Server 2008, however, can compress their database mirroring traffic, resulting in less bandwidth consumption for your mirroring sessions.
The replication option can selectively replicate a subset of the data or objects. With bi-directional transactional replication or peer-to-peer replication, failover and failback are also unproblematic. Latency, however, can be very high during batch operations, and not all database objects are replicated.
With SQL Server 2005 using bi-directional transactional replication or peer-to-peer replication, you may need to disconnect users from your database as you make schema changes, although SQL Server 2008 does not require this. peer-to-peer replication is an Enterprise Edition-only feature, and bi-directional transactional replication can be set up only using stored procedures.
Solutions such as software database mirroring typically work at the file system level. They filter changes to the file system and replicate the ones you select to be mirrored. If desired, you can mirror all changes to your database's MDFs, NDFs, and LDFs. The software database mirroring solution then copies these changes to the destination. They can typically also do compression on the fly.
These technologies tend to be low cost, low bandwidth and highly reliable, with simple failover and failback. Nevertheless, older versions of the software would require some recovery time before the destination server was available, and Microsoft may not support database problems occurring with these technologies.
Finally, hardware database mirroring solutions involve a split write: An application issues a write operation on the source server, which is written to the destination server, and the application can then continue with the next operation. This is a robust option that provides no data loss. The disaster recovery site, however, is not accessible while being mirrored to. These solutions are also costly and involve latency.
ABOUT THE AUTHOR
Hilary Cotter, SQL Server MVP, has been involved in IT for more than 20 years as a Web and database consultant. Microsoft first awarded Cotter the Microsoft SQL Server MVP award in 2001. Cotter received his bachelor of applied science degree in mechanical engineering from the University of Toronto and subsequently studied both economics at the University of Calgary and computer science at UC Berkeley. He is the author of a book on SQL Server transactional replication and is currently working on books on merge replication and Microsoft search technologies.
This was first published in March 2009