These days, high availability is a must-have for anything as mission-critical as SQL Server. Aside from numerous...
third-party high-availability options, SQL Server comes with several built-in features (which vary a bit by edition). It's a good idea to understand what they are, and the differences among them so that you can make smarter choices about your SQL Server environment.
The lowest form of high availability, log shipping is essentially an extension of SQL Server's replication functionality. It's configured on a per-database level, not for an entire server.
Basically, you have your production SQL Server ship, (or copy) a database's transaction log to a second, "hot spare" server. The hot spare has to start out with an identical copy of the database – something SQL Server takes care of when you initialize log shipping – and from there it replays the shipped logs to bring its copy of the database up-to-date. There's no automatic failover; if your production server dies, you'll have to manually redirect users to the hot spare. Log shipping has a small amount of latency, which means it's possible to lose data since the hot spare isn't always 100% up-to-date.
Microsoft has fairly generous licensing terms for hot spares, meaning you may not have to pay for a separate SQL Server license, depending upon your exact licensing situation. Multiple production servers can ship their logs to a single hot spare.
Database mirroring is a step up from log shipping. In its simplest form, it works a lot like log shipping, although the production server sends transactions to the mirror server more frequently, meaning the mirror is a lot more up-to-date.
Failover is still a manual process. You can add a third SQL Server to the mix, called a witness. The witness can act as a normal SQL Server, but in the background it keeps an eye on the two mirror servers. When the "main" mirror dies, the witness can direct the secondary mirror to take over operations – creating a sort of automatic failover. There's still a trick in getting clients to switch over to the mirror, and it is possible to set things up so that this occurs automatically.
Any client transactions that are in-progress at the time of the failover will have to be restarted, and since there is still some latency, the mirror server may still be missing a bit of data when a failover occurs.
This is Microsoft's highest-end form of availability, and it requires that you have a Windows cluster set up.
There's no fancy shipping or mirroring involved here; instead, two or more separate computers are each attached to a shared external storage – often a Storage Area Network (SAN). The database files are physically located on this shared storage, and identically-configured instances of SQL Server run on each of the cluster computers, or nodes.
Only one node's instance is actually active at any time; if that node fails, one of the other nodes starts its corresponding instance of SQL Server, attaches itself to the data files on the shared storage, and takes over. This failover process usually takes just seconds, and the magic built into Windows clustering makes sure clients are always "looking at" the active node for any given SQL Server instance.
Clustering can get pretty complicated, but it's a very effective high-availability technique. Unlike log shipping and mirroring, however, clustering relies on a single set of database files. If those files are corrupted, a failover won't help because the failover instance will simply be looking at the same, corrupted files. With mirroring and log shipping, you're actually making a real-time copy of the files (more or less), so file corruption is less of a concern. That said, I rarely see SQL Server file corruption – so clustering is still a solid choice.
Which high-availability option is right for you?
The answer to this question depends on several factors. Log shipping is the cheapest option, but provides what is probably the least protection and automation.
Database mirroring is a good middle ground in terms of cost vs. benefit, and both mirroring and log shipping can be configured on a per-database basis.
Clustering is a lot more expensive in most situations, but offers a great deal of flexibility and automation. It's definitely the option that provides the least data loss in almost every failure situation. In the end, you'll have to decide what tradeoffs are best for your environment.
Miss a tip? Check out the archive page for Don's series, SQL Server for the Reluctant DBA.
ABOUT THE AUTHOR:
Don Jones is a co-founder of Concentrated Technology LLC, the author of more than 30 IT books and a speaker at technical conferences worldwide. Contact him through his website at www.ConcentratedTech.com.