Given the choice, any sane person would scale a SQL Server database up rather than out. Up is easy: You get bigger hardware, more processor power, more memory, faster disks, a faster network. The idea is that you still leave the database running on a single server, but you make that server bigger. It’s expensive but it’s straightforward.
Some applications will, however, grow bigger than a single server can handle, and scaling up is no longer an option. Instead, your option is scaling SQL Server out, spreading the database across multiple servers. SQL Server offers a few options for making this possible; which one you’ll choose depends on your exact needs -- and your tolerance for pain.
Go to the cloud
Sometimes the easiest approach is to let someone else worry about it. Microsoft’s Windows Azure cloud service includes a cloud-based version of SQL Server, SQL Azure. This isn’t technically scaling out so much as it’s an infinite amount of scale-up. So switching to Azure doesn’t always require massive restructuring of your application. Essentially, you’ll need to point your applications to SQL Azure, and you’ll pay for storage, processing and data transfer. I t’s not free, but you’ll probably never have to worry about scale again.
SQL Server’s native replication can be a scale-out solution, depending upon how your database is built and used. What you’ll do is host copies of your database on multiple servers and direct different users to each server. This often works best for geographically distributed user populations, such as all the users in the Asia office using server 1 and the users in the North America office using server 2. Each server will have a complete copy of the data and attempt to replicate any changes with its partner servers.
There’s no automated load balancing, and this approach works best when users stick with their own subsets of data. In other words, if your Asia users tend to edit data associated with their office only -- mostly with customers in Asia, for example -- then replication will make sure copies of those records are available in the other copies of the database. If all users tend to edit the entire data set, replication can become complicated, because SQL Server has to deal with the possibility of two users trying to edit the same piece of data at the same time -- on two different servers.
SQL Server’s merge replication can handle those kinds of collisions, but you’ll have to do so through some custom merge handler programming, meaning your own developers will have to create the algorithms for determining who wins when two users touch the same data at once. Your client applications may also need reprogramming; that’s so they not only submit changes to the database, but also circle back to see if that change made it or if it was overwritten by another user at around the same time. Users will need retraining, too, since client applications might come back and say something like, “Hey, the data you were editing changed while you were editing it. You need to take a look and decide if your edits are still valid.”
The other major scale-out approach is federation. In this approach, you divide your database across multiple servers. In vertical partitioning, different rows from the same tables live on different servers. Again, geo-partitioning is the most obvious solution: Keep all the Asia data rows on one server and all the Europe data rows on another. This is different than whole-hog replication: Here, each location’s server won’t have the entire database, but will instead only have that location’s data. Users can see a “federated,” or combined, view of the data, when you implement something like a SQL Server distributed partitioned view to form the complete table. Horizontal partitioning splits the tables’ columns across separate servers, so every server works together to offer the combined table.
These kinds of databases are tough to create, and there’s an entire art around building them. You’ll need a lot of detailed information on how the data is accessed and used in order to make the right decisions, and you’ll need a SQL Server database architect who fully understands the technologies and can look at your business situation and set things up properly.
In some cases, this kind of scale-out can be accomplished with minimal changes to client applications. That’s especially true for well-designed applications that already rely primarily on views and stored procedures to access data, since those elements can be abstracted on the back end without changing the client. But these types of apps are hard to come by; more often, your scale-out will also involve some reprogramming of the client tier, so that clients don’t need to maintain an awareness of the back-end structure.
Yep, scale-out is tough
There’s no question that scaling SQL Server out is complicated -- which is one reason that cloud database systems like Azure hold so much promise. There’s also a robust third-party vendor market that helps to build scale-out technologies without relying entirely on SQL Server’s native features. You’ll need to do some homework to see what’s available and have a good understanding of how your data is accessed and used so that you can pick an approach that best fits your needs.
ABOUT THE AUTHOR
Don Jones is a Microsoft Most Valued Professional and a senior partner and principal technologist at Concentrated Technology, an IT education, strategic consulting and analysis firm. Contact him at www.ConcentratedTech.com.