Load-balancing enhancements in SQL Server 2005

SQL Server 2005 still does not support load balancing out of the box, but it does offer some useful new features to help you distribute transaction load and achieve high availability. Contributor Baya Pavliashvili identifies several significant load-balancing enhancements and the pros and cons of using each.

   
SQL Server 2005 still doesn't support load balancing out of the box --but it does offer some exciting improvements and supports all the load-balancing methods available in previous SQL Server versions.

My previous article described SQL Server 2000's load-balancing methods. Here, I'll briefly summarize SQL Server 2005's new features:

TABLE OF CONTENTS
   Transactional replication with peer-to-peer topology
   Table partitioning
   Backup and restore improvements (piecemeal restore)
   Database mirroring and snapshots

 
Transactional replication with peer-to-peer topology

Transactional replication has been enhanced with peer-to-peer (P2P) topology support in SQL Server 2005.

SQL Server 2000 supports bidirectional replication, which allows two servers to concurrently publish and subscribe to data from one other. The servers could update the same shared data, but you were limited to only two servers in such a topology.

The P2P topology supports an unlimited number of publishing servers that exchange transactions amongst each other. Of course as the number of participating publishers increases, so does the transactional latency. Even though there is no theoretical limit to the number of nodes involved in your P2P topology, only a finite number would provide acceptable performance. Microsoft recommends staying at or below 12 peers for optimal performance.

Nevertheless, the P2P topology is a huge step forward for SQL Server: Now multiple peer servers can change the data and replicate transactions to other publishers. This means that subscriber servers will no longer be limited to predominantly reporting environments. You could have servers distributed across the globe sharing the transactional load. As the number of users increases, simply add another server to the pool.

In addition to distributing load, this topology increases availability; if any peer server becomes unavailable, the remaining servers in the pool could share the load since each server would have the full set of data available on all other servers.

The following table summarizes the advantages and drawbacks of using P2P topology for load balancing.

 
Peer-to-peer topology pros and cons
Pros Cons
  • All participating servers have a full set of data.
  • Users can connect to any peer server to read or modify data.
  • Read performance is likely to benefit greatly by distributing load across servers.
  • Multiple servers could change the same data, which would lead to conflicts. Transactional replication doesn't support conflict resolution out of the box. You'd have to figure out how to handle or prevent potential conflicts.
  • Performance will suffer as the number of peer servers increases.
  • Write activity will perform the same, as though all data is on the same server.

Note: Replication has also been enhanced to handle database schema changes seamlessly. In previous releases, changing replicated object's schema required downtime. This is no longer the case with SQL Server 2005.
 
Table partitioning

Distributed partitioned views work the same way in SQL Server 2005 as they did in previous versions. However, SQL Server 2005 also supports table partitioning, which can help you improve performance by distributing the read-write load across multiple disks (or disk arrays).

To partition a table, you must identify which column will be used for partitioning and the ranges assigned to each partition. For example, the values of an identity column can define partition ranges; values one through one million could reside on one partition, one million through two million on the next partition, and so forth. Partition ranges are specified with partition functions. Then you must also create a partition scheme to map each range of values defined with a partition function to a separate filegroup. Each filegroup can then be placed on a different disk.

The following table identifies pros and cons of table partitioning:

 
Table partitioning pros and cons
Pros Cons
  • Easy to set up using partition schemes and functions
  • Eases the maintenance of large tables (with billions of rows)
  • Allows creation of separate indexes for each partition
  • There are certain limitations on data types supported by the partitioning columns.
  • Each table must be set up for separate partitioning, but you can reuse the same partition function for multiple tables.
  • Table partitioning allows you to spread the load across disks. However, all data will still be managed by a single server. If your performance bottleneck is associated with CPU or memory, this method probably isn't your best choice.

 
Backup and restore improvements (piecemeal restore)

SQL Server's backup and restore features haven't changed much, but Microsoft did add some new functionality that allows users to access databases being restored quicker than before.

SQL Server 2005 now supports a piecemeal database restore. Piecemeal restore allows you to restore the primary filegroup first and bring the database online. Then available secondary filegroups can also be restored. As soon as the primary filegroup is restored, users can connect to the database. Other filegroups can continue being restored while the database is available for querying and transactions. Filegroups being restored are marked offline.

Suppose you have a 100 GB database of which 75 GB is historical data being read infrequently. You could place the historical data in its own filegroup and have another filegroup for frequently accessed data. If you keep recent data in the primary filegroup, you will only have to restore 25 GB to allow users to connect to your database. Then you can restore the other filegroup containing historical data.

The following table identifies the pros and cons of this backup and restore solution:

 
Backup and restore pros and cons
Pros Cons
  • Very easy to implement and maintain
  • Allows both read and write access to the reporting database
  • Does not provide up-to-the-minute data
  • Database is not accessible while it's being restored. This means no reports can be generated.

 
Database mirroring and snapshots

SQL Server 2005 introduces the concept of database mirroring to aid in high availability. To be specific, database mirroring will work with SQL Server 2005 as soon as it is officially released. However, mirroring will not be supported until the release of SQL Server 2005 Service Pack 1, tentatively scheduled for early 2006.

Under the covers, mirroring works similar to log shipping.

  1. Transaction log records are applied to database files on two servers. Unlike log shipping, database mirroring doesn't require you to back up transaction logs and copy backups to the standby server.
  2. Database mirroring writes data twice continuously. Unlike log shipping, the standby database must remain in no-recovery mode, which prevents even read-only access to data. However, mirroring does allow snapshots of the standby database.

Database snapshots are yet another feature introduced in SQL Server 2005. A snapshot is a clone of the database at a particular point in time. Once you take a snapshot of the mirrored database, you can allow users to query the snapshot. Snapshot generation typically only takes a few seconds because no data is actually copied during this process. Therefore, to distribute the load across your primary and standby server, you could mirror your database and periodically take snapshots on the standby server. You could also use snapshots for reporting purposes on the primary server.

The following table summarizes pros and cons of database mirroring and snapshots:

 
Database mirroring and snapshots pros and cons
Pros Cons
  • Generating a snapshot from a mirrored database is very quick
  • Data is current because it is continuously written to the mirror.
  • Multiple snapshots can be generated on the same database.
  • Snapshots provide read-only access to the data.
  • Having snapshots adds certain overhead on the server and has a negative impact on performance.
  • If you happen to fail over to the mirrored server, both transactional and reporting activity would be directed to the same server (but different databases).

ABOUT THE AUTHOR
Baya Pavliashvili is a DBA manager with HealthStream Inc. -- the leader in online healthcare education. In this role, Pavliashvili oversees database operations supporting over one million users. His primary areas of expertise include performance tuning, replication and data warehousing. He can be reached at baya.pavliashvili@healthstream.com.


More information from SearchSQLServer.com


Reader comments:

Member Charles D. writes: You mention that in peer-to-peer replication "Multiple servers could change the same data, which would lead to conflicts. Transactional replication doesn't support conflict resolution out of the box. You'd have to figure out how to handle or prevent potential conflicts." How can this be? It is not much of a solution from Microsoft it this is the case.

Author's response:

This is true with SQL Server 2000 or 2005; Microsoft gives us the technology to build applications, but clever application design is still up to us developers and DBAs. What we can do is split the user base among multiple servers and only let each group of users update a subset of data.

Let's suppose I had three transactional servers. Users could read data modified on any server, yet they never need to update the same data. Each server could support a range of data values that can be updated on that server. For example:

Server 1 has key ranges between 1 and 1 million

Server 2 has key ranges between 1 million and 2 million

Server 3 has key ranges between 2 and 3 million

And so forth. All data can then combined (using replication) on a central reporting server to which all users have access. Alternatively (with SQL Server 2005) you could use peer-to-peer model to combine all data on all servers but only allow updates of the key range assigned to each server.

This was first published in October 2005

Dig deeper on Microsoft SQL Server 2005

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchBusinessAnalytics

SearchDataCenter

SearchDataManagement

SearchAWS

SearchOracle

SearchContentManagement

SearchWindowsServer

Close