Analysis Services 2005 (and later versions) support synchronization, a feature that greatly simplifies migrating...
analytical databases among servers. It also eases the separation of querying and processing duties among multiple instances of Analysis Services. In this tip, I'll introduce you to how synchronization is implemented and how to use this powerful feature for scaling your analytical applications.
First of all, if you worked with Analysis Services 2000 (and earlier versions), you'll recall that deploying changes to your cube design involved backing up a database on the development server and restoring it to the production server. An unsupported alternative was to copy files found in the "data" folder under Analysis Services installation from development to production server, overriding the existing files. Both options were cumbersome and had many limitations.
Developer servers often don't have the same security requirements as the production servers, so database administrators have to re-configure security after deploying a cube's structural changes. Furthermore, development and production cubes are normally populated from different data sources and, typically, relational databases in development environments have only a small subset of all production data. So, if you employed either of the methods mentioned above, you'd likely have to re-process your cubes, and that could take hours, days or even weeks depending on your data volume.
Implementing Analysis Services synchronization
Synchronization is a fairly simple concept resembling the snapshot replication feature available with a SQL Server relational database engine. Synchronization copies the data files from the source server to the destination server. For example, you could synchronize an Adventure Works sample database from server A to server B. If server B (destination) does not have this database yet, it will be created during synchronization. If server B already has the Adventure Works database, it will be overwritten.
If the destination database exists, it remains online while you synchronize it, so your users can continue querying the database. Analysis Services creates a separate folder in the data directory of the destination server. This folder will have a globally unique identifier (GUID) as its name -- for example 761A1D04B1C476A9886 -- and will contain only files that were modified since databases were last synchronized. In other words, synchronization is always incremental.
Note, however, that before incremental changes can be transferred you must first synchronize the entire database. Once you synchronize a primary database from a secondary database, you will only have to copy a subset of all files for subsequent synchronizations. The first synchronization must copy all files, even if the existing copies of data on both servers are identical. Keep in mind that since all files must be copied for the initial synchronization, you will need plenty of disk space to store the existing database files and those files that are being copied from the standby server.
After all necessary files are copied to the synchronization folder, Analysis Services deletes the existing folder, storing the current database files, and replaces it with the folder containing synchronized files. The change occurs very fast because it's a matter of renaming the folder from GUID to the database name (and version number). Users can continue querying the primary database, although queries will fail for a brief period while the folder is being renamed.
Synchronization is implemented through the SYNCHRONIZE XMLA command, which has the following syntax:
For example, the following command synchronizes a test database between two instances of Analysis Services:
By default, all partition data is copied to the data\database_name folder on the destination instance under Analysis Services installation directory. However, you can override this default behavior by using a
You can run the SYNCHRONIZE command much like any other XMLA command through SQL Server Management Studio (SSMS) or through ASCMD utility. ASCMD utility is provided as a sample application and can be found under \Program Files\Microsoft SQL Server\90\Samples\Analysis Services\Administrator\ascmd directory. You can use this utility to run XMLA, MDX or DMX (Data Mining eXpressions) queries.
In addition, you can synchronize Analysis Services databases using a wizard within SSMS instead of composing the entire SYNCHRONIZE command manually. To invoke the Synchronize Database wizard, connect to the destination server, right click on the Databases folder within SSMS and choose Synchronize. The wizard allows you to choose source server and database; destination will always be the server where you run the wizard. Although the wizard allows you to review cube partitions, it doesn't allow you to change their locations.
If you wish to change the destination directory for your partitions, you must edit the SYNCHRONIZE command created by the wizard.
You have three options for configuring security: Copy All, Skip Membership or Ignore All. The Copy All option transfers role definitions as well as role members when synchronizing databases. The Skip Membership option creates database roles, but does not add members to these roles. Finally, Ignore All does not transfer any part of security configuration.
Remember, Analysis Services 2005 supports a server administrator role and database roles. Synchronization works at a database level and can only transfer roles and their membership at that level. If you need to synchronize multiple databases, you must run multiple SYNCHRONIZE commands.
Synchronize Database wizard allows you to choose whether you want Analysis Services to apply compression when synchronizing databases. Compressed databases can be synchronized faster. Finally, you can choose whether you wish to synchronize databases immediately or save the SYNCHRONIZE command the wizard generates to a script file and execute it some other time. Of course, once you save the script, you can edit the command to suit your needs before executing it.
Synchronization typically doesn't use many processor cycles -- normally you could expect processor use to go up by 5% to 15%. However, depending on the size of the database you're synchronizing, it could add considerable overhead to the disk subsystem since Analysis Services has to copy numerous files to the target server. To minimize the disk overhead, synchronize databases during periods of limited querying activity.
Exploiting Analysis Services synchronization
You can use the SYNCHRONIZE command to:
- Have a standby Analysis Server dedicated to processing. The primary production server will be dedicated to user queries. After you process cubes on the standby server, you can transfer modified data files to the production server.
- Deploy changes from the development environment to production servers. You can include the security settings that exist on the development database or keep those settings that are defined in production. If the development database only has a subset of production data, then you will need to reprocess the production database once synchronization is complete.
- Keep multiple copies of the same database on multiple servers for scaling out your analytical solution. If there are thousands of users querying your cubes, you can direct a subset of users to each server.
- Keep multiple copies of the same database on multiple servers to assure high availability. Should any of your servers experience hardware issues, you could redirect users' queries to other servers.
- Copy the production database to development or quality assurance servers when you need to troubleshoot performance or data accuracy issues.
ABOUT THE AUTHOR
Baya Pavliashvili is a DBA manager with HealthStream, the leader in online healthcare education. In this role, he oversees database operations supporting over one million users. His primary areas of expertise include performance tuning, replication and data warehousing. You can reach Pavliashvili at firstname.lastname@example.org.