The first two parts of this short series on SQL Server 2008 R2’s Master Data Services (MDS) feature focused on getting started with MDS and loading data into the repository. In this article, we’ll switch gears and look at some options for creating your own master data management system without upgrading to SQL Server 2008 R2. We will focus on existing technologies and techniques that will allow you to use just about any version of Microsoft SQL Server as your master data repository.
The first thing you need to do before selecting a technique is decide which system will become the master system for your enterprise. For something like employee data, this will typically be your human resources system. Customer data will use your sales system or customer relationship management (CRM) system. This step is extremely important, as an incorrect master system selection will cause data collisions and conflicts that can put incorrect values into systems as authoritative data.
Basic techniques for master system setup
The easiest way to set up a master system is to follow the basic premise of moving the source data into all of the systems that need the master data. For instance, if your customer service system needs your employee data, copy the data into the needed tables from the source database to the destination database.
How you copy this data from system to system then becomes the key. Each of your destination systems probably has its own primary key that is different from your source system. For example, if you are replicating your employee data from your human resources system to your sales and CRM systems, the records that are already entered into your sales and CRM systems probably don’t match the employee ID values in the human resources database. To make this work, you’ll need somewhere to store the primary key from the human resources system in the sales and CRM databases. Depending on your database schema, you may already have a column you can use. If not, a column will need to be added to store this value.
Once you have identified where the primary key from your source table will be stored -- in this case the employee ID -- you need to identify the mapping of the other columns that will be replicated from the source system to the destination system. This is when you ensure that fields like employees’ first names are mapped to the first name field and team hierarchies, and are set up correctly. Once the mappings are done correctly, the data delivery between your systems needs to be identified, including your data replication frequency.
For information like employee data, you may decide that you only need to replicate data daily or faster, while customer data from your sales to CRM systems need to be replicated in near real time. Your data replication options depend on your version of SQL Server, as well as the database platforms used for your source and destination systems. If you work in a shop that runs only Microsoft SQL Server, then your options are a little more open. If you have to copy data from or to Oracle, IBM DB2 or Microsoft Access, however, then your options are much more limited.
Distributing your data
If you are working in an all-SQL Server shop and each of the SQL Servers in question is SQL Server 2005 or higher, then you will want to look at SQL Server Service Broker. This will provide you with the most flexibility to do data transformations. The easiest way to do this is to place a trigger on the source table or tables that need to be replicated. This will ensure that any time an insert, update or delete is done to the source tables the change is moved into the SQL Server Service Broker queue and transferred to all the destination databases. When the data is received at the source database, the data is removed from the queue by an activated stored procedure, transformed into the destination systems schema and loaded in.
If you are working in a multi-platform environment, SQL Server Service Broker won’t be able to deliver data to or from other data platforms. Therefore, another data delivery platform will need to be used. The next easiest choice is SQL Server replication. If your distribution database is SQL Server 2005 or higher, you can use this platform to replicate data to or from an Oracle database. If you are replicating data from SQL Server to a database other than SQL Server, versions as low as SQL Server 2000 can be used as your distribution database.
When replicating for SQL Server 2000, the older version gives you an advantage over newer versions. This is because SQL Server 2000 replication allows for the use of Data Transformation Services (DTS) to easily transform data from one schema to another. You can use DTS to transform data using replication with SQL Server 2005, but this is more complex to configure as it requires the replication to be configured using T-SQL stored procedures instead of the replication UI.
After using replication to get the data from the source database to the destination database, there are a couple of options for merging the data from the replicated table to the production table. The fastest option is to place a trigger on the table that you are replicating the data into. That way as soon as the data is replicated from the source system, the data is transformed into the final schema. This can be done when replicating to SQL Server, Oracle or any other platform that supports triggers.
No matter what solution you choose, you can potentially have a platform that gives you reliable data moving from your authoritative database to your line of business applications. Like any database design decision, proper planning and careful thought are the keys to a successful master data solution.
ABOUT THE AUTHOR
Denny Cherry has over a decade of experience managing SQL Server, including MySpace.com's over 175-million-user installation, one of the largest in the world. Denny's areas of expertise include system architecture, performance tuning, replication and troubleshooting. He currently holds several Microsoft certifications related to SQL Server and is a Microsoft MVP. Check out his blog at SQL Server with Mr. Denny.