Loading data into a Master Data Services repository

Configuration is an important first step for setting up Master Data Services. But what about loading the data? Here is the step-by-step process for R2.

In a previous article we looked at how to create and define Master Data Services (MDS) in order to have a single

authoritative source of data within an enterprise.  In this article, we will expand on this concept by taking our configured MDS and loading data into it, allowing that data to be consumed by other applications.

Samples in this article will use a Master Data Services model called Adventure Works, which has a single entity called Employee.  Within this entity there are three attributes known as LastName, FirstName and EmployeeId.  We will use the sample data provided in the AdventureWorks 2008 database to load data into Master Data Services.

Loading data is done by inserting tables into a few tables within the Master Data Services repository.  These tables include:

  • mdm.tblStgMember: Stages leaf members, consolidated members and collections.
  • mdm.tblStgMemberAttribute: Updates attribute values.
  • mdm.tblStgRelationship: Used to move members within explicit hierarchies and add members to collections.

We will use T-SQL to load the data into these tables.  In a larger data import, however, you can use SQL Server Integration Services (SSIS) or another ETL tool to import the data.

Loading the data

First we need to put rows into the mdm.tblStgMember table. This tells the import process which entities to load data for. Below is the sample code:

INSERT INTO mdm.tblStgMember
(ModelName, HierarchyName, EntityName, MemberType_ID, MemberName, MemberCode)
VALUES
('Adventure Works', NULL, 'FirstName', 1, '', 1),
('Adventure Works', NULL, 'LastName', 1, '', 2),
('Adventure Works', NULL, 'EmployeeId', 1, '', 3)

After running this T-SQL code, query the mdm.tblStgMember table and notice that the records have been inserted with a Status_ID of 0, as shown in Figure 1.  If we look at the MSDN document about the mdm.tblStgMember table we can see what the various values of the Status_ID column mean.

Figure 1. The mdm.tblStgMember table (click to enlarge)
mdm.tblStgMember table

The mdm.tblStgMemberAttribute table is the second table we need to load data into.  This tells the import process the values for the data you wish to import.  When looking at this table, you’ll notice that it looks a lot like a named value pair table.  In this case, we are importing values for two employees into the mdm.tblStgMemberAttribute table.  When looking at the queries to load data into the mdm.tblStgMemberAttribute and mdm.tblStgMember, you’ll notice that the MemberCode values appear to be arbitrarily assigned and should be unique within the loading batch.

INSERT INTO mdm.tblStgMemberAttribute
(ModelName, EntityName, MemberType_ID, MemberCode, AttributeName, AttributeValue)
VALUES
('Adventure Works', 'Employee', 1, 1, 'FirstName', 'Bill'),
('Adventure Works', 'Employee', 1, 2, 'LastName', 'Gates'),
('Adventure Works', 'Employee', 1, 3, 'EmployeeId', '1')

INSERT INTO mdm.tblStgMemberAttribute
(ModelName, EntityName, MemberType_ID, MemberCode, AttributeName, AttributeValue)
VALUES
('Adventure Works', 'Employee', 1, 1, 'FirstName', 'Steve'),
('Adventure Works', 'Employee', 1, 2, 'LastName', 'Balmer'),
('Adventure Works', 'Employee', 1, 3, 'EmployeeId', '2')

If we query the mdm.tblStgMemberAttribute, we can see the data that was just included in the table, as shown in Figure 2.   You can review the various Status_ID values from the MSDN document about the mdm.tblStgMemberAttribute table.

Figure 2. Reviewing imported data (click to enlarge)
Reviewing imported data

If you are using hierarchies within your master data, you’ll need to load data into the mdm.tblStgRelationship table. Since the sample used in this article does not use hierarchies, refer to the MDS Team’s blog post on the subject, as well as the MSDN document about the mdm.tblStgRelationship table.

Processing the data

After staging the data, we need to alert Master Data Services to begin processing it by running the mdm.udpStagingSweep stored procedure. This procedure uses three parameters: the UserId of the user who is staging the batch, the VersionId of the model to be processed, and a value that tells MDS if the data should be processed immediately or at a specified interval. The specified user must be a model administrator entered into the Master Data Services web-based application. The code below shows how to run the process immediately.

DECLARE @ModelName NVARCHAR(50) = 'Adventure Works'
DECLARE @UserName NVARCHARM(50)= 'CORP\dcherry'
DECLARE @UserId INT
DECLARE @VersionId INT

SET @UserId =  (SELECT ID
                  FROM mdm.tblUser u
                  WHERE u.UserName = @UserName)

SET @VersionId =  (SELECT MAX(ID)
                     FROM mdm.viw_SYSTEM_SCHEMA_VERSION
                     WHERE Model_Name = @ModelName)

EXEC mdm.udpStagingSweep @UserId=@UserId, @VersionId=@VersionId, @Process=1

In order to process the data at the next specified interval, change the @Process parameter from 1 to 0.

In order to verify that the batch was created correctly for processing, query the mdm.tblStgBatch table. A value of 2 in the Status_ID column means the processing of your batch has failed, while a value of 1 in the Status_ID column indicates success.

You can view the status of each row being processed by looking at the Status_ID and ErrorCode columns of the mdm.tblStgMember and mdm.tblStgMemberAttribute tables. A comprehensive list of the error codes that may occur in the MDS Staging tables can be found on the KnightReign blog post and this MSDN forum thread.

Any processing errors will also be logged into the ERRORLOG file on the server that is hosting your MDS database. These errors will be shown as output messages from an activated service broker procedure, as shown in Figure 3.

Figure 3. Output messages for MDS errors (click to enlarge)
Output messages from MDS errors

Validating the data

After processing the data, you have the option of validating it using the mdm.udpValidateModel stored procedure. This procedure tables four parameters: UserId, ModelId, VersionId and StatusId. The Status for completed processing is a value of 1. The below sample code will validate that the batch of recently processed data.

DECLARE @ModelName NVARCHAR(50) = 'Adventure Works'
DECLARE @ModelId INT
DECLARE @UserName NVARCHAR(50)= 'CORP\dcherry'
DECLARE @UserId INT
DECLARE @VersionId INT

SET @UserId =  (SELECT ID
                  FROM mdm.tblUser u
                  WHERE u.UserName = @UserName)

SET @ModelId = (SELECT Model_ID
                  FROM mdm.viw_SYSTEM_SCHEMA_VERSION
                  WHERE Model_Name = @ModelName)

SET @VersionId = (SELECT MAX(ID)
                    FROM mdm.viw_SYSTEM_SCHEMA_VERSION
                    WHERE Model_ID = @ModelId)

EXECUTE mdm.udpValidateModel @UserId, @ModelId, @VersionId, 1

ABOUT THE AUTHOR
Denny Cherry has over a decade of experience managing SQL Server, including MySpace.com's over 175-million-user installation, one of the largest in the world. Denny's areas of expertise include system architecture, performance tuning, replication and troubleshooting. He currently holds several Microsoft certifications related to SQL Server and is a Microsoft MVP.

This was first published in April 2010

Dig deeper on SQL Server Business Intelligence Strategies

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchBusinessAnalytics

SearchDataCenter

SearchDataManagement

SearchAWS

SearchOracle

SearchContentManagement

SearchWindowsServer

Close