Get started Bring yourself up to speed with our introductory content.

An introduction to SQL Server 2016 dynamic data masking

An expert provides a step-by-step guide for how to use dynamic data masking in SQL Server 2016 CTP2 and Azure SQL database to protect sensitive data.

Dynamic data masking is one of many new security features coming in SQL Server 2016. It is also in the Azure SQL...

database V12 preview. This feature lets you hide part of the data from normal users. For example, an application may display a Social Security Number (SSN) and mask out all but the last four digits of the number. Instead of displaying 123-45-6789, it would display XXX-XX-6789. If a privileged user is viewing the information, the masking is not applied. This is not encryption. The data is not masked on the storage device; it is masked or obfuscated on output based on whether the caller is normal or privileged. This is not a total security solution, but it can be deployed as part of your security strategy.

Dynamic data masking is useful in a couple of scenarios. The first is probably obvious: for regulatory compliance. Whether the regulation is governmental or dictated by an industrial body, protecting the display of sensitive data is often demanded or considered a best practice. Another scenario where this feature could be leveraged is internally with developers. It is not uncommon for developers to request real data for testing and development purposes. Often, when supplying data for this situation, you need to scrub sensitive data, whether through scrambling or actual removal. Scrubbed data can affect the effectiveness of testing by skewing query results. By applying dynamic data masking in this scenario, the data can be protected while providing a more realistic test bed. Often, static data masking, where data is stored in a masked format instead of only being masked upon output, is used in this situation, but this method introduces other problems, such as breaking data integrity.

In this article, I'll provide an example of how to set up dynamic data masking in SQL Server 2016 CTP2 and in Azure SQL database V12. An appealing characteristic of dynamic data testing is that it can be applied by adding to a table's definition without having to rewrite application code.

Setting up dynamic data masking

This article is based on the publicly available SQL Server 2016 CTP 2.2 release. The first step in dynamic data masking is to create a database like the one shown in Figure 1.

Create a table for testing

Figure 1. An example database for testing data masking.
Figure 1. An example database for testing data masking.

Insert a few rows for testing

Figure 2. Add rows of data.
Figure 2. Add rows of data.

Verify that the data is there

Figure 3. Check to make sure the data was inputted.
Figure 3. Check to make sure the data was inputted.

Because no masking rules are defined, all data is visible.

Create a test user and test permissions to the table

Figure 4. The employee table without masking.
Figure 4. The employee table without masking.

Because no masking has been defined, the TestUser login sees everything.

Masking is part of the table definition in SQL Server 2016

Figure 5. The employee table with a mask for HomePhone.
Figure 5. The employee table with a mask for HomePhone.

The Employee table definition was modified to add a customized mask to the HomePhone column (Figure 5). The function used in this case is the partial function. The partial(0,"XXX-XXX-",4) means to grab zero leftmost characters from the column to start with, use the string defined, and then display the last four characters of the column. There is nothing inherently meaningful about the Xs or the hyphens. The mask string was crafted to be similar to the data already in that column. Let's see what happens when the WorkPhone column is masked with a slightly different string (Figure 6).

Figure 6. The employee table with masks for HomePhone and WorkPhone.
Figure 6. The employee table with masks for HomePhone and WorkPhone.

The definition of the mask for WorkPhone is similar to the definition of the mask for HomePhone except for the characters in the string within the partial function.

Now that it is shown that there are mask definitions on each of the phone columns, can a query be run against the table referencing a masked column as the search criteria? Let's find out.

Figure 7. Search query on masked data.
Figure 7. Search query on masked data.

Since the mask is applied on display of the data, the ability to search on a masked column is not affected.

Let's add a mask to SSN.

Figure 8. Employee table with masks for HomePhone, WorkPhone and SSN
Figure 8. Employee table with masks for HomePhone, WorkPhone and SSN

There are also a few predefined functions to add a mask to a column. For example, there is a predefined mask for email. Let's take a look at it.

Figure 9. Employee table with masks for HomePhone, WorkPhone, SSN and workemail.
Figure 9. Employee table with masks for HomePhone, WorkPhone, SSN and workemail.

Notice that the predefined mask for email address displays the first character of the column, but the rest of the mask is XXX@XXXX.com to obfuscate the entire domain name (Figure 9).

Does a mask cause a sort to not work properly? Let's find out.

Figure 10. Sort query on employee table.
Figure 10. Sort query on employee table.

It is sorted properly, but the first character of the mask is presented (Figure 10). What if we took that away and applied a custom mask to the email address?

Figure 11. The employee table sorted with the custom mask.
Figure 11. The employee table sorted with the custom mask.

As shown above, the data is still sorted properly. Also note that I dropped the mask from the column and added a customized mask using the partial function. Again, the mask is being deployed upon display.

At this point, I've shown that a masked column can be used in a where clause for searching for specific records and that correct sorting is occurring. If a developer wanted to get at the unmasked data, could this be done (Figure 12)?

Figure 12. Unmasking the data.
Figure 12. Unmasking the data.

I needed to provide additional permissions to TestUser on the dbo schema, but the SELECT INTO copied the unmasked data into another table. The unmasked data is now in that table.

Dynamic data masking in Azure SQL database v12 (preview)

In Azure SQL database v12, the portal is used to set masking for columns in a GUI. Here, I'll use the Azure preview portal to demonstrate dynamic data masking setup. I assume you know how to create and access an Azure SQL database from SQL Server Management Studio (SSMS). Note that if you are using a "Downlevel Client" (TDS 7.3 and below) for accessing the data, you will want to use a modified connection string from servername.database.windows.net to servername.secure.windows.net.

The first step is to create a table in the Azure SQL database using the same syntax used in the SQL Server 2016 example (Figure 1). The syntax is shown again in Figure 13.

Figure 13. Syntax for creating the employee table.
Figure 13. Syntax for creating the employee table.

I'll also insert the same data used in Figure 1 into the Azure SQL Database table. Use the same INSERT statements from Figure 2 and check to see that the data is correct, as shown in Figure 14.

Figure 14. Verifying data in Azure SQL Database.
Figure 14. Verifying data in Azure SQL Database.

Defining a mask

From the Azure preview portal, select Browse ALL / SQL databases / <databasename> / Dynamic Data Masking (preview). This will display a summary of any masking rules already defined. From this point, select Add Mask. Select the HomePhone choice in the Column dropdown. Choose Custom String from the Masking Field Format dropdown. Use 0 for Exposed Prefix, XXX-XXX- for the Padding String, and 4 for the Exposed Suffix. Click Save on the Add Masking Rule blade and then Save again on the Dynamic Data Masking (preview) blade. The portal will display a message to indicate the changes are saved and ask you to acknowledge by clicking OK.

Figure 15. Create and add the HomePhone masking rule in Azure SQL Database.
Figure 15. Create and add the HomePhone masking rule in Azure SQL Database.

Let's look at the employee table to see if it worked. Note that it might take a minute or two for the masking rule to be applied fully.

Figure 16. The employee table with the masking rule for HomePhone applied.
Figure 16. The employee table with the masking rule for HomePhone applied.

The masking worked in the same manner in Azure SQL database as it did in SQL Server 2016 CTP2. At this point, it is clear that other masking rules can be defined as I did earlier but through the GUI. Currently, it appears that the GUI in the portal must be used to define masking rules because when I used the DDL to add a mask, the syntax appeared to be accepted, but the output was unchanged.

Some common masking rules are built in to Azure SQL database. One of these is for Social Security Number. Let's add it to the employee table's SSN column, as shown below.

Figure 17. Use the built-in masking rule for social security numbers.
Figure 17. Use the built-in masking rule for social security numbers.

The results are below in Figure 18.

Figure 18. The employee table with the social security number masking rule applied.
Figure 18. The employee table with the social security number masking rule applied.

Although the data is masked, it is not necessarily in the format that may have been desired. This mask would work fine for a SSN column that was wider and had the dashes included in the data instead of inferred. Figure 19 shows the masking rule for using the Number (random number range) option for the EmpID for the employee table.

Figure 19. The masking rule for Number (random number range) for the employee table.
Figure 19. The masking rule for Number (random number range) for the employee table.

Obviously, a random number mask for EmpID is not realistic, but it shows the built-in mask feature.

Also, note that there is a field on the dynamic data masking blade for privileged logins. These are logins that do not apply masking rules when displaying the data. A login would need to be created and another connection to the Azure SQL database established to demonstrate this. This is somewhat different from how this feature was presented earlier in the article where a user (without a login) could easily be created and impersonated for the demonstration.

About the author
Rick Heiges is a SQL Server MVP and Principal Solutions Architect for DB Best Technologies LLC. He works with customers educating them about the Microsoft Data Platform to solve business problems and to maximize the value of their data. Rick is very involved in the SQL Server Community via PASS where he spent nine years on the Board of Directors. You can read his blog on and follow him on Twitter @heigesr2.

Next Steps

Find out how to implement row-level security in Azure SQL Database

Learn about Always Encrypted, a new security feature for SQL Server 2016

This was last published in September 2015

Essential Guide

SQL Server 2016 release guide: News and analysis on the new version

Join the conversation

3 comments

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

Which of the two methods of implementing dynamic data masking do you plan to use?
Cancel
Great article, thanks!
In addition, I can recommend reading this one, on the same topic: http://www.sqlshack.com/new-features-in-sql-server-2016-dynamic-data-masking/
Cancel
Recently, CTP 2.4 was release which has a slight change in behavior for the SELECT INTO example. Read more about it here - http://sqlblog.com/blogs/rick_heiges/archive/2015/10/06/update-on-my-dynamic-data-masking-article.aspx
Cancel

-ADS BY GOOGLE

SearchBusinessAnalytics

SearchDataCenter

SearchDataManagement

SearchAWS

SearchOracle

SearchContentManagement

SearchWindowsServer

Close