Choosing between ata encryption and data hashing is a fairly new concept for the SQL Server database administrator and developer. Yet, network and software engineers have been dealing with data encryption at the network level and from the Web server to the client for many years. While we've had the capabilities to encrypt data for some time, whether it be encrypting within the client application or through third-party DLLs attached to SQL Server as an extended stored procedure, Microsoft didn't include native encryption and hashing technologies in Microsoft SQL Server until SQL Server 2005.
When planning your encryption or hashing solution, first you need to decide if you will be storing an encrypted version of the data or a hashed copy of the data. The difference is that encrypted data can be decrypted, while hashed data cannot be decrypted. The deciding factor when choosing to encrypt or hash your data comes after you determine if you'll need to decrypt the data for offline processing.
A typical example of data that needs to be decrypted would be within a payment processing system that has client credit card accounts in it that you need to be able to charge' monthly. The credit card number should be stored using a strong two-way encryption so that the credit card number can be decrypted when the card needs to be charged.
A typical example of data that should be hashed is the password field of a Web-based application. While you must
Setting up your encryption hierarchy
First, decide if you should be using a symmetric key or asymmetric key. Without delving too deeply into the key structures, the high-level difference between the two is that asymmetric keys are secured with the database master key, while the symmetric keys are secured with a certificate, which is then secured with the database master key. The image here, provided by Microsoft and published in SQL Server 2005 Books OnLine, is an excellent diagram of data encryption methods.
Figure 1: Data encryption methods.
When using an asymmetric key, you can select from three encryption algorithms: RSA_512, RSA_1024 and RSA_2048. The length of the private key corresponds to the encryption algorithm selected. When creating the asymmetric key, you can select a strong-name file, executable, or .NET assembly to use as the source of the public key.
When using a symmetric key, you have more algorithm options available to select from: DES, TRIPLE_DES, RC2, RC4, RC4_128, DESX, AES_128, AES_192 and AES_256. When creating a symmetric key, you can secure the key with one or more than one certificate, password, symmetric key or asymmetric key. When you secure symmetric keys with a password, the password is encrypted with the TRIPLE_DEC encryption.
If you secure the symmetric key with a password and select an encryption algorithm greater than TRIPLE_DES, you are encrypting the key with a lower encryption level than that of the data being protected. Note: The AES encryption is not available on Windows XP or Windows 2000 servers.
There are several options available for hashing data: MD2, MD4, MD5, SHA and SHA1. I recommend using MD5 or SHA1. MD2 and MD4 are older hashing algorithms that can yield no perfect hashes of the data.
With both asymmetric and symmetric keys, the higher the encryption level the more CPU time it will take to encrypt and decrypt data. Account for this cost when selecting your encryption algorithm. Databases that do a large amount of data encrypting and decrypting and do not require high levels of encryption should use lower encryption levels. That way, the encryption does not overtax the processor on the server.
When encrypting and decrypting data with a symmetric key, the system functions EncryptByKey and DecryptByKey are used, and the functions are fairly basic. The EncryptByKey function accepts four parameters of which two are required and two are optional. The required parameters are the first two that are the GUID of the symmetric key to be used and the data to be encrypted. The third and fourth parameters are optional and relate to whether authenticator data should be added to the value to further mask the data and what that value should be. You can make the same value appear differently in different records by changing the authenticator. When using an authenticator, it's recommended to use the primary key for the record you are working with, as the primary key should not change.
When encrypting and decrypting data with an asymmetric key, the system functions EncryptByAsymKey and DecryptByAsymKey are used. The EncryptByAsymKey function accepts two required parameters. The parameters are the key of the asymmetric key and the text to be encrypted.
The DecryptByAsymKey function accepts three parameters: the key of the asymmetric key, the encrypted value and the password for the asymmetric key. Microsoft notes that asymmetric keys require much more CPU power to perform the process than symmetric keys and should be used when appropriate -- usually not when dealing with multiple rows of data at a time.
When hashing data, the HashBytes function is used. It accepts two parameters: The first is the hashing algorithm to use and the second is the data to be hashed. Hashing data is a one-way process and there is no way to reverse the process.
When encrypting or hashing data, the system functions return data in the varbinary format, not the same format that the data is passed in as. This returned data can be longer than the prior text, so take that into account when designing your tables.
Let's look at some sample code for hashing and encrypting.
First, we'll encrypt and decrypt some data using a symmetric key. We will create a symmetric key using the CREATE SYMMETRIC KEY command, if the requested key does not already exist.
IF NOT EXISTS (SELECT * FROM
sys.symmetric_keys WHERE name =
CREATE SYMMETRIC KEY PrivateData WITH
ALGORITHM = AES_256
ENCRYPTION BY PASSWORD='Pa$$w0rd'
Now that the key has been created, we need to open the symmetric key. If the key is not opened, it cannot be used. We'll also create a table to hold our encrypted data.
OPEN SYMMETRIC KEY PrivateData DECRYPTION
BY PASSWORD ='Pa$$w0rd'
CREATE TABLE UserData
Next, we can insert a record into our table. First, we declare a variable to hold the GUID of the smmetric key and find the GUID in the sys.symmetric_keys system view. We then encrypt the text value RawData and insert it into the table using the INSERT command and the EncryptByKey function.
DECLARE @Key_GUID UNIQUEIDENTIFIER
SELECT @Key_GUID = Key_GUID
WHERE Name = 'PrivateData'
INSERT INTO UserData
(EncryptByKey (@Key_Guid, 'RawData'))
We can now verify that our data was encrypted and view the decrypted data to ensure that it was encrypted correctly. The first record set will be the encrypted value, while the second will show the actual data.
SELECT CONVERT (VARCHAR (20),
DecryptByKey (UserData.Data, 1)) AS Data
Now we must close our symmetric key and drop the table we are using.
CLOSE SYMMETRIC KEY PrivateData
DROP TABLE UserData
Hashing data requires much less code because you do not need to create, open and close the hash like you do with a symmetric key.
First, we create a table to store the hash in.
CREATE TABLE Users
(Username varchar (255) PRIMARY KEY,
UserPassword varbinary (512))
Next, we insert a user account into the table.
INSERT INTO Users
('BillGates', HashBytes('MD5', 'ThisIsMyPa$$w0rd'))
To view the hashed password in the table:
Next, we write a query to validate the password against user supplied text. In verifying this password, we can see that the password does not match. If we change the password that we're checking to another value, say, ThisIsMyPa$$w0rd, which was hashed and inserted into the table, we see the Password Matched value returned.
IF EXISTS (SELECT * FROM Users WHERE
UserName = 'BillGates' AND UserPassword =
SELECT 'Password Matched'
SELECT 'Password Did Not Match'
It's time to clean up our table.
DROP TABLE Users
Before you migrate an existing SQL Server system from using non-encrypted values to encrypted values, think about it carefully because it's a very complex task. You cannot simply tell SQL Server to
encrypt the data within your existing table because the data type will not be correct. In addition, most likely the table isn't long enough to hold your existing data. Take care, too, when selecting the correct encryption or hashing algorithm(s) for your database, as not all methods fit all purposes.
While encryption and hashing are fairly new to SQL Server, they are an extremely important part of database design. Current and proposed state and federal laws require that consumer data be encrypted to protect the consumer in the event of a data breach. Many of these laws, such as the California Online Privacy Protection Act of 2003, give guidelines on what should be encrypted, but not to what level that data should be encrypted. While companies are not required to encrypt data under normal operations, if the servers are breached and the data is downloaded, under these laws, you may be required to notify customers that their data may have been viewed by a third party. This not only causes company embarrassment, but can also lead to a loss of customers and revenue due to the data breach.
You and your staff or outside counsel should analyze encryption laws to see how they apply to your organization.
ABOUT THE AUTHOR
Denny Cherry has over a decade of experience managing SQL Server, including MySpace.com's more than 175-million-user installation, one of the largest in the world. Denny's areas of expertise include system architecture, performance tuning, replication and troubleshooting. He uses these skills in his role as a senior database administrator and architect at Awareness Technologies. Denny is a longtime member of PASS and Quest Software's Association of SQL Server Experts and has written numerous technical articles on SQL Server management.
This was first published in December 2007