Tip

SQL Server blob data types for accessing and storing data

Data types used for storing blob data

Blob data is a field that holds large amounts of data per record. This data can be either in a text format or a binary format. When working with blob data in SQL Server, the amount of data per record is typically well beyond the 8K of data that can fit within a single record.

There are various data types used for storing blob data and they've changed over time. The data types available for blob storage have been static for many years, but in SQL Server 2005, Microsoft introduced many new data types. In SQL Server 2008, Microsoft plans to introduce even more data types, but, to date, they haven't been introduced in the CTP release.

    Requires Free Membership to View

 

In SQL Server 2000 and earlier, we were limited to TEXT, NTEXT and IMAGE data types. The TEXT data type is used to store non-Unicode blob data, while the NTEXT data type is used to store Unicode blob data. The IMAGE data type is used for storing binary blob data. You can use IMAGE data type to store any binary data, such as images, office documents, compressed data, etc. Because the data is in a binary format, it can be either Unicode or non-Unicode.

SQL Server 2005 data types are based on VARCHAR, NVARCHAR, VARBINARY and XML. When using these data types, normally you would specify the maximum size of the data to be stored. To store blob data in these fields, you specify the MAX field size. These data types allow you to store up to 2 GB of data per record (2^31 bytes for character and binary data, 2^30 bytes of Unicode data). Microsoft plans to remove TEXT, NTEXT and IMAGE data types in a future version of Microsoft SQL Server and their use should be transitioned to the new VARCHAR(MAX), NVARCHAR(MAX),VARBINARY(MAX) and XML data types.

CREATE TABLE BlobData
(ID int PRIMARY KEY,
FirstName nvarchar(30),
LastName nvarchar(30),
EmailAddress varchar(255),
[Resume] nvarchar(MAX))

In SQL Server 2008, it's rumored that we'll get a new data type called the FILESTREAM data type, which can store binary data such as office documents outside of the normal database files. By storing this data outside of the normal MDF and NDF data files, we can store it on separate hard drives to minimize the performance impact of accessing this blob data on the rest of the database.

How is SQL Server blob data stored and accessed?

When using the TEXT, NTEXT and IMAGE data types, data is stored outside of the normal row. A small binary value, called a pointer, is stored in the row. This binary value points to the location within the data files where the data is actually stored.When storing data of the VARCHAR(MAX), NVARCHAR(MAX), VARBINARY(MAX) and XML data types, if the data is less than the 8K maximum size of the row, the data is stored within the page. However, if the data is greater than the 8K maximum size of the row, it is stored the same way that it is stored for the legacy data types, and it's accessed via a binary pointer value.

Note: This binary value and location is all done behind the scenes and is invisible to the user.

In order to read data from the TEXT, NTEXT and IMAGE data types, the READTEXT command is used. The command accepts a column name as well as a 16 byte binary pointer, the first character to read from and the number of characters to read. It uses the TEXTPTR function to obtain the correct binary pointer, which is used to locate data in the physical file:

DECLARE @ptr varbinary(16);
SELECT @ptr = TEXTPTR(LongText)
FROM TextDataExample
where TextID = 1
READTEXT TextDataExample.LongText @ptr 1 50;

Like the READTEXT command, there is also a WRITETEXT command for inserting data into a record. Like the READTEXT command, the WRITETEXT command uses the TEXTPTR function to locate the data within the physical files:

DECLARE @ptr varbinary(16);
SELECT @ptr = TEXTPTR(LongText)
FROM TextDataExample
where TextID = 1
READTEXT TextDataExample.LongText @ptr 'Sample data';

The new data types, unlike their predecessors can be easily accessed without using the READTEXT and WRITETEXT commands. Part of the reason these new data types have become so popular so quickly is because they do not require additional commands in order to use them. You can use these data types with the normal SELECT, INSERT and UPDATE DML, which we normally use to manipulate data. Unfortunately, because the FILESTREAM data type has not yet been released, no sample code can be published about using it.

Cost efficiency comes with tiered storage

Using a storage design called tiered storage is an easy way to decrease the total cost per byte of your SQL Server blob storage. With the tiered storage technique, highly accessed data is located on faster, more expensive storage. Less active data is located on slower, less expensive storage. Typically, blob data is accessed infrequently, and because of the amount of information stored in blob data columns, using tiered storage is often the most cost efficient storage method. When creating a less expensive tier of storage, larger slower hard drives are used to create a second RAID array, and this RAID array is used to house a second file group, which contains the BLOB data.

Tiered storage is used by telling the SQL Server to store the blob data in a separate file group from the row data. This assignment takes place when the table is created as part of the CREATE TABLE command using the TEXTIMAGE_ON parameter. All blob data for a SQL Server table is stored in a single file group. If you want to split blob data between two files, you must horizontally partition your table into two tables and specify each with a different file group for the TEXTIMAGE_ON parameter:

CREATE TABLE TextDataExample
(TextID int IDENTITY(1,1),
TextDescription varchar(25),
LongText TEXT)
ON [PRIMARY] TEXTIMAGE_ON [TextDataFileGroup]

Unfortunately there is no easy way to change the file group that contains the blob data once the table is created. The only way to move the blob data to another file group is to create a new table, copy the data into the new table, drop the old table, then rename the new table to the correct name.

Assuming you have a table created with this definition:

CREATE TABLE TextDataExample
(TextID int IDENTITY(1,1) PRIMARY KEY,
TextDescription varchar(25),
LongText TEXT)
ON [PRIMARY]

And you want to move the data to a table with this definition:

CREATE TABLE TextDataExample
(TextID int IDENTITY(1,1) PRIMARY KEY,
TextDescription varchar(25),
LongText TEXT)
ON [PRIMARY] TEXTIMAGE_ON [TextDataFileGroup]

You would use a script much like this:

CREATE TABLE TextDataExample_Temp
(TextID int IDENTITY(1,1) PRIMARY KEY,
TextDescription varchar(25),
LongText TEXT)
ON [PRIMARY] TEXTIMAGE_ON [TextDataFileGroup]

SET IDENTITY_INSERT TextDataExample_Temp ON
INSERT INTO TextDataExample_Temp
(TextID, TextDescription, LongText)
SELECT TextID, TextDescription, LongText
FROM TextDataExample WITH (TABLOCK)
SET IDENTITY_INSERT TextDataExample_Temp OFF

DROP TABLE TextDataExample

exec sp_rename 'TextDataExample_Temp',
'TextDataExample', 'object'

When doing this procedure be sure to recreate all needed indexes on the new table after dropping the old table. The indexes should be created after the old table has been dropped so that the same index names can be used in case any scripts or procedures require specific index names to be available.

While SQL Server supports blob data, you should always be careful when storing blob data in the SQL Server. Storing large amounts of blob data in your database can adversely impact your database performance. While there are things that can be done to help this performance, often it leads to eventually moving the blob data out of the SQL Server database and into the file system for storage. While putting blob data into the database may seem like an easy solution at the time, it can lead to performance issues later on as the system grows. The file system is usually the best location for blob data, but if you choose to store it in the database, just be sure to use care.


 

ABOUT THE AUTHOR:   
 
Denny Cherry is a DBA and database architect managing one of the largest SQL Server installations in the world, supporting more than 175 million users. Denny's primary areas of expertise are system architecture, performance tuning, replication and troubleshooting.
Copyright 2007 TechTarget
 


This was first published in September 2007

There are Comments. Add yours.

 
TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

Disclaimer: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.