Home > SQL Server Tips > Database Management and Administration > Database partitioning methods in SQL Server
SQL Server Tips:
EMAIL THIS
 TIPS & NEWSLETTERS TOPICS 

DATABASE MANAGEMENT AND ADMINISTRATION

Database partitioning methods in SQL Server


Hilary Cotter, Contributor
09.08.2005
Rating: -3.60- (out of 5)


Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   


Partitioning is the distribution of a table over multiple subtables that may reside on different databases or servers in order to improve read/write performance. SQL Server partitioning is typically done at the table level, and a database is considered partitioned when groups of related tables have been distributed. Tables are normally partitioned horizontally or vertically. The following tip will help you understand these SQL Server partitioning methods and determine when to use one over the other.

Vertical partitioning improves access to data

In a vertically partitioned table, columns are removed from the main table and placed in child tables through a process called denormalization. This type of partitioning allows you to fit more rows on a database page, making tables narrower to improve data-access performance. Therefore, a single I/O operation will return more rows. By vertically partitioning your data, you may have to resort to joins to return the denormalized columns. Carefully consider the impact of vertical partitioning on your overall data access patterns.

SQL Server will, by default, vertically partition text and image datatype columns on the database page. Columns created using the text and image datatype will be stored in separate data pages from the non-text and image columns. That way they take up less space than if you stored the text data in columns using the varchar data type.

Consider the following table:

Create Table Table1
(PK int not null identity constraint pk primary key,
Charcol1 char(10),
Charcol2 char(10),
Textcol text)

You can store a total of 8096/(4+10+10) = 338 rows per page, ignoring per-row overhead. If you stored textual data in a varchar(8000) column, assuming you have 8 KB of data in this column, you wou...


Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   



RELATED CONTENT
Microsoft SQL Server Performance Monitoring and Tuning
Using traces in SQL Server Profiler
SQL Server Mailbag: CALs, witnesses and unwanted changes
SQL Server Mailbag: Data restoration and DB property management
Working with IntelliSense in SQL Server 2008 Management Studio
SQL Server Mailbag: Stored procedures, triggers and SSRS reports
Troubleshooting Distributed Transaction Coordinator errors in SQL Server
Clearing the Windows page file and its effect on server performance
Optimizing SQL Server indexes –- even when they're not your indexes
Performance implications of transaction log autogrowth in SQL Server
The short course on how SQL Server really works

Microsoft SQL Server 2005
End of life comes for SQL Server 2005 SP2, 2008
SQL Server Reporting Services Fast Guide
SQL Server Service Broker Tutorial and Reference Guide
Tips for tuning SQL Server 2005 to improve reporting performance
SQL Server consolidation: Why it's an optimization technique
Parent-child dimensions in SQL Server 2005 with Analysis Services MDX
Enforcing data integrity in a SQL Server database
SSIS error message due to installation problem on SQL Server 2005
Should you upgrade to SQL Server 2005 or SQL Server 2008?
Basics for working with DATETIME and SMALLDATETIME in SQL Server 2005
Microsoft SQL Server 2005 Research

Database Management and Administration
Password cracking tools for SQL Server
Using traces in SQL Server Profiler
Meet compliance requirements with improved database security practices
Hardening the network and OS for SQL Server security
Securing the server and database in SQL Server
How SQL Server 2008 components impact SharePoint implementations
Troubleshooting Distributed Transaction Coordinator errors in SQL Server
Achieving high availability and disaster recovery with SharePoint databases
Clearing the Windows page file and its effect on server performance
Deploying a SQL Server virtual appliance for Microsoft Hyper-V

RELATED GLOSSARY TERMS
Terms from Whatis.com − the technology online dictionary
contiguity  (SearchSQLServer.com)
contiguous  (SearchSQLServer.com)
drilldown  (SearchSQLServer.com)
hashing  (SearchSQLServer.com)
hybrid online analytical processing  (SearchSQLServer.com)

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary


ld have one row stored per page. If the majority of your queries do not return the text column, this storage-engine optimization will be highly beneficial to you. If the majority of your queries do return the text column, you may wish to use the text in row option. The table option text in row stores text data in a single database page if the amount of text is less than a configurable threshold.

Horizontal partitioning improves overall read/write performance, sort of

A horizontally positioned table distributes groups of rows into different tables or databases (sometimes residing on different servers) according to a criterion, which may include a primary key value, date column or a location identifier.

Consider an application where 10,000 users must access a single table on a single server: You can expect high levels of read and write contention on this table. To improve overall read performance, you may want to scale out the same table and distribute it across 10 SQL Servers, then load balance them so each of the 10 databases services 1,000 customers. This may seem like a good approach to improve performance -- but there are several inefficiencies:

  • Storage usage: The same copy of the data is distributed to each node.
  • Data distribution: How will you keep all nodes in sync with the same copy of data?
  • DML synchronization: How will you synchronize changes across all nodes?

SQL Server 2005 will offer a peer-to-peer (P2P) replication model designed to distribute uniform sets of data across multiple servers and keep them synchronized. Updates in a node are replicated to all other nodes in this topology: When a node comes back online, it is synchronized with updates that occurred in other nodes participating in the P2P model. This solution is designed to scale out servers in different geographic locations.

However, in most horizontal partitioning implementations, each server or database contains a subset of the data. Prior to SQL Server 7, these implementations were done at the application level where the calling application would make decisions on which table to access based on the SARG (Search Argument). If tables were partitioned within the same database, the T-SQL logic had to account for the partitions so the correct partition was referenced or updated. Such partitioning methods quickly became complex and did not scale well. If you created a view to present a complete results set, all the underlying base tables would have to be referenced.

Distributed partition views

In SQL Server 7, the Query Optimizer would consider check constraints in the execution plan. If you had a check constraint on a column and your SARG referenced this column, only the correct partition would be referenced in the execution plan. Consider a table with a check constraint on a last name that looks like this:

CONSTRAINT [CK_Employee] CHECK ([LastName] > 'a' and [LastName] < 'b')

Such a table would only accept employees with a last name beginning with the letter A. Say you had a distributed topology consisting of 26 tables, one table for each letter, and these tables were referenced in a view of the union of these 26 tables. If you were to issue a query like the following, the Query Optimizer would reference the check constraints for each table and only reference the table containing last names beginning with the letter A.

Select * from EmployeeTableView where LastName ='a'

However, inserts, updates and deletes would have to be performed against the correct base tables. In SQL Server 2000, you could perform DML operations against the view and the correct base table would be referenced.

Partitioning in SQL Server 2005

SQL Server 2005, available in November 2005, introduces several database partitioning enhancements to improve performance and reduce the administrative overhead involved with managing the partitioned data set.

Prior versions of SQL Server required you to manually create the tables to form a partition. In our example above we would have to create 26 tables (one for each letter) to form the Employee partition. In SQL Server 2005, you can create a partition function, then a partitioned table using this function, and SQL Server will handle the distribution of data ranges to the different partitions according to your range function. This will look like a single table. For performance reasons, you would probably want to have the partitioned table span multiple file groups (possibly on different disks or arrays) with each partition allocated to its own file group. To do this, you would create a partition scheme that contains all file groups forming your partition, and then create your table to use this partition scheme instead of the normal file group.

You can also create partitioned indexes. By default, indexes created on a partition will use the same partitioning scheme as the partitioned table. The best resource for how to use the partitioning capabilities of SQL Server 2005 is Kimberly Tripp's white paper.

Summary

Both partitioning methods have their strengths and weaknesses. SQL Server 2000 offers several options to partition your database: Vertical partitioning is used to narrow the effective width of your rows through denormalization and careful use of text datatype columns. Horizontal partitioning breaks VLDB (very large database) tables into smaller tables according to ranges. Query Optimizer in SQL Server 7 is aware of check constraints and will only reference the base table that corresponds to the where condition. SQL 2000, on the other hand, allows you to perform DML against the views. And SQL Server 2005 will allow you to scale out using the peer-to-peer replication model, simplifying the creation and administration of a partitioning solution.

About the author: Hilary Cotter has been involved in IT for more than 20 years as a Web and database consultant. Microsoft first awarded Cotter the Microsoft SQL Server MVP award in 2001. Cotter received his bachelor of applied science degree in mechanical engineering from the University of Toronto and subsequently studied both economics at the University of Calgary and computer science at UC Berkeley. He is the author of a book on SQL Server transactional replication and is currently working on books on merge replication and Microsoft search technologies.


More information from SearchSQLServer.com

  • Tip: Get normalization rules you must follow
  • Tip: Learn how to improve partition views in SQL Server
  • Topic: Get tips and advice on how to improve SQL Server performance



    Rate this Tip
    To rate tips, you must be a member of SearchSQLServer.com.
    Register now to start rating these tips. Log in if you are already a member.




    DISCLAIMER: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.



  • SQL Server Development - .NET, C#, T-SQL, Visual Basic
    HomeNewsTopicsITKnowledge ExchangeTipsAsk the ExpertsMultimediaWhite PapersIT Downloads
    About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
    SEARCH 
    TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

    TechTarget Corporate Web Site  |  Media Kits  |  Site Map




    All Rights Reserved, Copyright 2005 - 2009, TechTarget | Read our Privacy Policy
      TechTarget - The IT Media ROI Experts