Problem solve Get help with specific problems with your technologies, process and projects.

Database index design and optimization: Some guidelines

What are the latest tips and tricks for database design and optimization? Check out this tip from Basit Farooq to learn more.

Well-designed indexes are an important part of database optimization because they are the key factor in achieving optimal query performance and response time. In the meantime, poorly designed indexes, such as missing or excessive indexes on tables, indexes placed on the wrong column or inadequate index maintenance strategies, can be a source of impaired query and database performance.

There are several guidelines you can follow to make indexes more effective and improve performance during the creation, implementation and maintenance of indexes:

Avoid overindexing tables. Indexes are a solution to many performance problems, but too many indexes on tables will affect the performance of INSERT, UPDATE and DELETE statements. This is due to the fact that SQL Server updates all indexes on tables when you add (INSERT), change (UPDATE) or remove (DELETE) data. This means that the more indexes you have on your tables, the more indexes SQL Server has to update during these data manipulation language (DML) operations, resulting in an increased load on resources. Therefore, only creating required indexes on the tables by analyzing the application or users' data access requirements, including how often they need access to the data, is suggested. Also, small tables generally don't receive measurable benefits from being indexed.

When using clustered indexes, create the clustered index before creating nonclustered indexes. As we all know, the leaf layer of a clustered index is made up of data pages that contain table rows, and the leaf layer of a nonclustered index is made up of index pages that contain pointers to the data rows. Additionally, SQL Server physically sorts table rows in a clustered index order based on key column values, whereas nonclustered indexes don't affect physical sort order. So, when we define nonclustered indexes on a table first, nonclustered indexes contain key values and a row locator that points to heaps that have key value. However, if the table has a clustered index, the leaf node of a nonclustered index points to the leaf node locations in clustered indexes. So, when you create or rebuild a clustered index, the leaf node structure of the nonclustered index also changes. That's why you should be sure to create the clustered index before creating nonclustered indexes, as any modifications to the clustered index change the nonclustered indexes.

Index columns used in foreign keys. Columns used in foreign keys are always a good candidate for nonclustered indexes because they are mostly used in join operations. Therefore, be sure to create a nonclustered index on any foreign key.

Index columns frequently used in joins. Be sure to create nonclustered indexes on columns that are frequently used in join operations, as this will improve query performance when performing the join operation by reducing the time required in locating the required rows in each table.

Use composite indexes and covering indexes to give the query optimizer greater flexibility. When you use a composite index, you create fewer indexes for a table. A composite index is defined from two or more columns from the same table. Also, a composite index improves query performance because they require less disk I/Os than the same query using an index on a single column.

Covering indexes also improves query performance; all the data needed to satisfy the query requirements exists within the index itself. All the data needed for the query is retrieved from the index pages instead of going to the data pages. Therefore, covering indexes also helps in reducing the overall disk I/Os.

More on SQL Server indexing

SQL Server indexing tips to improve performance

When not to use SQL Server indexes

Limit key columns to columns with a high level of selectability. The general rule for efficient indexing is to limit the key columns to columns with high selectivity, because the higher the level of selectivity in a column, the more likely it is to be a key column candidate. For example, good candidates for index key columns are columns used in DISTINCT, WHERE, ORDER BY, GROUP BY and LIKE clauses.

Pad indexes and specify fill factor to reduce page splits. When the database engine needs to add a row to a full index page, the database engine has to split that page to make additional space for the new row. This process will help to keep the index hierarchy intact.

Splitting pages is resource-intensive, as it depends on the size of the index and other activities in the database. The process can result in a significant loss in performance. To prevent splits, or at least reduce the need for them, you should pad the index and specify the fill-factor value. The fill factor value specifies the percentage of space on each leaf-level page to be filled with data, reserving the remainder of space for future growth. The fill factor can be set to 0, or to a percentage between 1 and 100. The server-wide default for the fill-factor value is 0, which means the leaf-level pages are filled to capacity. A padding index leaves the open space on each page in the intermediate level of the index. The padding indexing option is useful only when the fill factor is specified, because it uses the percentage specified by fill factor. By default, SQL Server ensures that each index page has enough empty space to accommodate at least one row of the maximum index size given the set of keys on intermediate pages. However, when you pad an index, if the percentage specified for the fill factor is not large enough to accommodate a row, SQL Server internally overrides the percentage to allow the minimum.

Rebuild indexes based on fragmentation level. Index fragmentation can occur in an active database, because SQL Server maintains indexes on an ongoing basis during DML operations so that they reflect data changes. As database administrators, our main goal is to check for index fragmentation and correct fragmentation with minimal impact on user operations.

Luckily, SQL Server provides a dynamic management view, which we can use to detect fragmentation in a specific index, all indexes in a table or indexed view, all indexes in databases, or all indexes in all databases. The avg_fragmentation_in_percent column of this view returns the percentage of fragmented data. Depending on the level of fragmentation, you can either rebuild or reorganize the index.

Dig Deeper on SQL Server Database Modeling and Design