How are users going to access data?
- A single row matched to a single value provided
- Multiple rows based on a single value provided
- Multiple rows based on multiple values provided
- Ranges of rows, such as for a specific period of time
What are commonly indexed columns?
- Primary Keys
- Foreign Keys
- Columns that are used in JOINs, WHERE, ORDER BY, GROUP BY, HAVING and other clauses
Which index should be selected?
- Clustered index -- This is reserved for a single index when data is physically ordered in the table. It is not possible to sort the table physically in more than one way. A good candidate for a clustered index is the Primary Key, a column that uniquely identifies each row, or a column that supports a range, such as a date.
- Non-clustered index -- This is used with ordered data based on a clustered index or without a clustered index, just based on the data. Good candidates for a non-clustered index could be foreign keys or columns used in JOINs, WHERE, ORDER BY, GROUP BY, HAVING and other clauses.
- None -- Use this for tables with a small number of rows that are consistent, such as lookup tables where SQL Server can scan the table faster than it can using using an index.
Do I always need to have a clustered index?
No, but "always" is the key word. Think of the 80/20 rule: Clustered indexes should be used in most, if not all, circumstances to physically order the data. Typically, when they are not used, it is because a table has a large number of transactions and the perceived overhead for SQL Server to maintain the clustered index is too costly. I should note, though, that I have personally seen large tables benefit from clustered indexes where conventional wisdom frowned on using them. The benefit was substantial in terms of improved data-access time for queries and minimizing I/O resources. It was a big win!
Do I always need to have a Primary Key?
No, but "always" is the key word once again. Based on my experience, you need a Primary Key 90% of the time or more to maintain referential integrity or to support third-party tools that compare data.
How many columns should be in the index?
- One column per index is probably the best approach to start with if you are unsure of your exact indexing needs.
- Multiple columns per index are valuable if the columns in the index match the column order used in many queries or key queries that are issued frequently. The only caveat is that the statistics for the index are only based on the first column, not the group of indexes.
How else can I configure the indexes?
- Index order -- You can create indexes either in ascending or descending order.
- Fill factor -- Determine the fill factor for each index to establish how much free space remains on the page when the index is created or rebuilt.
- Statistics -- Ensure statistics are either manually created for the indexes or else permit SQL Server to create and update them automatically depending on the database size.
The art and science of SQL Server indexing
Part 1: SQL Server indexing Q&A
Part 2: SQL Server 2000 indexing dos and don'ts
Part 3: Tricks for using the Index Tuning Wizard
ABOUT THE AUTHOR
Jeremy Kadlec is the Principal Database Engineer at Edgewood Solutions, a technology services company delivering professional services and product solutions for Microsoft SQL Server.