Before I begin, there's a general point I must make clear: A surrogate key value has no intrinsic meaning.
Surrogate key values can be used to facilitate table relationships, but that's all they do. There should be no business logic or intelligence attached to surrogate key values. The values in a surrogate key column should not be assumed to imply any useful order for the rows. There should be no expectations about the range or continuity of values in a surrogate key column. I want to stress these aspects of surrogate keys because some of the utility routines could lead readers to believe that I think surrogate key values have intrinsic meaning.
The utility routines in this article provide some helpful manipulations of IDENTITY values. An IDENTITY value is just one form of surrogate key, but it happens to be a very convenient and efficient form for most SQL Server databases. A GUID value is another common form of surrogate key. GUID values are beneficial when surrogate key values must be unique across multiple databases, but the data type is relatively large and that can become a hindrance for performance. The architecture proposed in my earlier article uses IDENTITY values for surrogate keys.
There are a few database "experts" out there who have made false statements about IDENTITY values. For instance, at least one individual has stated that IDENTITY values are exposed physical locators created by looking at the internal state of hardware. This is categorically wrong.
An IDENTITY value does not reference a physical storage location. If you have a table with no clustered index, new rows are inserted where sufficient free space exists. There is no intentional correlation between the physical location of a new row and the IDENTITY value generated for the new row. If you add a clustered index to the table it physically rearranges the rows of data by copying them to completely different data pages in the order of the index values. Despite this physical rearrangement the IDENTITY value for any given row never changes.
An IDENTITY value is not derived from any hardware state, it's not derived from any operating system state, and it's not derived from the state of the SQL Server application. You could say that it's derived from the state of the table at the time of insertion. Envision a Brand X server box running Windows 2003 Server and SQL Server 2000 Enterprise. Envision a Brand Y server box running Windows 2000 Server and SQL Server 2000 Standard. If you copy a SQL Server database from one to the other the generating of IDENTITY values is unaffected. A new row inserted into both copies of a table (one on each server) will be assigned the same IDENTITY value.
Now for the utility routines . . .
Utilities to maintain a surrogate key architecture
Part 1: What are surrogate key values
Part 2: T-SQL code to create stored procedures
Part 3: Utility routine: sp_CheckRowCounts
Part 4: Utility routine: sp_CheckKeyValues
Part 5: Utility routine: sp_ResetNextKeyValue
Part 6: Utility routine: sp_OrganizeKeyValues
Part 7: Utility routine: sp_FindDuplicateRows
Part 8: Utility routine: sp_ChangeParentValue
Surrogate key architecture to perform powerful database operations
Framework to support a surrogate key architecture
|ABOUT THE AUTHOR:|
| Brian Walker
Brian Walker is a senior database architect in an IS department that uses SQL Server 2000 and the .NET Framework. He has more than 25 years of experience in the IT industry with the last several years focused on databases and SQL Server. Walker is a software developer, database developer, database administrator and database consultant. He develops utility software as a hobby, including a large collection of SQL Server utilities.
Copyright 2006 TechTarget