The rules for this proposed architecture are quite simple. Here they are in a loose order of importance (with 1 being the most important):
1. Every table has a primary key
2. The primary key is a single column.
3. The primary key is the first column.
4. The primary key column is named to correspond with the table name.
5. The primary key migrates to child tables as a foreign key with the same characteristics.
6. The primary key column is numeric.
7. The primary key column is a 4-byte integer data type.
8. The primary key column uses the IDENTITY property (starting at 1 and incrementing by 1).
By submitting your personal information, you agree that TechTarget and its partners may contact you regarding relevant content, products and special offers.
I know, I know, some purists will complain about rule 3. They will say there's no such thing as column order in Relational Theory. In theory, there is no order. In practice, there is an order to the columns. It appears that the vast majority of databases using surrogate keys already follow rule 3, so all I have done is document current practice.
Perhaps the best way to describe this surrogate key architecture is with an example database. The simplistic table definitions below are intended to serve as examples of the rules listed above. They are not intended to represent a realistic model for any particular business.
CREATE TABLE dbo.Region (RegionID int IDENTITY(1,1), RegionCode char( 2) NOT NULL, RegionName varchar(40) NOT NULL, Representative varchar(40) NOT NULL) CREATE TABLE dbo.Customer (CustomerID int IDENTITY(1,1), RegionID int NOT NULL, Name varchar(80) NOT NULL, Address varchar(80) NOT NULL, Phone varchar(20) NOT NULL) CREATE TABLE dbo.Vendor (VendorID int IDENTITY(1,1), Name varchar(80) NOT NULL, Address varchar(80) NOT NULL, Phone varchar(20) NOT NULL) CREATE TABLE dbo.Product (ProductID int IDENTITY(1,1), VendorID int NOT NULL, Description varchar(80) NOT NULL, WholesaleCost decimal(7,2) NOT NULL, RetailPrice decimal(7,2) NOT NULL) CREATE TABLE dbo.Purchase (PurchaseID int IDENTITY(1,1), CustomerID int NOT NULL, OrderNumber int NOT NULL, OrderDate smalldatetime NOT NULL) CREATE TABLE dbo.PurchaseItem (PurchaseItemID int IDENTITY(1,1), PurchaseID int NOT NULL, ProductID int NOT NULL, Quantity smallint NOT NULL, LineNumber smallint NOT NULL) CREATE TABLE dbo.Shipment (ShipmentID int IDENTITY(1,1), Carrier varchar(20) NOT NULL, TrackingNumber varchar(20) NOT NULL, ShipDate smalldatetime NOT NULL) CREATE TABLE dbo.PurchaseItemShipment (PurchaseItemShipmentID int IDENTITY(1,1), PurchaseItemID int NOT NULL, ShipmentID int NOT NULL) CREATE TABLE dbo.Payment (PaymentID int IDENTITY(1,1), PurchaseID int NOT NULL, Method varchar(20) NOT NULL, Amount decimal(7,2) NOT NULL)
You may notice that the Region table and the PurchaseItemShipment table each have an IDENTITY column that will become the primary key. Some would consider an IDENTITY column to be excessive in such tables. The RegionCode column could be the primary key for the Region table. The PurchaseItemID column and the ShipmentID column could form a composite primary key for the PurchaseItemShipment table. It's hard to argue against the point from a pure performance perspective. However, a main objective with this architecture is consistency and I'm willing to sacrifice an insignificant amount of performance to achieve it. In order to remain very consistent this example database follows all eight rules above. The rules are universally applied to every table, including tables such as Region and PurchaseItemShipment. I believe the benefits of consistency far outweigh the costs and this tip will try to demonstrate some of the benefits.
You may notice that these table definitions do not include a primary key declaration. That's where the support tools come into play. Because these tables follow a consistent architecture, and do so universally, it's easy to create tools that allow you to implement declarative referential integrity (DRI) for the entire database with two stored procedure calls. I have provided the stored procedures here.
A surrogate key architecture for powerful database operations
- Home: Introduction
- Part 1: Why use surrogate keys
- Part 2: A surrogate key architecture
- Part 3: Stored procedures: Create and delete constraints and indexes
- Part 4: Stored procedure: Examine children
- Part 5: Stored procedure: Investigate related data
ABOUT THE AUTHOR:
Brian Walker is a senior database architect in an IS department that uses SQL Server 2000 and the .NET Framework. He has more than 25 years of experience in the IT industry with the last several years focused on databases and SQL Server. Walker is a software developer, database developer, database administrator and database consultant. He develops utility software as a hobby, including a large collection of SQL Server utilities.
Copyright 2006 TechTarget