SQL Server database design disasters: What not to do

Would you be embarrassed if an outsider took a good look at your SQL Server database design? Is it possible to implement a foreign key constraint in your tables? Are you using the right data types for your columns? Are your naming conventions defined? Or does the database architecture appear chaotic to anyone else? Inspired by what he recently found in some SQL Server shops, database architect Brian Walker shares advice for improved database design – and SQL Server performance.

Would you be embarrassed if an outsider took a good look at your SQL Server database design? Is it possible to

implement a foreign key constraint in your tables? Are you using the right data types for your columns? Are your naming conventions defined? Or does the database architecture appear chaotic to anyone else? Inspired by what he recently found in some SQL Server shops, database architect Brian Walker shares advice for improved database design – and SQL Server performance.

The condition of some business databases is deplorable.

I had the chance to examine three business databases in five weeks and I'm still staggered by what I found. These are SQL Server databases serving as critical business foundations for companies generating millions of dollars in revenue per year. Every day, hundreds of employees are counting on these databases to deliver accuracy, stability and performance. Personally, I would not trust these databases to store a simple catalog of the 2,600 songs on my iPod.

The databases support their respective businesses, but they are plagued by performance problems. There are many issues within the databases, but the businesses are simply working around the issues with server hardware and production DBAs. It would be very difficult to fix the issues because several applications have been coded to accommodate the database design flaws. With extra expenses and lost productivity, business is hurting. The database design is also making SQL Server look weak.

Evil in retrospect

So, what's wrong with the business databases I examined? Everything. Seriously. They are poorly normalized. Some of the tables have no primary key constraints. Many of the relationships between tables are not enforced with foreign key constraints. The use of indexing is haphazard. Essential business logic is hidden in massive triggers. Many of the column data types are inappropriate. How about consistency? Well, there's more consistency in style among first round contestants on "American Idol."

The benefits of normalization are typically understood, and I will not dwell on them here. Suffice it to say, many of the tables in these databases are not in 1NF. The tables without primary key constraints obviously fail the criteria for 1NF. In addition, a few tables include columns containing multiple delimited values. Most columns in most tables are nullable, which violates controversial criteria for 1NF. Personally, I think nullable columns are sure to be a pain for developers somewhere along the way.

Implementing foreign key constraints should be second nature, and the benefit of ensuring data integrity is a huge reason. Several tables in these databases contain orphaned rows by the millions. Those responsible for the lack of proper Declarative Referential Integrity (DRI) in these databases deserve eternal damnation as COBOL programmers. A couple of tables in one database have situations that send a shiver up my spine. A fictitious example of the gruesome details follows.

Let's say you have a Book table with an AuthorID foreign key column. In most rows, the AuthorID column contains a foreign key reference to an Author table. In some rows, the AuthorID column contains a foreign key reference to an Editor table. In other rows, the AuthorID column contains a foreign key reference to a Publisher table. Another column indicates which kind of reference is contained in the AuthorID column. Therefore, it's impossible to implement a foreign key constraint. Brilliant!

Complete and proper DRI makes it possible to generate some very useful SQL code. For example, indexes can be generated for all foreign keys, which can dramatically improve the performance of statements involving JOIN clauses. The generated indexes can be a great base from the start, instead of adding such indexes one by one as a performance-tuning activity. As another example, SQL Server triggers can be generated for auditing or logging purposes. In general, I think triggers should be reserved for such tasks while business logic belongs in stored procedures.

The databases I examined have many columns with inappropriate SQL Server data types. They have a mixture of single-byte (standard) character strings and double-byte (international) character strings. They have hundreds of columns as fixed-width strings 20, 30, 40 and 50+ characters wide. They have hundreds of columns as the datetime data type when the smalldatetime data type is more than adequate for their purposes. These data type choices are wasting storage space and decreasing read/write performance.

Be consistent!

The technical issues found in these databases make me queasy and the lack of consistency gives me the creeps. The main area of inconsistency is naming conventions. For every 10 database developers there are at least 10 "Best Practices" naming conventions. Sure, I have my own, but it's only a suggestion for those who might want one. The main point is: Define a naming convention and use it. Everywhere. Always. Please. It's a sickening sight when a dozen different naming conventions are used in one database.

As you are defining a naming convention, be sure to cover all the bases. Decide on singular or plural for table names. Decide on use of abbreviations, capitalization, special characters and prefixes/suffixes. Define a naming convention for all objects. Include tables,

More on database design for SQL Server:

 columns (primary keys, foreign keys, other), stored procedures, functions, views, triggers, indexes, primary key constraints, foreign key constraints, check constraints and defaults. Anything you leave out may cause chaos.

Consistency and predictability in table structure can have enormous benefits. For example, archiving and purging old data is heavily impacted by table structure. It's probably the most intricate thing you can do with a relational database. The idea is to select one parent row (or a set of parent rows) and copy/remove it along with every related row in every descendent table. If all tables have a consistent structure, then you can archive and purge with simple calls to generic stored procedures.

The databases I examined use several different table structures at the same time. The primary key columns are not predictable. The foreign key columns are not predictable. The auditing columns are not predictable. The IDENTITY property usage is not predictable. Therefore, archiving and purging requires carefully written custom SQL code to handle each table, resulting in a maintenance nightmare. Personally, I think inconsistency in table structure may be the most serious design flaw in many business databases.

Check out part two SQL Server database design disasters: How it all starts where Walker explains how business databases result in disastrous conditions and what you can do to avoid trouble.

ABOUT THE AUTHOR
Brian Walker is a senior database architect in an IS department that uses SQL Server 2000 and the .NET Framework. He has more than 25 years of experience in the IT industry with the last several years focused on databases and SQL Server. Walker is a software developer, database developer, database administrator and database consultant. He develops utility software as a hobby, including a large collection of SQL Server utilities.

MEMBER FEEDBACK TO THIS TIP

Do you have a comment on this tip? Let us know.

Your tip's bottom line seems to be a delineation between production and development DBA responsibilities, and your advice seems to be not to get into "mongrel" database situations in the first place. However, most companies have their "mongrels." I've recently started working with a company that could be ranked in the top five all-time disasters. After years of growth, our management has become convinced that we have to do a total redesign, based on analysis by a private consulting firm that any one of us could have analyzed ourselves. We now are about to initiate a total redesign while knowing that firefighting is still the priority.

In my opinion, a better tip would have been about organizing a redesign effort within a constant firefighting environment.
—Ron K.

******************************************

Identifying the problem, usually, is only half the bottle. To get out of these situations, you need to have a game plan. Often times, you have very limited resources and not much time; therefore, it is important to do a structural analysis of the data relationships while trying to identify small groups of tables that could be cleaned.

Usually you can narrow down the problem, but it can also leave a good-sized clump of tables at the end that need treatment. One solution is to create mirrored tables that have the correct best practices applied and a stored procedure to populate them via a trigger. Over time, the application or the stored procedures that process the data can be migrated.

That's just a broad description of how to do it, though. An expert's article would be a real benefit, and a light in the end of a tunnel for the weary.
—Terry B.

******************************************

The article was written to share my recent experiences, speculate about reasons for the problems, and propose a definition for a common area of confusion. The article was not written to be a tip and the criticism seems to come from being presented as a tip.

In response to the calls for a tip/article explaining how to fix a mongrel database I would like to paraphrase a line that seems to be working very well for a presidential candidate: The problem is very difficult to solve because we should not have come to this point in the first place.
—Brian Walker, Writer

******************************************

Unlike the others who commented, I read this as somewhere between a rant ("Oh, my sweet Lord, would you look at what these people did!?!?!?!") and a "Boy, this should make you feel better about where you are working!"

Sometimes, it is kind of nice to realize that either yours is not the only disaster on the planet or, compared to some other situations, yours may NOT be a total disaster. Of course, if one were to read your descriptions and feel envious of the "quality" of the databases, one might have the feeling that one does, indeed, have the only true disaster on the planet.
—Ralph W.

******************************************

I had a similar experience with a health insurance provider I was working with. Their database had been migrated from a mainframe system to Oracle by COBOL programmers. Most of the tables had composite primary keys, up to 5 columns, and 1 or 2 datetime data types. My suggestion to move the primary keys to a simple unique composite key index and introduce a surrogate key had not been even considered. The foreign key and declarative referential integrity (DRI) had not been set up at all.

As a result, my development team had to:

 

  • Introduce a validation engine to filter out the bad data upon reading from database.
  • Document the 5-6 worst use cases that had been recognized as issues have the client at least put in some effort towards data cleanup.

This situation with the database, from my experience, is the rule and not the exception. Most databases have been created quite a while ago, and there is a big generation gap in the DBA skillset.
—Igor R.

******************************************

You seem to have ignored the requirement for a data dictionary, which is the most important first step in designing a database. All DDLs, columns and tables should be defined in terms of descriptions and values before you even think of creating a database. One of the worst situations is to be told to go look at the code to determine what the values of a column represent.

It appears to me that most, if not all, of your suggestions would be solved with a data dictionary. Most companies don't want to bear the expense of creating a data dictionary and maintaining it throughout the life of the product. This includes companies like Microsoft.
—Merle C.

******************************************

In the real world, data is often unknown, missing or not applicable. Nullable columns are a wonderful and useful thing when used as part of a properly normalized system.

Most COBOL programs are better written and more structured than most VB6 programs. Do not blame the language for poor programming practice.
—Mark P.

******************************************

I have 23 years experience in software development, along with 10 years as a SQL Developer. I am about to go into full-time DBA work.

Because of my rigorous software background, I could only agree with the author on all points. The problems he saw in the various databases mimic most of the problems I have seen in software. Consistency, good naming conventions, efficient data typing, proper normalization, referential integrity, and performance are all extremely important. Because the author is a senior DBA, his specific criticisms and recommendations are valuable to me.

As a senior software developer, I have come to the realization that most software developers, especially young ones, have enormous egos and believe they are born knowing how to use properly design and use a database. They can't be bothered with any considerations that aren't source code-centric.

Every system and/or project I have ever worked on, and I have worked for a Fortune 70 company, failed to implement best practices for most, sometimes all, of their databases. The usual reason is that developers designed the databases, not DBAs. They cobbled and hacked them together, much like they implemented their source code.

As a SQL Developer, I have worked for years to learn best database practices, something I find most software developers completely uninterested in mastering.
—Lee C.

******************************************

Thanks for the great comments!

Ralph is right; I started writing the article as a rant.

Merle is right; a data dictionary is a great start for a database project, but I'm not sure it actually prevents "most, if not all" of the issues.

Igor wrote, "This situation with the database, from my experience, is the rule and not the exception," and Lee wrote, "Every system and/or project I have ever worked on ...failed to implement best practices for most, sometimes all, of their databases." It's scary to think that mongrel databases are even more common than I feared.

FYI - My mention of COBOL was a lame attempt at humor.
—Brian Walker, Writer

******************************************

Let's not forget to point at least one finger of blame at management, who hire DBA's to advise them and then refuse to listen to their advice. They usually prefer to have a crappy product now than a stable product later, too. So IT isn't given the time and resources to do the job right, even if they know best practices.
—James I.

******************************************

Whilst the author is no doubt right in general terms, I would challenge the offhand dismissal of Access as a kludge tool for amateurs (My reading of his words). Access is a good relational database and with properly designed tables normalized and indexed, it is entirely possible to build and run a multi million pound business with an Access based MRP system. The design is no less important using any RDBMS.
—Cliff S.

******************************************

I had no intention of dismissing Access (or Excel) as a kludge tool. I think Access is a very good database tool for a small number of users and a modest amount of data. It also has my all-time favorite report designer and a great UI designer. My comment about Access (and Excel) was a criticism of how the tool is often used, but not a criticism of the tool itself.
—Brian Walker, Writer


This was first published in September 2008

Dig deeper on SQL Server Database Modeling and Design

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

SearchBusinessAnalytics

SearchDataCenter

SearchDataManagement

SearchAWS

SearchOracle

SearchContentManagement

SearchWindowsServer

Close