Home > SQL Server Tips > Database Management and Administration > Navigate SQL Server hierarchies using recursive common table expressions
SQL Server Tips:
EMAIL THIS
 TIPS & NEWSLETTERS TOPICS 

DATABASE MANAGEMENT AND ADMINISTRATION

Navigate SQL Server hierarchies using recursive common table expressions


Adam Machanic, Contributor
07.14.2005
Rating: --- (out of 5)


Expert advice on database administration
Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google


Our world is, by and large, composed of hierarchies. Within both the natural and business worlds, we see different types of entities related to each other in a hierarchical fashion. Universes are made up of solar systems, which are made up of planets, each with moons. Living things are classified scientifically using the hierarchy of domain, kingdom, phylum, class, order, family, genus and species. And the populated areas of our planet are hierarchically classified using designators such as countries, states, counties, cities and postal codes.

In our business databases, we know different hierarchies. They might involve classes of salable items, narrowing in scope from broad categories to more specific categories, and eventually resolving into SKUs. The most common example within a database context is probably that of employees within an organization. Virtually every firm has some sort of organizational chart and -- matrix and other nouveau management techniques not withstanding -- these charts are hierarchical in nature. Modeling them seems so very simple.

Adjacency lists

This simple technique for modeling hierarchies is known as an adjacency list:

It's the most straightforward way of representing a tree. Yet, as most database professionals have discovered, these seemingly simplistic data structures can be maddeningly difficult to query. For instance, given the above table, how would one determine if a given employee was a subordinate of a manager several levels up? Or how would a developer write a query to pull back the entire tree, in an order suitable for creation of an organizational chart? The following query does just that, using the Employee table in the SQL Server 2005 Adventure Works sample database:

This query is obviously quite complex -- but it's also nowhere near perfect. The main issue, aside from that of readability, is maintainability. What will happen if senior management decides that a new level of manag...


Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google



RELATED CONTENT
SQL Server Database Modeling and Design
Optimizing SQL Server indexes –- even when they're not your indexes
Top tips and tricks for SQL Server database development
Managing the development lifecycle with Visual Studio Team System 2008
A first look at Visual Studio Team System 2008 Database Edition
Testing transaction log autogrowth behavior in SQL Server
Top 10 SQL Server Tips of 2008
Tutorial: SQL Server indexing tips to improve performance
Tutorial: Learn SQL Server basics from A-Z
SQL Server database design disasters: How it all starts
Can you shrink your SQL Server database to death?

SQL/Transact SQL (T-SQL)
Using DELETE and TRUNCATE TABLE statements to delete data in SQL Server
SQL language crash course (just enough to be dangerous)
Working with IntelliSense in SQL Server 2008 Management Studio
SQL Server Mailbag: Stored procedures, triggers and SSRS reports
Working with sparse columns in SQL Server 2008
Determining the source of full transaction logs in SQL Server
New GROUP BY option provides better data control in SQL Server 2008
Using the OPENROWSET function in SQL Server
Loading data files with SQL Server's BULK INSERT statement
Importing and exporting bulk data with SQL Server's bcp utility
SQL/Transact SQL (T-SQL) Research

Microsoft SQL Server 2005
End of life comes for SQL Server 2005 SP2, 2008
SQL Server Reporting Services Fast Guide
SQL Server Service Broker Tutorial and Reference Guide
Tips for tuning SQL Server 2005 to improve reporting performance
SQL Server consolidation: Why it's an optimization technique
Parent-child dimensions in SQL Server 2005 with Analysis Services MDX
Enforcing data integrity in a SQL Server database
SSIS error message due to installation problem on SQL Server 2005
Should you upgrade to SQL Server 2005 or SQL Server 2008?
Basics for working with DATETIME and SMALLDATETIME in SQL Server 2005
Microsoft SQL Server 2005 Research

RELATED GLOSSARY TERMS
Terms from Whatis.com − the technology online dictionary
binary tree  (SearchSQLServer.com)
block  (SearchSQLServer.com)
data structure  (SearchSQLServer.com)
DDBMS  (SearchSQLServer.com)
entity-relationship model  (SearchSQLServer.com)
initial extent  (SearchSQLServer.com)
primary key  (SearchSQLServer.com)
segment  (SearchSQLServer.com)
tablespace  (SearchSQLServer.com)
view  (SearchSQLServer.com)

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary


ers-in-training should be added between the junior managers (see the correlation name JuniorManagement) and the lowest level employees (using the correlation name PeopleWhoDoWork)? Other than making working conditions even worse for those on the bottom rungs, this decision will mean that a developer will have to edit this query, adding in yet another join to the Employee table in order to facilitate getting the data for the new level. This is hardly an optimal solution.

Getting around adjacency list limitations

A variety of techniques have been created for getting around these and other limitations of adjacency lists. Itzik Ben-Gan wrote an excellent article on representing hierarchies using materialized paths. Joe Celko popularized a technique called the Nested Sets Model. And more recently, Vadim Tropashko created a technique called the Nested Intervals Model. Although each of these techniques greatly improves upon adjacency lists when it comes to querying, none are as simple or easy to manage when it comes to actually populating and maintaining the data.

The best solution would be one that would allow data modelers to make use of the simplicity of adjacency lists, but ease the pain of querying them. Recursive Common Table Expressions (CTEs) are a new feature in SQL Server 2005 that meets these requirements. A common table expression is a construct very similar to a derived table. The following two queries, the first using a derived table and the second a (non-recursive) CTE, are functionally equivalent:

Both of these queries compute the average total bonus per territory of the sales team in the Adventure Works database. The common table expression in the second query acts just like the derived table in the first, and can be referenced by its correlation name (SalesBonusTotals) throughout the query. In this way, derived tables and CTEs are virtually identical.

However, CTEs have a feature that derived tables do not: The ability to recursively refer to themselves. This recursion is specified by using a UNION ALL within the CTE. The second query in the UNION ALL should refer to the CTE itself, using it as a key to the next level of recursion. This is best shown by example. The following query can be used to find the EmployeeID and ManagerID of the CEO of Adventure Works:

We know that any employee without a manager must be the CEO of the firm -- the top level of the hierarchy represented in the Employee table. To find all of the subordinates of this top level employee (every employee in the company), we can the following recursive CTE:

The results of this query aren't too interesting; all it does is display a list of EmployeeIDs from the Employee table. But what is interesting is what it's actually doing. The query first gets the CEO's EmployeeID, then calls the recursive piece, querying for every employee managed by the CEO (x.EmployeeID = E.ManagerID). For each row returned by that recursive call, the same logic is repeated again and again until the recursion is terminated by a lack of rows returned in the recursive step.

This query can be extended. For instance, we might want to know the level of each employee within the hierarchy:

In this case, the level of the CEO defaults to 1. Each time the recursive step is called, the level will be incremented for those rows.

At this point, the query is returning almost the same data as the flawed query above, yet it is much more readable and will require no maintenance when management makes an asinine decision. The query will dynamically adapt to new levels in the hierarchy without a problem.

The only issue left to solve is that the order of the rows returned by this query may not be quite right for use in an organizational chart. The rows should be returned in the order as if one were walking the tree from the root (the CEO), and visiting each node (employee). This can be solved by taking a cue from Ben-Gan's article and materializing a path as the CTE works:

By converting the EmployeeID into binary format and appending on each recursion, the query forms a trail of visited nodes. Each path representation will begin with 0x0000006D, the binary representation of 109, the Adventure Works CEO's EmployeeID. Following that will be the binary for each EmployeeID down the hierarchy, thereby creating an ordered representation that can be used in the ORDER BY clause to produce the correct results.

Using this technique, any tree or hierarchy represented by an adjacency list can easily be navigated, without using complex techniques such as the Nested Sets or Nested Intervals models. Given that hierarchical data makes up such a large part of our world, this is a very powerful feature for SQL Server 2005!

About the author: Adam Machanic is a senior database engineer for GetConnected Inc. He has several years of experience developing a variety of applications using SQL Server as a data repository and is active on numerous online technical forums. He is a Microsoft Certified Professional and a SQL Server MVP. Machanic, who also serves as our SQL Server 2005 expert, welcomes your questions.


More information from SearchSQLServer.com

  • Tip: Nested queries vs. temporary tables
  • Ask the Experts: Joining a derived table
  • Ask the Experts: Using a derived table instead of a view


  • Rate this Tip
    To rate tips, you must be a member of SearchSQLServer.com.
    Register now to start rating these tips. Log in if you are already a member.




    DISCLAIMER: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.



    SQL Server Development - .NET, C#, T-SQL, Visual Basic
    HomeNewsTopicsITKnowledge ExchangeTipsAsk the ExpertsMultimediaWhite PapersIT Downloads
    About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
    SEARCH 
    TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

    TechTarget Corporate Web Site  |  Media Kits  |  Site Map




    All Rights Reserved, Copyright 2005 - 2009, TechTarget | Read our Privacy Policy
      TechTarget - The IT Media ROI Experts