Home > SQL Server Tips > Microsoft SQL Server > Simplify queries with SQL Server 2005 common table expressions (CTEs)
SQL Server Tips:
EMAIL THIS
 TIPS & NEWSLETTERS TOPICS 

MICROSOFT SQL SERVER

Simplify queries with SQL Server 2005 common table expressions (CTEs)


Robert Sheldon
02.13.2008
Rating: -4.83- (out of 5)


Expert advice on database administration
Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google


SQL Server 2005 introduced a valuable new Transact-SQL language component – a common table expression (CTE) -- that can be a handy alternative to derived tables and views. By using CTEs, you can create named result sets to reference from within your SELECT, INSERT, UPDATE, and DELETE statements without having to persist any metadata about the result set structure. In this article, I explain how to create CTEs in SQL Server 2005 – including how to create a recursive query using a CTE – and provide several examples that demonstrate how they work. Note that for these examples I use the AdventureWorks sample database that ships with SQL Server 2005.

Creating a basic CTE in SQL Server 2005

You construct CTEs by adding a WITH clause before the SELECT, INSERT, UPDATE, or DELETE statement. The following syntax shows the basic structure of the WITH clause and CTE definition:

[WITH <CTE_definition> [,...n]]
<SELECT, INSERT, UPDATE, or DELETE statement that
calls the CTEs>

<CTE_definition>::=
CTE_name [(column_name [,...n ])]
AS
(
CTE_query
)

As the syntax shows, you can define multiple CTEs within the optional WITH clause. The CTE definition includes the name of the CTE, the CTE column names, the AS keyword and the CTE query enclosed in parentheses. Note that the number of CTE column names must match the number of columns returned by the CTE query. In addition, the column names are optional if the CTE query supplies all column names.

Now that you have a basic idea of the CTE syntax in SQL Server, let's look at an example of a CTE definition to help better understand this syntax. The following example defines a CTE named ProductSold and then references the CTE within a SELECT statement:

WITH ProductSold (ProductID, TotalSold)
AS
(
SELECT ProductID, SUM(OrderQty)
FROM Sales.SalesOrderDetail
GROUP BY ProductID
)
SELECT p.ProductID, p.Name, p.ProductNumber,
ps.TotalSold
FROM Production.Product AS p
INNER JOIN ProductSold AS ps
ON p.ProductID = ps.ProductID

As you can see, a WITH clause precedes the SELECT statement that references the CTE. The first line of the WITH clause includes the name of the CTE (ProductSold) and the names of the two columns within the CTE (ProductID and TotalSold). Next comes the AS keyword, followed by the CTE query in parentheses. In this case, the CTE query returns the total number of individual products sold.

The SELECT statement that follows the WITH clause references the CTE by name when joining the Product table to the CTE, based on the ProductID columns. The statement calls the CTE as it would a table or view. However, unlike a table or view, the CTE is available only to the statement that immediately follows the WITH clause. If you reference the CTE in a subsequent statement -- without redefining the CTE -- you'll receive an error.

One advantage to using common table expressions is that you can reference a CTE multiple times in the calling statement. For example, the following statement defines a CTE named Employees and then calls that CTE twice in the SELECT statement that follows the WITH clause:

WITH Employees (EmpID, MgrID, FName, LName, Email)
AS
(
SELECT e.EmployeeID, e. ManagerID,
c.FirstName, c.LastName, c.EmailAddress
FROM HumanResources.Employee e
INNER JOIN Person.Contact c
ON e.ContactID = c.ContactID
)
SELECT e.FName + ' ' + e.LName AS EmpName,
e.Email AS EmpEmail,
m.FName + ' ' + m.LName AS MgrName,
m.Email AS MgrEmail
FROM Employees e
LEFT OUTER JOIN Employees m
ON e.MgrID = m.EmpID

In this example, the SQL Server CTE query returns a list of employee IDs, names and email addresses, as well as their managers' IDs. The SELECT statement following the WITH clause then joins the CTE with itself to return the managers' names and email addresses. You can achieve the same results by using derived tables (subqueries), but that means repeating the same subquery multiple times and working with code that's more complex.

Creating multiple CTEs in a WITH clause

As you saw in the CTE syntax, you can define multiple CTEs in your WITH clause and then call each of those CTEs as often as necessary in the statement that follows. Take a look at the example below that demonstrates how this works. The following WITH clause includes two CTE definitions:

WITH
Cost (ProductID, AvgCost)
AS
(

SELECT ProductID, AVG(StandardCost)
FROM Production.ProductCostHistory
GROUP BY ProductID
),
Sold (ProductID, AvgSold)
AS
(
SELECT ProductID, AVG(OrderQty)
FROM Sales.SalesOrderDetail
GROUP BY ProductID
)
SELECT p.ProductID, p.Name,
(AvgCost * AvgSold)AS TotalCost
FROM Sold s
INNER JOIN Production.Product p
ON s.ProductID = p.ProductID
INNER JOIN Cost c
ON p.ProductID = c.ProductID

The first CTE definition defines a CTE named Cost, and the second definition defines one named Sold. The Cost CTE returns the average cost of each product based on its cost history. The Sold CTE returns the average number of products sold per order. The SELECT statement after the WITH clause joins these two CTEs with the Product table to return the total cost of each product based on the average number sold and the average cost of the product.

You can easily define multiple CTEs within the WITH clause, but you can take this a step further by defining CTEs that reference the CTEs defined before it. For example, the WITH clause below defines three CTEs (Cost, Sold, and Total):

WITH
Cost (ProductID, AvgCost)
AS
(

SELECT ProductID, AVG(StandardCost)
FROM Production.ProductCostHistory
GROUP BY ProductID
),
Sold (ProductID, AvgSold)
AS
(
SELECT ProductID, AVG(OrderQty)
FROM Sales.SalesOrderDetail
GROUP BY ProductID
),
Total (ProductID, TotalCost)
AS
(
SELECT c.ProductID, (AvgCost * AvgSold)
FROM Cost c
INNER JOIN Sold s
ON c.ProductID = s.ProductID
)
SELECT p.ProductID, p.Name, t.TotalCost
FROM Production.Product p
INNER JOIN Total t
ON p.ProductID = t.ProductID

Notice that the CTE query in the Total CTE definition joins the Cost and Sold CTEs, whose CTE definitions precede Total. You cannot reference an undefined CTE. For example, you cannot reference the Total CTE within the Sold CTE. Now, the SELECT statement that follows the WITH clause needs to join the Product table to the Total CTE only, rather than joining to both the Cost and Sold CTEs.

Creating a recursive common table expression

One of the most valuable features of a CTE in SQL Server is its ability to create a recursive query -- a type of query that repeatedly references itself in order to return subsets of data. A recursive query is most commonly used to return hierarchical data. For instance, the Employee table in the AdventureWorks database includes the manager ID of each employee. The manager ID is actually the employee ID of that manager. As a result, the Employee table contains the entire hierarchical direct reports structure from the CEO on down.

You can define a CTE to retrieve this hierarchical structure by creating a CTE query that uses a UNION ALL, UNION, INTERSECT, or EXCEPT operator to join multiple SELECT statements. The best way to show you how this works is through an example.

In this WITH clause, the CTE query includes two SELECT statements joined by a UNION ALL operator:

WITH Reports (EmpLevel, EmpID, ContactID, MgrID)
AS
(
SELECT 1, EmployeeID, ContactID, ManagerID
FROM HumanResources.Employee
WHERE ManagerID IS NULL
UNION ALL
SELECT r.EmpLevel + 1, e.EmployeeID, e.ContactID,
e.ManagerID
FROM HumanResources.Employee e
INNER JOIN Reports r
ON e.ManagerID = r.EmpID
)
SELECT r.EmpLevel, r.EmpID,
c.FirstName + ' ' + c.LastName AS EmpName,
MgrID
FROM Reports r
INNER JOIN Person.Contact c
ON r.ContactID = c.ContactID
ORDER BY r.EmpLevel, r.MgrID, r.EmpID

The first SELECT statement in the CTE query retrieves only the top-level employee -- the CEO. To select the top-level employee, use a WHERE clause that specifies that the ManagerID value is null.
Tips on tuning SQL Server performance:
  • T-SQL error handling with Try…Catch blocks
  • Tricks to increase SQL Server query performance
  • T-SQL performance problems and solutions
  • In other words, this is the person who does not directly report to another manager. Note: The first column in the SELECT statement is 1. This is used to designate that the employee returned by this query is at the top level, Level 1.

    The second SELECT statement (after the UNION ALL operator) joins the Employee table to the Reports CTE itself based on the manager and employee IDs. By self-referencing the CTE in this way, SQL Server automatically treats this as a recursive query and repeats the query as many times as necessary to return each level of employees. Every time the query runs, the first column adds 1 to the value so that each level is incremented by 1.

    The SELECT statement that follows the WITH clause joins the Reports CTE to the Contact table to retrieve the employees' name. The following query results show a sample of the data returned by this statement:

    EmpLevel EmpID EmpName MgrID
    1 109 Ken Sanchez NULL
    2 6 David Bradley 109
    2 12 Terri Duffy 109
    2 42 Jean Trenary 109
    2 140 Laura Norman 109
    2 148 James Hamilton 109
    2 273 Brian Welcker 109
    3 2 Kevin Brown 6
    3 46 Sariya Harnpadoungsataya 6
    3 106 Mary Gibson 6
    3 119 Jill Williams 6
    3 203 Terry Eminhizer 6
    3 269 Wanida Benshoof 6
    3 271 John Wood 6
    3 272 Mary Dempsey 6
    3 3 Roberto Tamburello 12
    3 66 Janaina Bueno 42
    3 102 Dan Bacon 42
    3 117 François Ajenstat 42
    3 128 Dan Wilson 42
    3 149 Ramesh Meyyappan 42
    3 150 Stephanie Conroy 42
    3 176 Karen Berg 42
    3 30 Paula Barreto de Mattos 140
    3 71 Wendy Kahn 140
    3 103 David Barber 140
    3 139 David Liu 140
    3 21 Peter Krebs 148
    3 44 A. Scott Wright 148
    3 200 Hazem Abolrous 148
    3 218 Gary Altman 148
    3 268 Stephen Jiang 273
    3 284 Amy Alberts 273
    3 288 Syed Abbas 273
    4 4 Rob Walters 3
    4 9 Gail Erickson 3
    4 11 Jossef Goldberg 3

    The results are listed according to the employee's level. Notice that the second through seventh rows show a MgrID value of 109, which is the ID of the top-level employee shown in the first row. The subsequent rows reflect the same hierarchical nature of the data.

    Recursive CTEs, like other common table expressions in SQL Server, provide powerful tools for retrieving data. Unlike views, you don't have to persist the metadata. And unlike derived tables, you don't have to repeat code unnecessarily. CTEs help simplify complex code by letting you more easily separate your code into discrete units. And when it comes to recursive queries, CTEs can't be beat. When you first start using CTEs, you might have to play around a little to become comfortable with them, but once you do, you'll never go back.

    ABOUT THE AUTHOR:   
    Robert Sheldon is a technical consultant and the author of numerous books, articles and training material related to Microsoft Windows and various relational database management systems (including SQL Server) as well as business intelligence design and implementation. He is also the author of the novel Dancing the River Lightly. You can find more information at http://www.rhsheldon.com.

    Rate this Tip
    To rate tips, you must be a member of SearchSQLServer.com.
    Register now to start rating these tips. Log in if you are already a member.




    Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google


    RELATED CONTENT
    SQL/Transact SQL (T-SQL)
    Create a computed column in SQL Server using XML data
    Using the OUTPUT clause for practical SQL Server applications
    Create DDL table in SQL Server 2005 to audit DDL trigger activity
    SQL Server source code analysis and management adds database security
    SQL and SQL Server Tutorial and Reference Guide
    Retrieve XML data values with XQuery in SQL Server 2005
    How to construct and use SQL OUTER JOINs optimally
    How to use the LEFT vs. RIGHT OUTER JOIN in SQL
    Using the FULL OUTER JOIN in SQL
    SQL OUTER JOIN sample statements for queries
    SQL/Transact SQL (T-SQL) Research

    SQL Server 2005 (Yukon)
    Manage traces in SQL Server 2005 Analysis Services with XMLA commands
    Tutorial: SQL Server 2005 Analysis Services
    Create a computed column in SQL Server using XML data
    Open SSIS packages without validation using these SQL properties
    Using the OUTPUT clause for practical SQL Server applications
    Create DDL table in SQL Server 2005 to audit DDL trigger activity
    How to process SQL Server 2005 Analysis Services for data availability
    Configure SQL Server Service Broker for sending stored procedure data
    SQL Server 2005 log shipping setup using the wizard
    Retrieve XML data values with XQuery in SQL Server 2005
    SQL Server 2005 (Yukon) Research

    SQL Server performance and tuning
    Tutorial: SQL Server 2005 Analysis Services
    Tutorial: Migrating to SANs from local SQL Server disk storage
    SQL Server memory configurations for procedure cache and buffer cache
    Using the OUTPUT clause for practical SQL Server applications
    Check SQL Server database and log file size with this stored procedure
    SQL Server PerfMon counters for access methods and buffer manager
    Find size of SQL Server tables and other objects with stored procedure
    Monitor SQL Server disk I/O with PerfMon counters
    SQL Server tempdb best practices increase performance
    SQL Server PerfMon counters for Windows operating system (OS)

    RELATED GLOSSARY TERMS
    Terms from Whatis.com − the technology online dictionary
    ACID  (SearchSQLServer.com)
    commit  (SearchSQLServer.com)
    DAO  (SearchSQLServer.com)
    fetch  (SearchSQLServer.com)
    OLE DB  (SearchSQLServer.com)
    query  (SearchSQLServer.com)
    SQL  (SearchSQLServer.com)
    T-SQL  (SearchSQLServer.com)

    RELATED RESOURCES
    2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
    Search Bitpipe.com for the latest white papers and business webcasts
    Whatis.com, the online computer dictionary

    DISCLAIMER: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.

    HomeNewsTopicsITKnowledge ExchangeTipsAsk the ExpertsMultimediaWhite PapersIT Downloads
    About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
    SEARCH 
    TechTarget provides enterprise IT professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective IT purchase decisions and managing their organizations' IT projects - with its network of technology-specific Web sites, events and magazines.

    TechTarget Corporate Web Site  |  Media Kits  |  Reprints  |  Site Map




    All Rights Reserved, Copyright 2005 - 2008, TechTarget | Read our Privacy Policy
      TechTarget - The IT Media ROI Experts