SQL Server 2005 introduced a valuable new Transact-SQL language component – a common table expression (CTE) -- that can be a handy alternative to derived tables and views. By using CTEs, you can create named result sets to reference from within your SELECT, INSERT, UPDATE, and DELETE statements without having to persist any metadata about the result set structure. In this article, I explain how to create CTEs in SQL Server 2005 – including how to create a recursive query using a CTE – and provide several examples that demonstrate how they work. Note that for these examples I use the AdventureWorks sample database that ships with SQL Server 2005.
Creating a basic CTE in SQL Server 2005
You construct CTEs by adding a WITH clause before the SELECT, INSERT, UPDATE, or DELETE statement. The following syntax shows the basic structure of the WITH clause and CTE definition:
calls the CTEs>
CTE_name [(column_name ...
To continue reading for free, register below or login
To read more you must become a member of SearchSQLServer.com
');
// -->

1;,...n ])]
AS
(
CTE_query
)
As the syntax shows, you can define multiple CTEs within the optional WITH clause. The CTE definition includes the name of the CTE, the CTE column names, the AS keyword and the CTE query enclosed in parentheses. Note that the number of CTE column names must match the number of columns returned by the CTE query. In addition, the column names are optional if the CTE query supplies all column names.
Now that you have a basic idea of the CTE syntax in SQL Server, let's look at an example of a CTE definition to help better understand this syntax. The following example defines a CTE named ProductSold and then references the CTE within a SELECT statement:
WITH ProductSold (ProductID, TotalSold)
AS
(
SELECT ProductID, SUM(OrderQty)
FROM Sales.SalesOrderDetail
GROUP BY ProductID
)
SELECT p.ProductID, p.Name, p.ProductNumber,
ps.TotalSold
FROM Production.Product AS p
INNER JOIN ProductSold AS ps
ON p.ProductID = ps.ProductID
As you can see, a WITH clause precedes the SELECT statement that references the CTE. The first line of the WITH clause includes the name of the CTE (ProductSold) and the names of the two columns within the CTE (ProductID and TotalSold). Next comes the AS keyword, followed by the CTE query in parentheses. In this case, the CTE query returns the total number of individual products sold.
The SELECT statement that follows the WITH clause references the CTE by name when joining the Product table to the CTE, based on the ProductID columns. The statement calls the CTE as it would a table or view. However, unlike a table or view, the CTE is available only to the statement that immediately follows the WITH clause. If you reference the CTE in a subsequent statement -- without redefining the CTE -- you'll receive an error.
One advantage to using common table expressions is that you can reference a CTE multiple times in the calling statement. For example, the following statement defines a CTE named Employees and then calls that CTE twice in the SELECT statement that follows the WITH clause:
In this example, the SQL Server CTE query returns a list of employee IDs, names and email addresses, as well as their managers' IDs. The SELECT statement following the WITH clause then joins the CTE with itself to return the managers' names and email addresses. You can achieve the same results by using derived tables (subqueries), but that means repeating the same subquery multiple times and working with code that's more complex.
Creating multiple CTEs in a WITH clause
As you saw in the CTE syntax, you can define multiple CTEs in your WITH clause and then call each of those CTEs as often as necessary in the statement that follows. Take a look at the example below that demonstrates how this works. The following WITH clause includes two CTE definitions:
(
(
INNER JOIN Production.Product p
INNER JOIN Cost c
(
),
Sold (ProductID, AvgSold)
(
),
Total (ProductID, TotalCost)
(
INNER JOIN Sold s
)
INNER JOIN Total t
Notice that the CTE query in the Total CTE definition joins the Cost and Sold CTEs, whose CTE definitions precede Total. You cannot reference an undefined CTE. For example, you cannot reference the Total CTE within the Sold CTE. Now, the SELECT statement that follows the WITH clause needs to join the Product table to the Total CTE only, rather than joining to both the Cost and Sold CTEs.
Creating a recursive common table expression
One of the most valuable features of a CTE in SQL Server is its ability to create a recursive query -- a type of query that repeatedly references itself in order to return subsets of data. A recursive query is most commonly used to return hierarchical data. For instance, the Employee table in the AdventureWorks database includes the manager ID of each employee. The manager ID is actually the employee ID of that manager. As a result, the Employee table contains the entire hierarchical direct reports structure from the CEO on down.
You can define a CTE to retrieve this hierarchical structure by creating a CTE query that uses a UNION ALL, UNION, INTERSECT, or EXCEPT operator to join multiple SELECT statements. The best way to show you how this works is through an example.
In this WITH clause, the CTE query includes two SELECT statements joined by a UNION ALL operator:
(
e.ManagerID
INNER JOIN Reports r
)
c.FirstName + ' ' + c.LastName
MgrID
INNER JOIN Person.Contact c
The first SELECT statement in the CTE query retrieves only the top-level employee -- the CEO. To select the top-level employee, use a WHERE clause that specifies that the ManagerID value is null. In other words, this is the person who does not directly report to another manager. Note: The first column in the SELECT statement is 1. This is used to designate that the employee returned by this query is at the top level, Level 1.
The second SELECT statement (after the UNION ALL operator) joins the Employee table to the Reports CTE itself based on the manager and employee IDs. By self-referencing the CTE in this way, SQL Server automatically treats this as a recursive query and repeats the query as many times as necessary to return each level of employees. Every time the query runs, the first column adds 1 to the value so that each level is incremented by 1.
The SELECT statement that follows the WITH clause joins the Reports CTE to the Contact table to retrieve the employees' name. The following query results show a sample of the data returned by this statement:
EmpLevel
EmpID
EmpName
MgrID
1
109
Ken Sanchez
NULL
2
6
David Bradley
109
2
12
Terri Duffy
109
2
42
Jean Trenary
109
2
140
Laura Norman
109
2
148
James Hamilton
109
2
273
Brian Welcker
109
3
2
Kevin Brown
6
3
46
Sariya Harnpadoungsataya
6
3
106
Mary Gibson
6
3
119
Jill Williams
6
3
203
Terry Eminhizer
6
3
269
Wanida Benshoof
6
3
271
John Wood
6
3
272
Mary Dempsey
6
3
3
Roberto Tamburello
12
3
66
Janaina Bueno
42
3
102
Dan Bacon
42
3
117
François Ajenstat
42
3
128
Dan Wilson
42
3
149
Ramesh Meyyappan
42
3
150
Stephanie Conroy
42
3
176
Karen Berg
42
3
30
Paula Barreto de Mattos
140
3
71
Wendy Kahn
140
3
103
David Barber
140
3
139
David Liu
140
3
21
Peter Krebs
148
3
44
A. Scott Wright
148
3
200
Hazem Abolrous
148
3
218
Gary Altman
148
3
268
Stephen Jiang
273
3
284
Amy Alberts
273
3
288
Syed Abbas
273
4
4
Rob Walters
3
4
9
Gail Erickson
3
4
11
Jossef Goldberg
3
The results are listed according to the employee's level. Notice that the second through seventh rows show a MgrID value of 109, which is the ID of the top-level employee shown in the first row. The subsequent rows reflect the same hierarchical nature of the data.
Recursive CTEs, like other common table expressions in SQL Server, provide powerful tools for retrieving data. Unlike views, you don't have to persist the metadata. And unlike derived tables, you don't have to repeat code unnecessarily. CTEs help simplify complex code by letting you more easily separate your code into discrete units. And when it comes to recursive queries, CTEs can't be beat. When you first start using CTEs, you might have to play around a little to become comfortable with them, but once you do, you'll never go back.
ABOUT THE AUTHOR:
[IMAGE]Robert Sheldon is a technical consultant and the author of numerous books, articles and training material related to Microsoft Windows and various relational database management systems (including SQL Server) as well as business intelligence design and implementation. He is also the author of the novel Dancing the River Lightly. You can find more information at http://www.rhsheldon.com.