Tip

Combining result sets from multiple SQL Server queries

SQL Server lets you combine multiple queries into one statement in order to produce a single result set. To connect the queries together, you can use a UNION, INTERSECT, or EXCEPT operator. Each operator produces a different result when combining the individual result sets. For example, the UNION operator simply combines the data returned by the individual queries into one result set. The INTERSECT operator, on the other hand, returns only those values that are contained in all the individual result sets, whereas the EXCEPT operator returns only those values found in the first individual result set.

The differences between the operators will be made clearer in the sections that follow. Note, however, that the operators do share a number of characteristics in terms of how you use them to combine queries.

    Requires Free Membership to View

First, when you combine queries, you need to simply specify the operator between the queries in order to connect them. In addition, you must follow these guidelines:

  • Each query's SELECT list must include the same number of columns or expressions.
  • The columns and expressions in each query's SELECT list must be in the same order. For example, if the first query's SELECT list contains the columns EmployeeID, FirstName, and LastName (in that order), the columns included in the SELECT lists of the other queries must list the columns in that order as well.
  • The data type of each item in a query's SELECT list must be compatible with the data type of the corresponding item in the SELECT lists of the other queries. In other words, if the first column in the SELECT list of the first query is the int data type, the first column in the SELECT lists of the other queries must be an int data type or a data type that can be implicitly or explicitly converted to int.
  • You can use the ORDER BY clause only at the end of the last query. The clause is then applied to the result set returned by all queries. You cannot use ORDER BY for the individual queries.
  • You can use the GROUP BY and HAVING clauses only within individual queries. The clauses cannot be applied to the result set from all the queries.
  • The first query can contain an INTO clause to create a table. The table will then hold the result set from all the queries.
  • You can use the UNION, INTERSECT, and EXCEPT operators within an INSERT statement.
  • You cannot use the FOR BROWSE clause in statements that include the UNION, INTERSECT, and EXCEPT operators.

These are a lot of rules to try to remember as you're learning how to use the UNION, INTERSECT, and EXCEPT operators, so refer back to these guidelines as necessary. In the meantime, let's look at several examples that demonstrate how to implement the operators. For these examples, I ran the following Transact-SQL script to create three tables in the AdventureWorks2008 database:

USE AdventureWorks2008;

IF OBJECT_ID
('Employees1', 'U') IS NOT NULL
DROP TABLE dbo.Employees1;
SELECT BusinessEntityID,
  FirstName,
  LastName
INTO dbo.Employees1
FROM HumanResources.vEmployee
WHERE BusinessEntityID BETWEEN 1 AND 4;

IF OBJECT_ID  ('Employees2', 'U') IS NOT NULL
DROP TABLE dbo.Employees2;
SELECT BusinessEntityID,
  FirstName,
  LastName
INTO dbo.Employees2
FROM HumanResources.vEmployee
WHERE BusinessEntityID BETWEEN 3 AND 6;

IF OBJECT_ID  ('Employees3', 'U') IS NOT NULL
DROP TABLE dbo.Employees3;
SELECT BusinessEntityID,
  FirstName,
  LastName
INTO dbo.Employees3
FROM HumanResources.vEmployee
WHERE BusinessEntityID BETWEEN 5 AND 8;

I created the tables in a SQL Server 2008 instance. The principles discussed here also apply to SQL Server 2005, however, so if you want to create these tables in SQL Server 2005, use the AdventureWorks database, rather than AdventureWorks2008, and change the name of the BusinessEntityID column to EmployeeID.

Using the UNION Operator to combine result sets

The UNION operator lets you combine two or more queries into a single statement in order to produce one result set that contains all values. For example, the following statement retrieves data from the Employees1 and Employees2 tables:

SELECT BusinessEntityID,
  FirstName,
  LastName
FROM Employees1
UNION
SELECT BusinessEntityID,
  FirstName,
  LastName
FROM Employees2;

Notice that the UNION operator connects the individual queries. Each query includes the same number of columns and the columns are in the same order. In addition, the data types of the corresponding columns are compatible. (Note that if the column names in the second query differ from the first query, the column names in the first query are used.) The example above returns the following results:

BusinessEntityID   FirstName   LastName  
---------------- --------- -----------------
1 Ken Sánchez
2 Terri Duffy
3 Roberto Tamburello
4 Rob Walters
5 Gail Erickson
6 Jossef Goldberg

As you can see, the values from both tables are included in the results. You might have noticed, however, that only one row is returned for BusinessEntityID 3 and one row for BusinessEntityID 4, even though each table contains these rows. By default, the UNION operator returns only distinct values. In other words, although each result set returns four rows, the final result set contains only six rows because duplicates are removed.

You can override the default behavior by specifying the ALL option, along with UNION, as shown in the following example:

SELECT BusinessEntityID,
  FirstName,
  LastName
FROM Employees1
UNION ALL
SELECT BusinessEntityID,
  FirstName,
  LastName
FROM Employees2;

The final result set will now include all rows from both tables, even if there are duplicates, as shown in the following results:

BusinessEntityID   FirstName   LastName  
---------------- --------- -----------------
1 Ken Sánchez
2 Terri Duffy
3 Roberto Tamburello
4 Rob Walters
3 Roberto Tamburello
4 Rob Walters
5 Gail Erickson
6 Jossef Goldberg

As you can see, there are now two rows each for BusinessEntityID values 3 and 4, though the IDs are not listed consecutively. Instead, the result set first lists the data returned by the first query, and then lists the data returned by the second query. However, you can sort the final result set by adding an ORDER BY clause after the final query, as shown in the following example:

SELECT BusinessEntityID,
  FirstName,
  LastName
FROM Employees1
UNION ALL
SELECT BusinessEntityID,
  FirstName,
  LastName
FROM Employees2;
ORDER BY BusinessEntityID;

As you can see in the following data, the final result set is now sorted according to the BusinessEntityID value:

BusinessEntityID   FirstName   LastName  
---------------- --------- -----------------
1 Ken Sánchez
2 Terri Duffy
3 Roberto Tamburello
3 Roberto Tamburello
4 Rob Walters
4 Rob Walters
5 Gail Erickson
6 Jossef Goldberg

The ORDER BY clause points to another, more subtle issue. When multiple queries are joined by a UNION operator, the database engine, by default, processes the queries in the order that they're specified. For example, in the following statement, the database engine first retrieves data from the Employees1 table and combines it with the data retrieved from the Employees2 table, and then combines that result set with the data retrieved from the Employees3 table:

SELECT BusinessEntityID,
  FirstName,
  LastName
FROM Employees1
UNION ALL
SELECT BusinessEntityID,
  FirstName,
  LastName
FROM Employees2;
UNION
SELECT BusinessEntityID,
  FirstName,
  LastName
FROM Employees3
ORDER BY BusinessEntityID;

Notice, however, that the first two queries are connected by a UNION ALL operator, but the second and third queries are connected by a UNION operator. This means that when the first result set is combined with the second result set, all results are included, but when this new result set is then combined with the third result set, all the duplicates are eliminated because UNION is used without ALL, as shown in the following results:

BusinessEntityID   FirstName   LastName  
---------------- --------- -----------------
1 Ken Sánchez
2 Terri Duffy
3 Roberto Tamburello
4 Rob Walters
5 Gail Erickson
6 Jossef Goldberg
7 Dylan Miller
8 Diane Margheim

You can override the default behavior by using parenthesis. The queries enclosed in parentheses are processed first, allowing you to better control the result set. In the following example, I enclose the second and third queries in a single set of parentheses:

SELECT BusinessEntityID,
  FirstName,
  LastName
FROM Employees1
UNION ALL
(SELECT BusinessEntityID,
  FirstName,
  LastName
FROM Employees2;
UNION
SELECT BusinessEntityID,
  FirstName,
  LastName
FROM Employees3)
ORDER BY BusinessEntityID;

Now the second and third queries are processed first and this result is then combined with the first query. As a result, the UNION ALL is preserved, as shown in the following result set:

BusinessEntityID   FirstName   LastName  
---------------- --------- -----------------
1 Ken Sánchez
2 Terri Duffy
3 Roberto Tamburello
3 Roberto Tamburello
4 Rob Walters
4 Rob Walters
5 Gail Erickson
6 Jossef Goldberg
7 Dylan Miller
8 Diane Margheim

As you can see, there are now two rows each for the BusinessEntityID values 3 and 4, but the duplicates are still removed from the query that combines the Employees2 and Employees3 tables.

Using the INTERSECT operator to combine result sets

The INTERSECT operator works much like the UNION operator except that, rather than returning all values, it returns only those values that are contained in the tables on both sides of the operator. In addition, the operator only returns distinct values; you cannot override this behavior like you can with UNION ALL.

In the following example, I use an INTERSECT operator to retrieve the data that is in both the Employees1 and Employees2 tables:

SELECT BusinessEntityID,
  FirstName,
  LastName
FROM Employees1
INTERSECT
SELECT BusinessEntityID,
  FirstName,
  LastName
FROM Employees2;

As you can see, I simply join the two queries together with the INTERSECT operator. The statement then returns the following results.

BusinessEntityID   FirstName   LastName  
---------------- --------- -----------------
3 Roberto Tamburello
4 Rob Walters

Notice that only the rows with the BusinessEntityID values of 3 and 4 are returned because they're the only rows contained in both tables. Also notice that only distinct values are returned.

Using the EXCEPT operator to combine result sets

The EXCEPT operator works in just the opposite way of the INTERSECT operator. The final result set produced includes only those values returned by the first query that are not returned by the second query. For instance, the following statement returns the values in the Employees1 table that are not included in the Employees2 table:

SELECT BusinessEntityID,
  FirstName,
  LastName
FROM Employees1
EXCEPT
SELECT BusinessEntityID,
  FirstName,
  LastName
FROM Employees2;

As you can see in the following result set, the rows with BusinessEntityID values of 1 and 2 are included, but not the rows with values 3 and 4:

BusinessEntityID   FirstName   LastName  
---------------- --------- -----------------
1 Ken Sánchez
2 Terri Duffy

If you want to return the rows from the Employees2 table, instead of the Employees1 table, you must reverse the order of the tables, as shown in the following example:

SELECT BusinessEntityID,
  FirstName,
  LastName
FROM Employees2
EXCEPT
SELECT BusinessEntityID,
  FirstName,
  LastName
FROM Employees1;

Now the results include only those rows with BusinessEntityID values of 5 and 6:

BusinessEntityID   FirstName   LastName  
---------------- --------- -----------------
5 Gail Erickson
6 Jossef Goldberg

As the examples have shown, the UNION, INTERSECT, and EXCEPT operators allow you to combine multiple queries, but each operator produces different results. For more details on any of these operators, refer to SQL Server Books Online. There you will find additional examples that will help you better understand how each operator works. Overall, however, you should find that using the three operators is a very straightforward process and easily incorporated into your T-SQL code.

ABOUT THE AUTHOR

Robert Sheldon is a technical consultant and the author of numerous books, articles, and training material related to Microsoft Windows, various relational database management systems, and business intelligence design and implementation. You can find more information at  http://www.rhsheldon.com.

This was first published in December 2009

There are Comments. Add yours.

 
TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

Disclaimer: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.