Tip

The nuts and bolts of T-SQL aggregate functions in SQL Server 2008

A recent article discussed the basics of T-SQL string functions with examples of how to truncate, modify and collect data from them. This next two-part series reviews

    Requires Free Membership to View

T-SQL aggregate functions, with details on how to find the average and sum of a set of numeric values, as well as the number of values in a group. Part two details how to obtain additional data from a set of values, run a checksum, and access statistical data.

When working with data in a SQL Server database, there may be times when you want to perform a calculation on a set of values. For example, your data might include the amounts of individual retail sales made by your organization. As a result, you may want to find the total sales for a particular year or the average amount of sales for each representative.

Transact-SQL includes aggregate functions in SQL Server 2008 that let you perform these types of calculations. A T-SQL aggregate function, which returns a single value, can be used as an expression in a SELECT list or in a COMPUTE, COMPUTE BY, or HAVING clause. All aggregate functions ignore null values, except for COUNT.

In this article, I will demonstrate how to use the aggregate functions available to SQL Server 2008. The examples created for this article are based on a local instance of SQL Server 2008 and the AdventureWorks2008 sample database. The examples are organized according to functionality, though this organization is meant only as a way to help explain and demonstrate the functions.

The basics of T-SQL aggregate functions

The first T-SQL aggregate functions I’ll start with are AVG and SUM. The AVG function returns the average of a set of numeric values, while the SUM function returns the sum total of such a set. In both cases, null values are ignored.

The following example uses both of these functions to provide summary sales data from the SalesOrderHeader table in the AdventureWorks2008 database:

   SELECT
     TerritoryID,
     AVG(SubTotal) AS AvgSales,
     SUM(SubTotal) AS TotalSales
   FROM
     Sales.SalesOrderHeader
   GROUP BY
     TerritoryID
   ORDER BY
     TerritoryID

In this example, I’ve grouped the values by Territory ID, and then for each group I’ve provided the average and total amount of sales based on the SubTotal column.

As you can see, to use a T-SQL aggregate function I have to specify the function name, followed by the column (in parentheses) that contains the values I want to aggregate. Note that the column can actually be any expression that adheres to the function’s requirements. For details about what expressions you can use with a function, see the applicable topic in SQL Server Books Online.

That’s all there is to using the AVG and SUM aggregate functions. The statement above returns the results shown in the following table:

Table 1

As the results show, the statement returns the average and total sales for each Territory ID in the SalesOrderHeader table.

Now let’s look at two other aggregate functions: MAX and MIN. The MAX function returns the highest value from a set of values, and the MIN function returns the lowest value. Both functions ignore null values. In the following statement, I’ve modified the previous example to include the MAX and MIN functions:

   SELECT
     TerritoryID,
     AVG (SubTotal) AS AvgSales,
     SUM(SubTotal) AS TotalSales,
     MAX(SubTotal) AS MaxSale,
     MIN(SubTotal) AS MinSale
   FROM
     Sales.SalesOrderHeader
   GROUP BY
     TerritoryID
   ORDER BY
     TerritoryID

Notice that I once again had to specify the function name, followed by the column name in parentheses. The statement results now show the maximum and minimum sales value for each Territory ID, as shown in the following table:

Table 2 (click to enlarge)

One aspect of the MAX and MIN functions that makes them different from AVG and SUM is that you can also use MAX and MIN with character and datetime data. When working with character data, the maximum and minimum values are based on the column’s collating sequence. For instance, in the following example I’ve retrieved the first and last names of products listed in the Product table:

   SELECT
     MIN(Name) AS FirstProduct,
     MAX(Name) AS LastProduct
   FROM
     Production.Product

The name column is configured with the nvarchar data type. When I use the MAX or MIN functions against that column, the maximum and minimum values returned are based on the alphabetical list of names in that column. As a result, the statement returns the first and last product names in that list, as shown in the following table:

Table 3

If I had used the MIN or MAX function against a datetime column, the value returned would be based on the earliest or latest datetime value stored in the column.

Determining row counts with T-SQL aggregate functions

Another valuable aggregate function supported by SQL Server is COUNT. The COUNT function returns the number of values in a group. Unlike other aggregate functions, you can specify whether null values are included in that count. In addition, you can specify whether to count all values or distinct values. The following SELECT statement shows several ways you can use the COUNT function to determine the number of values in a group:

   SELECT
     COUNT(*) AS ProductCount,
     COUNT(ProductModelID) AS ModelCount,
     COUNT(DISTINCT ProductModelID) AS DistinctCount
   FROM
     Production.Product

As you can see, the statement uses the COUNT function to arrive at three different counts. The first instance of the function uses the format COUNT(*) to return the number of values in the group, including null values. In this case, the group includes the entire Product table. Because there are 504 rows in that table, the function returns a value of 504, as shown below:

Table 4

The second instance of the COUNT function simply counts the values in the ProductModelID column. The count returned by this instance of the function does not include null values, so the number returned is 295 rather than 504, as shown in the preceding table.

The third instance of the COUNT function includes the DISTINCT keyword, which indicates that only distinct, non-null values should be included in the count. As a result, the function returns a value of only 119.

NOTE: Most aggregate functions support the DISTINCT keyword so you can specify that the aggregation is performed only on a distinct set of values from the specified group. Refer to each function’s topic in SQL Server Books Online for details on how to use DISTINCT with that aggregate function.

In addition to the COUNT aggregate function, SQL Server supports the COUNT_BIG function. The only difference between the two functions is that COUNT returns an int value and COUNT_BIG returns a bigint value.

Part two: Grouped data, checksums and stat retrieval

ABOUT THE AUTHOR
Robert Sheldon is a technical consultant and the author of numerous books, articles, and training material related to Microsoft Windows, various relational database management systems, and business intelligence design and implementation. You can find more information at www.rhsheldon.com.

This was first published in May 2010

There are Comments. Add yours.

 
TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

Disclaimer: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.