How to use rank function in SQL Server 2005

SQL Server 2005 T-SQL includes a set of functions that let you rank the rows in your result set. SQL Server expert Robert Sheldon defines and gives examples of how to use the four ranking functions: ROW_NUMBER, RANK, DENSE_RANK and NTILE. Include one or more of these functions in your query SELECT clause and automatically assign a rank to each row in your result.

SQL Server 2005 Transact-SQL includes a set of functions that lets you rank the rows in your result set. By including...

one or more of these SQL Server rank functions in the SELECT clause of your query, you can automatically assign a rank to each row. The way in which the rows are ranked depends on the function you use.

T-SQL currently supports four ranking functions: ROW_NUMBER, RANK, DENSE_RANK, and NTILE. I'll define these rank functions in SQL Server and show you how they work, but let's first look at the test environment I'll be using to demonstrate these functions.

To obtain the data I needed, I used the following code to create the Sales.Quota table in the SQL Server 2005 AdventureWorks sample database:

USE AdventureWorks
--Drop Sales.Quotas table if it exists
IF OBJECT_ID (N'Sales.Quotas', N'U') IS NOT NULL
DROP TABLE Sales.Quotas
--Create Sales.Quotas table
SELECT e.FirstName, e.LastName, q.SalesQuota AS Quota,
DATENAME(m,q.QuotaDate) AS [Month], YEAR(q.QuotaDate) AS [Year]
INTO Sales.Quotas
FROM Sales.SalesPersonQuotaHistory q
INNER JOIN HumanResources.vEmployee e
ON q.SalesPersonID = e.EmployeeID
WHERE SalesQuota BETWEEN 210000 and 280000
ORDER BY e.LastName, q.QuotaDate

As you can see, I simply pull data from a couple other tables in the database in order to create a set of meaningful test data.

Here's the SELECT statement I use to query the new table:

LastName, Quota, [Month], [Year]
FROM Sales.Quotas

The SELECT statement uses all four ranking functions to rank the rows. I include all the functions in one statement, so you can compare the results returned by each function, as shown in the following result set:

RowNumber Rank DenseRank NTile LastName Quota Month Year
1 1 1 1 Campbell 280000.00 January 2002
2 1 1 1 Vargas 280000.00 January 2004
3 3 2 1 Campbell 267000.00 April 2002
4 4 3 2 Vargas 266000.00 January 2002
5 5 4 2 Ansman-Wolfe 264000.00 January 2002
6 6 5 2 Jiang 263000.00 July 2003
7 7 6 3 Saraiva 247000.00 April 2003
8 8 7 3 Vargas 244000.00 July 2001
9 9 8 3 Vargas 239000.00 January 2003
10 10 9 4 Campbell 234000.00 January 2004
11 11 10 4 Ansman-Wolfe 226000.00 October 2002
12 11 10 4 Campbell 226000.00 July 2001
13 13 11 5 Ansman-Wolfe 224000.00 April 2003
14 14 12 5 Varkey Chudukatil 217000.00 January 2003
15 15 13 5 Ansman-Wolfe 210000.00 July 2002

(15 row(s) affected)

As you work through the following sections, refer back to the SELECT statement and the result set as necessary to better understand how the ranking function works.

ROW_NUMBER function

The ROW_NUMBER function is the most basic of the ranking functions. As you can see in the result set (the RowNumber column), the function numbers each row sequentially, beginning with 1. If you refer back to the query, you'll see that the first element in the SELECT clause is the ROW_NUMBER function. When you use this function, first specify the function name, followed by the empty parentheses. You do not pass any values into the function.

After the ranking function, specify the OVER function. For this function, you pass in an ORDER BY clause as an argument. The clause specifies the column (or columns) you want to rank. In this case, I am ranking the values in the Quota column -- in descending order. As a result, the rows in the result set are ranked starting with the highest Quota amount. If you refer again to the result set, you'll see the row with the highest Quota value is ranked 1 and the row with the lowest value is ranked 15. (The result set contains 15 rows.)

That's all there is to using the ROW_NUMBER function, and the other ranking functions work in much the same way, only the results are slightly different.

RANK function in SQL Server

The next ranking function in the SELECT list is RANK. Once again, you specify the function name, followed by the OVER function, which again includes the ORDER BY clause. However, as you can see in the result set (the Rank column),

More T-SQL features in SQL Server 2005:

 the ranked values are slightly different than what you saw for the ROW_NUMBER function. Yes, the highest Quota value is ranked 1, but, because two rows share the same highest value, they are both ranked 1.

When you use the RANK function, all shared values will be ranked the same. But notice that the rank value itself is based on the row's position in the result set, not on the sequential number of the row. For example, the Quota value in the third row is 267,000. That is the second highest Quota value, yet because it falls in the third row, it receives a ranking of 3, rather than 2. The RANK function skips the 2 because the second row matches the first row. If the fourth row shared the same value as the third row, it would also be ranked as 3. But because the value is lower and it is in the fourth row, it is ranked 4.

DENSE_RANK function

The DENSE_RANK function takes a different approach. Like the RANK function, the first two rows are assigned a value of 1. However, the DENSE_RANK function uses sequential numbering, rather than tying the rank to the row number. As a result, the third row is assigned a value of 2 because the Quota column contains the second highest value, and the fourth row is assigned a value of 3 because it is the third highest value, and so on.

The ROW_NUMBER, RANK, and DENSE_RANK functions are similar in how they return results. The difference is in whether the numbering is sequential and whether it is tied to the row number. The NTILE function, however, is a bit different than these three functions.

NTILE function

If you refer back to the SELECT statement, you can see that when you specify the NTILE function, you pass in an integer as an argument to the function -- unlike the other ranking functions where you pass in no argument. The NTILE function divides the result set

Vist our IT Knowledge Exchange forum:
  • "How can I write a SQL script to find the 10 busiest days of month?" Get the expert answer.
  • Ask your SQL Server question.

 into the number of groups specified by this argument. For example, in the SELECT statement, I specify 5, which means the result set will be split into five groups. Because there are 15 rows in the result set, each group will contain three rows. The rows are grouped together based on the value in the Quota column.

As a result, the three rows with the highest Quota values are in the first group and receive a ranking of 1. The three rows with the next highest Quota values are in the second group and receive a ranking of 2. and so on. Because there are only five groups, the highest ranking is 5, which is assigned to the group with the three lowest Quota values. Again, refer back to the result set to better understand how the NTILE function groups data and then ranks each group.

As you can see, rank functions in SQL Server are fairly straightforward, and they make ranking your result set a relatively simple process. Each function gives you a slightly different way to rank the result set. For more information about any of these functions, refer to Microsoft SQL Server Books Online.

Robert Sheldon is a technical consultant and the author of numerous books, articles, and training material related to Microsoft Windows, various relational database management systems, and business intelligence design and implementation. You can find more information at  http://www.rhsheldon.com.

Dig Deeper on SQL-Transact SQL (T-SQL)