CLR aggregates

The CLR aggregate is another new type of .NET database object that was introduced in SQL Server 2005. Learn more in this chapter excerpt.

The CLR aggregate is another new type of .NET database object that was introduced in SQL Server 2005. Essentially, a user-defined aggregate is an extensibility function that enables you to aggregate values over a group during the processing of a query. SQL Server has always provided a basic set of aggregation functions like MIN, MAX, and SUM that you can use over a query. User-defined aggregates enable you to extend this group of aggregate...

functions with your own custom aggregations. One really handy use for CLR aggregates is to enable the creation of aggregate functions for CLR UDTs. Like native aggregation functions, user-defined aggregates allow you to execute calculations on a set of values and return a single value. When you create a CLR aggregate, you supply the logic that will perform the aggregation. In this section you'll see how to create a simple aggregate that calculates the maximum variance for a set of numbers.

To create an aggregate using Visual Studio 2005, select the New | Project option, give your project a name, and click OK to create the project. This example uses the name of MaxVariance. After naming the project and clicking OK, complete the New Database Reference dialog using the required connection values for your SQL Server system and database. Next, to create the aggregate I used the Project | Add Aggregate option to display the Add New Item dialog that you can see in Figure 3-12.

Figure 3-12
Figure 3-12: Creating a CLR aggregate

Select Aggregate from the list of SQL Server templates and then enter the name for the class and click OK. As you can see in Figure 3-12, I used the name MaxVariance. Visual Studio will generate a starter project for the aggregate class. Much as with a UDT, the template for a SQL Server CLR aggregate implements four methods that SQL Server 2005 requires for all CLR aggregates. The four required methods are listed in Table 3-3.

Table 3-3
Table 3-3: Required Aggregate Methods

You can see the code to implement the MaxVariance aggregate in the following listing:

Imports System
Imports System.Data
Imports System.Data.Sql
Imports System.Data.SqlTypes
Imports Microsoft.SqlServer.Server


 
   _

  
    _
Public Structure MaxVariance

      Public Sub Init()
           m_LowValue = 999999999
           m_HighValue = -999999999
      End Sub

      Public Sub Accumulate(ByVal value As Integer)
           If (value > m_HighValue)
                 m_HighValue = value
           End If
           If (value < m_LowValue)
                 m_LowValue = value
           End If
       End Sub
       
       Public Sub Merge(ByVal Group as MaxVariance)
           If (Group.GetHighValue() > m_HighValue)
                m_HighValue = Group.GetHighValue()
           End If
           If (Group.GetLowValue() < m_LowValue)
                m_LowValue = Group.GetLowValue()
           End If
       End Sub

       Public Function Terminate() As Integer
               return m_HighValue - m_LowValue
       End Function

       ' Helper methods
       Private Function GetLowValue() As Integer
                return m_LowValue
       End Function

       Private Function GetHighValue() As Integer
                return m_HighValue
       End Function

       ' This is a place-holder field member
       Private m_LowValue As Integer
       Private m_HighValue As Integer

End Structure
  
 

At the top of this listing you can see the standard set of Imports statements used by CLR objects, followed by the serialization attribute that's required by CLR aggregate objects. After that, in the Init method the two variables, m_LowValue and m_HighValue, are assigned high and low values, ensuring that they will be assigned values from the list. These two variables are declared near the bottom of the listing, and they serve to hold the minimum and maximum values that are encountered by the aggregate routine. The Init method is called one time only—when the object is first initialized.

While the Init method is called just once, the Accumulate method is called once for each row in the result set. In this example, the Accumulate method compares the incoming value with the values stored in the m_HighValue and m_LowValue variables. If the incoming value is higher than the current high value, it is stored in the m_HighValue variable. If the value is lower than the value of m_LowValue, it is stored in m_LowValue. Otherwise, no action is performed by the Accumulate method.

The Merge method is used when the aggregate is processed in parallel, which typically won't be the case for most queries. If the Merge is called, its job is to import the current aggregation values from the parallel instance. You can see here that it does that using two helper methods that essentially export the values in the m_HighValue and m_LowValue variables. These values are compared to the existing values, and if they are higher or lower, they will replace the current values in m_HighValue and m_LowValue.

The Terminate method is called once after all of the results have been processed. For this example, the Terminate method simply subtracts the lowest value found from the highest value found and returns the difference to the caller.

Deploying the aggregate

After compiling the class into a DLL, you can import the DLL as a SQL Server assembly using either the Visual Studio 2005 Deploy option or manually using the CREATE ASSEMBLY statement and CREATE AGGREGATE statement as is shown in the following listing:

create assembly MaxVariance
from 'C:tempMaxVariance.dll'
go

CREATE AGGREGATE MaxVariance (@maXVar int)
RETURNS Int
EXTERNAL NAME MaxVariance.[MaxVariance.MaxVariance]
go

Like the earlier examples, this listing assumes that maxvariance.dll has been copied into the c:temp directory on the local SQL Server system. In the CREATE AGGREGATE statement and the EXTERNAL NAME clause the first part of the name specifies the assembly that will be used, and the second part of the name identifies the namespace and class. Here all of these values are named MaxVariance.

Using the aggregate

You can use the aggregate just like SQL Server's built-in aggregate functions. One small difference is that the UDAGG needs to be prefixed with the schema name to allow the system to locate it. The following line illustrates using the MaxVariance Aggregate:

SELECT dbo.MaxVariance(MinQty) FROM Sales.SpecialOffer

The result of this statement will show the difference between the high and low values found in the Sales.SpecialOffer column as is shown here:

-----------
61

(1 row(s) affected)

The previous tip is from "Developing CLR database objects: 10 tips in 10 minutes," excerpted from Chapter 3 of the book "Microsoft SQL Server 2005: A Developer's Guide" written by Michael Otey and Denielle Otey, courtesy of McGraw-Hill Publishing.


Developing CLR database objects

 Home: Introduction
 Tip 1: CLR architecture
 Tip 2: CLR assemblies in SQL Server 2005
 Tip 3: Creating CLR database objects
 Tip 4: CLR stored procedures
 Tip 5: User-Defined Functions
 Tip 6: CLR triggers
 Tip 7: User-Defined Types
 Tip 8: CLR aggregates
 Tip 9: Debugging CLR database objects
 Tip 10: .NET database object security
This was first published in May 2007

Dig deeper on Microsoft SQL Server 2005

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchBusinessAnalytics

SearchDataCenter

SearchDataManagement

SearchAWS

SearchOracle

SearchContentManagement

SearchWindowsServer

Close