Home > SQL Server Tips > Data Warehousing and Business Intelligence > Recommended practices for SQL Server Analysis Services aggregations
SQL Server Tips:
EMAIL THIS
 TIPS & NEWSLETTERS TOPICS 

DATA WAREHOUSING AND BUSINESS INTELLIGENCE

Recommended practices for SQL Server Analysis Services aggregations


By Baya Dewald, Contributor
08.18.2009
Rating: --- (out of 5)


Expert advice on database development
Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google


Retrieving data from a cache is the fastest way for SQL Server Analysis Services (SSAS) to resolve a query.

However, in order for the cache to have the necessary data, you need to either run all anticipated queries before your users do or you need to use the CREATE CACHE statement. Since it is difficult to predict every possible query (unless you have a very trivial cube with a handful of dimensions), it is more practical to experiment with CREATE CACHE. (Keep in mind, if you have pre-defined reports that run against the cube, it may be useful to warm up the cache on a set schedule before executing the reporting queries.)

Unfortunately, if you allow any flexibility in your analytical application, your queries may not be resolved by the cache. In this case, SSAS has two options: it can retrieve data from aggregation files or it can retrieve data from data files. Aggregations are pre-calculated summary values for a given set of SSAS measure group attributes. For example, an aggregation could contain reseller sales amounts for August 2009, volume discount promotions and a clothing product category.

Since aggregations files are often smaller than data files, it is faster to retrieve data from them.

Aggregation designs

A measure group can have zero or more aggregation designs, and each aggregation design can have one or more aggregations. You can also apply aggregation design to zero or more partitions within your measure group. Furthermore, each aggregation contains various combinations of dimension attributes that summary values should be calculated by.

Aggregation design doesn't speed up queries – it is simply the metadata used by Analysis Services when it builds aggregation files. You can review existing aggregation designs using the Business Intelligence Development Studio (BIDS), SQL Server Management Studio (SSMS) or Aggregation Manager sample tool.

In SSAS 2008, BIDS includes a new Aggreg...


Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google



RELATED CONTENT
SQL Server Business Intelligence (BI) and Data Warehousing
DBA career paths could lead to business intelligence
Are data warehouses made for the cloud?
Q&A: Business intelligence gets a facelift in SQL Server 2008 R2
Project Gemini gets a new name, Madison earns buzz
Speed up reports in SQL Server Reporting Services with caching
Data Transformation Services vs. SSIS: The key differences
Using package configurations in SQL Server Integration Services (SSIS)
How SQL Server 2008 components impact SharePoint implementations
Achieving high availability and disaster recovery with SharePoint databases
SQL Server Reporting Services 2008 offers faster speeds, new variations

Data Warehousing and Business Intelligence
Creating and managing SQL Server Analysis Services partitions
Sharing SSAS cube data in SharePoint with PerformancePoint Server 2007
New data profiling tools in SQL Server 2008
Utilize SSAS for data predictions and classification using Excel
SQL Server 2008 Integration Services delivers new features
Parent-child dimensions in SQL Server 2005 with Analysis Services MDX
Change data capture in SQL Server 2008 improves BI reporting accuracy
Manage traces in SQL Server 2005 Analysis Services with XMLA commands
Tutorial: SQL Server 2005 Analysis Services
Open SSIS packages without validation using these SQL properties

RELATED GLOSSARY TERMS
Terms from Whatis.com − the technology online dictionary
data aggregation  (SearchSQLServer.com)
data preprocessing  (SearchSQLServer.com)
data warehouse  (SearchSQLServer.com)
FileMaker  (SearchSQLServer.com)
GIS  (SearchSQLServer.com)
MOLAP  (SearchSQLServer.com)
pivot table  (SearchSQLServer.com)
Quiz: SQL Server 2000  (SearchSQLServer.com)
SQL  (SearchSQLServer.com)
T-SQL  (SearchSQLServer.com)

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary


ations tab, which is pictured below.

Figure 1 (Click to enlarge)
[IMAGE]

The advanced view of the Aggregations tab allows you to examine attributes included in each aggregation and to add or remove attributes.

In the following advanced view, the first 30 aggregations for Internet Sales aggregation design are shown.

Figure 2 (Click to enlarge)
[IMAGE]

Aggregations are assigned counters starting with 0. SSMS 2008 allows you to script aggregation designs in XMLA format.

Here is an abbreviated XMLA for the Internet Sales aggregation design:

<Create xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
    <ParentObject>
        <DatabaseID>Adventure Works DW 2008</DatabaseID>
        <CubeID>Adventure Works</CubeID>
        <MeasureGroupID> Fact Internet Sales 1</MeasureGroupID>
    </ParentObject>
    <ObjectDefinition>
        <AggregationDesign xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2" xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2" xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100">
            <ID>Internet Sales 1</ID>
            <Name>Internet Sales</Name>
            <EstimatedRows>32265>/EstimatedRows>
            <Dimensions>
                <Dimension>
                    <CubeDimensionID>Dim Promotion>/CubeDimensionID>
                    <Attributes>
                        <Attribute>
                            <AttributeID>Promotion Name</AttributeID>
                            <EstimatedCount>16</EstimatedCount>
                        </Attribute>
                        <Attribute>
                            <AttributeID>Discount Pct></AttributeID>
                            <EstimatedCount10>/EstimatedCount>
                        </Attribute>
                        <Attribute>
                            <AttributeID>Max Qty</AttributeID>
                            <EstimatedCount>4</EstimatedCount>
                        </Attribute>
                        <Attribute>
                            <AttributeID>Promotion Type>/AttributeID>
                            <EstimatedCount>6</EstimatedCount>
                        </Attribute>
                        <Attribute>
                            <AttributeID>Min Qty</AttributeID>
                            <EstimatedCount>6</EstimatedCount>
                        /Attribute>
                        <Attribute>
                            <AttributeID>Promotion Category</AttributeID>
                            <EstimatedCount>3</EstimatedCount>
                        </Attribute>
                        <Attribute>
                            <AttributeID>End Date</AttributeID>
                            <EstimatedCount>10</EstimatedCount>
                        </Attribute>
                        <Attribute>
                            <AttributeID>Start Date</AttributeID>
                            <EstimatedCount>8</EstimatedCount>
                        </Attribute>
                    </Attributes>
                </Dimension>

          </Dimensions>
            <Aggregations>
                <Aggregation>
                    <ID>Aggregation 0</ID>
                    <Name>Aggregation 0</Name>
                    
                        <Dimension>
                            <CubeDimensionID>Dim Promotion</CubeDimensionID>
                        </Dimension>
                        <Dimension>
                            <CubeDimensionID>Dim Sales Territory</CubeDimensionID>
                        </Dimension>
                        <Dimension>
                            <CubeDimensionID>Internet Sales Order Details</CubeDimensionID>
                        </Dimension>
                        <Dimension>
                            <CubeDimensionID>Dim Product</CubeDimensionID>
                            <Attributes>
                                <Attribute>
                                    <AttributeID>Model Name</AttributeID>
                                </Attribute>
                            </Attributes>
                        </Dimension>
                        <Dimension>
                            <CubeDimensionID>Dim Customer</CubeDimensionID>
                        </Dimension>
                        <Dimension>
                            <CubeDimensionID>Dim Currency</CubeDimensionID>
                        </Dimension>
                        <Dimension>
                            <CubeDimensionID>Destination Currency</CubeDimensionID>
                        </Dimension>
                        <Dimension>
                            <CubeDimensionID>Order Date Key - Dim Time</CubeDimensionID>
                        </Dimension>
                        <Dimension>
                            <CubeDimensionID>Ship Date Key - Dim Time</CubeDimensionID>
                        </Dimension>
                        <Dimension>
                            <CubeDimensionID>Due Date Key - Dim Time</CubeDimensionID>
                        </Dimension>
                        <Dimension>
                            <CubeDimensionID>Sales Reason</CubeDimensionID>
                        </Dimension>
                    </Dimensions>
                </Aggregation>>

     </Aggregations>
        </AggregationDesign>
    </ObjectDefinition>
</Create>

Notice that the aggregation design includes dimensions that the aggregations can be defined by. It also includes the list of attributes that will be pre-calculated by each aggregation.

In Aggregation Manager, pictured below, you can review aggregations and summarized attributes.

Figure 3 (Click to enlarge)
[IMAGE]

Assigning aggregation designs

You can have multiple aggregation designs for each measure group.

This is useful if you have a set of reports or expected queries that are slated to be executed at different times. In addition, partitions that are seldom used (historical partitions) could each be assigned a different aggregation design to save disk space. For small partitions that only take a few seconds to scan, you do not need to assign any aggregations since there are no performance benefits.

In BIDS, you can assign aggregation design to a partition by creating a new design for a specific part.

Alternatively, you can assign an existing aggregation design to a partition through SSMS by right-clicking a partition, choosing Assign Aggregation Design, selecting the aggregation design from a drop-down list and checking the partition to which you wish to apply this design.

In the background, Analysis Services executes an ALTER statement on a partition, similar to the command shown below.

<Alter ObjectExpansion="ObjectProperties" xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
  <Object>
    <DatabaseID>Adventure Works DW 2008</DatabaseID>
    <CubeID>Adventure Works</CubeID>
    <MeasureGroupID>Fact Internet Sales 1</MeasureGroupID>
    <PartitionID>Internet_Sales_2001</PartitionID>
  </Object>
  <ObjectDefinition>
    <Partition xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2" xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2" xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100">
      <ID>Internet_Sales_2001</ID>
      <Name>Internet_Sales_2001</Name>
      <Source xsi:type="QueryBinding">
        <DataSourceID>Adventure Works DW</DataSourceID>
        <QueryDefinition> SELECT * FROM [dbo].[FactInternetSales] WITH       WHERE OrderDateKey &lt;= '20011231'</QueryDefinition>
      </Source>
      <StorageMode>Molap</StorageMode>
      <ProcessingMode>Regular</ProcessingMode>
      <Slice>([Date].[Calendar].[Calendar Year].&amp;[2001], [Product].[Product Categories].[Category].&amp;[1])</Slice>
      <ProactiveCaching>
        <SilenceInterval>-PT1S</SilenceInterval>
        <Latency>-PT1S</Latency>
        <SilenceOverrideInterval>-PT1S</SilenceOverrideInterval>
        <ForceRebuildInterval>-PT1S</ForceRebuildInterval>
        <AggregationStorage>MolapOnly</AggregationStorage>
        <Source xsi:type="ProactiveCachingInheritedBinding">
          <NotificationTechnique>Server</NotificationTechnique>
        </Source>
      </ProactiveCaching>
      <EstimatedRows>1013</EstimatedRows>
      <AggregationDesignID>AggregationDesign1</AggregationDesignID>
    </Partition>
  </ObjectDefinition>
</Alter>

Aggregation processing and files

Aggregation files are created when you process partitions using ProcessFull or ProcessIndexes.

Each partition has a Process Mode property that can be set to either Regular or Lazy Aggregations. With Regular processing, aggregations are calculated as part of the partition processing. If you use the Lazy Aggregations option then aggregations are processed using a background thread after partition processing is complete.

The advantage of the Lazy Aggregations option is that your users can start querying the partition as soon as data is loaded. However, query performance won't benefit from the aggregations until they are calculated. You can find aggregation files under each partition folder. The files will be called N.agg.rigid.data and N.agg.flexible.data, where N represents the file's version number.

Aggregations can be rigid or flexible depending on the type of attribute relationships you setup in your hierarchies. If you aggregate an attribute that has a flexible relationship with the hierarchy's granularity attribute, the resulting aggregation is flexible. If relationship to the granularity attribute is rigid then the resulting aggregation is rigid.

From a query performance perspective, rigid and flexible aggregations behave identically. However, SSAS treats the two types of aggregations differently during incremental updates of dimensions. Flexible aggregations are dropped during dimension processing and must be rebuilt, while rigid aggregations are updated as part of processing.

The ProcessIndexes option processes indexes on a partition that the data has already been loaded using (the ProcessData option). If you want to delete existing aggregation files, you can use the ProcessClearIndexes option.

Building aggregations

You can build aggregations with BIDS and SSMS wizards.

With the Aggregation Design Wizard, you can pick single or multiple partitions, customize aggregation usage settings for each attribute, and customize the level of performance gain you expect from the new aggregation design.

The Usage Based Optimization Wizard is similar, but it also examines the MDX query log stored in a SQL Server database to help it decide which combination of attributes should be aggregated.

The cube structure tab in BIDS allows you to set the aggregation usage property for each attribute. To do this, simply select a cube dimension, choose an attribute and then set its aggregation usage property.

This property can take one of the following values:

  • None – aggregation wizards will NOT consider this attribute for any aggregation
  • Full – aggregation wizards will consider this attribute for all aggregations
  • Unrestricted – no restrictions are provided for aggregation wizards
  • Default – aggregation wizards will use default rules: if this is a key attribute it will be considered for all aggregations; non-key attributes will be treated as unrestricted for aggregation consideration

Aggregation Design and Usage Based Optimization wizards run a complex set of algorithms to determine the expected size of the aggregation and its perceived performance benefit.

Microsoft recommends that aggregation files be no greater than one-third of the partition data file size. The theory is if aggregation file size exceeds one-third of the data file size then its footprint on disk utilization will outweigh any performance benefit.

In some situations, you may need more granular control over the aggregations. This is when it is useful to know how to use the BIDS aggregations' tab or Aggregation Manager. As you saw earlier in this article, BIDS 2008 allows you to add attributes to existing aggregations by checking the attribute. You can also create new aggregations, copy an aggregation or delete an aggregation from an existing aggregation design.

Aggregation Manager allows you to add aggregations using the query log. Simply right-click on the Aggregation Designs folder for the measure group you wish to create aggregations for and choose Add from query log.

A screen, similar to the one below, appears.

Figure 4 (Click to enlarge)
[IMAGE]

Here, you can specify the SQL Server connection parameters and the SQL statement used to retrieve the attribute bitmap from the query log table. You can also customize the aggregation design name as well as aggregation prefix. Easily identifiable prefixes make it simpler to identify the custom designed aggregations when you review query's output in Profiler.

With Aggregation Manager you can create aggregations specifically for the troublesome query.

For example, say you have 10 queries that take particularly long because they scan large partitions. Truncate the query log table on your test server, then run the poorly performing MDX queries and make sure they're logged. Note that each query could examine multiple sub-cubes and therefore could create multiple records in a query log table.

Next create aggregations from the query log using Aggregation Manager. Clear the cache, re-run queries and monitor execution using Profiler. This time, they will use aggregations and be much faster.

Aggregations can add a significant footprint to your database size. It is important to keep in mind -- particularly if you synchronize your databases -- that the more aggregations you have, the longer it will take to synchronize the database.

ABOUT THE AUTHOR:   

[IMAGE]Baya Dewald is a database consultant who helps customers develop highly available and scalable applications with SQL Server and Analysis Services. Throughout his career. he has managed database administrator teams and databases of terabyte caliber. Baya's primary areas of expertise are performance tuning, replication and data warehousing. He can be reached at baya@bayasqlconsulting.com.
Copyright 2009 TechTarget


Rate this Tip
To rate tips, you must be a member of SearchSQLServer.com.
Register now to start rating these tips. Log in if you are already a member.




DISCLAIMER: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.



SQL Server Development - .NET, C#, T-SQL, Visual Basic
HomeNewsTopicsITKnowledge ExchangeTipsAsk the ExpertsMultimediaWhite PapersIT Downloads
About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
SEARCH 
TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Site Map




All Rights Reserved, Copyright 2005 - 2009, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts