Retrieving data from a cache is the fastest way for SQL Server Analysis Services (SSAS) to resolve a query.
However, in order for the cache to have the necessary data, you need to either run all anticipated queries before your users do or you need to use the CREATE CACHE statement. Since it is difficult to predict every possible query (unless you have a very trivial cube with a handful of dimensions), it is more practical to experiment with CREATE CACHE. (Keep in mind, if you have pre-defined reports that run against the cube, it may be useful to warm up the cache on a set schedule before executing the reporting queries.)
Unfortunately, if you allow any flexibility in your analytical application, your queries may not be resolved by the cache. In this case, SSAS has two options: it can retrieve data from aggregation files or it can retrieve data from data files. Aggregations are pre-calculated summary values for a given set of SSAS measure group attributes. For example, an aggregation could contain reseller sales amounts for August 2009, volume discount promotions and a clothing product category.
Since aggregations files are often smaller than data files, it is faster to retrieve data from them.
A measure group can have zero or more aggregation designs, and each aggregation design can have one or more aggregations. You can also apply aggregation design to zero or more partitions within your measure group. Furthermore, each aggregation contains various combinations of dimension attributes that summary values should be calculated by.
Aggregation design doesn't speed up queries – it is simply the metadata used by Analysis Services when it builds aggregation files. You can review existing aggregation designs using the Business Intelligence Development Studio (BIDS), SQL Server Management Studio (SSMS) or Aggregation Manager sample tool.
In SSAS 2008, BIDS includes a new Aggregations tab, which is pictured below.
The advanced view of the Aggregations tab allows you to examine attributes included in each aggregation and to add or remove attributes.
In the following advanced view, the first 30 aggregations for Internet Sales aggregation design are shown.
Aggregations are assigned counters starting with 0. SSMS 2008 allows you to script aggregation designs in XMLA format.
Here is an abbreviated XMLA for the Internet Sales aggregation design:
<DatabaseID>Adventure Works DW 2008</DatabaseID>
<MeasureGroupID> Fact Internet Sales 1</MeasureGroupID>
<AggregationDesign xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2" xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2" xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100">
<ID>Internet Sales 1</ID>
<CubeDimensionID>Dim Sales Territory</CubeDimensionID>
<CubeDimensionID>Internet Sales Order Details</CubeDimensionID>
<CubeDimensionID>Order Date Key - Dim Time</CubeDimensionID>
<CubeDimensionID>Ship Date Key - Dim Time</CubeDimensionID>
<CubeDimensionID>Due Date Key - Dim Time</CubeDimensionID>
Notice that the aggregation design includes dimensions that the aggregations can be defined by. It also includes the list of attributes that will be pre-calculated by each aggregation.
In Aggregation Manager, pictured below, you can review aggregations and summarized attributes.
Assigning aggregation designs
You can have multiple aggregation designs for each measure group.
This is useful if you have a set of reports or expected queries that are slated to be executed at different times. In addition, partitions that are seldom used (historical partitions) could each be assigned a different aggregation design to save disk space. For small partitions that only take a few seconds to scan, you do not need to assign any aggregations since there are no performance benefits.
In BIDS, you can assign aggregation design to a partition by creating a new design for a specific part.
Alternatively, you can assign an existing aggregation design to a partition through SSMS by right-clicking a partition, choosing Assign Aggregation Design, selecting the aggregation design from a drop-down list and checking the partition to which you wish to apply this design.
In the background, Analysis Services executes an ALTER statement on a partition, similar to the command shown below.
<Alter ObjectExpansion="ObjectProperties" xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
<DatabaseID>Adventure Works DW 2008</DatabaseID>
<MeasureGroupID>Fact Internet Sales 1</MeasureGroupID>
<Partition xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2" xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2" xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100">
<DataSourceID>Adventure Works DW</DataSourceID>
<QueryDefinition> SELECT * FROM [dbo].[FactInternetSales] WITH WHERE OrderDateKey <= '20011231'</QueryDefinition>
<Slice>([Date].[Calendar].[Calendar Year].&, [Product].[Product Categories].[Category].&)</Slice>
Aggregation processing and files
Aggregation files are created when you process partitions using ProcessFull or ProcessIndexes.
Each partition has a Process Mode property that can be set to either Regular or Lazy Aggregations. With Regular processing, aggregations are calculated as part of the partition processing. If you use the Lazy Aggregations option then aggregations are processed using a background thread after partition processing is complete.
The advantage of the Lazy Aggregations option is that your users can start querying the partition as soon as data is loaded. However, query performance won't benefit from the aggregations until they are calculated. You can find aggregation files under each partition folder. The files will be called N.agg.rigid.data and N.agg.flexible.data, where N represents the file's version number.
Aggregations can be rigid or flexible depending on the type of attribute relationships you setup in your hierarchies. If you aggregate an attribute that has a flexible relationship with the hierarchy's granularity attribute, the resulting aggregation is flexible. If relationship to the granularity attribute is rigid then the resulting aggregation is rigid.
From a query performance perspective, rigid and flexible aggregations behave identically. However, SSAS treats the two types of aggregations differently during incremental updates of dimensions. Flexible aggregations are dropped during dimension processing and must be rebuilt, while rigid aggregations are updated as part of processing.
The ProcessIndexes option processes indexes on a partition that the data has already been loaded using (the ProcessData option). If you want to delete existing aggregation files, you can use the ProcessClearIndexes option.
You can build aggregations with BIDS and SSMS wizards.
With the Aggregation Design Wizard, you can pick single or multiple partitions, customize aggregation usage settings for each attribute, and customize the level of performance gain you expect from the new aggregation design.
The Usage Based Optimization Wizard is similar, but it also examines the MDX query log stored in a SQL Server database to help it decide which combination of attributes should be aggregated.
The cube structure tab in BIDS allows you to set the aggregation usage property for each attribute. To do this, simply select a cube dimension, choose an attribute and then set its aggregation usage property.
This property can take one of the following values:
- None – aggregation wizards will NOT consider this attribute for any aggregation
- Full – aggregation wizards will consider this attribute for all aggregations
- Unrestricted – no restrictions are provided for aggregation wizards
- Default – aggregation wizards will use default rules: if this is a key attribute it will be considered for all aggregations; non-key attributes will be treated as unrestricted for aggregation consideration
Aggregation Design and Usage Based Optimization wizards run a complex set of algorithms to determine the expected size of the aggregation and its perceived performance benefit.
Microsoft recommends that aggregation files be no greater than one-third of the partition data file size. The theory is if aggregation file size exceeds one-third of the data file size then its footprint on disk utilization will outweigh any performance benefit.
In some situations, you may need more granular control over the aggregations. This is when it is useful to know how to use the BIDS aggregations' tab or Aggregation Manager. As you saw earlier in this article, BIDS 2008 allows you to add attributes to existing aggregations by checking the attribute. You can also create new aggregations, copy an aggregation or delete an aggregation from an existing aggregation design.
Aggregation Manager allows you to add aggregations using the query log. Simply right-click on the Aggregation Designs folder for the measure group you wish to create aggregations for and choose Add from query log.
A screen, similar to the one below, appears.
Here, you can specify the SQL Server connection parameters and the SQL statement used to retrieve the attribute bitmap from the query log table. You can also customize the aggregation design name as well as aggregation prefix. Easily identifiable prefixes make it simpler to identify the custom designed aggregations when you review query's output in Profiler.
With Aggregation Manager you can create aggregations specifically for the troublesome query.
For example, say you have 10 queries that take particularly long because they scan large partitions. Truncate the query log table on your test server, then run the poorly performing MDX queries and make sure they're logged. Note that each query could examine multiple sub-cubes and therefore could create multiple records in a query log table.
Next create aggregations from the query log using Aggregation Manager. Clear the cache, re-run queries and monitor execution using Profiler. This time, they will use aggregations and be much faster.
Aggregations can add a significant footprint to your database size. It is important to keep in mind -- particularly if you synchronize your databases -- that the more aggregations you have, the longer it will take to synchronize the database.
ABOUT THE AUTHOR
Baya Dewald is a database consultant who helps customers develop highly
available and scalable applications with SQL Server and Analysis Services. Throughout his career.
he has managed database administrator teams and databases of terabyte caliber. Baya's primary areas
of expertise are performance tuning, replication and data warehousing. He can be reached at
This was first published in August 2009