Home > SQL Server Tips > Microsoft SQL Server > Use cube partitions to improve Analysis Services performance
SQL Server Tips:
EMAIL THIS
 TIPS & NEWSLETTERS TOPICS 

MICROSOFT SQL SERVER

Use cube partitions to improve Analysis Services performance


Baya Pavliashvili, Contributor
06.06.2006
Rating: --- (out of 5)


Expert advice on database administration
Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google


Microsoft Analysis Services (MSAS) is a great tool for enabling business stakeholders to analyze data from various perspectives -- and its growing popularity can be primarily attributed to its exceptional performance.

Using pre-calculated summary value (aggregations), MSAS helps you generate reports within seconds whereas running similar reports from a relational database management system (RDBMS) can take hours. But even with a multitude of aggregations, MSAS can struggle to provide optimal performance when dealing with huge volumes of data. In this tip, I will discuss the advantages of using cube partitions – a powerful mechanism for improving query performance and reducing downtime associated with cube processing.

TABLE OF CONTENTS
   Partition benefits
   Partition drawbacks
   Figuring how many aggregations you need
   Implementing partitions

  Partition benefits Return to Table of Contents

Partitions offer many potential benefits summarized here:

1. Shorter downtime: A portion of a typical data warehouse is relatively volatile and needs to be processed often. Other portions are relatively static. For example, in a sales cube, you're likely to change the current year's data nightly (or perhaps even hourly), but sales from previous years might change only occasionally to account for merchandise returns and exchanges. You can create a partition for current year sales and another partition containing all sales from previous years. If your warehouse tracks sales that occurred in the past 10 years, processing only the current partition may be 10 times quicker than processing the entire cube.

2. Better query performance: Queries that summarize data over 10 years could take considerably longer than those that only search through the current year data. If you have appropriate partitions, Analysis Services only has to scan a small subset of all data to return query results, which can translate into dramatic performance improvements compared to queries running against a cube with a single (default) partition.

3. More focused aggregations: Both MSAS 2000 and MSAS 2005 allow you to have different sets of aggregations for each partition. The partition that is queried frequently could benefit from additional aggregations, which in turn could warrant better performance. Partition(s) that aren't used heavily can be processed less frequently -- perhaps monthly or quarterly -- with considerably fewer aggregations. Usage-based optimization wizard can be executed for each partition to determine any additional aggregations that can benefit performance of queries actually executed against a given partition.

4. Customized storage and processing settings: Frequently accessed partitions might benefit from proactive caching and ROLAP storage. On the other hand, other forms of storage and processing might be better for less frequently queried partitions.

5. Distributed query and processing load across servers: MSAS allows you to create remote partitions; a remote partition resides on a server different from its parent cube. This way the queries that affect the remote partition are processed on a server separate from its parent cube, allowing you to take advantage of additional processing power.

6. Partitions processed in parallel: With MSAS 2005, you can process multiple partitions in parallel on a server that has multiple processors. This can further reduce the total cube processing time.

  Partition drawbacks Return to Table of Contents

I mentioned that MSAS 2000 allows you to include the same data in multiple partitions, which may skew calculations on your reports. Fortunately, Microsoft fixed that in MSAS 2005. Perhaps what's more important is that partitions require more work. You must determine the appropriate data sets typically represented on reports and then partition data accordingly. Also, you have to decide which storage mode and aggregation scheme makes sense for each partition. More partitions mean additional maintenance overhead because you may have to merge partitions over time.

Let's suppose most of your analytical reports query current month's data; you could have a partition for the current month that you refresh daily (or perhaps even hourly), a partition for all previous months of the current year and a partition for all previous years. Once the current month is over, you'll have to merge the "current" partition with the current year's partition and create a new partition for the current month. As I mentioned earlier, MSAS only allows you to merge two partitions with identical aggregations -- so you might have to rebuild your aggregations before merging.

Even with the drawbacks, partitions present an invaluable tool for improving the performance of your analytical queries. And, more importantly, they reduce cube processing time and increase cube availability.

  Figuring how many aggregations you need Return to Table of Contents

The number of aggregations you create for your cubes depends on your users' analytical needs. More aggregations typically translate into better query performance, but they also use more disk space and extend cube processing time.

Disk space, typically, is not a huge concern for two reasons:

    1. Storage costs decrease daily and costs of building a data warehouse will far outweigh storage costs.
    2. MSAS offers the efficient multidimensional OLAP (MOLAP) storage option: A relational warehouse typically measured in a few gigabytes would only require several hundred megabytes to store aggregations.

On the other hand, long cube processing times present a more serious limitation. Full cube processing requires downtime, meaning no one can execute queries against cubes being processed. Longer processing time means lengthier downtime. As your warehouse grows larger, so will the cube processing time. An eight-hour downtime window (during non-business hours) may be acceptable for some environments, but users won't be happy when that window opens to 16 hours. Furthermore, downtime is not feasible at all if your business provides analytical solutions to customers across the globe, 24 hours a day, seven days a week.

MSAS also reaches a point of diminishing performance improvements with additional aggregations. Creating one-third of all possible aggregations typically provides the best performance; any aggregations beyond that achieve marginal performance improvement at best.

  Implementing partitions Return to Table of Contents

So how do you keep cube processing time to a minimum and boost analytical query performance? One option is to partition cubes based on certain criteria commonly used to filter queries. A partition is simply a portion of the full cube that can be limited by some criteria.

For example, if analytical queries normally limit output by customers' marital status, you could create two partitions -- one for married customers and one for single customers. Cubes are more commonly partitioned by time dimension members, as is the case when you create a separate partition for each year or each quarter.

Partitions are available in both MSAS 2000 and MSAS 2005, but they are only supported in the Enterprise Editions. This fact shouldn't come as a surprise because partitions typically make sense with large-scale data warehouses that would be more likely to use Enterprise Edition. Note also that partitions are seamless to cube users: They execute queries against the cube and don't have to specify partition names that contain particular data sets. Instead, MSAS determines which partition to poll at query runtime.

By default, each cube has a single partition containing all data in the fact table. To create a new partition, you can invoke the Partition Wizard and specify a portion of the data you wish to include in the partition. For example, if you have a time dimension that spans years 2000 through 2006, you can create a separate partition for each year.

You can also partition your cubes using multiple dimensions. For instance, the sales cube may be partitioned by promotion category, customers' education level and yearly income. Note that you can create different partitions based on the same fact table or different fact tables. For example, you could have a fact table for each year of sales, or perhaps a view for each year of sales data based on a single fact table, that can be used to build each partition.

One limitation of MSAS 2000 is that it allows you to create a partition containing data also included with other partitions. Suppose we create a partition for customers who make more than 150K per year; customers who make more than 150K will be counted twice -- once in the default partition and second in the new partition. MSAS doesn't automatically remove such data from the default partition, nor do you get a warning that you may be counting some data twice. To avoid incorrect calculations, you must edit the default partition. In addition, there is no way to exclude dimension members when designing partitions; you couldn't advise the Partition Wizard to include all income levels except those greater than $150K.

MSAS 2005 allows you to create partitions based on fact tables found in your data source views. You can also write a SQL query that can be used as the source for your partitions. MSAS 2005 warns you if you are using a fact table for an existing partition, and prevents you from counting the same data twice.

Analysis Services allows you to merge partitions, which is helpful when the current partitioning scheme is no longer effective. Suppose you have a partition for year 2006. In 2007, you may wish to combine this partition with the "historical" partition and create a new partition for 2007. You can merge two partitions if both have the same aggregations, same structure and belong to the same cube. In order to ensure that two partitions have the same aggregations with MSAS 2000, you can copy the aggregation design from an existing partition when you create a new partition. MSAS 2005 lets you copy aggregation design from one partition to another after partitions have been created.

Note that Analysis Services calculates the total number of dimension members and fact table size when dimensions and cubes are first created. Initial size of the dimension can be considerably smaller than the size of the same dimension six months later. However, MSAS 2000 never recalculates the size of the dimensions or facts. You should modify dimension level member counts and fact table size properties manually within the dimension editor and cube editor. Furthermore, MSAS 2000 doesn't have a way of determining the total number of members per partition; and you cannot specify the total number of rows for the partition. If the member counts are incorrect, chances are your aggregations will not be optimal. You should use the partition aggregation utility to specify each partition's fact table size.

About the author: Baya Pavliashvili is a DBA manager overseeing database operations supporting over one million users. Baya's primary areas of expertise include performance tuning, replication and data warehousing.

More information from SearchSQLServer.com

  • Tip: Using SQL Profiler with Analysis Services
  • Tip: How to back up and restore Analysis Services databases
  • Tip: Reporting Services tool set in Management Studio goes real-time

  • Rate this Tip
    To rate tips, you must be a member of SearchSQLServer.com.
    Register now to start rating these tips. Log in if you are already a member.




    Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google


    RELATED CONTENT
    SQL Server performance and tuning
    How to create a SQL Server linked server to DB2
    Using full-text search for symbols in SQL Server
    Monitor database mirroring and replication after a SQL Server upgrade
    How to use the SELECT statement in SQL
    Translating information requests into SQL SELECT statements
    Using DISTINCT in SQL to eliminate duplicate rows
    SQL SELECT statement and SELECT query samples
    Using the ORDER BY clause of the SELECT query in SQL
    How to configure Database Mail in SQL Server 2005 to send mail
    SQL Server stored procedures tutorial: Write, tune and get examples

    Microsoft SQL Server
    Tuning SQL Server performance via disk arrays and disk partitioning
    Should you upgrade to SQL Server 2005 or SQL Server 2008?
    SQL Server data conversions from date/time values to character types
    Using full-text search for symbols in SQL Server
    Monitor database mirroring and replication after a SQL Server upgrade
    Basics for working with DATETIME and SMALLDATETIME in SQL Server 2005
    How to configure Database Mail in SQL Server 2005 to send mail
    Upgrade live applications to SQL Server 2005 for high availability
    How to use rank function in SQL Server 2005
    SQL Server high availability when upgrading to SQL Server 2005

    SQL Server data warehousing/business intelligence
    Manage traces in SQL Server 2005 Analysis Services with XMLA commands
    Tutorial: SQL Server 2005 Analysis Services
    Top 10 SQL Server Integration Services (SSIS) and DTS tips
    Open SSIS packages without validation using these SQL properties
    How to process SQL Server 2005 Analysis Services for data availability
    Export SQL Server data to an Excel file using SSIS and Visual Studio
    Five steps to event handlers in SQL Server Integration Services (SSIS)
    Workaround to Reporting Services error in SQL Server Express
    Optimize SAN setup for improved SQL Server performance
    Table partitioning with SQL Server 2005

    RELATED GLOSSARY TERMS
    Terms from Whatis.com − the technology online dictionary
    contiguity  (SearchSQLServer.com)
    contiguous  (SearchSQLServer.com)
    drilldown  (SearchSQLServer.com)
    hashing  (SearchSQLServer.com)
    hybrid online analytical processing  (SearchSQLServer.com)

    RELATED RESOURCES
    2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
    Search Bitpipe.com for the latest white papers and business webcasts
    Whatis.com, the online computer dictionary

    DISCLAIMER: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.

    HomeNewsTopicsITKnowledge ExchangeTipsAsk the ExpertsMultimediaWhite PapersIT Downloads
    About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
    SEARCH 
    TechTarget provides enterprise IT professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective IT purchase decisions and managing their organizations' IT projects - with its network of technology-specific Web sites, events and magazines.

    TechTarget Corporate Web Site  |  Media Kits  |  Reprints  |  Site Map




    All Rights Reserved, Copyright 2005 - 2008, TechTarget | Read our Privacy Policy
      TechTarget - The IT Media ROI Experts