Microsoft Analysis Services (MSAS) offers exceptional performance for analytical queries. Reports that show the breakdown of revenues by customer, time and geography are probably the most common example of online analytics. What makes MSAS so great for analytical queries? One simple yet very powerful concept -- aggregations.
Aggregations are summary values calculated during cube processing. Aggregations allow MSAS to ready calculations before queries are executed, thereby dramatically improving application response times. Even if you have millions of transactions stored in your database, MSAS can generate a summary report of these transactions within a second. Analysis Services builds aggregations for you, but knowing how MSAS stores aggregations and data can help you reduce cube processing time and improve the performance of your queries.
Storage models for data and aggregations in MSAS 2000
MSAS cubes are built on top of a relational database (preferably having a dimensional model). When cubes are processed, the data is read and potential aggregates are recorded in one of the three storage models: MOLAP, ROLAP or HOLAP.
MOLAP stands for multidimensional OLAP and records both data and summary values in Analysis Services proprietary file format. MOLAP is the most efficient way to store data; it usually requires orders of magnitude less storage than the relational alternative. ROLAP stands for relational OLAP and records both data and aggregations in a relational database. ROLAP is useful for handling very large dimensions (those containing over 10,000 members), but it can increase your storage needs exponentially. Finally, HOLAP stands for hybrid OLAP and stores data in a relational database while aggregations are stored in multidimensional format.
Within MSAS 2000, MOLAP, ROLAP and HOLAP are your only available options for storing data and aggregations. Furthermore, you have to process cubes to reflect changes in the relational data source. Depending on the number of dimensions and measures within your cubes, processing could take a significant amount of time, often measured in hours. What's worse is that during cube processing none of your reports will be available.
One strategy for reducing report downtime is to process cubes on a server other than the one used for querying. For instance, you could back up a MSAS database on Server A, copy the backup and restore it on Server B and process cubes there. Once the cubes are ready on Server B, back up the database, copy it and restore it back on Server A. Restoring MSAS cubes is considerably faster than cube processing. Alternatively, you could copy the data directory for the MSAS database from the processing server to the querying server.
Storage modes for data and aggregations in MSAS 2005
With MSAS 2005 you have several additional options for providing high availability and excellent performance with your cubes. Many of these options don't require scheduled cube processing at all -- data is refreshed automatically as changes occur in the relational source.
An option called Real Time ROLAP listens to notifications of data changes in the relational source and provides virtually zero latency. You do not have to process cubes with Real Time ROLAP because data changes are propagated to cubes continuously. You do have to pay the penalty of increased storage cost and suboptimal performance, however, because MOLAP will typically outperform ROLAP.
Real Time HOLAP works similarly to Real Time ROLAP, but aggregations are stored in a multidimensional format. Again, MSAS propagates data changes to cubes continuously, but aggregations are recalculated as needed. During aggregation, design MSAS switches to ROLAP mode.
More information from SearchSQLServer.com
Both Real Time HOLAP and Real Time ROLAP are intended for environments where it's crucial to have the most recent data, such as a stock market. MOLAP options (discussed next) are best for environments where application performance is more important than having up-to-the-minute data. For example, sales analytics for a car dealership might not be concerned with most recent changes in data.
Low Latency MOLAP and Medium Latency MOLAP are identical except for how often cube processing occurs. Both of these options use MOLAP storage and listen to notification of data changes in the relational source. MSAS switches to Real Time OLAP while multidimensional objects are processed. Target latency with Low Latency MOLAP is 30 minutes; with Medium Latency MOLAP, it's four hours.
Automatic MOLAP is the other setting that updates MOLAP objects automatically as data changes in the relational source. However, with Automatic MOLAP the server never switches to Real Time OLAP. Instead, it returns data from the cache while new aggregations are being built.
Scheduled MOLAP does not listen to notification of data changes. Rather it automatically reprocesses cubes every 24 hours. With this setting, cubes are always available but they might not reflect recent data changes -- queries are executed against cache while cubes are processed.
All MOLAP options discussed so far provide 100% uptime for cubes (unless you encounter hardware failure of some sort); however, while cubes are processed, the server will have to perform a lot of processor-intensive and memory-intensive work. Therefore, you can reasonably expect some performance degradation while cube data is refreshed.
Finally MSAS 2005 still supports the regular MOLAP option, which does not cache data and requires reprocessing cubes as we did with MSAS 2000. This option provides the best performance while cubes are available, but it requires some downtime.
MSAS 2005 also allows you to choose a different storage model for each partition within your cube, so if you know the pattern of your data changes you can model your cube processing accordingly. For example, suppose your warehouse contains data from years 1999 to 2005. Although some transactions can be backdated, most of your data changes are likely to pertain to 2005. Therefore, you could model your cube so that 2005 data is processed using Low Latency MOLAP, whereas the partition that contains all previous years is processed nightly, using Scheduled MOLAP.
Top 5 SQL Server query tips
Tip 1: SQL Server query design: 10 mistakes to avoid
Tip 2: Troubleshoot SQL Server queries and improve I/O
Tip 3: Tracking query execution with SQL Server 2005 Profiler
Tip 4: Find and fix resource-intensive SQL Server queries
Tip 5: Running analytical queries with Analysis Services
For more help, check out our FAQ: SQL Server query errors and explanations.
About the author:
Baya Pavliashvili is a DBA manager with HealthStream Inc. -- a leader in online healthcare education. In this role, Pavliashvili oversees database operations supporting over one million users. His primary areas of expertise include performance tuning, replication and data warehousing. He can be reached at email@example.com.