It’s easy to think of data warehouses as being more or less maintenance-free. After all, apart from occasionally loading the data warehouse with new and updated data, what else are SQL Server database administrators (DBAs) expected to do with it? Quite a bit. Let’s start with a few data warehouse maintenance tips.
Purge data. While data warehouses are designed to handle enormous amounts of information, you shouldn’t consider them to be an infinite data store. Sure, your company may want to retain three or more years’ worth of data, but at some point, some data is going to have to come out in order to retain the system’s performance.
Data warehouses can easily attract the worst kind of “data packrat” in your organization, with all kinds of data making its way into the warehouse and sticking around for what seems to be forever. But you should create a plan to move older data out of the system, either deleting it outright or moving it to some less-expensive means of archival storage.
Tune the system. No plan of attack survives first contact with the enemy, and in the world of data warehousing, the “enemy” is your user population. Regardless how expertly designed your data warehouse, no expert can foresee the real, day-to-day uses that it will evolve to support. For that reason, you’re going to need to periodically review how the data warehouse is being used and fine-tune its configuration to provide the best performance. In some cases, that may even mean revisiting the data warehouse’s design and restructuring data. After all, if your users aren’t asking the kinds of questions that the original structure was intended to answer, a new structure is going to be necessary.
This plays into data maintenance as well. When the time does come to restructure data, you’ll be able to do so more easily if your data warehouse isn’t retaining years and years of now useless data. Keeping the data store trimmed to “just the right amount” of data will make restructuring and tuning easier to accomplish.
Maintain the database. I see a lot of DBAs who don’t realize that a data warehouse is just another database. True, it’s designed differently than a transaction-processing database, but it’s a database all the same. That means all the usual database maintenance tasks apply.
- Defragment and rebuild indexes as needed -- at least when indexes become more than 30% fragmented, which can easily happen after a major data loading operation.
- Delete dimensional data on a regularly scheduled basis. It’s a slow process, so this might only happen once or twice a year, but it needs to happen.
- Clear out obsolete logs and other historical data. This is especially important when the data warehouse is supporting a front-end system that is writing log data to the database.
- Review the database size, configuration settings and other attributes to make sure they’re properly configured for the workload that the data warehouse supports. This will differ from organization to organization; the moral is to never assume that what, from a configuration perspective, was a good idea on Day 1 is still a good idea today.
But if you’re still in doubt, ask around. A data warehouse will require maintenance like any other database. If your data warehouse is part of a packaged system that your company bought, contact the vendor to find out what maintenance is appropriate. If it’s part of a system that was developed in-house, work with your developers to find out what maintenance is appropriate.
ABOUT THE AUTHOR
Don Jones is a co-founder of Concentrated Technology LLC, the author of more than 30 books on IT and a speaker at technical conferences worldwide. He can be reached through his website at www.ConcentratedTech.com.
This was first published in July 2011