Problem solve Get help with specific problems with your technologies, process and projects.

In data warehousing, cost-benefit’s the issue with backup compression

Backup compression is well known to carry the benefit of cost savings to SQL Server shops, but organizations using SQL Server data warehouses may have a few more variables to consider.

Most people like simple answers to simple questions. Take this: “Is SQL Server backup compression worth it?” The...

short answer should be yes, but that’s a qualified yes. A better way to phrase the question would be, “Is there a real cost-benefit payoff for making SQL Server compressed backups?”

Native functionality for compressing database backups was first introduced in SQL Server 2008 Enterprise and is now available in all editions, as of version 2008 R2. The biggest potential benefit is cost savings: You can rely less on a third-party product (with the attendant extra cost), spend less on disk space and you don’t need as much network bandwidth to transmit the compressed backups.

But there are other implications for organizations that use SQL Server for data warehousing. On the plus side, compressed backups can be restored on a broader range of SQL Server systems than they’re made on. But compression could have a negative impact on SQL Server performance. And then businesses have to take into consideration the pros and cons of using SQL Server’s native compression technology versus third-party products.

Vincent Rainardi, who tackles data warehousing and business intelligence (BI) questions on his blog, dug into the riddle of cost-effectiveness. Doing his own calculations, he found the time saved in I/O for larger databases more than offsets the processing time to perform the compression. Since most database operations, especially backups, are heavily I/O bound, it’s not hard to see how a 60% compression ratio for a database of several terabytes will pay off. As Rainardi points out, the benefits extend past just speed: Less storage needed for backups means more room for other things, more money saved and less overall bandwidth taken up by the backup process.

The harder part is making sure these things happen consistently in the real world. Rainardi’s testing was of a much smaller scope than what you might find in your average BI analysis or data warehousing scenario, where billions of rows and terabytes of data are not unusual. A white paper by Unisys on SQL Server compression -- both backup compression and in-database compression -- used a test rig and a database size (2 TB) that’s a little closer to what many BI or data warehouse arrangements would be dealing with.

The authors noted, “It is possible in some cases to achieve an overall win in all areas, but doing so isn’t easy and certainly isn’t possible in all cases.” They pointed out that with compression the main tradeoffs are between storage space, I/O and overall performance. For instance, you lose some performance because of the CPU overhead needed to do the compression. But compressed backups require lower overall I/O, so that might translate into a net performance gain.

There are few hurdles to clear when using backup compression, the authors wrote; it’s simply a matter of using the WITH COMPRESSION command when generating backups. But Unisys also noticed a marked increase in CPU utilization when doing compressed backups. They used a metric they called “CPU-seconds” -- the backup elapsed time multiplied by the average CPU multiplied by the number of CPUs. The more cores and sockets you have to throw at the problem, the less of an impact compression will have. That said, Microsoft recommends using the management tool Resource Governor to keep backups from using more than a certain percentage of CPU, no matter how many cores or sockets are available.

Native compression vs. third-party products
What if you’re already compressing backups using a third-party product such as LiteSpeed for SQL Server? Does it make sense to dump that and use SQL Server native backup compression instead?

There’s one major advantage of using SQL Server’s native compression: Anyone running SQL Server 2008 can restore backups compressed with SQL Server 2008. You can only create compressed backups in SQL Server 2008 Enterprise or later editions but you can restore those compressed backups in any SQL Server 2008 edition. This is a handy cost-savings measure: If you don’t need to buy licenses for yet another third-party product, that’s nothing but good news.

Problems arise when you have data that compresses better with a third-party system. If, say, LiteSpeed offers better deduplication for the kind of data you’re warehousing, you’re better off sticking with that. My general rule of thumb is this: If you see a storage improvement of a third or better with any third-party product --that is, 33% compression or more -- it’s worth the effort. But if you don’t see drastic differences between a third-party product and SQL Server’s native compression, stick with native.

Sharing media with other backups
This is a minor issue, but it’s worth bringing up since it might have an impact on your media management, depending on how you use it to do your warehousing. SQL Server takes an all-or-nothing approach to compressing backups. If you save a compressed backup to a file, you can append only compressed backups to the same file. Likewise, only uncompressed backups can be appended to a file that already holds uncompressed backups. The same goes for those still dependent on tape: You can’t put a compressed SQL Server backup and a backup from NTBackup on the same tape.

For most people this shouldn’t be a giant issue; tape is cheap and backups are priceless. But if you’re a backup administrator and you’re contemplating putting multiple backup types on one tape for convenience -- one less thing to keep track of, right? -- now you know what not to do.

Serdar Yegulalp has been writing about computers and IT for more than 15 years for a variety of publications, including InformationWeekand Windows Magazine.

This was last published in March 2012

Dig Deeper on SQL Server Data Warehousing

Start the conversation

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.