Test Center: Deep dive into SQL Server 2008

03.11.2008

SQL Server's first attempt at Data Compression actually looks pretty good. As I noted in my beta preview (see ""), SQL Server 2008 provides two types of compression: row and page. Row compression is true compression, in which unused spaces at the ends of columns are removed to save storage. Page compression, aka dictionary compression, normalizes the data on each page and keeps a lookup pointer. In SQL Server 2008, page compression includes row compression. If you have page compression turned on, you get row compression in the bargain.

Microsoft provides a handy compression calculation wizard that will give you a good estimate of the benefits you can expect. The wizard runs a test compression scenario against your data for each compression type (row and page) and tells you what the new size of the table should be. I tested the compression calculator against a number of data sets, and on average the calculation deviated from my final results by only 1 or 2 percent. That's pretty good, considering that the calculation is based on a relatively small amount of data.

How much will a table compress? That depends on both the type of data and the type of compression. If your data is something like SAP data, which has a lot of trailing spaces, then row-level compression will serve you well. If you have a lot of repeating data, then table-level page compression will be your best friend.

The order of your data matters also. Because SQL Server compresses at the page level, repeating data that is clustered together will be de-duplicated. So it pays to be aware of both the nature and the order of your data.

All that said, just as with Advanced Compression in Oracle Database 11g (see and ), my results with Data Compression in SQL Server 2008 were all over the map. I got as little as 17 percent compression on some data sets, and as much as 76 percent on another. The high of 76 percent was reached on denormalized order line data in a data warehouse.