Administration: Omnidex Features

Rollup Tables

Designing Rollup Tables

Rollup tables are a good tool for improving performance, but they are not always the best or only solution. The Omnidex Administrator must consider the advantages and disadvantages of using rollup tables and determine when it is best to use them. This is the same as with most performance options, including partitioning and indexing. The table below lists some of the advantages and disadvantages of using rollup tables:

Advantages Disadvantages
A rollup table will have fewer rows to read, so a query can be answered in a fraction of the time. For instance, a table that preaggregates 300 million customers by zipcode and other geographic distinction will have less than 100,000 rows. A query against 100,000 rows is much faster than a query against 300 million rows, regardless of indexing. To achieve its fewer number of rows, a rollup table usually contains a dozen or fewer columns. Only queries that are limited to these columns in their criteria and group by clauses can be redirected to the rollup table.
A rollup table is generally quite small and does not require a lot of disk space or indexing time. A rollup table itself does take time to build since it requires exhaustively reading and aggregating of the underlying data.
A rollup table does not require any changes to the underlying data. A rollup table is not automatically updated when the primary table is updated. Instead, it is a snapshot of the data taken at the time it was generated. It is reasonable to regenerate rollup tables on a daily or weekly basis, but rollup tables are not a good solution for up-to-the-minute queries against heavily-updated data.

The best design approach for rollup tables begins with analyzing the queries and looking for repeated aggregations. Within these situations, evaluate the following questions:

  • Is the value being aggregated a column value or the result of an expression? At present, only column values from the primary table may be referenced in a rollup table.
  • Is the aggregation being grouped by one or more columns in that same table? At present, only columns from the primary table may be referenced as group by columns in the rollup table.
  • Does the aggregation in the primary table substantially reduce the size of the result set? Specifically, does the aggregation produce a result set that is either 1 million rows or 10% of the size of the non-aggregated result set? A rollup table must be significantly smaller than the primary table to provide a performance advantage.

After answering these questions, it should be possible to design one or more rollup tables that would be helpful for a given primary table.

Additional Resources

 
Back to top
admin/features/rollups/design.txt ยท Last modified: 2016/06/28 22:38 (external edit)