Administration: Omnidex Indexing

ActiveCounts

The Optimization Behind ActiveCounts

The SQL statements that allow ActiveCounts are fairly ordinary. They are primarily a succession of grouped aggregations, with an additional piece of criteria added at each step. The key to ActiveCounts lies in the optimization of these queries. Omnidex has special optimizations for this particular kind of query.

Omnidex is specifically designed as a count engine. By allowing many indexes to be used together on the same search, and by maintaining a count at each step of the search, it is ideally suited to providing fast counts. As long as Omnidex is set up so that all criteria and aggregations can be satisfied within the indexes, Omnidex will be able to provide fast counts along the way.

Omnidex has two additional optimizations that are essential to ActiveCounts:

Cached Qualifications

When a succession of SQL statements is processed on the same connection, Omnidex watches for similarities between the queries. If the current query is the same as the previous query, except for the addition of another piece of criteria, it will use the index pointers from the previous query as a starting point for the current query. This can save a great deal of resources.

For example, if the original SQL statement presents genders and their counts, using state codes as criteria …

  select        GENDER, count(*)
    from        LIST
    where       STATE in ('AZ','CA','CO','NM','OR','UT','WA')
    group by    GENDER;

… and then the subsequent SQL statement presents educations and their counts, using state and gender codes as criteria …

  select        EDUCATION, count(*)
    from        LIST
    where       STATE in ('AZ','CA','CO','NM','OR','UT','WA') and 
                GENDER = 'F'
    group by    EDUCATION;

… Omnidex will recognize that the second query is similar to the first query, except for the addition of new criteria against GENDER. It will recognize that the index pointers from the first search, which represent the Western United States, can be reused. The current query simply requires the addition of the new criteria against GENDER. This saves a great deal of resources.

Consider a succession of statements, each of which adds one more piece of criteria. The number of index searches will be dramatically reduced using this optimization.

Criteria Index Searches
Without Optimization
Index Searches
With Optimization
STATE 1 1
STATE
GENDER
1
1
cached
1
STATE
GENDER
EDUCATION
1
1
1
cached
cached
1
STATE
GENDER
EDUCATION
EMPLOYMENT
1
1
1
1
cached
cached
cached
1
STATE
GENDER
EDUCATION
EMPLOYMENT
INCOME
1
1
1
1
1
cached
cached
cached
cached
1
Total 15 5

This optimization is automatic. Application developers can simply submit queries in sequence, and Omnidex will recognize when it can use this optimization. No special syntax is required from the developer.

This optimization does rely on the succession of queries occurring on the same connection. If a connection unexpectedly closes, Omnidex will simply reprocess all of the criteria, delivering the correct answer without any intervention by the application.

Rollup Tables

ActiveCounts on large databases will also benefit from the use of Rollup Tables. A Rollup Table pre-aggregates another table and stores the summary results. Rollup Tables can then be indexed on their own, and can be used to automatically optimize queries. With ActiveCounts, it can be quite helpful to pre-aggregate a large database using the commonly used criteria.

For example, a Rollup Table that contained the STATE, GENDER, EDUCATION, EMPLOYMENT and INCOME codes could entirely satisfy the query described above. The original table could contain hundreds of millions of rows, but the Rollup Table may contain only 100,000 rows. This ensures the performance of these queries.

Consult the section on Rollup Tables for instruction on designing and setting up Rollup Tables.

Additional Resources

See also:

 
Back to top
admin/indexing/activecounts/optimization.txt · Last modified: 2016/06/28 22:38 (external edit)