This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
admin:indexing:activecounts:optimization [2011/01/22 19:18] els |
admin:indexing:activecounts:optimization [2016/06/28 22:38] (current) |
||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ~~NOTOC~~ | ||
+ | |||
{{page>:top_add&nofooter&noeditbtn}} | {{page>:top_add&nofooter&noeditbtn}} | ||
Line 15: | Line 17: | ||
==== The Optimization Behind ActiveCounts ==== | ==== The Optimization Behind ActiveCounts ==== | ||
- | The SQL statements that allow ActiveCounts are fairly ordinary. They are primarily a succession of grouped aggregations, with an additional piece of criteria added at each step. The key to ActiveCounts lies in the optimization of these queries. Omnidex has special optimizations for this particular kind of queries. | + | The SQL statements that allow ActiveCounts are fairly ordinary. They are primarily a succession of grouped aggregations, with an additional piece of criteria added at each step. The key to ActiveCounts lies in the optimization of these queries. Omnidex has special optimizations for this particular kind of query. |
- | === Omnidex - The Count Engine === | + | Omnidex is specifically designed as a count engine. By allowing many indexes to be used together on the same search, and by maintaining a count at each step of the search, it is ideally suited to providing fast counts. As long as Omnidex is set up so that all criteria and aggregations can be satisfied within the indexes, Omnidex will be able to provide fast counts along the way. |
- | The first optimization is Omnidex is specifically designed as a count engine. By allowing many indexes to be used together on the same search, and by maintaining a count at each step of the search, it is ideally suited to providing fast counts. As long as all of the criteria and the aggregations can be satisfied using Omnidex indexing, Omnidex will be able to provide fast counts along the way. | + | Omnidex has two additional optimizations that are essential to ActiveCounts: |
=== Cached Qualifications === | === Cached Qualifications === | ||
- | As Omnidex processes SQL statements for a connection, it watches for queries that are very similar to the previous query. If the current query only differs from the previous query by the addition of an additional piece of criteria, it will recognize that it does not have to repeat the original criteria. Omnidex will simply append the additional criteria to the existing search, greatly speeding the search. | + | When a succession of SQL statements is processed on the same connection, Omnidex watches for similarities between the queries. If the current query is the same as the previous query, except for the addition of another piece of criteria, it will use the index pointers from the previous query as a starting point for the current query. This can save a great deal of resources. |
- | For example, if the original SQL statement presents genders and their counts, filtered by state codes ... | + | For example, if the original SQL statement presents genders and their counts, using state codes as criteria ... |
+ | |||
+ | <code> | ||
select GENDER, count(*) | select GENDER, count(*) | ||
from LIST | from LIST | ||
Line 32: | Line 36: | ||
group by GENDER; | group by GENDER; | ||
- | ... and then the subsequent SQL statement presents educations and their counts, filtered by state codes and genders ... | + | </code> |
+ | |||
+ | ... and then the subsequent SQL statement presents educations and their counts, using state and gender codes as criteria ... | ||
+ | |||
+ | <code> | ||
select EDUCATION, count(*) | select EDUCATION, count(*) | ||
Line 39: | Line 47: | ||
GENDER = 'F' | GENDER = 'F' | ||
group by EDUCATION; | group by EDUCATION; | ||
+ | </code> | ||
- | ... Omnidex will recognize that the state codes were processed in the previous search. Rather than repeating this search, it will start with the existing search and simply add the criteria for gender. This can save a great deal of resource. | ||
- | Consider an user who steps through five pieces of criteria. The first search has one criteria. The second has two ... the original criteria plus the new one. The third search has three criteria ... the original two criteria plus the new one. And so forth. By the time the user submits the SQL statement with the fifth piece of criteria, fifteen index searches will have been performed. | + | ... Omnidex will recognize that the second query is similar to the first query, except for the addition of new criteria against GENDER. It will recognize that the index pointers from the first search, which represent the Western United States, can be reused. The current query simply requires the addition of the new criteria against GENDER. This saves a great deal of resources. |
- | Since Omnidex can recognize this type of query sequence and reuse previous qualification, it will only process five pieces of criteria. Each new criteria is simply appended to the previous search. This greatly reduced the time for the query, as well as reducing the load on the system resources. | + | Consider a succession of statements, each of which adds one more piece of criteria. The number of index searches will be dramatically reduced using this optimization. |
+ | |||
+ | ^ Criteria ^ Index Searches\\ Without Optimization ^ Index Searches\\ With Optimization ^ | ||
+ | | STATE | 1 | 1 | | ||
+ | | STATE\\ GENDER | 1\\ 1 | cached\\ 1 | | ||
+ | | STATE\\ GENDER\\ EDUCATION | 1\\ 1\\ 1 | cached\\ cached\\ 1 | | ||
+ | | STATE\\ GENDER\\ EDUCATION\\ EMPLOYMENT | 1\\ 1\\ 1\\ 1 | cached\\ cached\\ cached\\ 1 | | ||
+ | | STATE\\ GENDER\\ EDUCATION\\ EMPLOYMENT\\ INCOME | 1\\ 1\\ 1\\ 1\\ 1 | cached\\ cached\\ cached\\ cached\\ 1 | | ||
+ | ^ Total ^ 15 ^ 5 ^ | ||
This optimization is automatic. Application developers can simply submit queries in sequence, and Omnidex will recognize when it can use this optimization. No special syntax is required from the developer. | This optimization is automatic. Application developers can simply submit queries in sequence, and Omnidex will recognize when it can use this optimization. No special syntax is required from the developer. | ||
- | This optimization relies on the succession of queries occurring on the same connection. Sometimes connections are unexpectedly closed. If this happens, Omnidex will automatically reprocess all of the criteria so that the correct results are returned. | + | This optimization does rely on the succession of queries occurring on the same connection. If a connection unexpectedly closes, Omnidex will simply reprocess all of the criteria, delivering the correct answer without any intervention by the application. |
+ | |||
+ | === Rollup Tables === | ||
+ | |||
+ | ActiveCounts on large databases will also benefit from the use of Rollup Tables. A Rollup Table pre-aggregates another table and stores the summary results. Rollup Tables can then be indexed on their own, and can be used to automatically optimize queries. With ActiveCounts, it can be quite helpful to pre-aggregate a large database using the commonly used criteria. | ||
+ | |||
+ | For example, a Rollup Table that contained the STATE, GENDER, EDUCATION, EMPLOYMENT and INCOME codes could entirely satisfy the query described above. The original table could contain hundreds of millions of rows, but the Rollup Table may contain only 100,000 rows. This ensures the performance of these queries. | ||
+ | |||
+ | Consult the section on [[admin:features:rollups:home|Rollup Tables]] for instruction on designing and setting up Rollup Tables. | ||
- | final search that has five criteria. The first search had one criteria; the second had two; the third had three, and so on. If all criteria was processed each time, there would be a total of fifteen index searches across all of the | ||