Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
admin:indexing:activecounts:optimization [2011/01/22 19:40]
els
admin:indexing:activecounts:optimization [2016/06/28 22:38] (current)
Line 1: Line 1:
 +~~NOTOC~~
 +
 {{page>:​top_add&​nofooter&​noeditbtn}} {{page>:​top_add&​nofooter&​noeditbtn}}
  
Line 15: Line 17:
 ==== The Optimization Behind ActiveCounts ==== ==== The Optimization Behind ActiveCounts ====
  
-The SQL statements that allow ActiveCounts are fairly ordinary. ​ They are primarily a succession of grouped aggregations,​ with an additional piece of criteria added at each step.  The key to ActiveCounts lies in the optimization of these queries. ​ Omnidex has special optimizations for this particular kind of queries.+The SQL statements that allow ActiveCounts are fairly ordinary. ​ They are primarily a succession of grouped aggregations,​ with an additional piece of criteria added at each step.  The key to ActiveCounts lies in the optimization of these queries. ​ Omnidex has special optimizations for this particular kind of query.
  
 Omnidex is specifically designed as a count engine. ​ By allowing many indexes to be used together on the same search, and by maintaining a count at each step of the search, it is ideally suited to providing fast counts. ​ As long as Omnidex is set up so that all criteria and aggregations can be satisfied within the indexes, Omnidex will be able to provide fast counts along the way. Omnidex is specifically designed as a count engine. ​ By allowing many indexes to be used together on the same search, and by maintaining a count at each step of the search, it is ideally suited to providing fast counts. ​ As long as Omnidex is set up so that all criteria and aggregations can be satisfied within the indexes, Omnidex will be able to provide fast counts along the way.
Line 23: Line 25:
 === Cached Qualifications === === Cached Qualifications ===
  
-As Omnidex processes ​SQL statements ​for a connection, ​it watches for queries that are very similar to the previous query.  If the current query only differs from the previous query by the addition of an additional ​piece of criteria, it will recognize that it does not have to repeat ​the original criteria. ​ Omnidex will simply append ​the additional criteria to the existing search, greatly speeding the search.  ​+When a succession of SQL statements ​is processed on the same connection, ​Omnidex ​watches for similarities between ​the queries.  If the current query is the same as the previous query, except for the addition of another ​piece of criteria, it will use the index pointers from the previous query as a starting point for the current query.  ​This can save a great deal of resources.
  
-For example, if the original SQL statement presents genders and their counts, ​filtered by state codes ... +For example, if the original SQL statement presents genders and their counts, ​using state codes as criteria ​... 
  
 +
 +<​code>​
   select ​       GENDER, count(*)   select ​       GENDER, count(*)
     from        LIST     from        LIST
Line 32: Line 36:
     group by    GENDER;     group by    GENDER;
  
-... and then the subsequent SQL statement presents educations and their counts, ​filtered by state codes and genders ​...+</​code>​ 
 + 
 +... and then the subsequent SQL statement presents educations and their counts, ​using state and gender codes as criteria ​... 
 + 
 +<​code>​
  
   select ​       EDUCATION, count(*)   select ​       EDUCATION, count(*)
Line 39: Line 47:
                 GENDER = '​F'​                 GENDER = '​F'​
     group by    EDUCATION;     group by    EDUCATION;
 +</​code>​
  
-... Omnidex will recognize that the state codes were processed in the previous search. ​ Rather than repeating this search, it will start with the existing search and simply add the criteria for gender. ​ This can save a great deal of resource. 
  
-Consider an user who steps through five pieces of criteria. ​ The first search has one criteria. ​ The second has two ... the original criteria plus the new one.  The third search has three criteria ... the original two criteria plus the new one.  ​And so forth. ​ By the time the user submits the SQL statement with the fifth piece of criteria, fifteen index searches will have been performed +... Omnidex will recognize that the second query is similar to the first query, except for the addition of new criteria ​against GENDERIt will recognize that the index pointers from the first search, which represent ​the Western United States, can be reused.  ​The current query simply requires ​the addition of the new criteria ​against GENDER. This saves a great deal of resources.
  
-Since Omnidex can recognize this type of query sequence and reuse previous qualificationit will only process five pieces ​of criteria.  ​Each new criteria is simply appended to the previous search.  ​This greatly reduced the time for the query, as well as reducing the load on the  system resources.+Consider a succession ​of statementseach of which adds one more piece of criteria.  ​The number of index searches will be dramatically reduced using this optimization. 
 + 
 +^ Criteria ​                                               ^  Index Searches\\ ​  ​Without Optimization ​ ^  Index Searches\\ ​  With Optimization ​      ^ 
 +| STATE                                                   ​| ​ 1                                        |  1                                          | 
 +| STATE\\ ​ GENDER ​                                        ​| ​ 1\\  1                                   ​| ​ cached\\ ​ 1                                | 
 +| STATE\\ ​ GENDER\\ ​ EDUCATION ​                           |  1\\  1\\  1                              |  cached\\ ​ cached\\ ​ 1                      | 
 +| STATE\\ ​ GENDER\\ ​ EDUCATION\\ ​ EMPLOYMENT ​             |  1\\  1\\  1\\  1                         ​| ​ cached\\ ​ cached\\ ​ cached\\ ​ 1            | 
 +| STATE\\ ​ GENDER\\ ​ EDUCATION\\ ​ EMPLOYMENT\\ ​ INCOME ​   |  1\\  1\\  1\\  1\\  1                    |  cached\\ ​ cached\\ ​ cached\\ ​ cached\\ ​ 1  | 
 +^ Total                                                   ​^ ​ 15                                       ​^ ​ 5                                          ^ 
  
 This optimization is automatic. ​ Application developers can simply submit queries in sequence, and Omnidex will recognize when it can use this optimization. ​ No special syntax is required from the developer. This optimization is automatic. ​ Application developers can simply submit queries in sequence, and Omnidex will recognize when it can use this optimization. ​ No special syntax is required from the developer.
  
-This optimization ​relies ​on the succession of queries occurring on the same connection.  Sometimes connections are unexpectedly closed.  If this happens, Omnidex will automatically ​reprocess all of the criteria ​so that the correct results ​are returned.  ​+This optimization ​does rely on the succession of queries occurring on the same connection. ​ If a connection unexpectedly closes, Omnidex will simply ​reprocess all of the criteria, delivering ​the correct ​answer without any intervention by the application.  
 + 
 +=== Rollup Tables === 
 + 
 +ActiveCounts on large databases will also benefit from the use of Rollup Tables. ​ A Rollup Table pre-aggregates another table and stores the summary ​results.  ​Rollup Tables can then be indexed on their own, and can be used to automatically optimize queries. ​ With ActiveCounts,​ it can be quite helpful to pre-aggregate a large database using the commonly used criteria. ​  
 + 
 +For example, a Rollup Table that contained the STATE, GENDER, EDUCATION, EMPLOYMENT and INCOME codes could entirely satisfy the query described above. ​ The original table could contain hundreds of millions of rows, but the Rollup Table may contain only 100,000 rows.  This ensures the performance of these queries. 
 + 
 +Consult the section on [[admin:​features:​rollups:​home|Rollup Tables]] for instruction on designing and setting up Rollup Tables. 
  
-final search that has five criteria. ​ The first search had one criteria; the second had two; the third had three, and so on.  If all criteria was processed each time, there would be a total of fifteen index searches across all of the  
  
  
 
Back to top
admin/indexing/activecounts/optimization.1295725239.txt.gz · Last modified: 2016/06/28 22:38 (external edit)