Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
admin:indexing:activecounts:optimization [2011/01/22 19:18]
els
admin:indexing:activecounts:optimization [2016/06/28 22:38] (current)
Line 1: Line 1:
 +~~NOTOC~~
 +
 {{page>:​top_add&​nofooter&​noeditbtn}} {{page>:​top_add&​nofooter&​noeditbtn}}
  
Line 15: Line 17:
 ==== The Optimization Behind ActiveCounts ==== ==== The Optimization Behind ActiveCounts ====
  
-The SQL statements that allow ActiveCounts are fairly ordinary. ​ They are primarily a succession of grouped aggregations,​ with an additional piece of criteria added at each step.  The key to ActiveCounts lies in the optimization of these queries. ​ Omnidex has special optimizations for this particular kind of queries.+The SQL statements that allow ActiveCounts are fairly ordinary. ​ They are primarily a succession of grouped aggregations,​ with an additional piece of criteria added at each step.  The key to ActiveCounts lies in the optimization of these queries. ​ Omnidex has special optimizations for this particular kind of query.
  
-=== Omnidex ​- The Count Engine ===+Omnidex ​is specifically designed as a count engine. ​ By allowing many indexes to be used together on the same search, and by maintaining a count at each step of the search, it is ideally suited to providing fast counts. ​ As long as Omnidex is set up so that all criteria and aggregations can be satisfied within the indexes, Omnidex will be able to provide fast counts along the way.
  
-The first optimization is Omnidex ​is specifically designed as a count engine. ​ By allowing many indexes ​to be used together on the same search, and by maintaining a count at each step of the search, it is ideally suited to providing fast counts. ​ As long as all of the criteria and the aggregations can be satisfied using Omnidex indexing, Omnidex will be able to provide fast counts along the way.+Omnidex ​has two additional optimizations that are essential ​to ActiveCounts:​
  
 === Cached Qualifications === === Cached Qualifications ===
  
-As Omnidex processes ​SQL statements ​for a connection, ​it watches for queries that are very similar to the previous query.  If the current query only differs from the previous query by the addition of an additional ​piece of criteria, it will recognize that it does not have to repeat ​the original criteria. ​ Omnidex will simply append ​the additional criteria to the existing search, greatly speeding the search.  ​+When a succession of SQL statements ​is processed on the same connection, ​Omnidex ​watches for similarities between ​the queries.  If the current query is the same as the previous query, except for the addition of another ​piece of criteria, it will use the index pointers from the previous query as a starting point for the current query.  ​This can save a great deal of resources.
  
-For example, if the original SQL statement presents genders and their counts, ​filtered by state codes ... +For example, if the original SQL statement presents genders and their counts, ​using state codes as criteria ​... 
  
 +
 +<​code>​
   select ​       GENDER, count(*)   select ​       GENDER, count(*)
     from        LIST     from        LIST
Line 32: Line 36:
     group by    GENDER;     group by    GENDER;
  
-... and then the subsequent SQL statement presents educations and their counts, ​filtered by state codes and genders ​...+</​code>​ 
 + 
 +... and then the subsequent SQL statement presents educations and their counts, ​using state and gender codes as criteria ​... 
 + 
 +<​code>​
  
   select ​       EDUCATION, count(*)   select ​       EDUCATION, count(*)
Line 39: Line 47:
                 GENDER = '​F'​                 GENDER = '​F'​
     group by    EDUCATION;     group by    EDUCATION;
 +</​code>​
  
-... Omnidex will recognize that the state codes were processed in the previous search. ​ Rather than repeating this search, it will start with the existing search and simply add the criteria for gender. ​ This can save a great deal of resource. 
  
-Consider an user who steps through five pieces of criteria. ​ The first search has one criteria. ​ The second has two ... the original criteria plus the new one.  The third search has three criteria ... the original two criteria plus the new one.  ​And so forth. ​ By the time the user submits the SQL statement with the fifth piece of criteria, fifteen index searches will have been performed +... Omnidex will recognize that the second query is similar to the first query, except for the addition of new criteria ​against GENDERIt will recognize that the index pointers from the first search, which represent ​the Western United States, can be reused.  ​The current query simply requires ​the addition of the new criteria ​against GENDER. This saves a great deal of resources.
  
-Since Omnidex can recognize this type of query sequence and reuse previous qualificationit will only process five pieces ​of criteria.  ​Each new criteria is simply appended to the previous search.  ​This greatly reduced the time for the query, as well as reducing the load on the  system resources.+Consider a succession ​of statementseach of which adds one more piece of criteria.  ​The number of index searches will be dramatically reduced using this optimization. 
 + 
 +^ Criteria ​                                               ^  Index Searches\\ ​  ​Without Optimization ​ ^  Index Searches\\ ​  With Optimization ​      ^ 
 +| STATE                                                   ​| ​ 1                                        |  1                                          | 
 +| STATE\\ ​ GENDER ​                                        ​| ​ 1\\  1                                   ​| ​ cached\\ ​ 1                                | 
 +| STATE\\ ​ GENDER\\ ​ EDUCATION ​                           |  1\\  1\\  1                              |  cached\\ ​ cached\\ ​ 1                      | 
 +| STATE\\ ​ GENDER\\ ​ EDUCATION\\ ​ EMPLOYMENT ​             |  1\\  1\\  1\\  1                         ​| ​ cached\\ ​ cached\\ ​ cached\\ ​ 1            | 
 +| STATE\\ ​ GENDER\\ ​ EDUCATION\\ ​ EMPLOYMENT\\ ​ INCOME ​   |  1\\  1\\  1\\  1\\  1                    |  cached\\ ​ cached\\ ​ cached\\ ​ cached\\ ​ 1  | 
 +^ Total                                                   ​^ ​ 15                                       ​^ ​ 5                                          ^ 
  
 This optimization is automatic. ​ Application developers can simply submit queries in sequence, and Omnidex will recognize when it can use this optimization. ​ No special syntax is required from the developer. This optimization is automatic. ​ Application developers can simply submit queries in sequence, and Omnidex will recognize when it can use this optimization. ​ No special syntax is required from the developer.
  
-This optimization ​relies ​on the succession of queries occurring on the same connection.  Sometimes connections are unexpectedly closed.  If this happens, Omnidex will automatically ​reprocess all of the criteria ​so that the correct results ​are returned.  ​+This optimization ​does rely on the succession of queries occurring on the same connection. ​ If a connection unexpectedly closes, Omnidex will simply ​reprocess all of the criteria, delivering ​the correct ​answer without any intervention by the application.  
 + 
 +=== Rollup Tables === 
 + 
 +ActiveCounts on large databases will also benefit from the use of Rollup Tables. ​ A Rollup Table pre-aggregates another table and stores the summary ​results.  ​Rollup Tables can then be indexed on their own, and can be used to automatically optimize queries. ​ With ActiveCounts,​ it can be quite helpful to pre-aggregate a large database using the commonly used criteria. ​  
 + 
 +For example, a Rollup Table that contained the STATE, GENDER, EDUCATION, EMPLOYMENT and INCOME codes could entirely satisfy the query described above. ​ The original table could contain hundreds of millions of rows, but the Rollup Table may contain only 100,000 rows.  This ensures the performance of these queries. 
 + 
 +Consult the section on [[admin:​features:​rollups:​home|Rollup Tables]] for instruction on designing and setting up Rollup Tables. 
  
-final search that has five criteria. ​ The first search had one criteria; the second had two; the third had three, and so on.  If all criteria was processed each time, there would be a total of fifteen index searches across all of the  
  
  
 
Back to top
admin/indexing/activecounts/optimization.1295723937.txt.gz · Last modified: 2016/06/28 22:38 (external edit)