Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
admin:indexing:strategies:aggregations [2012/01/26 23:29]
doc
admin:indexing:strategies:aggregations [2016/06/28 22:38] (current)
Line 19: Line 19:
 Count aggregations are any SELECT statements that request "​COUNT(*)",​ "​COUNT(column)",​ or "​COUNT(DISTINCT column)",​ regardless of whether there is a GROUP BY clause. ​ These queries are optimized using a collection of Omnidex indexes. ​ Each column that is referenced in the GROUP BY clause and referenced in the COUNT clause must be indexed using a standard Omnidex index. ​ Omnidex will use all of these indexes in combination to optimize the query. Count aggregations are any SELECT statements that request "​COUNT(*)",​ "​COUNT(column)",​ or "​COUNT(DISTINCT column)",​ regardless of whether there is a GROUP BY clause. ​ These queries are optimized using a collection of Omnidex indexes. ​ Each column that is referenced in the GROUP BY clause and referenced in the COUNT clause must be indexed using a standard Omnidex index. ​ Omnidex will use all of these indexes in combination to optimize the query.
  
-The advantage of this approach ​to optimizing queries ​is that a small number of indexes can support nearly any combination of GROUP BY columns, providing ​the highest ​performance at the least cost.+The advantage of this optimization ​approach is that a small number of indexes can support nearly any combination of GROUP BY columns, providing ​high performance at little ​cost.
  
-There are some situations in which specialized ​index is better for optimizing a query. ​ When the GROUP BY column ​have especially high cardinality,​ such as over 64,000 distinct values, it will be necessary to create a multi-column index as described below. ​ Additionally,​ if the same GROUP BY columns are used repeatedly, a multi-column index will perform ​faster.+There are some situations in which specialized ​indexes are better for optimizing a query. ​ When GROUP BY column ​has especially high cardinality,​ such as over 64,000 distinct values, it is necessary to create a multi-column index as described below. ​ Additionally,​ if the same GROUP BY columns are used frequently, a multi-column index will provide even faster ​performance.
  
 ==== Optimizing Other Aggregations ==== ==== Optimizing Other Aggregations ====
Line 30: Line 30:
  
 It is fine for one index to service multiple aggregations. ​ This occurs when one index contains a superset of all of the columns for several queries. ​ For example, an index on columns A, B, C and D would allow sums of D grouped by A, and also averages of D and C grouped by B and A.  The performance of these aggregations will begin to degrade as the index width increases, so indexes should not contain an excessive number of columns. ​ A good rule of thumb is to keep the width of these indexes below 64 bytes, though they can be as large as 240 bytes. It is fine for one index to service multiple aggregations. ​ This occurs when one index contains a superset of all of the columns for several queries. ​ For example, an index on columns A, B, C and D would allow sums of D grouped by A, and also averages of D and C grouped by B and A.  The performance of these aggregations will begin to degrade as the index width increases, so indexes should not contain an excessive number of columns. ​ A good rule of thumb is to keep the width of these indexes below 64 bytes, though they can be as large as 240 bytes.
 +
 +==== Examples ====
  
 == Example 1.  Ungrouped COUNT(*) aggregations == == Example 1.  Ungrouped COUNT(*) aggregations ==
Line 35: Line 37:
 In this example, the COUNT(*) does not require any additional indexes. In this example, the COUNT(*) does not require any additional indexes.
  
-<​code ​sql>+<​code>​
  
   select ​    ​count(*)   select ​    ​count(*)
Line 49: Line 51:
 In this example, the COUNT(*) does not require any additional indexes, but the GROUP BY clause requires an index.  ​ In this example, the COUNT(*) does not require any additional indexes, but the GROUP BY clause requires an index.  ​
  
-<​code ​sql>+<​code>​
  
   select ​    ​GENDER,​ count(*)   select ​    ​GENDER,​ count(*)
Line 64: Line 66:
 In this example, the GROUP BY clause contains columns from multiple parents. In this example, the GROUP BY clause contains columns from multiple parents.
  
-<​code ​sql>+<​code>​
  
   select ​    ​C.DESCRIPTION,​ S.DESCRIPTION,​ H.CITY, count(DISTINCT H.ZIP)   select ​    ​C.DESCRIPTION,​ S.DESCRIPTION,​ H.CITY, count(DISTINCT H.ZIP)
Line 81: Line 83:
 In this example, both SQL statements can be satisfied by one Omnidex index. In this example, both SQL statements can be satisfied by one Omnidex index.
  
-<​code ​sql>+<​code>​
  
   select ​    ​GENDER,​ BIRTHDATE, count(DISTINCT HOUSEHOLD)   select ​    ​GENDER,​ BIRTHDATE, count(DISTINCT HOUSEHOLD)
Line 89: Line 91:
  
  
-<​code ​sql>+<​code>​
  
   select ​    ​HOUSEHOLD,​ GENDER, count(*)   select ​    ​HOUSEHOLD,​ GENDER, count(*)
Line 103: Line 105:
 This sample environment file shows the Omnidex indexes that will optimize these queries. This sample environment file shows the Omnidex indexes that will optimize these queries.
  
-<​code ​sql>+<​code>​
 create environment create environment
  ​in ​                  "​simple.xml"​  ​in ​                  "​simple.xml"​
 
Back to top
admin/indexing/strategies/aggregations.1327620598.txt.gz ยท Last modified: 2016/06/28 22:38 (external edit)