Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
admin:optimization:plans:optimization [2012/02/03 03:17]
doc created
admin:optimization:plans:optimization [2016/06/28 22:38] (current)
Line 19: Line 19:
 ==== Optimizing Queries ==== ==== Optimizing Queries ====
  
 +The basic optimization of a query with Omnidex has six main steps. Most of the time, following these steps is enough to insure that the query performs well.  ​
 +
 +=== Step 1. Understand the data model ===
 +
 +Omnidex has different indexing approaches based on the data model. ​ Be sure to understand the parent-child relationships of tables by following their primary and foreign constraints. ​ For more complex databases, it is often useful to diagram the data model.  ​
 +
 +=== Step 2. Understand the cardinalities ===
 +
 +For obvious reasons, Omnidex performance is affected by the cardinality of the table, meaning the number of rows in the table. ​ This is especially true when large tables need to be joined together. ​ Omnidex performance is also affected by the cardinality of the columns, meaning the number of unique values in the column. For example, columns with cardinalities of 32 or less are good candidates for Omnidex Bitmap indexes.
 +
 +=== Step 3. Generate a query plan for the query ===
 +
 +The query plan is the key to understanding the performance of the query. ​ A quick look at the warnings and notes in the Summary section provides clues for optimizing the query, and the Details section provide an understanding of how the query is processed.
 +
 +This sample query has been run on a database with no Omnidex indexing at all.  Moreover, this database uses raw data files, so there is no native indexing available in the underlying database, either. ​
  
 <​code>​ <​code>​
Line 33: Line 48:
   group by    I.GENDER;   group by    I.GENDER;
  
-Version: ​     5.2.01 ​ (Compiled ​Jan 23 2012 17:27:25)+Version: ​     5.2.01 ​ (Compiled ​Feb  2 2012 21:29:57)
 Warnings: ​    ​UNOPTIMIZED_CRITERIA,​ UNOPTIMIZED_AGGREGATION,​ UNOPTIMIZED_SORT,​ Warnings: ​    ​UNOPTIMIZED_CRITERIA,​ UNOPTIMIZED_AGGREGATION,​ UNOPTIMIZED_SORT,​
               SEQUENTIAL_SCAN,​ SEQUENTIAL_TABLE_JOIN               SEQUENTIAL_SCAN,​ SEQUENTIAL_TABLE_JOIN
-Notes: ​       Optimized aggregations are not possible because ​no ASK index +Notes: ​       Optimized aggregations are not possible because ​Omnidex indexes 
-                ​contains ​all columns and all links to dimension tables +                ​don't exist on all aggregated ​columns and GROUP BY columns and 
-              HDC optimization not used because table INDIVIDUALS has Zero+                ​all links to dimension tables 
 +              HDC optimization not used because table INDIVIDUALS has zero
                 cardinality.                 cardinality.
-              SortMerge optimization not used because table HOUSEHOLDS has Zero+              SortMerge optimization not used because table HOUSEHOLDS has zero
                 cardinality.                 cardinality.
               Sequential Table Join on INDIVIDUALS with H.HOUSEHOLD =               Sequential Table Join on INDIVIDUALS with H.HOUSEHOLD =
                 I.HOUSEHOLD                 I.HOUSEHOLD
               Filter on column STATE will not be optimized because there is not               Filter on column STATE will not be optimized because there is not
-                an MDK index installed on the column.+                an Omnidex ​index installed on the column.
               Filter on column CITY will not be optimized because there is not               Filter on column CITY will not be optimized because there is not
-                an MDK index installed on the column.+                an Omnidex ​index installed on the column.
               Filter on column BIRTHDATE will not be optimized because there is               Filter on column BIRTHDATE will not be optimized because there is
-                not an MDK index installed on the column.+                not an Omnidex ​index installed on the column.
 ----------------------------------- DETAILS ----------------------------------- ----------------------------------- DETAILS -----------------------------------
 Retrieve HOUSEHOLDS H sequentially;​ Retrieve HOUSEHOLDS H sequentially;​
Line 56: Line 72:
  ​Filter I.BIRTHDATE > '​January 1, 1980';  ​Filter I.BIRTHDATE > '​January 1, 1980';
  ​Filter H.STATE = '​CO';​  ​Filter H.STATE = '​CO';​
- ​Filter H.CITY = 'DENVER';+ ​Filter H.CITY = 'Denver';
  ​Filter FILTER 0 AND FILTER 1;  ​Filter FILTER 0 AND FILTER 1;
  ​Filter H.STATE = '​AZ';​  ​Filter H.STATE = '​AZ';​
- ​Filter H.CITY = 'PHOENIX';+ ​Filter H.CITY = 'Phoenix';
  ​Filter FILTER 3 AND FILTER 4;  ​Filter FILTER 3 AND FILTER 4;
  ​Filter FILTER 2 OR FILTER 5;  ​Filter FILTER 2 OR FILTER 5;
Line 67: Line 83:
 Return I.GENDER, COUNT('​*'​);​ Return I.GENDER, COUNT('​*'​);​
 ------------------------------------------------------------------------------- -------------------------------------------------------------------------------
-> 
 </​code>​ </​code>​
  
-A plan consists of two sections: the Summary and the Details The Summary section shows the SQL statement, the version of Omnidex, any optimization settings or warnings that affect the query, and even occasional suggestions for optimizing the query. ​ The Details ​section shows the steps that will be executed to satisfy ​the query.  Steps are executed in the order displayed, and are indented as needed to show loops of instructions.+=== Step 4Review ​the WarningsNotes and Details ​of the query plan ===
  
-In the simple example abovethree columns are retrieved from the INDIVIDUALS ​table.  ​Since no criteria ​was provided in WHERE clause, the details of the plan show that the table is retrieved sequentially ​and the columns are returned.  ​A simple warning indicates ​that a sequential scan is taking placewhich in this case cannot ​be avoided.+An analysis of the warnings and notes will provide many clues for optimizing the query: 
 + 
 +== Warnings == 
 + 
 +The Warnings section displays many warnings that show opportunities for improvement:​  
 + 
 +  * **UNOPTIMIZED_CRITERIA**. ​ At least one piece of criteria is not supported by an Omnidex index. ​ The goal is always to have all criteria processed through the Omnidex indexes. ​ This is especially true for queries that have table joinsaggregations or ordering, since Omnidex indexes cannot be used for these steps unless all criteria has been processed through Omnidex indexes.  
 +  * **UNOPTIMIZED_AGGREGATION**. ​ The aggregations in this query cannot be satisfied using Omnidex indexes. ​ This means that the data must be retrieved from the database and sorted. ​ It is preferable to have the aggregation processed through Omnidex indexes. 
 +  * **UNOPTIMIZED_SORT**. ​ Some aspect of this query required sorting at run time.  This is not always an optimization issue, as sometimes it is required to sort intermediate results that were derived from Omnidex indexes. ​ If sorting can be eliminated through the use of Omnidex indexes, it is an easy way to improve the query performance. 
 +  * **SEQUENTIAL_SCAN**. ​ At least one table is being sequentially scanned.  ​This is not always an optimization issue, especially if more than 10% of the table is to be retrieved. ​ Often it is an indication that Omnidex indexes were not present, either for processing ​criteria ​or table joins. 
 +  * **SEQUENTIAL_TABLE_JOIN**. ​ At least one table join required ​sequential table joinwhere the second table is sequentially scanned for each and every row in the first table. ​ This is the most inefficient approach to table joins and should always be avoided when possible. ​ It is usually an indication ​that an UPDATE STATISTICS command has not be run to register ​the table cardinalities,​ and that indexes are not present on the join columns.  
 + 
 +== Notes == 
 + 
 +The Notes section gives more specific recommendations for optimizing this particular query: 
 + 
 +  * **Optimized aggregations are not possible because Omnidex indexes don't exist on all aggregated columns and GROUP BY columns and all links to dimension tables**. This note describes how this query is performing aggregations,​ but they are not supported by Omnidex indexes. ​ The technique for optimizing most aggregations is to add Omnidex indexes to all columns that are being aggregated, or that are GROUP BY columns, or that are foreign keys to dimension tables containing GROUP BY columns. ​ In this query, the GENDER column must be indexed since it is a GROUP BY column. ​ No other indexes would be necessary since columns are not being aggregated, ​and the GROUP BY columns are not in dimension tables within a Star Schema. 
 +  * **HDC optimization not used because table INDIVIDUALS has zero cardinality**. ​ This note describes how Omnidex is trying to find a way to optimize the table join between HOUSEHOLDS and INDIVIDUALS. ​ Omnidex is evaluating whether it can use a Hashed Data Cache (HDC) to optimize the join, but cannot because the table cardinalities are not known. ​ This is an indication ​that an UPDATE STATISTICS has not been performed. 
 +  * **SortMerge optimization not used because table HOUSEHOLDS has zero cardinality**. This note also describes how Omnidex is trying to find way to optimize the table join between HOUSEHOLDS and INDIVIDUALS. ​ Omnidex ​is evaluating whether it can use a Sort/Merge to optimize the joinbut cannot because the table cardinalities are not known. ​ This is an indication that an UPDATE STATISTICS has not been performed. 
 +  * **Sequential Table Join on INDIVIDUALS with H.HOUSEHOLD = I.HOUSEHOLD**. ​ This note describes that Omnidex has found no other approach to processing this table join other than a Sequential Table Join, where the second table is sequentially scanned for each and every row of the first table. ​ This is the worst performing method of processing a table join.  In this case, Omnidex indexes should ​be added to the join columns, HOUSEHOLDS.HOUSEHOLD and INDIVIDUALS.HOUSEHOLD. 
 +  * **Filter on column STATE will not be optimized because there is not an Omnidex index installed on the column**. ​ This note describes how there is not an Omnidex index on the STATE column, and so Omnidex must process this filter by retrieving each row and comparing it at run time.  It is important to process all criteria with Omnidex indexes, both for the speed of processing the criteria and also for the speed of later steps, such as table joins, aggregations and ordering. ​ In this case, an Omnidex index should be added on the STATE column. 
 +  * **Filter on column CITY will not be optimized because there is not an Omnidex index installed on the column**. ​ Similar to the note above, an Omnidex index should be added on the CITY column. ​ Depending on whether the criteria is textual or not, the administrator should choose between an Omnidex index and a QuickText index. 
 +  * **Filter on column BIRTHDATE will not be optimized because there is not an Omnidex index installed on the column**. ​ Similar to the note above, an Omnidex index should be added on the BIRTHDATE column. ​  
 + 
 + 
 +=== Step 5. Apply any indexing changes or other recommendations === 
 + 
 +After considering all of these warnings and notes, we have determined that we should add Omnidex indexes to the following columns:
    
 +  * HOUSEHOLDS.HOUSEHOLD
 +  * HOUSEHOLDS.STATE
 +  * HOUSEHOLDS.CITY
 +  * INDIVIDUALS.HOUSEHOLD
 +  * INDIVIDUALS.GENDER  ​
 +
 +We have also determined that we should run the UPDATE STATISTICS command after indexing the database.  ​
 +
 +=== Step 6. Produce a new query plan and re-analyze the query ===
 +
 +After applying the indexing changes and other recommendations,​ a new query plan should be analyzed. ​ The changes made in each pass will affect the processing of the query, and may result in other warnings or suggestions for query optimization. ​ These steps should be repeated until the best optimization is obtained.
 +
 +After applying the changes described above, the new query plan shows that the query is well optimized. There are no warnings or notes, and the details show that all processing is done through the Omnidex indexes, rather than the database.
 +
 +<​code>​
 +----------------------------------- SUMMARY -----------------------------------
 +Select ​       I.GENDER,
 +              count(*)
 +  from        HOUSEHOLDS H
 +  join        INDIVIDUALS I on H.HOUSEHOLD = I.HOUSEHOLD
 +  where       ​((H.STATE = '​CO'​ and
 +                H.CITY = '​DENVER'​) or
 +               ​(H.STATE = '​AZ'​ and
 +                H.CITY = '​PHOENIX'​)) and
 +              I.BIRTHDATE > '​January 1, 1980'
 +  group by    I.GENDER;
 +
 +Version: ​     5.2.01 ​ (Compiled Feb  2 2012 21:29:57)
 +----------------------------------- DETAILS -----------------------------------
 +Qualify (HOUSEHOLDS)HOUSEHOLDS where CITY = '​DENVER';​
 +Qualify (HOUSEHOLDS)HOUSEHOLDS where and STATE = '​CO';​
 +Create index segment O1 on 1;
 +Qualify (HOUSEHOLDS)HOUSEHOLDS where CITY = '​PHOENIX';​
 +Qualify (HOUSEHOLDS)HOUSEHOLDS where and STATE = '​AZ';​
 +Qualify (HOUSEHOLDS)HOUSEHOLDS where or $ODXID = '​segment(O1)';​
 +Join HOUSEHOLDS using HOUSEHOLD to (INDIVIDUALS)INDIVIDUALS using HOUSEHOLD;
 +Qualify (INDIVIDUALS)INDIVIDUALS where and BIRTHDATE > '"​January 1, 1980"';​
 +Aggregate INDIVIDUALS using GENDER for GROUP(GENDER),​ COUNT(*);
 +Return I.GENDER, COUNT('​*'​);​
 +-------------------------------------------------------------------------------
 +</​code>​
 +
 +==== Further Optimization ===
 +
 +A query is considered "fully optimized"​ when there are no unwanted warnings or notes in the query plan.  The goal with Omnidex optimization is to process as much of the query in the Omnidex indexes as possible, and to access the underlying database as little as possible. ​ It is even possible for extremely complex queries to be processed entirely in the Omnidex indexes, with no access to the underlying database, even when there are many table joins, subqueries, complex criteria, aggregations and ordering.  ​
 +
 +Sometimes the techniques described above are not enough to achieve the desired query performance. ​ Some queries will perform poorly because of the fundamental structure of the query. ​ This can happen because the query is written in an unusual way, often using techniques that are recommended for an underlying relational database. ​ It is sometimes best to study the query, understand what the query is fundamentally trying to do, and then simplify the query.  ​
 +
 +While simplicity is usually the best approach for Omnidex optimization,​ there are times when it is beneficial to restructure the query to take advantage of Omnidex indexing techniques. ​ Ironically, this is similar to what may have been done for an underlying relational database, but this time we are favoring the optimization techniques of Omnidex.  ​
 +
 +If you are not able to achieve the level of performance you desired with a query, contact [[appendix:​contactus|Technical Support]]. ​ We can sometimes offer suggestions as part of your support contract, and if needed, we can discuss having an engineer analyze your environment and your query plans to achieve improved performance. ​
 +
 +
  
 =====  ===== =====  =====
 ---- ----
 \\ \\
-**[[admin:​optimization:​reference|Next]]**+**[[admin:​optimization:​plans:​configuration|Prev]]**
  
  
 ====== Additional Resources ====== ====== Additional Resources ======
 See also: See also:
-{{page>:​admin:​indexing:​see_also&​nofooter&​noeditbtn}}+{{page>:​admin:​optimization:​see_also&​nofooter&​noeditbtn}}
  
 {{page>:​bottom_add&​nofooter&​noeditbtn}} {{page>:​bottom_add&​nofooter&​noeditbtn}}
 
Back to top
admin/optimization/plans/optimization.1328239024.txt.gz · Last modified: 2016/06/28 22:38 (external edit)