This is an old revision of the document!
Overview | Anatomy of a Query Plan | Reading a Query Plan | Configuring Query Plans | Optimizing Queries
The basic optimization of a query with Omnidex has six main steps. Most of the time, following these steps is enough to insure that the query performs well.
Omnidex has different indexing approaches based on the data model. Be sure to understand the parent-child relationships of tables by following their primary and foreign constraints. For more complex databases, it is often useful to diagram the data model.
For obvious reasons, Omnidex performance is affected by the cardinality of the table, meaning the number of rows in the table. This is especially true when large tables need to be joined together. Omnidex performance is also affected by the cardinality of the columns, meaning the number of unique values in the column. For example, columns with cardinalities of 32 or less are good candidates for Omnidex Bitmap indexes.
The query plan is the key to understanding the performance of the query. A quick look at the warnings and notes in the Summary section provides clues for optimizing the query, and the Details provide a clear understanding of what is actually happening when the query is processed.
This sample query has been run an a database with no Omnidex indexing at all. Moreover, this is a database of raw data files, and so there is no indexing in the underlying database, either.
----------------------------------- SUMMARY ----------------------------------- Select I.GENDER, count(*) from HOUSEHOLDS H join INDIVIDUALS I on H.HOUSEHOLD = I.HOUSEHOLD where ((H.STATE = 'CO' and H.CITY = 'DENVER') or (H.STATE = 'AZ' and H.CITY = 'PHOENIX')) and I.BIRTHDATE > 'January 1, 1980' group by I.GENDER; Version: 5.2.01 (Compiled Feb 2 2012 21:29:57) Warnings: UNOPTIMIZED_CRITERIA, UNOPTIMIZED_AGGREGATION, UNOPTIMIZED_SORT, SEQUENTIAL_SCAN, SEQUENTIAL_TABLE_JOIN Notes: Optimized aggregations are not possible because Omnidex indexes don't exist on all aggregated columns and GROUP BY columns and all links to dimension tables HDC optimization not used because table INDIVIDUALS has zero cardinality. SortMerge optimization not used because table HOUSEHOLDS has zero cardinality. Sequential Table Join on INDIVIDUALS with H.HOUSEHOLD = I.HOUSEHOLD Filter on column STATE will not be optimized because there is not an Omnidex index installed on the column. Filter on column CITY will not be optimized because there is not an Omnidex index installed on the column. Filter on column BIRTHDATE will not be optimized because there is not an Omnidex index installed on the column. ----------------------------------- DETAILS ----------------------------------- Retrieve HOUSEHOLDS H sequentially; Retrieve INDIVIDUALS I sequentially; Filter H.HOUSEHOLD = I.HOUSEHOLD; Filter I.BIRTHDATE > 'January 1, 1980'; Filter H.STATE = 'CO'; Filter H.CITY = 'Denver'; Filter FILTER 0 AND FILTER 1; Filter H.STATE = 'AZ'; Filter H.CITY = 'Phoenix'; Filter FILTER 3 AND FILTER 4; Filter FILTER 2 OR FILTER 5; Pass to queue {1} [I.GENDER]; Sort {1} for GROUP BY [I.GENDER]; Retrieve {1} sequentially; Return I.GENDER, COUNT('*'); -------------------------------------------------------------------------------
An analysis of the warnings and notes will provide many clues for optimizing the query:
The Warnings section displays many warnings that show opportunities for improvement:
After considering all of these warnings and notes, we have determined that we need to add Omnidex indexes to the following columns:
We have also determined that we run the UPDATE STATISTICS command after indexing the database.
After applying the indexing changes and other recommendations, a new query plan should be analyzed. The changes made in each pass will affect the processing of the query, and may result in other warnings or suggestions for query optimization. These steps should be repeated until the best optimization is obtain.
After applying the changes described above, the new query plan shows that the query is well optimized. There are no warnings or notes, and the details show that all processing is done through the Omnidex indexes, rather than the database.
----------------------------------- SUMMARY ----------------------------------- Select I.GENDER, count(*) from HOUSEHOLDS H join INDIVIDUALS I on H.HOUSEHOLD = I.HOUSEHOLD where ((H.STATE = 'CO' and H.CITY = 'DENVER') or (H.STATE = 'AZ' and H.CITY = 'PHOENIX')) and I.BIRTHDATE > 'January 1, 1980' group by I.GENDER; Version: 5.2.01 (Compiled Feb 2 2012 21:29:57) ----------------------------------- DETAILS ----------------------------------- Qualify (HOUSEHOLDS)HOUSEHOLDS where CITY = 'DENVER'; Qualify (HOUSEHOLDS)HOUSEHOLDS where and STATE = 'CO'; Create index segment O1 on 1; Qualify (HOUSEHOLDS)HOUSEHOLDS where CITY = 'PHOENIX'; Qualify (HOUSEHOLDS)HOUSEHOLDS where and STATE = 'AZ'; Qualify (HOUSEHOLDS)HOUSEHOLDS where or $ODXID = 'segment(O1)'; Join HOUSEHOLDS using HOUSEHOLD to (INDIVIDUALS)INDIVIDUALS using HOUSEHOLD; Qualify (INDIVIDUALS)INDIVIDUALS where and BIRTHDATE > '"January 1, 1980"'; Aggregate INDIVIDUALS using GENDER for GROUP(GENDER), COUNT(*); Return I.GENDER, COUNT('*'); -------------------------------------------------------------------------------
A query is considered “fully optimized” when there are no wanted warnings or notes in the query plan. Moreover, the goal is always to process as much of the query in the Omnidex indexes as possible, and to access the underlying database as little as possible. It is not uncommon for extremely complex queries to be processed entirely in the Omnidex indexes, with no access to the underlying database, even when there are many table joins, subqueries, complex criteria, aggregations and ordering.
Sometimes the techniques described above are not enough to achieve the desired query performance. Some queries will perform poorly because of the fundamental structure of the query. This can happen because the query is written in an unusual way, often using techniques that are recommended for an underlying relational database. It is sometimes best to study the query, understand what the query is fundamentally trying to do, and then simplify the query.
Other times, it is beneficial to restructure the query to take advantage of Omnidex indexing techniques, similar to what may have been done the underlying relational database. It is usually preferable to maintain queries in their simplest form, but like all databases, certain approaches work better than others.
If you are not able to achieve the level of performance you desired with a query, contact Technical Support. We can sometimes offer suggestions as part of your support contract, and if needed, we can discuss having an engineer analysis your environment and your query plans to achieve improved performance.
See also: