Administration: Indexing Strategies

Advanced Strategies

Criteria

Criteria is the often the easiest aspect of a query to optimize. If all columns that are referenced in the WHERE clause are indexed, Omnidex will use those indexes to optimize the criteria. At the same time, there are advanced strategies that can help with more challenging situations.

Multi-Column Indexes

Most Omnidex indexes are placed on a simple column; however, Omnidex indexes can also consist of multiple columns. This can be beneficial when certain combinations of criteria are repeated for a large percentage of queries. When processing criteria, Omnidex normally searches one index and then intersects it with another index; however, if a multi-column index exists that includes both of those columns, Omnidex can resolve both criteria in one index search. This will save time and can make a noticeable difference across an entire application.

At the same time, this strategy implies redundant indexing. It is typical that the individual columns will be indexed independently so that they can be intersected with other columns as well. This strategy must produce enough benefit that it justifies the added indexing time and added disk space of redundant indexing.

This strategy makes the most sense in the following scenarios:

  • The table is very large, with at least a million rows.
  • The combination of criteria is used in a majority of queries.
  • At least one of the columns has cardinality greater than 32, meaning that it will not be an Omnidex Bitmap.

In these scenarios, it can make sense to create a small number of multi-column indexes to satisfy the most frequent combinations.

Using Expression-based Columns

Standard Omnidex indexes are appropriate for optimizing criteria where a column is being compared to one or more values. For example, standard Omnidex indexes are excellent for optimizing the criteria, “STATE = 'MA' and CITY = 'Boston'”. By default, Omnidex indexes cannot be used to optimize criteria where an expression is being compared to one or more values. For example, standard Omnidex indexes cannot be used to optimize the criteria, “((QUANTITY * UNIT_COST) * TAX_RATE) + COMMISSION) > 100.00”.

In these situations, Omnidex will look for an expression-based column that matches the expression in the SQL statement. If one is found, Omnidex will look to see if that expression-based column has been indexed. If an index is found, Omnidex will optimize the query by using that index. This has two advantages. First, it gains the speed of an indexed search, and it prevents the expression from be recalculated for each row as it is retrieved.

This same technique is appropriate for many types of expressions. For example, a search for area code using criteria of “SUBSTRING(PHONE from 2 for 3) = '303'” can be optimized using a similarly constructed expression-based column. A search for age using criteria of “$COMPARE_DATES(BIRTHDATE, CURRENT_DATE, 'YY') >= 24” can be similarly optimized. This technique creates a wide variety of new oppportunities for optimizing queries using Omnidex indexes.

Using Partitioned Data

In very large databases, it can be beneficial to partition the data into an Omnidex Grid. Each Grid Node is then limited to rows that meet certain criteria. This means that queries that reference the partition column can be restricted to accessing only those nodes that match that criteria. This can have a substantial impact on performance, allowing many queries to be limited to just a node on an Omnidex Grid, rather than the entire database. For example, if an Omnidex Grid is partitioned by the STATE column, then any query with criteria against the STATE column will only need to access the node(s) for those states.

Additional Resources

See also:

 
Back to top
admin/indexing/advanced/criteria.txt · Last modified: 2016/06/28 22:38 (external edit)