DRAFT

Omnidex Indexing

OverviewIndexesStrategiesAdvancedInstallationMaintenance

Basic Indexing Strategies

The basic strategy for creating Omnidex indexes is to index all of the columns that are used in criteria, table joins, aggregations and ordering. This is a different approach than is taken with relational databases. Database administrators are usually trained to identify selected columns that are most frequently used and create indexes only on those tables. Their general expectation is that the relational database will choose an index to access a table and will process the rest of the statement by directly evaluating the data.

Omnidex approaches indexing differently. Omnidex indexes all of the columns and then coordinates searches across all of the indexes to fulfill the different aspects of a query. Some indexes will be use to satisfy table joins or criteria. Other indexes will be used to fulfill aggregations or ordering. Most database administrators are surprised to learn that complex SQL statements which join many tables and contain intricate criteria can often be fulfilled without ever accessing the underlying data. This approach allows Omnidex to process queries very quickly. It also reduces the load on the servers since accessing the data is a common cause of performance problems.

Most indexing strategies are developed by analyzing the queries. Queries usually follow patterns and these patterns give clues to the best indexing approach. Analyze the queries to determine which columns are criteria in the WHERE clause, which tables are joined together in the FROM clause, which columns are aggregated or used in GROUP BY clauses, and which columns are used in the ORDER BY clause. Use the discussions below to determine indexes for these columns and then run sample queries to assess their performance. Query plans will show how the indexes were used, and if the query is not fully optimized, revisit the indexing strategy as needed. You can also read about Advanced Indexing Strategies for special optimization techniques.

Optimizing Criteria

Criteria is usually optimized by creating indexes on each column. This is true regardless of the use of Boolean operators or parentheses. As you consider the query patterns, look for opportunities to use QuickText indexes or indexing options.

Example 1. Basic Criteria

In the following statement, look for the columns used as criteria in the WHERE clause

select  NAME, ADDRESS1, ADDRESS2, PHONE
  from  INDIVIDUALS
  where ((STATE = 'CO' and CITY = 'Boulder') or 
         (STATE = 'IL' and CITY = 'Chicago'))

The STATE and CITY columns should be Omnidex indexes.

Example 2. Textual Criteria

In this example, you might want to use QuickText indexes.

select  NAME, ADDRESS1, ADDRESS2, PHONE
  from  INDIVIDUALS
  where STATE = 'CO' and NAME = 'John'

The STATE column should be an Omnidex index and the NAME column should be a Quicktext index.

Example 3. Case Sensitivity

In this same example, we might want to allow case insensitivity.

select  NAME, ADDRESS1, ADDRESS2, PHONE
  from  INDIVIDUALS
  where STATE = 'co' and NAME = 'john'

The STATE column should be an Omnidex index with the Case Insensitive option, and the NAME column should be a QuickText index. QuickText indexes are automatically case insensitive.

top of page

Optimizing Table Joins

Tables joins fall into two main categories. Some joins provide access to tables that are involved in criteria. Other joins provide access to tables that are only used for select items. Most of the time, Omnidex is only concerned about the tables that are involved in criteria. Table relationships can get complicated, so this rule doesn't always hold true.

Table joins are usually optimized by indexing the foreign keys in the child tables. Most table joins follow either a declared or implied constraint, meaning that they follow the columns that form a parent-child relationship between tables. In most situations, Omnidex will want try to start in the parent and join to the child. This means that the foreign key of that child must be indexed.

Example 1. Joins that do not involve criteria

In this example, the parent table is only used for select items and is not used for criteria.

select  INDIVIDUALS.NAME, HOUSEHOLDS.ADDRESS1, HOUSEHOLDS.ADDRESS2, INDIVIDUALS.PHONE
  from  INDIVIDUALS join HOUSEHOLDS on INDIVIDUALS.HOUSEHOLD = HOUSEHOLDS.HOUSEHOLD
 where  INDIVIDUALS.NAME = 'John'

The parent table HOUSEHOLDS is only used to obtain select items and is not used in criteria. For relational databases, the NAME column should be indexed to resolve the criteria, but no indexes are needed to optimize the join. Omnidex will identify the INDIVIDUALS rows based on the NAME index, and then Omnidex will ask the relational database to join to the HOUSEHOLDS table.

For raw data files, there is no underlying database. Omnidex must provide all of the indexing. In this situation, the primary and foreign constraints must be indexed with Omnidex, causing indexes on HOUSEHOLDS.HOUSEHOLD and INDIVIDUALS.HOUSEHOLD.

Example 2. Joins that do involve criteria

In this example, the parent table is used for criteria as well as select items.

select  INDIVIDUALS.NAME, HOUSEHOLDS.ADDRESS1, HOUSEHOLDS.ADDRESS2, INDIVIDUALS.PHONE
  from  INDIVIDUALS join HOUSEHOLDS on INDIVIDUALS.HOUSEHOLD = HOUSEHOLDS.HOUSEHOLD
 where  HOUSEHOLDS.STATE = 'CO' and INDIVIDUALS.NAME = 'John' 

The parent table HOUSEHOLDS is used to obtain select items and is also used in criteria. For relational databases, the INDIVIDUALS.HOUSEHOLD column must be indexed since it is the foreign key. The NAME column should also be indexed to resolve the criteria. Omnidex will first qualify the HOUSEHOLDS based on the HOUSEHOLDS.STATE index, then join from HOUSEHOLDS to INDIVIDUALS using the INDIVIDUALS.HOUSEHOLD index, and further qualify the INDIVIDUALS based on the INDIVIDUALS.NAME index.

For raw data files, all primary and foreign constraint columns should be indexed since there is no underlying database.

top of page

Optimizing Aggregations

top of page

Optimizing Ordering

top of page

Other Uses for Indexes

top of page




The next section on Omnidex Indexing is Basic Indexing Strategies.