Administration: Optimizing Queries

Dynamic Caches

Hashed Data Caching

Omnidex has several approaches to choose from when joining tables. Standard Omnidex indexes can be used to join tables, and pre-joined indexes can be used to accelerate joins between parents tabled and child tables. Omnidex can use the underlying database's indexes to perform table joins. Omnidex can use a sort/merge technique to join large tables that have no indexes. Omnidex can also load smaller tables into a temporary, memory-resident, hashed data cache for extremely fast table joins.

The Hashed Data Cache is used when a smaller table will be repeatedly accessed in a table join. A classic example is joining the fact table of a star schema to dimension or snowflake tables. Other examples include any time a table is being joined to small reference or code-lookup tables. In these situations, Omnidex indexes will be favored for resolving criteria; however, the smaller table will frequently be loaded into a Hashed Data Cache.

The Hashed Data Cache temporarily caches these smaller tables into memory and create hash indexes for the join column. Once the cache is loaded, all subsequent access is memory-resident and does not require access to the filesystem. This yields a tremendous gain in performance.

This temporary cache is unique to a connection, and only lasts the duration of the query. This means that the Hashed Data Cache will work fine in environments that use online updates.

Hashed Data Caches in Query Plans

If a Hashed Data Cache is used in a query, it can be seen using Omnidex query plans. The following query plan shows a “Build cache” step that builds the Hashed Data Cache, and then a subsequent “Retrieve” step that retrieves from the Hashed Data Cache.

----------------------------------- SUMMARY -----------------------------------
Select        INDIVIDUALS.NAME,
              GENDERS.DESCRIPTION
  from        INDIVIDUALS
  join        GENDERS on INDIVIDUALS.GENDER = GENDERS.GENDER;

Version:      5.2.01  (Compiled Feb  2 2012 21:29:57)
Warnings:     SEQUENTIAL_SCAN
----------------------------------- DETAILS -----------------------------------
Build cache {1} as (SELECT DESCRIPTION, GENDER FROM GENDERS) on 2;
Retrieve INDIVIDUALS sequentially;
 Retrieve {1} using PK_GENDER = INDIVIDUALS.GENDER;
 Return INDIVIDUALS.NAME, GENDERS.DESCRIPTION;
-------------------------------------------------------------------------------

Adjusting the Cache Size

The Hashed Data Cache size defaults to 32MB of memory per query. This can be adjusted using the HDC_THRESHOLD option of the SELECT statement, which sets the maximum cache size, measured in MB.

Additional Resources

See also:

 
Back to top
admin/optimization/caches/hdc.txt · Last modified: 2016/06/28 22:38 (external edit)