Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
admin:optimization:caches:hdc [2012/02/24 18:27]
doc
admin:optimization:caches:hdc [2016/06/28 22:38] (current)
Line 17: Line 17:
 ==== Hashed Data Caching ==== ==== Hashed Data Caching ====
  
-Omnidex has several approaches to choose from when joining tables. ​ Standard Omnidex indexes can be used to join tables, and pre-joined indexes can be used to accelerate joins between parents tabled and child tables. ​ Omnidex can use the underlying database'​s indexes to perform table joins. ​ Omnidex can use a sort/merge technique to join especially ​tables ​without ​indexes. ​ Omnidex can also load smaller tables into a temporary, memory-resident,​ hashed data cache for extremely fast table joins.+Omnidex has several approaches to choose from when joining tables. ​ Standard Omnidex indexes can be used to join tables, and pre-joined indexes can be used to accelerate joins between parents tabled and child tables. ​ Omnidex can use the underlying database'​s indexes to perform table joins. ​ Omnidex can use a sort/merge technique to join large tables ​that have no indexes. ​ Omnidex can also load smaller tables into a temporary, memory-resident,​ hashed data cache for extremely fast table joins.
  
-The Hashed Data Cache is used when a smaller table will be repeatedly accessed ​many times.  ​The classic example ​of this joining the fact table of a star schema to dimension or snowflake tables. ​ Other examples include any time a table is being joined to small reference tables. ​ In these situations, Omnidex indexes will be favored for resolving criteria; however, ​if the smaller table must be accessed to return data in the result set, then the Hashed Data Cache may be used.+The Hashed Data Cache is used when a smaller table will be repeatedly accessed ​in a table join.  ​classic example ​is joining the fact table of a star schema to dimension or snowflake tables. ​ Other examples include any time a table is being joined to small reference ​or code-lookup ​tables. ​ In these situations, Omnidex indexes will be favored for resolving criteria; however, the smaller table will frequently ​be loaded into a Hashed Data Cache.
  
-A classic example occurs when joining an INDIVIDUALS table to several reference tables, one for GENDERS, one for MARITAL_STATUSES,​ and one for EDUCATION. ​ Each of these reference tables has less then 20 rows, meaning that the same rows will be repeated accessed throughout the scan of the INDIVIDUALS table. ​ Assuming two GENDERS --- Male and Female --- a scan of one million INDIVIDUALS would result in each GENDERS row being accessed approximately a half-million times. ​  +The Hashed Data Cache temporarily ​caches ​these smaller tables into memory and create hash indexes for the join column.  ​Once the cache is loaded, all subsequent ​access ​is memory-resident ​and does not require access to the filesystem.  This yields a tremendous gain in performance.
- +
-The Hashed Data Cache will temporarily ​cache these smaller tables into memory and create hash indexes for the join column.  ​This means that the underlying data will only be accessed onceand all other access ​will be memory-resident. ​ This yields a tremendous gain in performance. +
- +
-** VERIFY **+
  
 This temporary cache is unique to a connection, and only lasts the duration of the query. ​ This means that the Hashed Data Cache will work fine in environments that use online updates.  ​ This temporary cache is unique to a connection, and only lasts the duration of the query. ​ This means that the Hashed Data Cache will work fine in environments that use online updates.  ​
Line 53: Line 49:
 === Adjusting the Cache Size === === Adjusting the Cache Size ===
  
-The Hashed Data Cache is limited ​to 32MB of memory per table by default.  This can be adjusted using the HDC_THRESHOLD option of the SELECT statement, which sets the maximum cache size, measured in MB.+The Hashed Data Cache size defaults ​to 32MB of memory per query.  This can be adjusted using the HDC_THRESHOLD option of the SELECT statement, which sets the maximum cache size, measured in MB.
  
  
 
Back to top
admin/optimization/caches/hdc.1330108059.txt.gz · Last modified: 2016/06/28 22:38 (external edit)