Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
admin:optimization:caches:hdc [2012/02/07 00:15]
doc
admin:optimization:caches:hdc [2016/06/28 22:38] (current)
Line 9: Line 9:
  
 [[admin:​optimization:​caches:​home|Overview]] | [[admin:​optimization:​caches:​home|Overview]] |
-**[[admin:​optimization:​caches:​hdc|Data Caching]]** |+**[[admin:​optimization:​caches:​hdc|Hashed ​Data Caching]]** |
 [[admin:​optimization:​caches:​cachequal|Qualification Caching]] | [[admin:​optimization:​caches:​cachequal|Qualification Caching]] |
-[[admin:​optimization:​caches:​files|File Caches]]+[[admin:​optimization:​caches:​filesystem|Filesystem Caching]]
  
 ---- ----
  
-==== Data Caching ====+==== Hashed ​Data Caching ====
  
-Omnidex has several approaches to choose from when joining tables. ​ Standard Omnidex indexes can be used to join tables, and pre-joined indexes can be used to accelerate joins between parents tabled and child tables. ​ Omnidex can use the underlying database'​s indexes to perform table joins. ​ Omnidex can use a sort/merge technique to join especially ​tables ​without ​indexes. ​ Omnidex can also load smaller tables into a temporary, memory-resident,​ hashed data cache for extremely fast table joins.+Omnidex has several approaches to choose from when joining tables. ​ Standard Omnidex indexes can be used to join tables, and pre-joined indexes can be used to accelerate joins between parents tabled and child tables. ​ Omnidex can use the underlying database'​s indexes to perform table joins. ​ Omnidex can use a sort/merge technique to join large tables ​that have no indexes. ​ Omnidex can also load smaller tables into a temporary, memory-resident,​ hashed data cache for extremely fast table joins.
  
-The Hashed Data Cache is used when a smaller table will be repeatedly accessed ​many times.  ​The classic example ​of this joining the fact table of a star schema to dimension or snowflake tables. ​ Other examples include any time a table is being joined to small reference tables. ​ In these situations, Omnidex indexes will be favored for resolving criteria; however, ​if the smaller table must be accessed to return data in the result set, then the Hashed Data Cache may be used.+The Hashed Data Cache is used when a smaller table will be repeatedly accessed ​in a table join.  ​classic example ​is joining the fact table of a star schema to dimension or snowflake tables. ​ Other examples include any time a table is being joined to small reference ​or code-lookup ​tables. ​ In these situations, Omnidex indexes will be favored for resolving criteria; however, the smaller table will frequently ​be loaded into a Hashed Data Cache.
  
-A classic example occurs when joining an INDIVIDUALS table to several reference tables, one for GENDERS, one for MARITAL_STATUSES,​ and one for EDUCATION. ​ Each of these reference tables has less then 20 rows, meaning that the same rows will be repeated accessed throughout the scan of the INDIVIDUALS table. ​ Assuming two GENDERS --- Male and Female --- a scan of one million INDIVIDUALS would result in each GENDERS row being accessed approximately a half-million times. ​  +The Hashed Data Cache temporarily ​caches ​these smaller tables into memory and create hash indexes for the join column.  ​Once the cache is loaded, all subsequent ​access ​is memory-resident ​and does not require access to the filesystem.  This yields a tremendous gain in performance.
- +
-The Hashed Data Cache will temporarily ​cache these smaller tables into memory and create hash indexes for the join column.  ​This means that the underlying data will only be accessed onceand all other access ​will be memory-resident. ​ This yields a tremendous gain in performance. +
- +
-** VERIFY **+
  
 This temporary cache is unique to a connection, and only lasts the duration of the query. ​ This means that the Hashed Data Cache will work fine in environments that use online updates.  ​ This temporary cache is unique to a connection, and only lasts the duration of the query. ​ This means that the Hashed Data Cache will work fine in environments that use online updates.  ​
Line 53: Line 49:
 === Adjusting the Cache Size === === Adjusting the Cache Size ===
  
-The Hashed Data Cache is limited ​to 32MB of memory per table by default.  This can be adjusted using the HDC_THRESHOLD option of the SELECT statement, which sets the maximum cache size, measured in MB.+The Hashed Data Cache size defaults ​to 32MB of memory per query.  This can be adjusted using the HDC_THRESHOLD option of the SELECT statement, which sets the maximum cache size, measured in MB.
  
  
Line 66: Line 62:
 ====== Additional Resources ====== ====== Additional Resources ======
 See also: See also:
-{{page>:​admin:​indexing:​see_also&​nofooter&​noeditbtn}}+{{page>:​admin:​optimization:​see_also&​nofooter&​noeditbtn}}
  
 {{page>:​bottom_add&​nofooter&​noeditbtn}} {{page>:​bottom_add&​nofooter&​noeditbtn}}
 
Back to top
admin/optimization/caches/hdc.1328573735.txt.gz · Last modified: 2016/06/28 22:38 (external edit)