Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
admin:optimization:caches:hdc [2012/02/06 22:03]
doc created
admin:optimization:caches:hdc [2016/06/28 22:38] (current)
Line 8: Line 8:
  
  
-**[[admin:​optimization:​caches:​home|Overview]]** +[[admin:​optimization:​caches:​home|Overview]] | 
-[[admin:​optimization:​caches:​hdc|Data ​Caches]] | +**[[admin:​optimization:​caches:​hdc|Hashed ​Data Caching]]** 
-[[admin:​optimization:​caches:​qualifications|Qualification ​Caches]] | +[[admin:​optimization:​caches:​cachequal|Qualification ​Caching]] | 
-[[admin:​optimization:​caches:​files|File Caches]]+[[admin:​optimization:​caches:​filesystem|Filesystem Caching]]
  
 ---- ----
  
-==== Data Caches ​====+==== Hashed ​Data Caching ​====
  
-Omnidex has several approaches to choose from when joining tables. ​ Standard Omnidex indexes can be used to join tables, and pre-joined indexes can be used to accelerate joins between parents tabled and child tables. ​ Omnidex can use the underlying database'​s indexes to perform table joins. ​ Omnidex can use a sort/merge technique to join especially ​tables ​without ​indexes. ​ Omnidex can also load smaller tables into a temporary, memory-resident,​ hashed data cache for extremely fast table joins.+Omnidex has several approaches to choose from when joining tables. ​ Standard Omnidex indexes can be used to join tables, and pre-joined indexes can be used to accelerate joins between parents tabled and child tables. ​ Omnidex can use the underlying database'​s indexes to perform table joins. ​ Omnidex can use a sort/merge technique to join large tables ​that have no indexes. ​ Omnidex can also load smaller tables into a temporary, memory-resident,​ hashed data cache for extremely fast table joins.
  
-The Hashed Data Cache is used when a smaller table will be repeatedly accessed ​many times.  ​The classic example ​of this joining the fact table of a star schema to dimension or snowflake tables. ​ Other examples include any time a table is being joined to small reference tables. ​ In these situations, Omnidex indexes will be favored for resolving criteria; however, ​if the smaller table must be accessed to return data in the result set, then the Hashed Data Cache may be used.+The Hashed Data Cache is used when a smaller table will be repeatedly accessed ​in a table join.  ​classic example ​is joining the fact table of a star schema to dimension or snowflake tables. ​ Other examples include any time a table is being joined to small reference ​or code-lookup ​tables. ​ In these situations, Omnidex indexes will be favored for resolving criteria; however, the smaller table will frequently ​be loaded into a Hashed Data Cache.
  
-A classic example occurs when joining an INDIVIDUALS table to several reference tables, one for GENDERS, one for MARITAL_STATUSES,​ and one for EDUCATION. ​ Each of these reference tables has less then 20 rows, meaning that the same rows will be repeated accessed throughout the scan of the INDIVIDUALS table. ​ Assuming two GENDERS --- Male and Female --- a scan of one million INDIVIDUALS would result in each GENDERS row being accessed approximately a half-million times. ​  +The Hashed Data Cache temporarily ​caches ​these smaller tables into memory and create hash indexes for the join column.  ​Once the cache is loaded, all subsequent ​access ​is memory-resident ​and does not require access to the filesystem.  This yields a tremendous gain in performance.
- +
-The Hashed Data Cache will temporarily ​cache these smaller tables into memory and create hash indexes for the join column.  ​This means that the underlying data will only be accessed onceand all other access ​will be memory-resident. ​ This yields a tremendous gain in performance.+
  
 This temporary cache is unique to a connection, and only lasts the duration of the query. ​ This means that the Hashed Data Cache will work fine in environments that use online updates.  ​ This temporary cache is unique to a connection, and only lasts the duration of the query. ​ This means that the Hashed Data Cache will work fine in environments that use online updates.  ​
  
-The Hashed Data Cache is limited to 32MB of memory per table by default. ​ This can be adjusted using the  
  
 +=== Hashed Data Caches in Query Plans ===
  
 +If a Hashed Data Cache is used in a query, it can be seen using [[admin:​optimization:​plans:​home|Omnidex query plans]]. ​ The following query plan shows a "Build cache" step that builds the Hashed Data Cache, and then a subsequent "​Retrieve"​ step that retrieves from the Hashed Data Cache.
  
 +<​code>​
 +----------------------------------- SUMMARY -----------------------------------
 +Select ​       INDIVIDUALS.NAME,​
 +              GENDERS.DESCRIPTION
 +  from        INDIVIDUALS
 +  join        GENDERS on INDIVIDUALS.GENDER = GENDERS.GENDER;​
  
-Caching is a common method of optimizing queries in Omnidex, as with other relational databases.  ​Omnidex works with caching in three areas:+Version: ​     5.2.01  (Compiled Feb  2 2012 21:29:57) 
 +Warnings: ​    ​SEQUENTIAL_SCAN 
 +----------------------------------- DETAILS ----------------------------------- 
 +Build cache {1} as (SELECT DESCRIPTION,​ GENDER FROM GENDERS) on 2; 
 +Retrieve INDIVIDUALS sequentially;​ 
 + ​Retrieve {1} using PK_GENDER = INDIVIDUALS.GENDER;​ 
 + ​Return INDIVIDUALS.NAME,​ GENDERS.DESCRIPTION;​ 
 +------------------------------------------------------------------------------- 
 +</​code>​
  
-  * **Data Caches** - Omnidex creates temporary caches of data during some query processing in order to minimize access to the underlying database. ​ Typically, this occurs when smaller tables are joined to from a much larger table, such as in a star schema data warehouse where a fact table joins to many dimension and snowflake tables. ​ In these cases, Omnidex will load the smaller tables into a hashed data cache, allowing the join to occur in memory rather than requiring repeated access to the disk.+=== Adjusting ​the Cache Size ===
  
-  * **Qualification Caches** - Omnidex is able to recognize patterns in queries, and will try to reuse previous qualifications when appropriate. ​ A very common scenario involves a succession of SQL statements, each of which adds another piece of criteria.  ​Another common scenario involves queries that begin with requesting a count of rows, and then either asks for aggregations or detailed rows using the same criteria. ​ In both of these casesOmnidex is able to reuse earlier qualificationsgreatly improving performance.+The Hashed Data Cache size defaults ​to 32MB of memory per query.  ​This can be adjusted ​using the HDC_THRESHOLD option ​of the SELECT statementwhich sets the maximum cache sizemeasured in MB.
  
-  * **File Caches** - Omnidex relies on file caching from the operating system in order to achieve high performance. ​ This file caching is performed automatically by the operating system and benefits Omnidex, but only if certain memory requirements are met. 
  
-All of these caches are designed to support standard Omnidex applications,​ and they are activated automatically. ​ These caches also reflect that Omnidex is most frequently used in read-only environments. ​ These caches can be deactivated in read-write environments as needed. 
  
 =====  ===== =====  =====
 ---- ----
 \\ \\
-**[[admin:​optimization:​caches:​hdc|Next]]**+**[[admin:​optimization:​caches:​home|Prev]]** | 
 +**[[admin:​optimization:​caches:​cachequal|Next]]**
  
  
 ====== Additional Resources ====== ====== Additional Resources ======
 See also: See also:
-{{page>:​admin:​indexing:​see_also&​nofooter&​noeditbtn}}+{{page>:​admin:​optimization:​see_also&​nofooter&​noeditbtn}}
  
 {{page>:​bottom_add&​nofooter&​noeditbtn}} {{page>:​bottom_add&​nofooter&​noeditbtn}}
 
Back to top
admin/optimization/caches/hdc.1328565832.txt.gz · Last modified: 2016/06/28 22:38 (external edit)