Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
admin:indexing:concepts:joins [2011/01/17 22:20]
els
admin:indexing:concepts:joins [2016/06/28 22:38] (current)
Line 1: Line 1:
 +~~NOTOC~~
 +
 {{page>:​top_add&​nofooter&​noeditbtn}} {{page>:​top_add&​nofooter&​noeditbtn}}
  
-====== Administration:​ Indexing ​Strategies ​======+====== Administration: ​Omnidex ​Indexing ======
  
 ===== Indexing Concepts ===== ===== Indexing Concepts =====
Line 10: Line 12:
 [[admin:​indexing:​concepts:​aggregations|Aggregations]] |  [[admin:​indexing:​concepts:​aggregations|Aggregations]] | 
 [[admin:​indexing:​concepts:​ordering|Ordering]] |  [[admin:​indexing:​concepts:​ordering|Ordering]] | 
-[[admin:​indexing:​concepts:​retrieval|Retrievals]] |  +[[admin:​indexing:​concepts:​retrieval|Retrievals]]
-[[admin:​indexing:​concepts:​alternatives|Alternatives]]+
  
 ---- ----
Line 17: Line 18:
 ==== Table Joins ==== ==== Table Joins ====
  
-Indexes are also valuable for processing table joins.  ​The best indexing approach varies depending on whether ​the join is required for retrieving rows, or simply to qualify rows based on criteria.  ​If a query joins to multiple tables so that their values can be used in criteria; however, the data in that table is not returned ​to the user, then this is called ​Qualification ​Join.If a query tries to retrieve ​data from multiple tables and return ​it to the user, then this is called ​Retrieval ​Join.  ​+Omnidex indexing is also valuable for processing table joins.  ​Omnidex first considers ​the purpose of the table join.  ​Some table joins are required ​to allow criteria to be processed. ​ These are called Qualification ​Joins Other table joins are required ​to obtain ​data to return to the user.  These are called Retrieval ​Joins.  ​
  
-=== Qualification ​Join ===+=== Qualification ​Joins ===
  
-It is common to join to a table so that criteria can be issued ​against it.  The following example shows a join that returns columns from the INDIVIDUALS table to the user, but also joins to the HOUSEHOLDS and STATES tables to process criteria. ​ These joins are Qualification Joins.+It is common to join to a table so that criteria can be processed ​against it.  The following example shows a join that returns columns from the INDIVIDUALS table to the user, but also joins to the HOUSEHOLDS and STATES tables to process criteria. ​ These joins are Qualification Joins since no data is returned to the user from these tables. 
 + 
 +<​code>​
  
   select ​       I.NAME, I.PHONE   select ​       I.NAME, I.PHONE
Line 28: Line 31:
     join        STATES S on H.STATE = S.STATE     join        STATES S on H.STATE = S.STATE
     where       ​S.TAX_RATE > 5.0;     where       ​S.TAX_RATE > 5.0;
 +</​code>​
 +
  
 Omnidex will always attempt to process Qualification Joins in the indexes without ever going to the disk drive, as this dramatically improves performance. ​ If the join columns are indexed with Omnidex, then Omnidex will process the criteria in the STATES table, use indexes to join to the HOUSEHOLDS table, and then use indexes to join to the INDIVIDUALS table.  ​ Omnidex will always attempt to process Qualification Joins in the indexes without ever going to the disk drive, as this dramatically improves performance. ​ If the join columns are indexed with Omnidex, then Omnidex will process the criteria in the STATES table, use indexes to join to the HOUSEHOLDS table, and then use indexes to join to the INDIVIDUALS table.  ​
  
-Generally speaking, ​Omnidex can do this type of optimization ​when joining FROM a parent table INTO a child table. ​ In this example, the STATES table is a parent of HOUSEHOLDS, and HOUSEHOLDS is a parent of INDIVIDUALS. ​ It is easy for Omnidex to perform this kind of join since the primary keys are always ​available ​in the indexes.  ​It is more difficult for Omnidex ​to do this type of optimization when joining INTO a parent table FROM a child table. ​ ​Omnidex will still perform the optimization if it can, but it will not always do so.  ​+Omnidex can optimize ​this type of join most efficiently ​when joining FROM a parent table INTO a child table. ​ In this example, the STATES table is a parent of HOUSEHOLDS, and HOUSEHOLDS is a parent of INDIVIDUALS. ​ It is easy for Omnidex to perform this kind of join in the indexes ​since the primary keys are available ​from the indexes. ​ Omnidex ​isolates the primary keys from the parent table and uses them as criteria in search against the child table.
  
-Omnidex also provides an advanced featured called pre-joined indexes.  ​Information about the table join is stored in a pre-joined ​index, making certain types of Qualification Joins even faster.  Pre-joined indexes ​are more expensive to build, but can be faster to search, depending on the nature of the query.  The [[admin:​indexing:​advanced:​joins|following article]] explains pre-joined indexes.+Omnidex also optimizes queries that join INTO a parent table FROM a child table, but it may require accessing the database.  ​Omnidex also provides an advanced featured called pre-joined indexes.  ​Pre-joined ​indexes are especially useful for processing criteria in multiple children ​of the same parent.  Pre-joined indexes ​store foreign key information in the indexes allowing faster access between a parent and multiple children.  The [[admin:​indexing:​advanced:​joins|following article]] explains pre-joined indexes ​in more detail.
  
-=== Retrieval ​Join ===+=== Retrieval ​Joins ===
  
-Some tables are joined so that data can be returned to the user.  The following example shows a join that returns columns from the INDIVIDUALS,​ HOUSEHOLDS and STATES tables. ​ These joins are Retrieval Joins.+Some tables are joined so that data can be returned to the user.  The following example shows a join that returns columns from the INDIVIDUALS,​ HOUSEHOLDS and STATES tables. ​ These joins are Retrieval Joins since data is returned from these tables to the user. 
 + 
 +<​code>​
  
   select ​       I.NAME, H.ADDRESS, H.CITY, H.STATE, I.PHONE, S.TAX_RATE   select ​       I.NAME, H.ADDRESS, H.CITY, H.STATE, I.PHONE, S.TAX_RATE
Line 44: Line 51:
     join        STATES S on H.STATE = S.STATE     join        STATES S on H.STATE = S.STATE
     where       ​I.NAME = 'John Smith'     where       ​I.NAME = 'John Smith'
 +</​code>​
  
-When tables share a fairly specific key, such as a HOUSEHOLD ID, an index makes it easy to retrieve common rows between the two tables. ​ As was the case with criteria, this is only efficient when retrieving a small portion of each table. ​ If two tables are to be joined in their entirety, indexing will actually slow things down.  In the example above, the criteria against the NAME column probably isolates a small number of rows.  Using indexes for this join makes sense.  ​ 
  
-If the query retrieved a large portion of each table, indexing would be less helpful. ​ This query retrieves all rows in the United States, which probably comprises a large percentage of this database.+These joins are easily optimized by simply indexing the join columns. ​  
 + 
 +=== When to Avoid Indexes === 
 + 
 +When tables share a fairly specific key, such as a HOUSEHOLD ID, an index makes it easy to retrieve common rows between the two tables. ​ This is only efficient when retrieving a small portion of each table. ​ If two tables are to be joined in their entirety, indexing will actually slow things down for the same reasons discussed in the [[admin:​indexing:​concepts:​criteria|Criteria]] section. ​ In the example above, the criteria against the NAME column probably isolates a small number of rows, and so using indexes for this join makes sense. ​  
 + 
 +If the query retrieved a large portion of each table, indexing would be less helpful. ​ This query below retrieves all rows in the United States, which probably comprises a large percentage of this database. 
 + 
 +<​code>​
  
   select ​       I.NAME, H.ADDRESS, H.CITY, H.STATE, I.PHONE, S.TAX_RATE   select ​       I.NAME, H.ADDRESS, H.CITY, H.STATE, I.PHONE, S.TAX_RATE
Line 54: Line 69:
     join        STATES S on H.STATE = S.STATE     join        STATES S on H.STATE = S.STATE
     where       ​H.COUNTRY = '​US';​     where       ​H.COUNTRY = '​US';​
 +</​code>​
 +
  
-In this case, it may be faster to use a technique ​called a Sort-Merge Join.  This technique ​first sorts the two tables and then scans both sorted tables, matching entries as it goes.  This can provide a significant improvement in performance.+In this case, it may be faster to use other techniques than an indexed join.  Omnidex provides ​feature ​called a Sort-Merge Join.  This technique sorts two tables ​or result sets and then scans them, matching entries as it goes.  This can provide a significant improvement in performance.
  
-A third approach ​can provide ​even better performance improvementsbut requires that the needed portion of one of the tables be fairly small. ​ If portion of a table can be read into memory with a hashed ​indexed, then joins to that table will be very fast.  The disk drives are hit once to load it in memory, but thereafter, the disk drives are not impacted. ​ This optimization is commonly used when joining ​a table to dimension tables or code lookup tables.  ​ +Omnidex provides another feature called the Hashed Data Cache. ​ This can be even faster stilland is ideal when joining to a fairly small table, such as a small dimension table.  If the required ​portion of a table can be read into memory with a hashed ​index, then joins to that table will be instantaneous.  The disk drives are hit once to load the required portion of a table into memory, but thereafter, the disk drives are not impacted. ​ This optimization is commonly used when joining to dimension tables or code lookup tables. ​ In the example above, the STATES would be small enough to use this technique ​since it probably has less than 100 entries.
-In the example above, the STATES would be small enough to use this technique.+
  
 Omnidex automatically chooses between these approaches to table joins, and no intervention is required by the user. Omnidex automatically chooses between these approaches to table joins, and no intervention is required by the user.
  
-The [[admin:​indexing:​basic:​joins|following article]] explains basic indexing strategies for optimizing table joins.+====  ==== 
 +The [[admin:​indexing:​strategies:​joins|following article]] explains basic indexing strategies for optimizing table joins.
  
 =====  ===== =====  =====
 
Back to top
admin/indexing/concepts/joins.1295302849.txt.gz · Last modified: 2016/06/28 22:38 (external edit)