Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
admin:indexing:concepts:joins [2011/01/17 20:55]
els
admin:indexing:concepts:joins [2016/06/28 22:38] (current)
Line 1: Line 1:
 +~~NOTOC~~
 +
 {{page>:​top_add&​nofooter&​noeditbtn}} {{page>:​top_add&​nofooter&​noeditbtn}}
  
-====== Administration:​ Indexing ​Strategies ​======+====== Administration: ​Omnidex ​Indexing ======
  
 ===== Indexing Concepts ===== ===== Indexing Concepts =====
  
-**[[admin:​indexing:​concepts:​home|Overview]]** |  +[[admin:​indexing:​concepts:​home|Overview]] | 
-[[admin:​indexing:​concepts:​basics|Basics]] | +
 [[admin:​indexing:​concepts:​criteria|Criteria]] |  [[admin:​indexing:​concepts:​criteria|Criteria]] | 
-[[admin:​indexing:​concepts:​joins|Joins]] | +**[[admin:​indexing:​concepts:​joins|Joins]]** 
 [[admin:​indexing:​concepts:​aggregations|Aggregations]] |  [[admin:​indexing:​concepts:​aggregations|Aggregations]] | 
 [[admin:​indexing:​concepts:​ordering|Ordering]] |  [[admin:​indexing:​concepts:​ordering|Ordering]] | 
-[[admin:​indexing:​concepts:​retrieval|Retrievals]] |  +[[admin:​indexing:​concepts:​retrieval|Retrievals]]
-[[admin:​indexing:​concepts:​alternatives|Alternatives]]+
  
 ---- ----
Line 18: Line 18:
 ==== Table Joins ==== ==== Table Joins ====
  
-Indexes are also valuable for processing table joins.  ​The best indexing approach varies depending on whether ​the join is required for retrieving rows, or simply to qualify rows based on criteria.  ​If a query joins to multiple tables so that their values can be used in criteria; however, the data in that table is not returned ​to the user, then this is called ​Qualification ​Join.If a query tries to retrieve ​data from multiple tables and return ​it to the user, then this is called ​Retrieval ​Join.  ​+Omnidex indexing is also valuable for processing table joins.  ​Omnidex first considers ​the purpose of the table join.  ​Some table joins are required ​to allow criteria to be processed. ​ These are called Qualification ​Joins Other table joins are required ​to obtain ​data to return to the user.  These are called Retrieval ​Joins.  ​
  
-=== Qualification ​Join ===+=== Qualification ​Joins ===
  
-It is common to join to a table so that criteria can be issued ​against it.  The following example shows a join that returns columns from the INDIVIDUALS table to the user, but also joins to the HOUSEHOLDS and STATES tables to process criteria. ​ These joins are Qualification Joins.+It is common to join to a table so that criteria can be processed ​against it.  The following example shows a join that returns columns from the INDIVIDUALS table to the user, but also joins to the HOUSEHOLDS and STATES tables to process criteria. ​ These joins are Qualification Joins since no data is returned to the user from these tables. 
 + 
 +<​code>​
  
   select ​       I.NAME, I.PHONE   select ​       I.NAME, I.PHONE
Line 29: Line 31:
     join        STATES S on H.STATE = S.STATE     join        STATES S on H.STATE = S.STATE
     where       ​S.TAX_RATE > 5.0;     where       ​S.TAX_RATE > 5.0;
 +</​code>​
 +
  
 Omnidex will always attempt to process Qualification Joins in the indexes without ever going to the disk drive, as this dramatically improves performance. ​ If the join columns are indexed with Omnidex, then Omnidex will process the criteria in the STATES table, use indexes to join to the HOUSEHOLDS table, and then use indexes to join to the INDIVIDUALS table.  ​ Omnidex will always attempt to process Qualification Joins in the indexes without ever going to the disk drive, as this dramatically improves performance. ​ If the join columns are indexed with Omnidex, then Omnidex will process the criteria in the STATES table, use indexes to join to the HOUSEHOLDS table, and then use indexes to join to the INDIVIDUALS table.  ​
  
-Generally speaking, ​Omnidex can do this type of optimization ​when joining FROM a parent table INTO a child table. ​ In this example, the STATES table is a parent of HOUSEHOLDS, and HOUSEHOLDS is a parent of INDIVIDUALS. ​ It is easy for Omnidex to perform this kind of join since the primary keys are always ​available ​in the indexes.  ​It is more difficult for Omnidex ​to do this type of optimization when joining INTO a parent table FROM a child table. ​ ​Omnidex will still perform the optimization if it can, but it will not always do so.  ​+Omnidex can optimize ​this type of join most efficiently ​when joining FROM a parent table INTO a child table. ​ In this example, the STATES table is a parent of HOUSEHOLDS, and HOUSEHOLDS is a parent of INDIVIDUALS. ​ It is easy for Omnidex to perform this kind of join in the indexes ​since the primary keys are available ​from the indexes. ​ Omnidex ​isolates the primary keys from the parent table and uses them as criteria in search against the child table.
  
-Omnidex also provides an advanced featured called pre-joined indexes.  ​Information about the table join is stored in a pre-joined ​index, making certain types of Qualification Joins even faster.  Pre-joined indexes ​are more expensive to build, but can be faster to search, depending on the nature of the query.  The [[admin:​indexing:​advanced:​joins|following article]] explains pre-joined indexes.+Omnidex also optimizes queries that join INTO a parent table FROM a child table, but it may require accessing the database.  ​Omnidex also provides an advanced featured called pre-joined indexes.  ​Pre-joined ​indexes are especially useful for processing criteria in multiple children ​of the same parent.  Pre-joined indexes ​store foreign key information in the indexes allowing faster access between a parent and multiple children.  The [[admin:​indexing:​advanced:​joins|following article]] explains pre-joined indexes ​in more detail.
  
-=== Retrieval ​Join ===+=== Retrieval ​Joins ===
  
-When two tables ​share a fairly specific key, such as a Customer ID or an Email Address, an index makes it easy to retrieve common rows between ​the two tables.  ​As was the case with criteriathis is only efficient when retrieving a small portion of each table.  ​If two tables ​are to be joined in their entirety, indexing will actually slow things down.+Some tables ​are joined so that data can be returned ​to the user.  ​The following example shows a join that returns columns from the INDIVIDUALSHOUSEHOLDS and STATES tables.  ​These joins are Retrieval Joins since data is returned from these tables to the user.
  
-When whole tables ​are to be joined, it can be better ​to use a technique called a Sort-Merge.  This technique first sorts the two tables, presuming they are not already sorted.  ​It then scans both sorted tables ​and matches entries as it goes.  ​This can provide ​significant improvement ​in performance.+<​code>​ 
 + 
 +  select ​       I.NAME, H.ADDRESS, H.CITY, H.STATE, I.PHONE, S.TAX_RATE 
 +    from        INDIVIDUALS I 
 +    join        HOUSEHOLDS H on I.HOUSEHOLD = H.HOUSEHOLD 
 +    join        STATES S on H.STATE = S.STATE 
 +    where       ​I.NAME = 'John Smith'​ 
 +</​code>​ 
 + 
 + 
 +These joins are easily optimized by simply indexing the join columns. ​  
 + 
 +=== When to Avoid Indexes === 
 + 
 +When tables share a fairly specific key, such as a HOUSEHOLD IDan index makes it easy to retrieve common rows between the two tables.  This is only efficient when retrieving a small portion of each table. ​ If two tables are to be joined in their entirety, indexing will actually slow things down for the same reasons discussed in the [[admin:​indexing:​concepts:​criteria|Criteria]] section.  ​In the example above, the criteria against the NAME column probably isolates a small number of rows, and so using indexes for this join makes sense.  ​ 
 + 
 +If the query retrieved ​large portion of each table, indexing would be less helpful. ​ This query below retrieves all rows in the United States, which probably comprises a large percentage of this database. 
 + 
 +<​code>​ 
 + 
 +  select ​       I.NAME, H.ADDRESS, H.CITY, H.STATE, I.PHONE, S.TAX_RATE 
 +    from        INDIVIDUALS I 
 +    join        HOUSEHOLDS H on I.HOUSEHOLD = H.HOUSEHOLD 
 +    join        STATES S on H.STATE = S.STATE 
 +    where       H.COUNTRY = '​US';​ 
 +</​code>​
  
-A third approach can provide even better performance improvements,​ but requires that the needed portion of one of the tables be fairly small. ​ If a portion of a table can be read into memory with a hashed indexed, then joins to that table will be very fast.  The disk drives are hit once to load it in memory, but thereafter, the disk drives are not impacted. ​ This optimization is commonly used when joining a table to dimension tables or code lookup tables.  ​ 
  
-Omnidex will automatically choose between these approaches to table joins, and no intervention is required by the user.+In this caseit may be faster to use other techniques than an indexed join.  Omnidex provides a feature called a Sort-Merge Join.  This technique sorts two tables or result sets and then scans them, matching entries as it goes.  This can provide a significant improvement in performance.
  
-The [[admin:​indexing:​basic:​joins|following article]] explains basic indexing strategies for optimizing ​table joins.+Omnidex provides another feature called the Hashed Data Cache. ​ This can be even faster still, and is ideal when joining to a fairly small table, such as a small dimension table. ​ If the required portion of a table can be read into memory with a hashed index, then joins to that table will be instantaneous. ​ The disk drives are hit once to load the required portion of a table into memory, but thereafter, the disk drives are not impacted. ​ This optimization is commonly used when joining to dimension tables or code lookup tables. ​ In the example above, the STATES would be small enough to use this technique since it probably has less than 100 entries.
  
 +Omnidex automatically chooses between these approaches to table joins, and no intervention is required by the user.
  
 +====  ====
 +The [[admin:​indexing:​strategies:​joins|following article]] explains basic indexing strategies for optimizing table joins.
  
 =====  ===== =====  =====
  
-**[[admin:​indexing:​concepts:​criteria|Next]]**+**[[admin:​indexing:​concepts:​criteria|Prev]]** 
 +**[[admin:​indexing:​concepts:​aggregations|Next]]**
  
 ====== Additional Resources ====== ====== Additional Resources ======
 
Back to top
admin/indexing/concepts/joins.1295297749.txt.gz · Last modified: 2016/06/28 22:38 (external edit)