Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
admin:indexing:concepts:home [2011/01/14 15:30]
els created
admin:indexing:concepts:home [2016/06/28 22:38] (current)
Line 1: Line 1:
 +~~NOTOC~~
 +
 {{page>:​top_add&​nofooter&​noeditbtn}} {{page>:​top_add&​nofooter&​noeditbtn}}
  
-====== Administration:​ Indexing ​Strategies ​======+====== Administration: ​Omnidex ​Indexing ​ ======
  
 ===== Indexing Concepts ===== ===== Indexing Concepts =====
  
 **[[admin:​indexing:​concepts:​home|Overview]]** |  **[[admin:​indexing:​concepts:​home|Overview]]** | 
-[[admin:​indexing:​concepts:​basics|Indexing Basics]] |  
 [[admin:​indexing:​concepts:​criteria|Criteria]] |  [[admin:​indexing:​concepts:​criteria|Criteria]] | 
-[[admin:​indexing:​concepts:​joins|Table Joins]] | +[[admin:​indexing:​concepts:​joins|Joins]] | 
 [[admin:​indexing:​concepts:​aggregations|Aggregations]] |  [[admin:​indexing:​concepts:​aggregations|Aggregations]] | 
 [[admin:​indexing:​concepts:​ordering|Ordering]] |  [[admin:​indexing:​concepts:​ordering|Ordering]] | 
-[[admin:​indexing:​concepts:​retrieval|Retrievals]] |  +[[admin:​indexing:​concepts:​retrieval|Retrievals]]
-[[admin:​indexing:​concepts:​alternatives|Alternatives]]+
  
 ---- ----
Line 18: Line 18:
 ==== Overview ==== ==== Overview ====
  
-  * Basics of indexing +Someone once told a story: 
-  ​* Indexing for criteria ​ + 
-    * Basic cost of qualifications +==  == 
-    * Use of multiple indexes +One day, a homeowner noticed that his kitchen sink was leaking. ​ He fiddled with the faucet and concluded that the washer needed replacing. ​ He headed down to the hardware store, purchased the replacement washer and returned home to install it.  As soon as he tried to put the new washer in, he realized that it was the wrong size, so he headed back to the hardware store. ​ After getting the correct washer, he returned, only to find that he once again had purchased the wrong washer. ​ In his frustration,​ he removed the whole faucet and took it to the hardware store. ​ Once there, he was able to match it up correctly. ​ He returned home, reinstalled the faucet, and finally was able to call the job finished. 
-    * MDK_RETRIEVAL_THRESHOLD (rename ​to QUALIFICATION_THRESHOLD) + 
-  * Indexing for joins +====  ==== 
-    * Basic costs of joins + 
-    * Prejoined indexes +This story tells us something about indexing. Hardware stores are like big databases. They have all the important things that we want, and they are usually organized well enough that we can find what we want. The most expensive thing, though, is the trip to the hardware store in the car. We want to make sure that we get everything in one trip. 
-    * Joining across ​databases + 
-  * Indexing for aggregations +It would certainly be painful if we had to make a trip to the hardware to get a “1/2-inch rubber washer for a Delta kitchen faucet”, but we didn't have the benefit ​of looking at the aisle markers or asking for help. Imagine if we had to close our eyes, reach for the first item in the first aisle, and return home to check whether we retrieved the right item. Once home, we realize that the broom that we blindly retrieved was not the washer that we needed, so we head back to the hardware store, retrieve the next item, return home, and discover that it is another identical broom. We repeat this process ​for all the brooms, mops, vacuum cleaners, tools, furnace filters and doorbells until we finally get to the washers. Then we make repeated trips checking every kind of washer until we find what we are looking for. This insane and amusing metaphor shows the cost of a full table scan for a database. A full table scan evaluates every single row, transports ​the row from the disk drive to the CPU where it can do the comparison, and repeats this process ad nauseum. In the end, the trips to the disk drive kill the performance. With database applications,​ it is essential to minimize the I/O to the disk drive. 
-    * Basic cost of index scan + 
-    * Use of partition qualifiers +This is where indexing is useful, but indexing comes in many different flavors. When we go to the hardware store, we're smart enough to go to the washer aisle, so we thankfully avoid the brooms, mops and vacuum cleaners, but with only one piece of criteria, we still have to bring each and every washer back to the house. That is still too many trips. It may surprise you, but many databases use this approach. They pick a primary index and retrieve rows based on that one index, but all other criteria is processed by the CPU after it is retrieved from the disk. We would not tolerate this approach at the hardware store, and we don’t have to tolerate it with databases either.
-    * ASK_RETRIEVAL_THRESHOLD (rename ​to AGGREGATION_THRESHOLD) +
-  * Indexing ​for ordering +
-    * Similar ​to aggregations +
-  * Indexing ​for retrieval +
-    * Obtaining data from indexes +
-    * Base cost of reading ​the index +
-  * Alternatives ​to indexing +
-    * Full table scans +
-    * Sort-merge +
-    * Hashed data caching+
  
 +When the homeowner took the faucet to the store and matched up the washer there, he applied all criteria at the same time. With databases, we can do the same thing using indexes. If there are six elements of criteria (such as a “1/2-inch rubber washer for a Delta kitchen faucet”), then use the intersection of six indexes. If there are twenty pieces of criteria across five tables, then use twenty indexes across five tables. Retrieve only the rows from the database that match all the criteria. This greatly reduces the I/O to the disk drives.
  
 +Companies who use Omnidex swear by the performance. It may look like magic, but it is really just intelligent use of indexes. Companies who have grown accustomed to making endless trips to the disk drives think it is a miracle when they reduce their I/O and queries become lightening fast. They realize that it never really helped them to retrieve rows from the disk that they didn't need, just like it didn't help to bring home a broom when we needed a washer.
  
 +This somewhat silly story is actually quite useful as a foundation for understanding Omnidex performance. Throughout these sections, you will see an emphasis on insuring that there is indexing to support each step of the query. There are indexing strategies for criteria, for table joins, for aggregations and for ordering. Ultimately, that is the goal of the Omnidex Administrator … to use indexing as fully as possible to make each query efficient and fast.
  
 =====  ===== =====  =====
 
Back to top
admin/indexing/concepts/home.1295019030.txt.gz · Last modified: 2016/06/28 22:38 (external edit)