This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
admin:indexing:text:proximity [2012/01/31 17:47] doc |
admin:indexing:text:proximity [2016/06/28 22:38] (current) |
||
---|---|---|---|
Line 10: | Line 10: | ||
[[admin:indexing:text:retrieve|External Files]] | | [[admin:indexing:text:retrieve|External Files]] | | ||
**[[admin:indexing:text:proximity|Proximity Searches]]** | | **[[admin:indexing:text:proximity|Proximity Searches]]** | | ||
+ | [[admin:indexing:text:advanced|Advanced Searches]] | | ||
[[admin:indexing:text:results|Displaying Results]] | | [[admin:indexing:text:results|Displaying Results]] | | ||
[[admin:indexing:text:relevancy|Relevancy]] | [[admin:indexing:text:relevancy|Relevancy]] | ||
Line 23: | Line 24: | ||
There are three basic categories of Proximity Searches: | There are three basic categories of Proximity Searches: | ||
- | == Phrase Searches == | + | === Phrase Searches === |
- | A phrase search finds occurrences of between two and eight words adjacent to each other in the column. To specify phrases in the search criteria, enclose the phrase in double quotation marks. For example, to look for the phrase cell phone, search for the criteria of "cell phone". | + | A phrase search finds occurrences of between two and eight words adjacent to each other in the column. To specify phrases in the search criteria, enclose the phrase in double quotation marks. The following example searches for the phrase "No place like home" in the sample BOOKS database, finding the Wizard of Oz. |
- | == BEFORE Searches == | + | <code> |
+ | > select title from books where content = '"no place like home"'; | ||
- | BEFORE Searches A BEFORE search is an expansion on a Phrase Search. The BEFORE operator used in qualification criteria allows more control over how many words are allowed between two keywords. In fact, a Phrase Search is simply a special implementation of a BEFORE search. | + | TITLE |
+ | ----------------------------------------------------------------------------- | ||
+ | The Wonderful Wizard of Oz | ||
+ | </code> | ||
- | The BEFORE operator is used between two words, as in “word1 BEFORE(n) word2”. The BEFORE operator accepts an optional parameter containing a value between 1 and 999, representing the number of words by which word1may proceed word2. The Phrase Search example above could also be submitted as: cell BEFORE phone, or cell BEFORE(1) phone. | ||
- | |||
- | A Phrase Search with “cell phone” would not find the sentence: | ||
- | “Our company markets cell and mobile phones.” | + | === BEFORE Searches === |
- | To locate this sentence, the search criteria must be: cell BEFORE(3) phone. Any parameter of 3 or above would similarly find this sentence. | + | A BEFORE search is an expansion on a Phrase Search. The BEFORE operator used in qualification criteria allows more control over how many words are allowed between two keywords. In fact, a Phrase Search is simply a special implementation of a BEFORE search. |
- | If the BEFORE operator is used without a parameter, it defaults to 10 words. This provides the most relaxed and flexible search, and can dramatically increase the total number of rows found. In these situations, queries are often sorted by their relevancy score using the $SCORE function. This allows rows with close proximity to be presented first, while still including rows with distant proximity. | + | The BEFORE operator is used between two words, as in "word1 BEFORE(n) word2". The BEFORE operator accepts an optional parameter containing a value between 1 and 999, representing the number of words by which //word1// may precede //word2//. If no value is provided, the default value of 10 is used. The Phrase Search example above could also be submitted as: //no// BEFORE //place// BEFORE //like// BEFORE //home//, or //no// BEFORE(1) //place// BEFORE(1) //like// BEFORE(1) //home//. |
- | + | The following example searches for the word //place// within 25 words before the word //home//. Note that the use of operators within SQL criteria requires that the criteria be enclosed in parentheses, indicating that the criteria is really an expression with special operators, rather than a literal string. | |
- | NEAR Searches A NEAR Search is very similar to a BEFORE Search. The only difference is that NEAR allows words to be in any order. | + | |
- | Neither a Phrase Search with “cell phone”, nor a BEFORE Search with | + | <code> |
- | cell BEFORE phone would find the sentence: | + | > select title from books where content = '(place before(25) home)'; |
- | “The call was dropped because we traveled outside of the mobile phone cell” | + | TITLE |
+ | ----------------------------------------------------------------------------- | ||
+ | Around the World in Eighty Days | ||
+ | The Wonderful Wizard of Oz | ||
+ | </code> | ||
- | To locate this sentence, the search criteria must be: cell NEAR(1) phone, or cell NEAR phone. | + | This example immediate raises the question of how to see the locations in the book where this criteria is met, and also how to sort the results based on relevancy. The [[admin:indexing:text:results|next two pages]] of this section describe functions that can be used to meet these requirements. |
- | If the NEAR operator is used without a parameter, it also defaults to 10 words. This provides the most relaxed and flexible search, and can dramatically increase the total number of rows found. In these situations, queries are often sorted by their relevancy score using the $SCORE function. This allows rows with close proximity to be presented first, while still including rows with distant proximity. Most search engines use NEAR Searches rather than BEFORE Searches to provide the greatest search flexibility. Results are then presented in order of the relevancy score. | + | === NEAR Searches === |
+ | A NEAR search is nearly the same as a BEFORE search, except that the words can be in any relative order to each other. The NEAR operator is used between two words, as in "word1 NEAR(n) word2". The NEAR operator accepts an optional parameter containing a value between 1 and 999, representing the number of words by which //word1// may precede or follow //word2//. If no value is provided, the default value of 10 is used. | ||
+ | The following example searches for the word //home// within 25 words near the word //place//. The words may be in any order relative to each other. | ||
- | There are some limitations to Proximity Searches. Columns indexed with the Proximity option are limited to 4 million keywords. The Proximity option cannot be used on pre-joined indexes. Lastly, Proximity Searches only pay attention to word proximity, and not to semantics, sentence structure or context. | + | <code> |
+ | > select title from books where content = '(place near(25) home)'; | ||
+ | TITLE | ||
+ | ----------------------------------------------------------------------------- | ||
+ | Around the World in Eighty Days | ||
+ | The Adventures of Tom Sawyer | ||
+ | The Wonderful Wizard of Oz | ||
+ | </code> | ||
- | Figure 8 - The BEFORE and NEAR Operators | + | === Default Proximity Search === |
- | BEFORE[(n)] | + | Any search against a FullText index is inherently a Proximity Search. Specifically, the NEAR(999) operator is used whenever there is no other operator before words. This means that the following two examples are synonymous: |
- | NEAR[(n)] | + | |
- | n A number between 1 and 999 representing the number of other words allowed between the two requested words. The default value is 10. | + | <code> |
+ | > select title from books where content = 'place home'; | ||
+ | TITLE | ||
+ | ----------------------------------------------------------------------------- | ||
+ | Alice's Adventures in Wonderland | ||
+ | Around the World in Eighty Days | ||
+ | Hamlet | ||
+ | The Adventures of Tom Sawyer | ||
+ | The Wonderful Wizard of Oz | ||
+ | > select title from books where content = '(place NEAR(999) home)'; | ||
- | Figure 9 - Examples of Using the BEFORE and NEAR Operators | + | TITLE |
- | + | ----------------------------------------------------------------------------- | |
- | The following example shows a proximity search using the QUALIFY statement: | + | Alice's Adventures in Wonderland |
- | + | Around the World in Eighty Days | |
- | Qualify CATALOG where CONTENT = ‘cell BEFORE(3) phone’ | + | Hamlet |
- | + | The Adventures of Tom Sawyer | |
- | + | The Wonderful Wizard of Oz | |
- | The following example shows a proximity search using the $CONTAINS function of a SELECT statement: | + | |
- | + | ||
- | Select … from CATALOG | + | |
- | where $contains(CONTENT, ‘cell BEFORE(3) phone’) | + | |
- | + | ||
- | + | ||
- | + | ||
- | Proximity searches are automatically performed when criteria is submitted against columns installed with the Proximity option. However, the default processing of that criteria can be overridden with the PROXIMITY option. | + | |
- | + | ||
- | Figure 10 - The PROXIMITY Function | + | |
- | + | ||
- | PROXIMITY(‘criteria’[,’options’]]) | + | |
- | + | ||
- | criteria The qualification criteria to be converted using the passed options. | + | |
- | + | ||
- | options An optional parameter that controls options for the function. If no option is supplied, then NEAR(999) is assumed as the default. | + | |
- | + | ||
- | PHRASE Convert all unquoted spaces to BEFORE(1) operators, producing a phrase search. | + | |
- | + | ||
- | BEFORE(n) Convert all unquoted spaces to BEFORE(n) operators. | + | |
- | + | ||
- | NEAR(n) Convert all unquoted spaces to NEAR(n) operators. | + | |
+ | </code> | ||
+ | === Proximity Search Limitations === | ||
+ | There are some limitations to Proximity Searches. Columns indexed with the Proximity option are limited to 4 million keywords. The Proximity option cannot be used on pre-joined indexes. Lastly, Proximity Searches only pay attention to word proximity, and not to semantics, sentence structure or context. | ||
===== ===== | ===== ===== | ||
**[[admin:indexing:text:retrieve|Prev]]** | | **[[admin:indexing:text:retrieve|Prev]]** | | ||
- | **[[admin:indexing:text:results|Next]]** | + | **[[admin:indexing:text:advanced|Next]]** |
====== Additional Resources ====== | ====== Additional Resources ====== |