Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
dev:sql:functions:contains:home [2009/12/18 16:51]
tdo
dev:sql:functions:contains:home [2016/06/28 22:38] (current)
Line 1: Line 1:
-====== $CONTAINS ====== +{{page>:​top_add&​nofooter&​noeditbtn}}
-===== Syntax =====+
  
-  $CONTAINS(column,​ criteria, [, options [,label]])+====== Development:​ Omnidex SQL ======
  
-column The column against which the criteria should be applied.+===== SQL Functions: $CONTAINS ======
  
-criteria Criteria to be applied against the column, consisting of either simple keywords, or an expression. ​ Expressions may contain nested parentheses,​ qualification ​functions, Boolean operators and other textual operators described in this document. ​ Expressions must be surrounded by outer parentheses.+**[[dev:​sql:​functions:​contains:​home|Description]]** |  
 +[[dev:​sql:​functions:​contains:​syntax|Syntax]] | 
 +[[dev:​sql:​functions:​contains:​customization|Customization]] | 
 +[[dev:​sql:​functions:​contains:​examples|Examples]] ​
  
-options An optional parameter that controls options for the function. ​ The primary options determine which features to apply to this search. ​ If multiple options are submitted, they will be performed in the order shown below:+==== Description ====
  
-PROXIMITY[=approach] Convert all unquoted spaces in criteria to proximity searches based on approach. ​ Valid approaches are:  +The $CONTAINS function expands ​the search ​capabilities ​of SQL to include more flexible textual ​searches.  ​
- PHRASE +
- BEFORE(n) +
- AFTER(n) +
- +
-SPELLCHECK[=list] Check each word in criteria to see if it is a commonly misspelled word, and automatically replace the word with the corrected spelling. ​ If list is not provided, then use the list declared in the environment file for this column. ​ If no list is provided in the environment file, then use the default list provided with Omnidex (recommended). +
- +
-SYNONYMS[=list] Expand ​the search ​to include the synonyms ​of each word in criteria. ​ If list is not provided, then use the list declared in the environment file for this column. ​ If no list is provided in the environment file, then use the thesaurus. +
- +
-FORMS[=approach] Expand the search ​to include ​the various forms of each word.  Approach can be set to one or more of PLURALS, CONJUGATIONS,​ and DERIVATIONS. ​ If approach is not provided, then use the approach declared in the environment file for this column. ​ If no approach is provided in the environment file, then all approaches will be used by default. +
- +
-STOPWORDS[=list] Restrict this search by excluding all words found in the stopwords list.  If list is not provided, then use the list declared in the environment file for this column. ​ If no list is provided in the environment file, then use the default list provided with Omnidex. +
- +
- Stopwords are not eliminated if a single stopword is the only word in criteria. ​  +
- +
-MISSPELLINGS[=list] Locate possible misspellings for each word in criteria. ​ If list is not provided, then use the spelling dictionaries created by ODXSQL’s UPDATE TEXT command. +
- +
-PHONETIC Expand the search to include the phonetic equivalents for each word.  This option requires that the Phonetic indexing option (;SX) be used for this column. +
- +
- The secondary options configure the way that the primary options are applied: +
- +
-INCLUSIVE Search for the original word as well as any additional words. ​ This option applies to the SYNONYMS option only, and is true by default. +
- +
-EXCLUSIVE Search only for the additional words, but not for the original word.  This option applies to the SYNONYMS only. +
-  +
-PARTOFSPEECH=pos When searching the thesaurus or dictionary, limit searches ​to the named part of speech.  ​Parts of speech are: +
- +
-  * NOUN +
-  * VERB +
-  * ADJECTIVE +
-  * ADVERB +
- +
- +
-SENSE=n When searching the thesaurus or dictionary, limit searches to the specified sense. ​ Senses can be seen by looking up the definition with the $LOOKUP function or ODXSQL’s LOOKUP command. +
-  +
-MAX_SENSES=n | ALL When searching the thesaurus or dictionary, limit searches to the specified number of senses. ​ By default, searches are limited to the most common sense. ​  +
- +
-label An optional label that describes this particular search. ​ This label is only needed when there are multiple $CONTAINS functions in a single SELECT statement. ​ The label is then used to associate a particular $CONTAINS function with a $SCORE or $CONTEXT function. ​ The label is case-insensitive,​ and may be up to 32 characters long. +
- +
- +
-The $CONTAINS function returns a Boolean TRUE or FALSE. +
- +
-===== Options ===== +
-PROXIMITY=<​PHRASE | BEFORE(n) | NEAR(n)>  +
-===== Examples ===== +
-This example shows how to enhance an existing SQL statement to use a $CONTAINS function: +
- +
-Original Omnidex SQL Select: +
-  select ACCT, COMPANY, CONTACT  +
-    from CUSTOMERS  +
-    where FIRST_NAME = ‘William’ +
- +
-Enhanced with $CONTAINS function: +
-  select ACCT, COMPANY, CONTACT  +
-    from CUSTOMERS  +
-    where $CONTAINS(FIRST_NAME,​ ‘William’,​  +
-                    ‘synonyms=all_given_names’) +
- +
- +
-This example shows how to enhance a LIKE search to use a $CONTAINS function: +
-  select ACCT, COMPANY, CONTACT  +
-    from CUSTOMERS  +
-    where ADDRESS LIKE ‘%FIRST%STREET%’ +
- +
-Enhanced with $CONTAINS function: +
-  select ACCT, COMPANY, CONTACT  +
-    from CUSTOMERS  +
-    where $CONTAINS(ADDRESS,​ ‘FIRST STREET’,​  +
-                      ‘synonyms=all_addresses’) +
- +
-===== From Omnidex TEXT class ===== +
- +
-<​code>​ +
-$CONTAINS Searches+
  
 Most SQL statements are written with criteria in simple predicates, such as: FIRST_NAME = ‘William’. ​ This predicate compares the entire contents of FIRST_NAME against the criteria ‘William’,​ providing no opportunities for more flexible searches. ​ If the criterion does not match the column exactly, then the row will not be included. ​ Given the likelihood of everything from nicknames to misspellings,​ this means that a lot of rows are “lost”, never to be seen by the user despite their importance and relevance. ​   Most SQL statements are written with criteria in simple predicates, such as: FIRST_NAME = ‘William’. ​ This predicate compares the entire contents of FIRST_NAME against the criteria ‘William’,​ providing no opportunities for more flexible searches. ​ If the criterion does not match the column exactly, then the row will not be included. ​ Given the likelihood of everything from nicknames to misspellings,​ this means that a lot of rows are “lost”, never to be seen by the user despite their importance and relevance. ​  
Line 91: Line 20:
 There is no standardized SQL to support textual searches involving keyword searches, proximity searches, synonym searches, spellchecking and similar features. ​ Some database vendors have implemented their own independent approaches, but they vary between databases. There is no standardized SQL to support textual searches involving keyword searches, proximity searches, synonym searches, spellchecking and similar features. ​ Some database vendors have implemented their own independent approaches, but they vary between databases.
  
-Omnidex specializes in textual searches and provides a rich suite of tools for locating rows containing these words and their variations. ​  ​Within SQL, the gateway to most of these features is the $CONTAINS ​function.  Each $CONTAINS function allows complex criteria ​ against a particular column. ​  These criteria can apply to parsed words within ​column, ​and can also include parenthetical and Boolean relationships between words. ​ The $CONTAINS function also allows phrase and proximity searches, and a wide array of options to trigger features such as synonym searches, spellchecking,​ form searches, phonetic searches and stopwords. ​ A typical application that uses Omnidex Text involves writing standard SQL statements that are then enhanced to use one or more $CONTAINS functions. ​   +This function ​returns ​BOOLEAN datatype ​and can only be referenced ​in the WHERE clause ​of a SELECT statement.
- +
-The examples below demonstrate a substantial improvement over the capabilities of traditional SQL.  In the first example, the use of the SYNONYMS option will trigger a synonym search using a list of first names. ​ This will expand the search to look for alternates of the name William, such as Will, Bill, Liam and others. ​ In the second example, the two criteria will be searched for as actual parsed words rather than as substrings. ​ Additionally,​ the search will be expanded to include alternative forms of those words, such as “1st St.”. ​ These specific capabilities are discussed in depth in the remainder of this document. ​ These examples simply demonstrate how the $CONTAINS function is the gateway to the textual features of Omnidex Text.   +
-  +
- +
-Figure 6 - Basic Examples of the $CONTAINS Function +
- +
-This example shows how to enhance an existing SQL statement to use a $CONTAINS function: +
- +
- select ACCT, COMPANY, CONTACT  +
-  from CUSTOMERS  +
-  where FIRST_NAME = ‘William’ +
- +
-… becomes …  +
- +
- select ACCT, COMPANY, CONTACT  +
-  from CUSTOMERS  +
-  where $CONTAINS(FIRST_NAME,​ ‘William’,​  +
-                      ‘synonyms=all_given_names’) +
- +
- +
-This second example shows how to enhance a LIKE search to use a $CONTAINS function: +
- +
- select ACCT, COMPANY, CONTACT  +
-  from CUSTOMERS  +
-  where ADDRESS LIKE ‘%FIRST%STREET%’ +
- +
-… becomes …  +
- +
- select ACCT, COMPANY, CONTACT  +
-  from CUSTOMERS  +
-  where $CONTAINS(ADDRESS,​ ‘FIRST STREET’,  +
-                      ‘synonyms=all_addresses’) +
- +
- +
- +
- +
-The $CONTAINS function is also a prerequisite for using the $SCORE and $CONTEXT functions described later in this document. ​ Those functions provide relevancy scoring and context excerpt, and must be applied to a particular $CONTAINS search. ​ If multiple $CONTAINS functions are used in a single SQL statement, then labels must be used to identify which $CONTAINS function to use. +
- +
-The $CONTAINS function is unusual in that it comprises an entire predicate rather than being used in a predicate. ​ Functions like CHAR_LENGTH and CASE can be as operands, such as  +
- +
-SELECT …  +
-  ​WHERE  ​ CHAR_LENGTH(column) > 16 AND  +
-… +
- +
-$CONTAINS, however, is used as an entire predicate, as in  +
- +
-SELECT …  +
-  WHERE $CONTAINS(column,​ ‘ABCD’) AND +
- … +
- +
- +
- +
-  +
- +
-Figure 7 - The $CONTAINS Function +
- +
-$CONTAINS(column,​ ‘criteria’,​ [, options [,​’label’]]) +
- +
-column The column against which the criteria should be applied. +
- +
-criteria Criteria to be applied against the column, consisting ​of either simple keywords, or an expression. ​ Expressions may contain nested parentheses,​ qualification functions, Boolean operators and other textual operators described in this document. ​ Expressions must be surrounded by outer parentheses. +
- +
-options An optional parameter that controls options for the function. ​ The primary options determine which features to apply to this search. ​ If multiple options are submitted, they will be performed in the order shown below: +
- +
-PROXIMITY[=approach] Convert all unquoted spaces in criteria to proximity searches based on approach. ​ Valid approaches are:  +
- PHRASE +
- BEFORE(n) +
- AFTER(n) +
- +
-SPELLCHECK[=list] Check each word in criteria to see if it is a commonly misspelled word, and automatically replace the word with the corrected spelling. ​ If list is not provided, then use the list declared in the environment file for this column. ​ If no list is provided in the environment file, then use the default list provided with Omnidex (recommended). +
- +
-SYNONYMS[=list] Expand the search to include the synonyms of each word in criteria. ​ If list is not provided, then use the list declared in the environment file for this column. ​ If no list is provided in the environment file, then use the thesaurus. +
- +
-FORMS[=approach] Expand the search to include the various forms of each word.  Approach can be set to one or more of PLURALS, CONJUGATIONS,​ and DERIVATIONS. ​ If approach is not provided, then use the approach declared in the environment file for this column. ​ If no approach is provided in the environment file, then all approaches will be used by default. +
- +
-STOPWORDS[=list] Restrict this search by excluding all words found in the stopwords list.  If list is not provided, then use the list declared in the environment file for this column. ​ If no list is provided in the environment file, then use the default list provided with Omnidex. +
- +
- Stopwords are not eliminated if a single stopword is the only word in criteria. ​  +
- +
-MISSPELLINGS[=list] Locate possible misspellings for each word in criteria. ​ If list is not provided, then use the spelling dictionaries created by ODXSQL’s UPDATE TEXT command. +
- +
-PHONETIC Expand the search to include the phonetic equivalents for each word.  This option requires that the Phonetic indexing option (;SX) be used for this column. +
- +
- The secondary options configure the way that the primary options are applied: +
- +
-INCLUSIVE Search for the original word as well as any additional words. ​ This option applies to the SYNONYMS option only, and is true by default. +
- +
-EXCLUSIVE Search only for the additional words, but not for the original word.  This option applies to the SYNONYMS only. +
-  +
-PARTOFSPEECH=pos When searching the thesaurus or dictionary, limit searches to the named part of speech. ​ Parts of speech are: +
- NOUN +
- VERB +
- ADJECTIVE +
- ADVERB +
- +
-SENSE=n When searching the thesaurus or dictionary, limit searches to the specified sense. ​ Senses can be seen by looking up the definition with the $LOOKUP function or ODXSQL’s LOOKUP command. +
-  +
-MAX_SENSES=n | ALL When searching the thesaurus or dictionary, limit searches to the specified number of senses. ​ By default, searches are limited to the most common sense. ​  +
- +
-label An optional label that describes this particular search. ​ This label is only needed when there are multiple $CONTAINS functions in single ​SELECT statement. ​ The label is then used to associate a particular $CONTAINS function with a $SCORE or $CONTEXT function. ​ The label is case-insensitive,​ and may be up to 32 characters long. +
  
-The $CONTAINS function returns a Boolean TRUE or FALSE.+====  ====
  
 +**[[dev:​sql:​functions:​contains:​syntax|Next]]**
  
 +====== Additional Resources ======
  
-</code>+See also: 
 +{{page>dev:​sql:​functions:​see_also_string&​nofooter&​noeditbtn&​noheader}}
  
 +{{page>:​bottom_add&​nofooter&​noeditbtn}}
 
Back to top
dev/sql/functions/contains/home.1261155104.txt.gz · Last modified: 2016/06/28 22:38 (external edit)