Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
admin:indexing:text:relevancy [2012/01/31 23:03]
doc created
admin:indexing:text:relevancy [2016/06/28 22:38] (current)
Line 10: Line 10:
 [[admin:​indexing:​text:​retrieve|External Files]] |  [[admin:​indexing:​text:​retrieve|External Files]] | 
 [[admin:​indexing:​text:​proximity|Proximity Searches]] |  [[admin:​indexing:​text:​proximity|Proximity Searches]] | 
-[[admin:​indexing:​text:​contains|Advanced Searches]] | +[[admin:​indexing:​text:​advanced|Advanced Searches]] | 
 [[admin:​indexing:​text:​results|Displaying Results]] |  [[admin:​indexing:​text:​results|Displaying Results]] | 
 **[[admin:​indexing:​text:​relevancy|Relevancy]]** **[[admin:​indexing:​text:​relevancy|Relevancy]]**
Line 18: Line 18:
 ==== Relevancy ==== ==== Relevancy ====
  
-Proximity searches can qualify rows with large blocks of text using Phrase SearchesBEFORE Searches and NEAR Searches.  ​Once the rows are qualifiedthe obvious next step is to display the results. ​ This can be more difficult ​if the text is as long as an entire book, or some other large block of text. +When searching ​large blocks of text, relevancy becomes more important.  ​If the criteria occurs many times in a large block of textit can be considered ​more relevant than if the criteria occurs only once.  When multiple words are nearer to each other, or occur nearer ​to the beginning ​of the block of text, it can be considered more relevant.  ​Omnidex provides ​a $SCORE function that provides relevancy scores based on these considerations.  ​Rows can be filtered or ordered based on relevancy scores so that the most valuable blocks ​of text are shown first.
- +
-Omnidex allows excerpts ​to be retrieved from large blocks ​of text to make viewing easier. ​ These excerpts show the portions ​of the text that qualified the rowwith the search terms highlighted.  ​ +
- +
-Excerpts are retrieved using the $CONTEXT function. ​ The $CONTEXT function works hand-in-hand with the $CONTAINS function. ​ The $CONTAINS function is used to label particular search, and the $CONTEXT ​function ​retrieves excerpts for that same label.  ​This is needed before there may be other criteria in the SQL statement, and even multiple Proximity Searches against multiple columns and tables. ​ Only one Proximity Search can feed these excerpts, necessitating the pairing ​of the $CONTAINS function and the $CONTEXT function.+
  
 === $SCORE Function === === $SCORE Function ===
  
-The [[dev:​sql:​functions:​context|$CONTEXT]] function retrieves excerpts of a text field based on a paired $CONTAINS function.  ​By default, a simple excerpt is displayedhoweveroptions exist to allow embedding HTML tags to highlight the search terms for easy display in web environment +The [[dev:​sql:​functions:​score|$SCORE]] function retrieves excerpts of a text field based on a paired $CONTAINS function.  ​The score is a number between 1 and 100with 100 representing the highest relevancy. ​ Note that the $SCORE function only returns ​relevancy score when paired with a $CONTAINS functionotherwiseit will always return ​score of 100.
  
 <​code>​ <​code>​
-> select ​       TITLE,+> select ​       ​$score, 
 +>> ​             ​TITLE,
 >> ​             $context >> ​             $context
 >> ​ from        BOOKS >> ​ from        BOOKS
->> ​ where       ​$contains(CONTENT,​ 'missisipi',​ '​misspellings');+>> ​ where       ​$contains(CONTENT,​ '(place near(25) home)'); 
 + 
 +$SCORE 
 +--------------------------------
 TITLE TITLE
 ----------------------------------------------------------------------------- -----------------------------------------------------------------------------
 $CONTEXT(BOOKS.CONTENT) $CONTEXT(BOOKS.CONTENT)
 ----------------------------------------------------------------------------- -----------------------------------------------------------------------------
-Around the World in Eighty Days +                       ​49.410000 
---- at Nauvoo, on the *Mississippi*, numbering twenty-five thousand --- +The Wonderful Wizard of Oz 
->> night it crossed the *Mississippiat Davenport, and by ---+--- There is no *placelike *home*." ​---
  
 +                       ​37.210000
 The Adventures of Tom Sawyer The Adventures of Tom Sawyer
---- a point where the *MississippiRiver was a trifle --- and saw the broad +--- I want to go *home*." ​ "But, Joe, there ain't such another 
->> *Mississippirolling by! --- +>> ​swimming-*placeanywhere." ​---
-2 rows returned +
-</​code>​+
  
-Excerpts can be easily formatted for display using HTML, including assigning CSS classes as needed: +                       36.400000
- +
-<​code>​ +
-> select ​       TITLE, +
->> ​             $context(255,​ '​STYLE=HTML CLASSES'​) +
->> ​ from        BOOKS +
->> ​ where       ​$contains(CONTENT,​ '​missisipi',​ '​misspellings'​);​ +
-TITLE +
------------------------------------------------------------------------------ +
-$CONTEXT(BOOKS.CONTENT) +
------------------------------------------------------------------------------+
 Around the World in Eighty Days Around the World in Eighty Days
---- at Nauvooon the <span class="​odx_word">​Mississippi</​span>,​ numbering +--- travelled nor stayed from *home* overnighthe felt...this would be the 
->> ​twenty-five thousand ​--- night it crossed ​the <span +>> ​*place* he was after. ​--- which was to take *place* ​the next...found him 
->> ​class="​odx_word">​Mississippi</​span> ​at Davenport, and by --- +>> ​not at *home*. ​--- 
-The Adventures of Tom Sawyer +rows returned
---- a point where the <span class="​odx_word">​Mississippi</​span>​ River was a +
->> trifle --- and saw the broad <span class="​odx_word">​Mississippi</​span>​ +
->> rolling by! --- +
-rows returned+
 </​code>​ </​code>​
  
-If the statement contains multiple $CONTAINS functions, they should be labelled with distinct names, and the $CONTEXT ​should reference the appropriate $CONTAINS ​clause.  The excerpts ​will be created based on that column'​s criteria.+If the statement contains multiple $CONTAINS functions, they should be labelled with distinct names, and the $SCORE function ​should reference the appropriate $CONTAINS ​label.  The relevancy score will be created based on that column'​s criteria.
  
 <​code>​ <​code>​
-select ​       TITLE, +select ​       ​$score(, '​CONTENT'​) RELEVANCY,​ 
-              $CONTEXT(255, '​STYLE=TEXT',​ '​CONTENT'​) +>> ​             ​TITLE, 
-  from        BOOKS +>> ​             $context(255, '​STYLE=TEXT',​ '​CONTENT'​) 
-  where       ​$contains(LANGUAGE,​ '​English',,​ '​LANGUAGE'​) and +>> ​ ​from ​       BOOKS 
-              $contains(CONTENT,​ 'missisipi', ​'​misspellings'​, '​CONTENT'​);​+>> ​ ​where ​      ​$contains(LANGUAGE,​ '​English',,​ '​LANGUAGE'​) and 
 +>> ​             ​$contains(CONTENT,​ 'magic',, '​CONTENT'​) 
 +>> ​ order by    RELEVANCY desc;
  
 +RELEVANCY
 +--------------------------------
 TITLE TITLE
 ----------------------------------------------------------------------------- -----------------------------------------------------------------------------
 $CONTEXT(BOOKS.CONTENT) $CONTEXT(BOOKS.CONTENT)
 ----------------------------------------------------------------------------- -----------------------------------------------------------------------------
 +                       ​83.350000
 +The Wonderful Wizard of Oz
 +--- The *Magic* Art of the Great --- will use all the *magic* arts I know of
 +>> --- me to use my *magic* power to send you --- and then by her *magic*
 +>> arts made the iron ---
 +
 +                       ​56.380000
 Around the World in Eighty Days Around the World in Eighty Days
---- at Nauvoo, on the *Mississippi*, numbering twenty-five thousand ​--- +--- transferred by some strange ​*magicto the antipodes. ​--- interest, as 
->> ​night it crossed the *Mississippiat Davenport, and by --- +>> ​if by *magic*--- 
-The Adventures ​of Tom Sawyer + 
---- a point where the *MississippiRiver was a trifle ​--- and saw the broad +                       ​52.360000 
->> ​*Mississippirolling by! --- +Alice'​s ​Adventures ​in Wonderland 
-rows returned +--- for Alice, ​the little ​*magicbottle had now had --- 
-+                       52.360000 
-</​code>​+Hamlet 
 +--- thrice infected, Thy natural ​*magicand dire property, On --- 
 +rows returned</​code>​
 =====  ===== =====  =====
  
-**[[admin:​indexing:​text:​contains|Prev]]** | +**[[admin:​indexing:​text:​results|Prev]]**
-**[[admin:​indexing:​text:​relevancy|Next]]**+
  
 ====== Additional Resources ====== ====== Additional Resources ======
 
Back to top
admin/indexing/text/relevancy.1328051014.txt.gz · Last modified: 2016/06/28 22:38 (external edit)