This shows you the differences between two versions of the page.
admin:indexing:powersearch:sql [2011/01/24 05:27] els |
admin:indexing:powersearch:sql [2016/06/28 22:38] |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | {{page>:top_add&nofooter&noeditbtn}} | ||
- | |||
- | ====== Administration: Omnidex Indexing ====== | ||
- | |||
- | ===== PowerSearch ===== | ||
- | |||
- | [[admin:indexing:powersearch:home|Overview]] | | ||
- | [[admin:indexing:powersearch:example|Example]] | | ||
- | **[[admin:indexing:powersearch:sql|SQL]]** | | ||
- | [[admin:indexing:powersearch:optimization|Optimization]] | | ||
- | [[admin:indexing:powersearch:tips|Tips]] | ||
- | |||
- | ---- | ||
- | |||
- | |||
- | |||
- | ==== The SQL Behind PowerSearch ==== | ||
- | |||
- | In most scenarios, the key to implementing PowerSearch is the $CONTAINS function in Omnidex SQL. This function provides access to most of the search capabilities shown on the previous page. | ||
- | |||
- | === The $CONTAINS Function === | ||
- | |||
- | The $CONTAINS function is used in the WHERE clause of a SELECT statement to provide criteria accompanied by special options. In traditional SQL statements, criteria is specified using a syntax of: | ||
- | |||
- | <code sql> | ||
- | select count(*) | ||
- | from LIST | ||
- | where FNAME = 'Bill'; | ||
- | </code> | ||
- | |||
- | The $CONTAINS function allows this same statement to be reworded as follows: | ||
- | |||
- | <code sql> | ||
- | select count(*) | ||
- | from LIST | ||
- | where $contains(FNAME, 'Bill'); | ||
- | </code> | ||
- | |||
- | The real advantage of the $CONTAINS function lies in the options that can be added. The $CONTAINS function is discussed in depth in the [[dev:sql:functions:contains:home|documentation]] on the function. Meanwhile, the following examples show how the options were used to provide each of the capabilities shown on the previous page. | ||
- | |||
- | |||
- | === First Name === | ||
- | |||
- | PowerSearch uses a given names synonym list to search for both the formal given name and the most common variations. For example: | ||
- | |||
- | ^ Given Name ^ Synonyms ^ | ||
- | |William |Bill, Billy, Will, Williams, Willie, Willis, Wilson | | ||
- | |James |Jim, Jimmy, Jimmie, Jay, Jaime, Jacob | | ||
- | |Elizabeth |Beth, Betty, Eliza, Elsa, Lisa, Liz, Liza | | ||
- | |Ann |Ana, Anita, Anna, Anne, Annette, Annie, Annmarie | | ||
- | |------------------------------------------|-----------------------------------------------------------------------------------------------------------------| | ||
- | |||
- | The SQL syntax for given name synonyms searches is shown below. | ||
- | |||
- | <code sql> | ||
- | select ... | ||
- | from ... | ||
- | where $contains(FNAME, 'Bill', 'SYNONYMS=ALL_GIVEN_NAMES') ... | ||
- | </code> | ||
- | |||
- | === Last Name === | ||
- | |||
- | PowerSearch uses phonetic algorithms to search for last names that sound similar, but are spelled differently. For example: | ||
- | |||
- | Here are some examples: | ||
- | |||
- | ^ Last Name ^ Phonetic Equivalents ^ | ||
- | |Smith |Schmit, Schmidt, Schmitt, Smidt, Smit, Smith, Smithey, Smyth, Smythe | | ||
- | |Meyers |Meyers, Myers, Myres, Meiers, Moyers, Meers, Mayers, Maris | | ||
- | |Nelson |Nelsen, Nielsen, Nielson, Neilsen, Neilson, Nilsen, Nilson, Nilsson | | ||
- | |Williams |Willaims, Wiliams, Willimas, Wilmes, Willams, Willems | | ||
- | |------------------------------------------|-----------------------------------------------------------------------------------------------------------------| | ||
- | |||
- | The SQL syntax for phonetic last name searches is shown below. | ||
- | |||
- | <code sql> | ||
- | select ... | ||
- | from ... | ||
- | where $contains(LNAME, 'Meyers', 'PHONETIC') ... | ||
- | </code> | ||
- | |||
- | === Address === | ||
- | |||
- | PowerSearch uses a postal abbreviations synonym list, combined with misspelling and transposition algorithms, to search street addresses. | ||
- | |||
- | Here are some examples: | ||
- | |||
- | ^ Category ^ Search Term ^ Equivalents ^ | ||
- | |Street Types |Ave, St, Blvd |Avenue, Street, Boulevard | | ||
- | |Street Directions |N, S, E, W |North, South, East, West | | ||
- | |Street Names (abbreviated) |1st, 2nd, 3rd, 4th, 5th |First, Second, Third, Fourth, Fifth | | ||
- | |Street Names (misspelled) |Canterbury |Cantaberry, Canterbury, Cantabury | | ||
- | |Unit Types |Apt, Ste, Rm |Apartment, Suite, Room | | ||
- | |Street Numbers |12345 |13245, 15324, 1234, 2345 | | ||
- | |------------------------------------------|------------------------------------------|----------------------------------------------------------------------| | ||
- | |||
- | |||
- | The SQL syntax for postal abbrevation synonym searches and misspelling searches is shown below. | ||
- | |||
- | <code sql> | ||
- | select ... | ||
- | from ... | ||
- | where $contains(ADDRESS1, '825 Fifth Avenue', | ||
- | 'SYNONYMS=ALL_ADDRESS_LINES,MISSPELLINGS') ... | ||
- | </code> | ||
- | === City === | ||
- | |||
- | PowerSearch uses a cities synonym list, combined with misspelling algorithms, to search city names. For example: | ||
- | |||
- | Here are some examples: | ||
- | |||
- | ^ Category ^ Search Term ^ Equivalents ^ | ||
- | |Abbreviations |SF, LA, NY |San Francisco, Los Angeles, New York | | ||
- | |Misspellings |Milwaukee |Milwalkee, Milwauki, Milwalke | | ||
- | |------------------------------------------|------------------------------------------|----------------------------------------------------------------------| | ||
- | |||
- | The SQL syntax for city synonym searches and misspelling searches is shown below. | ||
- | |||
- | |||
- | <code sql> | ||
- | select ... | ||
- | from ... | ||
- | where $contains(CITY, 'NY', 'SYNONYMS=CITY_ABBR,MISSPELLINGS') ... | ||
- | </code> | ||
- | |||
- | === Zip Code === | ||
- | |||
- | PowerSearch uses a geographic radius search to locate zip codes that are within a certain distance from the named zip code. This is an example of a PowerSearch search that does not use the $CONTAINS function. Instead, it uses the [[admin:features:geo:home|$DISTANCE function]] to perform the radius search. This search requires that the database have a table of postal codes that includes their geographic coordinates. For example: | ||
- | |||
- | Here are some examples: | ||
- | |||
- | ^ Zip Code ^ Adjacent Zip Code ^ Distance ^ | ||
- | |10021 |10162\\ 10155\\ 10028\\ 10044\\ 10052\\ 10022 |.64 miles\\ .67 miles\\ .73 miles\\ .80 miles\\ .82 miles | | ||
- | |------------------------------------------|------------------------------------------|----------------------------------------------------------------------| | ||
- | |||
- | The SQL syntax for zip code geographic radius searches is shown below. | ||
- | |||
- | |||
- | <code sql> | ||
- | select ... | ||
- | from ... | ||
- | where ZIP in | ||
- | (select ZIP | ||
- | from ZIPCODES | ||
- | where $distance((select LATITUDE, | ||
- | LONGITUDE | ||
- | from ZIPCODES | ||
- | where ZIP = '10022'), | ||
- | LATITUDE, | ||
- | LONGITUDE) <= 5) ... | ||
- | </code> | ||
- | |||
- | === Phone Area Code === | ||
- | |||
- | PowerSearch uses an area code synonym list to locate area codes that are shared within the same metropolitan area. For example: | ||
- | |||
- | Here are some examples: | ||
- | |||
- | ^ Metropolitan Area ^ Area Codes ^ | ||
- | |New York, NY |212, 646, 917 | | ||
- | |Denver, CO |303, 720 | | ||
- | |Seattle, WA |206, 360, 253, 425, 564 | | ||
- | |------------------------------------------|-----------------------------------------------------------------------------------------------------------------| | ||
- | |||
- | The SQL syntax for telephone area code synonym searches is shown below. | ||
- | |||
- | <code sql> | ||
- | select ... | ||
- | from ... | ||
- | where $contains(PHONE1_AREA, '917', 'SYNONYMS=AREACODES') ... | ||
- | </code> | ||
- | === Phone Prefix and Suffix === | ||
- | |||
- | PowerSearch uses a transposition search to locate telephone numbers that have been transposed or mistyped. For example: | ||
- | |||
- | Here are some examples: | ||
- | |||
- | ^ Phone Number ^ Transposition ^ | ||
- | |775-4866 | 755-4846 | | ||
- | |831-6584 | 381-8564 | | ||
- | |------------------------------------------|-----------------------------------------------------------------------------------------------------------------| | ||
- | |||
- | The SQL Syntax for telephone number transposition searches is shown below. | ||
- | |||
- | <code sql> | ||
- | select ... | ||
- | from ... | ||
- | where $contains(PHONE1_PREFIX, '755', 'MISSPELLINGS MIN_SCORE=70') and | ||
- | $contains(PHONE1_SUFFIX, '4686', 'MISSPELLINGS MIN_SCORE=70') ... | ||
- | </code> | ||
- | |||
- | === Email === | ||
- | |||
- | |||
- | PowerSearch uses a standard misspelling search to locate email addresses that have similar spellings. The SQL syntax for email misspelling searches is shown below. | ||
- | |||
- | <code sql> | ||
- | select ... | ||
- | from ... | ||
- | where $contains(EMAIL, 'wmeyers@med.cornell.edu', 'MISSPELLINGS MIN_SCORE=80') ... | ||
- | </code> | ||
- | |||
- | === Final SQL Statement === | ||
- | |||
- | When combined, the final SQL statement has many $CONTAINS functions which combine to provide the full power of PowerSearch: | ||
- | |||
- | <code sql> | ||
- | select count(*) | ||
- | from LIST | ||
- | where $contains(FNAME, 'Bill', 'SYNONYMS=ALL_GIVEN_NAMES') and | ||
- | $contains(LNAME, 'Meyers', 'PHONETIC') and | ||
- | $contains(ADDRESS1, '825 Fifth Avenue', | ||
- | 'SYNONYMS=ALL_ADDRESS_LINES,MISSPELLINGS') and | ||
- | $contains(CITY, 'NY', 'SYNONYMS=CITY_ABBR,MISSPELLINGS') and | ||
- | LIST.STATE = 'NY' and | ||
- | ZIP in | ||
- | (select ZIP | ||
- | from ZIPCODES | ||
- | where $distance((select LATITUDE, | ||
- | LONGITUDE | ||
- | from ZIPCODES | ||
- | where ZIP = '10022'), | ||
- | LATITUDE, | ||
- | LONGITUDE) <= 5) and | ||
- | $contains(PHONE1_AREA, '917', 'SYNONYMS=AREACODES') and | ||
- | $contains(PHONE1_PREFIX, '755', 'MISSPELLINGS MIN_SCORE=70') and | ||
- | $contains(PHONE1_SUFFIX, '4686', 'MISSPELLINGS MIN_SCORE=70') and | ||
- | $contains(EMAIL, 'wmeyers@med.cornell.edu', 'MISSPELLINGS MIN_SCORE=80'); | ||
- | </code> | ||
- | |||
- | |||
- | ===== ===== | ||
- | |||
- | **[[admin:indexing:powersearch:example|Prev]]** | | ||
- | **[[admin:indexing:powersearch:optimization|Next]]** | ||
- | |||
- | ====== Additional Resources ====== | ||
- | |||
- | See also: | ||
- | |||
- | {{page>:admin:indexing:see_also&nofooter&noeditbtn}} | ||
- | |||
- | {{page>:bottom_add&nofooter&noeditbtn}} | ||