This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
admin:indexing:powersearch:sql [2011/01/24 05:15] els |
admin:indexing:powersearch:sql [2016/06/28 22:38] (current) |
||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ~~NOTOC~~ | ||
+ | |||
{{page>:top_add&nofooter&noeditbtn}} | {{page>:top_add&nofooter&noeditbtn}} | ||
Line 9: | Line 11: | ||
**[[admin:indexing:powersearch:sql|SQL]]** | | **[[admin:indexing:powersearch:sql|SQL]]** | | ||
[[admin:indexing:powersearch:optimization|Optimization]] | | [[admin:indexing:powersearch:optimization|Optimization]] | | ||
+ | [[admin:indexing:powersearch:custom|Customization]] | | ||
[[admin:indexing:powersearch:tips|Tips]] | [[admin:indexing:powersearch:tips|Tips]] | ||
Line 17: | Line 20: | ||
==== The SQL Behind PowerSearch ==== | ==== The SQL Behind PowerSearch ==== | ||
- | In most scenarios, the key to implementing PowerSearch is the $CONTAINS function in Omnidex SQL. This function provides access to most of the search capabilities shown on the previous page. | + | === The POWERSEARCH Option === |
- | === The $CONTAINS Function === | + | In most scenarios, the key to implementing PowerSearch is to simply use the POWERSEARCH option on a SELECT statement. Alternatively, the SET POWERSEARCH ON command can be used to automatically activate PowerSearch for all queries on that connection. |
- | The $CONTAINS function is used in the WHERE clause of a SELECT statement to provide criteria accompanied by special options. In traditional SQL statements, criteria is specified using a syntax of: | + | <code> |
+ | > select name from individuals where name = 'William Cooke' with powersearch; | ||
- | <code sql> | + | NAME |
- | select count(*) | + | -------------------------------------------------- |
- | from LIST | + | BILLIE E COOK |
- | where FNAME = 'Bill'; | + | MR. WILL COOK |
</code> | </code> | ||
- | The $CONTAINS function allows this same statement to be reworded as follows: | + | === The USAGE Clause === |
- | <code sql> | + | The POWERSEARCH option works hand-in-hand with the data stored in the Omnidex Environment File. To use POWERSEARCH, administrators add a context to columns telling Omnidex what they contain. For example, a column named NAME may have a character datatype, but that alone is not enough to tell Omnidex what the column contains. Moreover, Omnidex cannot look to the column name, as there are many different conventions for naming columns. Administrators add a USAGE clause that identifies the contents as given names, as shown in this example: |
- | select count(*) | + | |
- | from LIST | + | |
- | where $contains(FNAME, 'Bill'); | + | |
- | </code> | + | |
- | + | ||
- | The real advantage of the $CONTAINS function lies in the options that can be added. The $CONTAINS function is discussed in depth in the [[dev:sql:functions:contains:home|documentation]] on the function. Meanwhile, the following examples show how the options were used to provide each of the capabilities shown on the previous page. | + | |
- | + | ||
- | + | ||
- | === First Name === | + | |
- | + | ||
- | When searching a database containing first names, you never know whether the formal given name is stored, or one of common variations. PowerSearch will automatically search for common synonyms for first names. | + | |
- | + | ||
- | Here are some examples: | + | |
- | + | ||
- | ^ Given Name ^ Synonyms ^ | + | |
- | |William |Bill, Billy, Will, Williams, Willie, Willis, Wilson | | + | |
- | |James |Jim, Jimmy, Jimmie, Jay, Jaime, Jacob | | + | |
- | |Elizabeth |Beth, Betty, Eliza, Elsa, Lisa, Liz, Liza | | + | |
- | |Ann |Ana, Anita, Anna, Anne, Annette, Annie, Annmarie | | + | |
- | |------------------------------------------|-----------------------------------------------------------------------------------------------------------------| | + | |
- | + | ||
- | The SQL syntax for given name synonyms searches is shown below. | + | |
- | <code sql> | + | <code> |
- | select ... | + | create table "INDIVIDUALS" |
- | from ... | + | physical "dat\individuals.dat" |
- | where $contains(FNAME, 'Bill', 'SYNONYMS=ALL_GIVEN_NAMES') ... | + | ( |
+ | "INDIVIDUAL" CHARACTER(12) omnidex, | ||
+ | "HOUSEHOLD" CHARACTER(12) omnidex , | ||
+ | "NAME" CHARACTER(50) quicktext usage "NAME", | ||
+ | "GENDER" CHARACTER(1) omnidex bitmap, | ||
+ | "BIRTHDATE" ANSI DATE omnidex, | ||
+ | "PHONE" CHARACTER(14) omnidex usage "PHONE", | ||
+ | "EMAIL" CHARACTER(60) quicktext usage "EMAIL", | ||
+ | constraint INDIVIDUALS_INDIVIDUAL_PK primary ("INDIVIDUAL"), | ||
+ | constraint INDIVIDUALS_HOUSEHOLD_FK foreign ("HOUSEHOLD") references "HOUSEHOLDS", | ||
+ | constraint INDIVIDUALS_GENDER_FK foreign ("GENDER") references "GENDERS", | ||
+ | ) | ||
+ | in "simple.xml"; | ||
</code> | </code> | ||
- | === Last Name === | + | Omnidex provides many different usages to meet the general needs of most applications; however, administrators can add custom usages to meet the specific needs of their application. The general usages provided by Omnidex include: |
- | Many last names sound similar, but are spelled differently. For this reason, phonetic algorithms such as Soundex or Metaphone are commonly used to find last names that sound similar to each other. PowerSearch will automatically search for phonetic equivalents for last names. | + | * SALUTATION |
+ | * FIRST_NAME | ||
+ | * MIDDLE_NAME | ||
+ | * LAST_NAME | ||
+ | * NAME | ||
+ | * COMPANY_NAME | ||
+ | * STREET_NUMBER | ||
+ | * STREET_DIRECTION | ||
+ | * STREET_NAME | ||
+ | * STREET_TYPE | ||
+ | * UNIT_TYPE | ||
+ | * UNIT_NUMBER | ||
+ | * CITY | ||
+ | * STATE | ||
+ | * COUNTRY | ||
+ | * ZIPCODE | ||
+ | * STREET_ADDRESS | ||
+ | * CITY_STATE_POSTAL | ||
+ | * ADDRESS | ||
+ | * PHONE_AREACODE | ||
+ | * PHONE_PREFIX | ||
+ | * PHONE_SUFFIX | ||
+ | * PHONE | ||
+ | |||
+ | * ... plus, you can add any custom usages needed for your application. | ||
- | Here are some examples: | ||
- | |||
- | ^ Last Name ^ Phonetic Equivalents ^ | ||
- | |Smith |Schmit, Schmidt, Schmitt, Smidt, Smit, Smith, Smithey, Smyth, Smythe | | ||
- | |Meyers |Meyers, Myers, Myres, Meiers, Moyers, Meers, Mayers, Maris | | ||
- | |Nelson |Nelsen, Nielsen, Nielson, Neilsen, Neilson, Nilsen, Nilson, Nilsson | | ||
- | |Williams |Willaims, Wiliams, Willimas, Wilmes, Willams, Willems | | ||
- | |------------------------------------------|-----------------------------------------------------------------------------------------------------------------| | ||
- | |||
- | The SQL syntax for phonetic last name searches is shown below. | ||
- | |||
- | <code sql> | ||
- | select ... | ||
- | from ... | ||
- | where $contains(LNAME, 'Meyers', 'PHONETIC') ... | ||
- | </code> | ||
- | |||
- | === Address === | ||
- | |||
- | Many addresses use both common and uncommon abbreviations. Street numbers can be easily transposed. Street names can be easily misspelled. PowerSearch will automatically look for abbreviations, misspellings and transpositions. | ||
- | |||
- | Here are some examples: | ||
- | |||
- | ^ Category ^ Search Term ^ Equivalents ^ | ||
- | |Street Types |Ave, St, Blvd |Avenue, Street, Boulevard | | ||
- | |Street Directions |N, S, E, W |North, South, East, West | | ||
- | |Street Names (abbreviated) |1st, 2nd, 3rd, 4th, 5th |First, Second, Third, Fourth, Fifth | | ||
- | |Street Names (misspelled) |Canterbury |Cantaberry, Canterbury, Cantabury | | ||
- | |Unit Types |Apt, Ste, Rm |Apartment, Suite, Room | | ||
- | |Street Numbers |12345 |13245, 15324, 1234, 2345 | | ||
- | |------------------------------------------|------------------------------------------|----------------------------------------------------------------------| | ||
- | |||
- | |||
- | The SQL syntax for postal abbrevation synonym searches and misspelling searches is shown below. | ||
- | |||
- | <code sql> | ||
- | select ... | ||
- | from ... | ||
- | where $contains(ADDRESS1, '825 Fifth Avenue', | ||
- | 'SYNONYMS=ALL_ADDRESS_LINES,MISSPELLINGS') ... | ||
- | </code> | ||
- | === City === | ||
- | |||
- | Some cities have common abbreviations and others can be easily misspelled. PowerSearch will automatically look for common abbreviations and misspellings. | ||
- | |||
- | Here are some examples: | ||
- | |||
- | ^ Category ^ Search Term ^ Equivalents ^ | ||
- | |Abbreviations |SF, LA, NY |San Francisco, Los Angeles, New York | | ||
- | |Misspellings |Milwaukee |Milwalkee, Milwauki, Milwalke | | ||
- | |------------------------------------------|------------------------------------------|----------------------------------------------------------------------| | ||
- | |||
- | The SQL syntax for city synonym searches and misspelling searches is shown below. | ||
- | |||
- | |||
- | <code sql> | ||
- | select ... | ||
- | from ... | ||
- | where $contains(CITY, 'NY', 'SYNONYMS=CITY_ABBR,MISSPELLINGS') ... | ||
- | </code> | ||
- | |||
- | === Zip Code === | ||
- | |||
- | Zip codes are sometimes guessed based on their approximate geographic location. PowerSearch will automatically search for zipcodes that are geographically adjacent to this zip code. | ||
- | |||
- | Here are some examples: | ||
- | |||
- | ^ Zip Code ^ Adjacent Zip Code ^ Distance ^ | ||
- | |10021 |10162 |.48 miles | | ||
- | | |10155 |.64 miles | | ||
- | | |10028 |.67 miles | | ||
- | | |10044 |.73 miles | | ||
- | | |10052 |.80 miles | | ||
- | | |10022 |.82 miles | | ||
- | |------------------------------------------|------------------------------------------|----------------------------------------------------------------------| | ||
- | |||
- | The SQL syntax for zip code geographic radius searches is shown below. | ||
- | |||
- | |||
- | <code sql> | ||
- | select ... | ||
- | from ... | ||
- | where ZIP in | ||
- | (select ZIP | ||
- | from ZIPCODES | ||
- | where $distance((select LATITUDE, | ||
- | LONGITUDE | ||
- | from ZIPCODES | ||
- | where ZIP = '10022'), | ||
- | LATITUDE, | ||
- | LONGITUDE) <= 5) ... | ||
- | </code> | ||
- | |||
- | === Phone Area Code === | ||
- | |||
- | |||
- | Some metropolitan areas use multiple area codes, leading people to be unsure which area code to use. PowerSearch will automatically search the alternate area codes. | ||
- | |||
- | Here are some examples: | ||
- | |||
- | ^ Metropolitan Area ^ Area Codes ^ | ||
- | |New York, NY |212, 646, 917 | | ||
- | |Denver, CO |303, 720 | | ||
- | |Seattle, WA |206, 360, 253, 425, 564 | | ||
- | |------------------------------------------|-----------------------------------------------------------------------------------------------------------------| | ||
- | |||
- | The SQL syntax for telephone area code synonym searches is shown below. | ||
- | |||
- | <code sql> | ||
- | select ... | ||
- | from ... | ||
- | where $contains(PHONE1_AREA, '917', 'SYNONYMS=AREACODES') ... | ||
- | </code> | ||
- | === Phone Prefix and Suffix === | ||
- | |||
- | Telephone numbers are easy to transpose or mistype. PowerSearch will automatically search for transpositions and typographical errors in your telephone numbers. | ||
- | |||
- | Here are some examples: | ||
- | |||
- | ^ Phone Number ^ Transposition ^ | ||
- | |775-4866 | 755-4846 | | ||
- | |831-6584 | 381-8564 | | ||
- | |------------------------------------------|-----------------------------------------------------------------------------------------------------------------| | ||
- | |||
- | The SQL Syntax for telephone number transposition searches is shown below. | ||
- | |||
- | <code sql> | ||
- | select ... | ||
- | from ... | ||
- | where $contains(PHONE1_PREFIX, '755', 'MISSPELLINGS MIN_SCORE=70') and | ||
- | $contains(PHONE1_SUFFIX, '4686', 'MISSPELLINGS MIN_SCORE=70') ... | ||
- | </code> | ||
- | |||
- | === Email === | ||
- | |||
- | |||
- | This column contains the e-mail address for the individual. You may also enter just the mailbox name or the domain name. For example, you may enter 'mailbox@domain.com' , or just 'mailbox', or just 'domain.com'. PowerSearch will additionally search for phonetic equivalents or misspellings of your email address. | ||
- | |||
- | <code sql> | ||
- | select ... | ||
- | from ... | ||
- | where $contains(EMAIL, 'wmeyers@med.cornell.edu', 'MISSPELLINGS MIN_SCORE=80') ... | ||
- | </code> | ||
- | |||
- | === Final SQL Statement === | ||
- | |||
- | When combined, the final SQL statement has many $CONTAINS functions which combine to provide the full power of PowerSearch: | ||
- | |||
- | <code sql> | ||
- | select count(*) | ||
- | from LIST | ||
- | where $contains(FNAME, 'Bill', 'SYNONYMS=ALL_GIVEN_NAMES') and | ||
- | $contains(LNAME, 'Meyers', 'PHONETIC') and | ||
- | $contains(ADDRESS1, '825 Fifth Avenue', | ||
- | 'SYNONYMS=ALL_ADDRESS_LINES,MISSPELLINGS') and | ||
- | $contains(CITY, 'NY', 'SYNONYMS=CITY_ABBR,MISSPELLINGS') and | ||
- | LIST.STATE = 'NY' and | ||
- | ZIP in | ||
- | (select ZIP | ||
- | from ZIPCODES | ||
- | where $distance((select LATITUDE, | ||
- | LONGITUDE | ||
- | from ZIPCODES | ||
- | where ZIP = '10022'), | ||
- | LATITUDE, | ||
- | LONGITUDE) <= 5) and | ||
- | $contains(PHONE1_AREA, '917', 'SYNONYMS=AREACODES') and | ||
- | $contains(PHONE1_PREFIX, '755', 'MISSPELLINGS MIN_SCORE=70') and | ||
- | $contains(PHONE1_SUFFIX, '4686', 'MISSPELLINGS MIN_SCORE=70') and | ||
- | $contains(EMAIL, 'wmeyers@med.cornell.edu', 'MISSPELLINGS MIN_SCORE=80'); | ||
- | </code> | ||