Multi-Dimensional Keyword (MDK) Indexes
MDK indexes enable keyword searches which allow users to find rows based
on words that appear anywhere in an indexed field, without serial reads
or sequential index searches.
Multiple columns can be grouped together to qualify rows where a single
keyword may appear in one column or another, but not both. For example,
an address field where ADDRESS1 might contain a street address and ADDRESS2
might contain a post office box or suite number, but the data is not necessarily
consistent through all rows.
MDK indexes allow users to find data without knowing exactly how it was
originally entered in the database, i.e. upper, lower, or mixed case text,
spelling differences, noise words like "the" and "and",
or inadvertent extra spaces in the data.
As of Omnidex version 4.0, by default, MDK indexed fields are indexed
completely intact. Text is case sensitive and not parsed, meaning that
spaces, special characters and punctuation are not stripped out. This
allows data retrievals on fields that contain blanks or special characters,
and zero values in a binary column.
To enable keyword searches on a column or group of columns, install an
MDK index using the ;KW (Keyword) option. With this option, spaces, percent
symbols, ampersands, and other special characters are parsed as keyword
separators.
To make searches case insensitive, use the ;CI option.
See Composite Indexes for more information and rules about creating composite
MDK indexes.
Advantages
Most data management systems offer some limited form of indexing. However,
Omnidex indexing allows far more flexibility in the way indexes are created
and what they allow the user to do with them, while providing tremendous
speed advantages over the dbms at the same time.
- Qualifying counts are always returned for queries against a column
installed with an MDK index. Omnidex gets the count from the indexes
and never touches the actual data unless a fetch is requested. In many
cases, a count is all that is needed from a specific query.
- MDK indexes can be used to qualify rows based on data in multiple
columns. For example, to find all customers in California involved in
software development, a query can be constructed to qualify all customers
with CA in the STATE column and then further qualify on the COMMENTS
column using AND SOFTWARE DEVELOPMENT.
- MDK indexes can also be grouped using the ;Gn option, to search multiple
columns simultaneously. For example, many databases contain Address1
and Address2 columns to handle suite numbers, PO Boxes, or just particularly
long addresses. If these columns were grouped, a qualification could
be performed against one of the Address columns and would qualify data
that matched the criteria whether it was in Address1 or Address2.
- Tables can be linked with an join operation, on columns that contain
the same information across tables. To perform join operations at search
time, the target column must be installed with an MDK index.
- MDK indexes can be installed on flat file databases where no dbms
exists, allowing the same search capabilities as any other database.
Noise words like "and", "the", "a", or
any word supplied in an Excluded Words list, can be filtered out of
the indexes.
When to Use MDK Indexes
Use MDK indexes for keyword retrieval and high-speed multi-column and
multi-table access.
MDK indexes should be installed on name and text fields such as people
and company names, part descriptions, notes, etc., where retrieval by
a word, part of a word, or a combination of multiple words is required.
Use MDK indexes on highly codified data when retrieval across multiple
columns is required. For example:
SELECT * FROM CONTACTS WHERE SIC_CODE='1045' AND STATE_CODE
IN (CO, TX, FL)
AND TYPE_CODE='CUST'
Use an excluded word list to eliminate noise words or values that will
not be needed in an MDK retrieval. This can reduce indexing time and index
size.
Use the Group (;Gn) option for MDK Indexes to combine columns logically
for retrieval ease and speed.
Do not group
columns in MDK indexes when the columns have similar data. For example,
while using Omnidex to retrieve records added in a particular date value
or range, grouping a Date_updated column with a Date_added column will
produce ambiguous results.
Keyword Searches
The true power of Omnidex is its ability to find rows based on words
that appear anywhere in an indexed field. For example,
CUSTOMERS
COMPANY_NAME;G1;SX;KW
CONTACT;G1;SX;KW
AKA_COMPANY;G1;SX;KW
you could use the search criterion INFORMATION - upper or lower case
- against a COMPANY_NAME index,
COMPANY_NAME="INFORMATION"
to find all rows where the company name contains the word "information".
The search might return the following rows:
Company State
Dynamic Information Systems Corporation
CO
Information Express
MN
Dun and Bradstreet Information Services
NY
More importantly, you can combine multiple keywords to narrow your search
still further. For example, changing the criteria to
COMPANY_NAME="%INFORMATION AND DYNAMIC"
would instantly return the first row from the list above, OR
COMPANY_NAME="%INFORMATION AND NOT DYNAMIC"
would return the second and third rows but not the first.
Top
|