Excluded Words (Stop Words)
Words like 'the', 'and', or 'that' provide no benefit in a search because
they occur so often and do not help to uniquely identify a specific record.
An excluded words list prevents words like these from being indexed, reducing
the time it takes to load the indexes and the disk space required to store
the indexes.
An excluded words list is an ASCII file that contains a customized list
of words considered to be "noise" words that should not be indexed.
The X or XCLUDE option is a database-level indexing option that applies
an excluded words list to an entire index installation, with the exception
of columns installed with the ;NE (No Exclude) option.
Because excluding words affects the contents of the Omnidex indexes,
the excluded words list should be loaded before building the indexes.
Otherwise, the indexes will have to be rebuilt after the excluded words
list is loaded.
NOTE: Because the ;NP (No Parse) option prevents Omnidex from parsing
the data in a field, excluded words will not be removed from these indexes,
unless the value of the entire field is contained in the excluded words
list.
Create an Excluded Words List
To exclude words from indexing, create an ASCII file that contains each
word to be excluded on its own line. This is called an excluded words
file.
Below is part of a sample excluded words file called “xcluded”.
1 A
2 AN
3 AND
4 CORP
6 INC
8 OF
9 ON
Although it is not required, for maintenance purposes, DISC recommends
alphabetizing the list of excluded words. Note that all of the words are
in upper case. This is because keywords are indexed in upper case by default,
and lower case spellings will not exclude the indexing of keywords for
default Omnidex indexes.
NOTE: No Translate prevents indexed values from being upshifted. When
excluding words for No Translate indexes, be sure to include the different
case sensitive spellings of the word. For example, if “CO”
is an excluded word, the keyword “Co” is still indexed for
any No Translate indexes unless "Co" is included in the excluded
words list.
Loading an Excluded Words List
To load an excluded words file, enter X at the DBINSTAL Cmd: prompt.
DBINSTAL will prompt for the file containing the excluded words.
Cmd: X
File of excluded words: xcluded
Word counts for each exclusion list:
1 letter words: 1
2 letter words: 3
3 letter words: 2
4 letter words: 1
5 letter words: 0
6 letter words: 0
7 letter words: 0
8 letter words: 0
9 letter words: 0
10 letter words: 0
11 letter words: 2
Whenever reinstalling Omnidex on an environment catalog, the excluded
words file must also be reloaded for the environment.
To clear the excluded words list, enter X! at the Cmd: prompt. DBINSTAL
will verify the request before clearing the excluded words list.
Cmd: X!
Clear excluded words list? Y
After loading or disabling an excluded words list, the tables must be
reindexed in that Omnidex environment. Otherwise, the status of the excluded
word list is not reflected in the indexes
Top
|