Home

Getting Started

Utilities

Indexing

Omnidex

Development

Tutorials

Quick Links

 

Appendix

OAGLOBAL

Customizing Lists

OAGLOBAL Tables

Environment Source

 

Appendix

 

Overview

OAGLOBAL is the Omnidex Global Environment File. This is an environment file, similar to those used in any Omnidex installation, that describes several tables containing items used by Omnidex. These items, including error messages, synonyms, thesaurus and dictionary entries, are contained in flat file format, many of which are customizable by an Omnidex administrator.

The configuration of Omnidex Text relies on a set of tables that are described in the Omnidex Global Environment. These tables are a series of Tab-Delimited files located in the config/english subdirectory of the Omnidex home directory. There are a series of subdirectories with names equating to their respective table, each containing one or more TDFs.

TDFs can be easily edited using applications such as Microsoft Excel, text editors or word processors. These tools can be used as long as the column layout is maintained and the proper delimiters used. Columns must be separated using a tab characters and rows must be separated by a line feed. Note that Microsoft Windows NT/2000/XP require a carriage-return and line feed to terminate each row. This will automatically be maintained by most Windows tools.

 

Customizing Lists in OAGLOBAL

Many of the lists in maintained in the OAGLOBAL environment can be customized, including editing or removing existing lists, as well as adding new custom lists. All lists must be in proper TDF format, using tab characters as column delimiters and line feeds (carriage-return, line feed on Windows platforms) as record delimiters.

Custom lists can have .tdf or .txt file extensions and must be placed in the appropriate subdirectory of the Omnidex_home/config/english directory.

The first column in each record will be a unique list name, which will group specific lists together, and a single file can contain multiple lists. For example, the file "givennames.txt" in the synonyms folder, contains several lists: FEMALE_GIVEN_NAMES, MALE_GIVEN_NAMES and MALE_FEMALE_GIVEN_NAMES.

The second column will vary, depending on the list you are editing. See OAGLOBAL Tables below a summary of tables and the type of lists each holds.

After making changes to any lists, including adding, editing or deleting a list, you must rebuild the indexes using dbinstal. A build script is included with the Omnidex software, located in the Omnidex_home/config/english directory.

omnidex/config/english>dbinstal < oaglobalin

This will rebuild the indexes to include the changed values.

 

 

OAGLOBAL Tables

$LISTS - ../config/english/lists - $LIST, $REPLACEMENT, $COMMENTS - This table allows the declaration of composite lists for the other options. Lists entered here may be referenced as though the list exists. For example, if you have custom lists named CHEVROLET, FORD, and DODGE, you can create a composite list named AMERICAN_CARS by adding "AMERICAN_CARS" to the $LIST column and "CHEVROLET, FORD, DODGE" to the $REPLACEMENT column. Then, you can pass "AMERICAN_CARS" as the list name and all three lists will be searched, without having to create an entirely new list with the same information.

$STOPWORDS - ../config/english/stopwords - $LIST, $WORD, $COMMENTS - This table is organized into lists, each of which contains a list of words or phrases to be excluded from searches. The list named SMALL contains English language words commonly recognized as stopwords, such as "and" and "the". The list named MEDIUM contains common English language words like the personal pronouns "me", "she", "he", and "they", as well as common verbs like "are", "were", "be", and "been". The list named LARGE contains common English language words like "because", "meanwhile" and "otherwise".

$SYNONYMS - ../config/english/synonyms - $LIST, $WORD, $REPLACEMENT, $COMMENTS - This table is organized into lists, each of which contains a list of words or phrases and their respective synonyms. This list named THESAURUS contains an English language synonym list. Other lists are provided for individual scenarios, such as FIRST_NAME, STATE_ABBREVIATIONS and COUNTRY_ABBREVIATIONS. New lists may be added as needed.

$MESSAGES - ../config/english/messages - $ERROR, $MESSAGE - This table contains all of the error messages used by Omnidex.

$INCLUDED_TAGS - ../config/english/included_tags - $LIST, $TAG - This table is organized into lists, each of which contains a list of markup language tags to be included in indexing and search criteria. When an INCLUDED_TAGS lists is used, then indexing and searches are restricted to the tags in that list. There is no list by default.

$EXCLUDED_TAGS - ../config/english/excluded_tags - $LIST, $TAG - This table is organized into lists, each of which contains a list of markup language tags to excluded from indexing and search criteria. When an EXCLUDED_TAGS list is used, then indexing and searches are prevented against the tags in in that list.

 

 

 

OAGLOBAL Environment Source

 

environment "$OMNIDEX_GLOBAL_CONFIGURATION"

/* ====== DATABASE: $GLOBAL_CONFIGURATION ====== */

database "$OMNIDEX_GLOBAL_CONFIGURATION"
type flatfile
indexprefix "idx/odx_global_config_"


/* ----------- TABLE: $STOPWORDS ----------- */

table "$STOPWORDS"
type tdf
physical "{$OMNIDEX_LANG/stopwords/*.txt},
{$OMNIDEX_LANG/stopwords/*.tdf}"

column "$LIST" datatype CHARACTER length 32
column "$WORD" datatype C STRING length 128


/* ----------- TABLE: $SYNONYMS ----------- */

table "$SYNONYMS"
type tdf
physical "{$OMNIDEX_LANG/synonyms/*.txt},
{$OMNIDEX_LANG/synonyms/*.tdf}"

column "$LIST" datatype CHARACTER length 32
column "$WORD" datatype C STRING length 128
column "$REPLACEMENT" datatype C STRING length 4096


/* ----------- TABLE: $PHONETIC ----------- */

table "$PHONETIC"
type tdf
physical "{$OMNIDEX_LANG/phonetic/*.txt},
{$OMNIDEX_LANG/phonetic/*.tdf}"

column "$LIST" datatype CHARACTER length 32
column "$WORD" datatype C STRING length 128
column "$TYPE" datatype CHARACTER length 8
column "$SOUNDEX" datatype CHARACTER length 64
column "$METAPHONE1" datatype CHARACTER length 64
column "$METAPHONE2" datatype CHARACTER length 64


/* ----------- TABLE: $INCLUDED_HTML_TAGS ----------- */

table "$INCLUDED_HTML_TAGS"
type tdf
physical "{$OMNIDEX_LANG/included_html_tags/*.txt},
{$OMNIDEX_LANG/included_html_tags/*.tdf}"

column "$LIST" datatype CHARACTER length 32
column "$TAG" datatype C STRING length 256


/* ----------- TABLE: $EXCLUDED_HTML_TAGS ----------- */

table "$EXCLUDED_HTML_TAGS"
type tdf
physical "{$OMNIDEX_LANG/excluded_html_tags/*.txt},
{$OMNIDEX_LANG/excluded_html_tags/*.tdf}"

column "$LIST" datatype CHARACTER length 32
column "$TAG" datatype C STRING length 256


/* ----------- TABLE: $INCLUDED_XML_TAGS ----------- */

table "$INCLUDED_XML_TAGS"
type tdf
physical "{$OMNIDEX_LANG/included_xml_tags/*.txt},
{$OMNIDEX_LANG/included_xml_tags/*.tdf}"

column "$LIST" datatype CHARACTER length 32
column "$TAG" datatype C STRING length 256


/* ----------- TABLE: $EXCLUDED_XML_TAGS ----------- */

table "$EXCLUDED_XML_TAGS"
type tdf
physical "{$OMNIDEX_LANG/excluded_xml_tags/*.txt},
{$OMNIDEX_LANG/excluded_xml_tags/*.tdf}"

column "$LIST" datatype CHARACTER length 32
column "$TAG" datatype C STRING length 256


/* ----------- TABLE: $MESSAGES ----------- */

table "$MESSAGES"
type tdf
physical "{$OMNIDEX_LANG/messages/*.txt},
{$OMNIDEX_LANG/messages/*.tdf}"

column "$ERROR" datatype INTEGER length 4
column "$MESSAGE" datatype C STRING length 256

 

 

Top