$RETRIEVE_FILE
Some databases may be used to catalog a collection of external files.
In these situations, the database contains a series of columns such as
title, authorship and filename while the actual file is stored outside
the database.
$RETRIEVE_FILE allows text files external to the database to be indexed
in the same way as any other textual column. This function is used as
part of a pseudocolumn in the Omnidex environment catalog, with the data
type commonly declared as a CLOB or C STRING.
The $RETRIEVE_FILE function returns a buffer containing the contents
of an external file, using the data type and length specified in the parameters.
If no parameters are specified, the default data type and length are returned.
This column can also be used in the WHERE clause or as a select-item
of a SELECT statement, although the latter is less common. Usually, the
existing application will have an established approach for retrieving
the file using the filename, rather than having Omnidex traffic the content
within a SQL statement.
Syntax
$RETRIEVE_FILE(filename[, datatype[, length[, options]]])
$RETRIEVE_FILE
Required.
filename
Required. Can be a string literal, a column or an expression, and must
contain the filename to retrieve.
datatype
Optional. The data type to be used for retrieving the file's content.
Typically a CLOB or C STRING is used to retrieve ASCII data such as text
and HTML, and BLOB is used to retrieve binary data such as Microsoft Word
and Adobe PDF documents. Alternatively, a CLOB can be used to retrieve
the text from Microsoft Word and Adobe PDF documents if the EXTRACT_TEXT
option is used.
Data types are specified in textual form and may be used with or without
lengths. If no lengths are specified, then CLOB is presumed.
length
Optional. The length to be used retrieving the file's content. Lengths
may also be specified in the datatype parameter using the standard Omnidex
syntax (C STRING(50KB)). If no length is provided
in either place, the length defaults to 64KB.
options
Optional. Options to be applied to retrieving this file.
Options
EXTRACT_TEXT
Extract the text from the file, rather than returning the exact contents
of the file. Strips formatting and other non-printable characters.
AUTO_EXTENSION
If the passed filename does not exist, AND the passed filename does not
contain an extension (suffix), AND a single file exists with the same
name plus an extension in the specified directory, use that file. This
option allows filenames to be included without an extension as long as
only one file is possible.
STOPWORDS=
Use the STOPWORDS list identified by this option.
INCLUDED_HTML_TAGS=
Use the INCLUDED_HTML_TAGS list identified by this option.
EXCLUDED_HTML_TAGS=
Use the EXCLUDED_HTML_TAGS list identified by this option.
INCLUDED_XML_TAGS=
Use the INCLUDED_XML_TAGS list identified by this option.
EXCLUDED_XML_TAGS=
Use the EXCLUDED_XML_TAGS list identified by this option.
PARSE
Parse the keywords from the text and discard all white space and punctuation.
Example
table "CATALOG"
column "FILENAME" datatype C STRING(255)
column "CONTENT" datatype CLOB(16MB)
as "$retrieve_file(FILENAME)"
Top
|