This is an old revision of the document!


Administration: Omnidex Indexing

Omnidex Text

Textual Datatypes

Databases generally store text in character columns or in variable character columns (often called VARCHAR columns). These datatypes are good for storing smaller amounts of text, such as names, addresses and short descriptions. These datatypes are often limited in size, though. For larger amounts of text, a database may employ special datatypes designed specifically for this purpose. As examples, Oracle provides a CLOB (Character Large Object) datatype and SQL Server provides a TEXT datatype. These datatypes can often store up to 2-4 gigabytes of text.

Omnidex recommends storing text in either CHARACTER or C STRING datatypes. These are the simplest datatypes to use and provide great flexibility. Omnidex also supports VARCHAR and CLOB dataypes, but these datatypes are more difficulty to use and are more restricted. For most applications, the C STRING datatype will be preferred since it allows null-terminated data up to 16 MB.

Datatype Description
CHARACTERSpace-padded data up to 4,095 bytes.
C STRINGNull-terminated data up to 64MB. If indexed with Omnidex, the extracted text from this column may be up to 16MB.
VARCHARNon-terminated and non-padded data up to 4,095 bytes. This datatype may contain embedded null characters since it is not null-terminated; however, it should not be used to store binary data. Use of this datatype requires use of oadescribe and oabind so that the data_lengths variables can be used. This allows applications to know the length of the data in this column for each row returned.

This datatype is not appropriate for fixed length Flatfiles since the data length cannot be stored. Flatfiles should use CHARACTER, C STRING or OMNIDEX VARCHAR datatypes.
CLOBNon-terminated and non-padded data up to 64MB. If indexed with Omnidex, the extracted text from this column may be up to 16MB. This datatype may contain embedded null characters since it is not null-terminated; however, it should not be used to store binary data. Use of this datatype requires use of oadescribe and oabind so that the data_lengths variables can be used. This allows applications to know the length of the data in this column for each row returned.

The handling of CLOB data may be more expensive than the handling of CHARACTER, C STRING and VARCHAR data. It is better to use those datatypes if their size limitations will not be exceeded.

This datatype is not appropriate for fixed length Flatfiles since the data length cannot be stored. Flatfiles should use CHARACTER, C STRING, OMNIDEX VARCHAR or OMNIDEX CLOB datatypes.
—————-—————————————————————————————————————————–

The textual datatypes have different characteristics and have different restrictions within Omnidex SQL. The following table shows the capabilities of each datatype.

Characteristics CHARACTER C STRING VARCHAR CLOB

Datatype Characteristics
Character data allowed Yes Yes Yes Yes
Binary data allowed No No No No
Embedded nulls allowed No No Yes Yes
Null-terminated No Yes No No
Data_lengths required No No Yes Yes
Max size 4,095 64mb 4,095 64mb

Usage Characteristics
Select item of simple query Yes Yes Yes Yes
Select item of outer query Yes Yes Yes Yes
Select item of nested query Yes Yes Yes No
Select item of INSERT Yes Yes Yes No
Select item of set operation Yes Yes Yes Yes
Table joins Yes Yes Yes No
WHERE clause Yes Yes Yes Limited 1
GROUP BY clause Yes Yes Yes No
ORDER BY clause Yes Yes Yes No
HAVING clause Yes Yes Yes No
SELECT INTO clause Yes Yes Yes No
Aggregate functions Yes Yes Yes No
SQL Functions Yes Yes Yes Limited 2

Update Characteristics
Inserts, updates and deletes Yes Yes Yes No
————————————————————————–—————-—————-—————-—————-

1. Limited to use of LIKE, $CONTAINS and IS NULL operators.
2.See individual functions for compatibility with CLOB datatype.

Additional Resources

See also:

 
Back to top
admin/indexing/text/home.1295977469.txt.gz · Last modified: 2016/06/28 22:38 (external edit)