Home

Getting Started

Utilities

Indexing

Omnidex

Development

Tutorials

Quick Links

 

Appendix

Data Types

CHARACTER | CHAR

C STRING

VARCHAR

CLOB

OMNIDEX VARCHAR

OMNIDEX CLOB

Text Indexing Length Variation

 

Binary Data Types (Numbers)

Dates

Textual Data Types

National Character Data Types (UNICODE)

Data Types

 

Appendix

Textual Data Types

CHARACTER | CHAR

CLOB

C STRING

OMNIDEX VARCHAR

VARCHAR

OMNIDEX CLOB

 

Textual data type lengths are declared by the number of characters. You can abbreviate thousands, millions, kilobytes and megabytes in the length argument, as follows:

K - Thousands - n*1000 characters - n*1000 bytes or n kibibytes.
CHARACTER(4000) = CHARACTER(4K)

M - Millions - n*1000000 characters - n*1000000 bytes or n mebibytes.
C STRING(3000000) = C STRING(3M)

KB - Kilobytes - n*1024 characters - n*1024 bytes or n kilobytes.
VARCHAR(2048) = VARCHAR(2KB)

MB - Megabytes - n*1048576 characters - n*1048576 bytes or n megabytes.
CLOB(5242880) = CLOB(5MB)

See Text Indexing Length Variation for additional restrictions.

 

CHARACTER | CHAR

Space-padded data up to 4095 characters. CHARACTER and CHAR can be used interchangeably.

COLUMN "PHONE" PHYSICAL "phone" DATATYPE CHAR(12)

 

 

C STRING

Null-terminated data up to 64MB. See Text Indexing Length Variation for restrictions on the 64MB limit.

COLUMN "ADDRESS" PHYSICAL "address" DATATYPE C STRING(40)

 

VARCHAR

Non-terminated and non-padded data up to 4,095 bytes. This data type may contain embedded null characters since it is not null-terminated; however, it should not be used to store binary data.

Use of this data type requires use of oadescribe and oabind so that the data-lengths variables can be used. This allows applications to know the length of the data in this column for each row returned.

This data type is not appropriate for fixed length flat files since the data length cannot be stored. Flat files databases should use CHARACTER, C STRING or OMNIDEX VARCHAR Data types

COLUMN "COMMENTS" PHYSICAL "company" DATATYPE VARCHAR(40)

Use of the varchar data type is very expensive compared to other data types and should therefore, only be used when absolutely necessary. Specifically, varchar should only be used if the data will have embedded null characters. Otherwise, a character or c string data type should be used.

 

CLOB

Non-terminated and non-padded data up to 64MB. See Text Indexing Length Variation for restrictions on the 64MB limit.

This data type may contain embedded null characters since it is not null-terminated; however, it should not be used to store binary data.

Use of this data type required use of oadescribe and oabind so that the data-lengths variables can be used. This allows applications to know the length of the data in this column for each row returned.

The handling of CLOB data may be more expensive than the handling of CHARACTER, C STRING or VARCHAR data. It is better to use those Data types if their size limitations will not be exceeded.

This data type is not appropriate for fixed length flat files since the data length cannot be stored. Flat files databases should use CHARACTER, C STRING, OMNIDEX VARCHAR or OMNIDEX CLOB Data types

COLUMN "PR_INFO" PHYSICAL "pr_info" DATATYPE CLOB(65535)

 

OMNIDEX VARCHAR

Non-terminated and non-padded data up to 4095 bytes.

The length of the textual data is stored as the first four bytes in the form of a 4-byte integer. The length is only the length of the textual data and does not include the length of the integer itself. The integer length is inherently of the native endian (byte order) of the machine and is not necessarily aligned on a machine word boundary.

COLUMN "CONTACT" PHYSICAL "contact" DATATYPE OMNIDEX VARCHAR(50)

Use of the omnidex varchar data type is very expensive compared to other data types and should therefore, only be used when absolutely necessary. Specifically, omnidex varchar should only be used if the data will have embedded null characters. Otherwise, a character or c string data type should be used.

 

OMNIDEX CLOB

Non-terminated and non-padded data up to 64MB. See Text Indexing Length Variation for restrictions on the 64MB limit.

The length of the textual data is stored as the first four bytes in the form of a 4-byte integer. The length is only the length of the textual data and does not include the length of the integer itself. The integer length is inherently of the native endian (byte order) of the machine and is not necessarily aligned on a machine word boundary.

The handling of CLOB data may be more expensive than the handling of CHARACTER, C STRING or VARCHAR data. It is better to use those Data types if their size limitations will not be exceeded.

COLUMN "ARTICLES" PHYSICAL "articles" DATATYPE OMNIDEX CLOB(2MB)

 

Text Indexing Length Variation

The declared length of a C STRING, CLOB or OMNIDEX CLOB column can be as high as 64MB, however, the maximum amount of text that can be indexed is 16MB per row. This means that after all the non-printable characters such as formatting and line feed characters, have been removed, and only plain text remains, the plain text cannot exceed 16MB.

This applies to both internal database columns and external documents.

 

Top