Integration: Raw Data Files

Querying Raw Data Files

Omnidex allows applications to issue SQL statements against raw data files using standard ODBC and JDBC interfaces. Once a raw data file has been included in an Omnidex Environment File, it is treated like a standard table. Applications can relate with this data file as a standard table, just as though the table was resident in a relational database.

Applications can take advantage of the great breadth of the SQL language. Queries can filter rows using criteria, join tables, aggregate rows, order rows, and process functions.

Criteria

Omnidex supports the complete suite of criteria options in the SQL language against raw data files. Omnidex indexing should be installed on criteria columns to achieve the best performance.

Delimited and fixed-length raw data files do not differentiate a NULL column as is done in a relational database. Omnidex will treat empty fields as NULL fields to compensate for this. For character-class datatypes, an IS NULL test will qualify any row containing an empty space in that column. For binary-class datatypes, an IS NULL test will qualify any row containing a zero in that column.

Table Joins

Omnidex supports standard table joins between raw data files, including inner joins, outer joins, and cross joins. Since raw data files do not generally have native indexing capabilities like relational databases, it is important to install Omnidex indexing on the join columns. This will insure the best optimization of table joins.

Aggregations

Omnidex supports standard aggregations such as COUNT, SUM, AVG, MIN, and MAX, both with and without the GROUP BY clause. Omnidex indexing should be installed to achieve the best performance.

Aggregations are generally appropriate for binary datatypes. For delimited files, be sure to declare numeric columns using binary datatypes to allow aggregations. Of course, binary datatypes can only be declared when the numeric data is consistent and valid across all rows.

Ordering

Omnidex supports ordering result sets using the ORDER BY clause. Omnidex indexing should be installed to achieve the best performance.

Functions

Omnidex allows a wide selection of SQL Functions that can be used against raw data files.

Omnidex also allows Expression-based Columns in a table declaration, which create a virtual column that returns the result of a SQL expression.

Datatypes

Raw data files can support retrieving data in a wide variety of datatypes. They can store character and binary data, and data can be further converted using functions in SQL statements. Raw data files do not support VARCHAR and CLOB datatypes due to their reliance on length variables; however, CHARACTER and STRING datatypes may be used instead with greater convenience.


In general terms, applications should treat a raw data file just like any other relational database. Many companies transfer their data from relational databases to raw data files, and the only difference their applications notice is an increase in performance.

Additional Resources

See also:

 
Back to top
integration/rawdata/queries.txt ยท Last modified: 2016/06/28 22:38 (external edit)