This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
integration:rawdata:delimited:environments [2011/03/14 14:21] 127.0.0.1 external edit |
integration:rawdata:delimited:environments [2016/06/28 22:38] (current) |
||
---|---|---|---|
Line 3: | Line 3: | ||
{{page>:top_add&nofooter&noeditbtn}} | {{page>:top_add&nofooter&noeditbtn}} | ||
- | ====== Adminstration: Raw Data Files ====== | + | ====== Integration: Raw Data Files ====== |
===== Delimited Files ===== | ===== Delimited Files ===== | ||
Line 25: | Line 25: | ||
=== Declaring Databases for Raw Data Files === | === Declaring Databases for Raw Data Files === | ||
- | Databases in Omnidex are of a particular type, such as Oracle, SQL Server, MySQL or ODBC. A database that accesses raw data file has its own type: FILE. All tables within this database must be raw data files, though they can be a mixture of the types of raw data files supported by Omnidex. | + | Databases in Omnidex are of a particular type, such as Oracle, SQL Server, MySQL or ODBC. A database that accesses a raw data file has its own type: FILE. All tables within this database must be raw data files, though they can be a mixture of the types of raw data files supported by Omnidex. |
<code> | <code> | ||
Line 39: | Line 39: | ||
Tables will point to one or more delimited files, and the files must match the record layout specified in the [[dev:sql:statements:create_table:home|CREATE TABLE]] statement. In the [[dev:sql:statements:create_table:home|CREATE TABLE]] statement, the [[integration:rawdata:delimited:options|OPTIONS clause]] is used to identify this as a delimited file and indicate which delimiters to use. The DELIMITED option indicates that this is a delimited file. The remaining options govern the specific delimiters and behavior to use when parsing the file. | Tables will point to one or more delimited files, and the files must match the record layout specified in the [[dev:sql:statements:create_table:home|CREATE TABLE]] statement. In the [[dev:sql:statements:create_table:home|CREATE TABLE]] statement, the [[integration:rawdata:delimited:options|OPTIONS clause]] is used to identify this as a delimited file and indicate which delimiters to use. The DELIMITED option indicates that this is a delimited file. The remaining options govern the specific delimiters and behavior to use when parsing the file. | ||
- | In the CREATE TABLE statement, the PHYSICAL clause is used to point to the physical location of the raw data file. This PHYSICAL clause can contain a relative path name (relative to the location of the Omnidex Environment File) such as "dat/states.csv", or an absolute path name such as "d:/class/lab1/dat/states.csv". This PHYSICAL clause can also contain wildcards as described below, and can also reference [[appendix:reference:envvar|environment variables]] using a special syntax. | + | In the CREATE TABLE statement, the PHYSICAL clause is used to point to the physical location of the raw data file. This PHYSICAL clause can contain a relative path name (relative to the location of the Omnidex Environment File) such as "dat/states.csv", or an absolute path name such as "d:/class/lab1/dat/states.csv". This PHYSICAL clause can also contain wildcards as described below, and can also reference [[appendix:reference:envvar:home|environment variables]] using special syntax. |
- | <code sql> | + | <code> |
create table "STATES" | create table "STATES" | ||
options "DELIMITED COLUMN=',' RECORD='\r\n' QUOTES" | options "DELIMITED COLUMN=',' RECORD='\r\n' QUOTES" | ||
Line 50: | Line 50: | ||
In the example above, a single file was used as the source of data for a table. It is also possible for multiple files to be used as the source of data for a table. As long as the files all share the same record layout, they will all be processed when reading the table. Filesets can be declared as a comma-separated list of files, or they can be declared using wildcards in the filename. In this example, three files are declared individually in the PHYSICAL clause: | In the example above, a single file was used as the source of data for a table. It is also possible for multiple files to be used as the source of data for a table. As long as the files all share the same record layout, they will all be processed when reading the table. Filesets can be declared as a comma-separated list of files, or they can be declared using wildcards in the filename. In this example, three files are declared individually in the PHYSICAL clause: | ||
- | <code sql> | + | <code> |
create table "STATES" | create table "STATES" | ||
options "DELIMITED COLUMN=',' RECORD='\r\n' QUOTES" | options "DELIMITED COLUMN=',' RECORD='\r\n' QUOTES" | ||
Line 61: | Line 61: | ||
In this second example, wildcards are used to declare these files: | In this second example, wildcards are used to declare these files: | ||
- | <code sql> | + | <code> |
create table "STATES" | create table "STATES" | ||
options "DELIMITED COLUMN=',' RECORD='\r\n' QUOTES" | options "DELIMITED COLUMN=',' RECORD='\r\n' QUOTES" | ||
Line 71: | Line 71: | ||
Files can reside in multiple directories and in multiple drives or volumes to allow a table to span multiple storage devices. Files may also reside on a Storage Area Network (SAN); in fact, this is one of the most common implementations of Omnidex. Files may also reside on Microsoft Windows mapped drives; however, users should expect performance degradation to occur in this scenario. | Files can reside in multiple directories and in multiple drives or volumes to allow a table to span multiple storage devices. Files may also reside on a Storage Area Network (SAN); in fact, this is one of the most common implementations of Omnidex. Files may also reside on Microsoft Windows mapped drives; however, users should expect performance degradation to occur in this scenario. | ||
- | Wildcards provide a great flexibility since they allow more data to be added to a table by simply moving another file into place. Care should be taken with this approach, though, since the table must be reindexed as soon as new data is in place. | + | Wildcards provide greater flexibility since they allow more data to be added to a table by simply moving another file into place. Care should be taken with this approach, though, since the table must be reindexed as soon as new data is in place. |
===== ===== | ===== ===== |