Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
integration:rawdata:types [2011/03/14 14:21]
127.0.0.1 external edit
integration:rawdata:types [2016/06/28 22:38] (current)
Line 11: Line 11:
 [[integration:​rawdata:​updates|Updates]] | [[integration:​rawdata:​updates|Updates]] |
 [[integration:​rawdata:​creation|Generating Data Files]] | [[integration:​rawdata:​creation|Generating Data Files]] |
-[[integration:​rawdata:​implementation|Implementation ​Guides]]+[[integration:​rawdata:​integration|Integration ​Guides]]
  
  
Line 17: Line 17:
 ==== File Types ==== ==== File Types ====
  
-Omnidex supports a variety of raw data files. ​ Fixed-length files are a simple binary, flat file that uses a consistent number of bytes per row.  Delimited files store data in character format and use delimiters to separate columns and rows.  Omnidex Standalone Tables (OST's) are proprietary files that store data in a compressed fashion and can be easily moved around. ​ Support is planned for the future for Hadoop Distributed File System (HDFS) files.+Omnidex supports a variety of raw data files. ​ Fixed-length files are a simple binary, flat file that uses a consistent number of bytes per row.  Delimited files store data in character format and use delimiters to separate columns and rows.  Omnidex Standalone Tables (OSTs) are proprietary files that store data in a compressed fashion and can be easily moved around. ​ Support is planned for the future for Hadoop Distributed File System (HDFS) files.
  
-In general, Omnidex data files must maintain a consistent structure, meaning that the data has consistent rows and columns. ​ For example, relational database systems allow data to be exported ​data into data files, and these files are ideal for indexing with Omnidex. ​ Similarly, companies often receive data from vendors or suppliers in this same form, and these files can be indexed directly without having to load the data into a relational database.  ​+In general, Omnidex data files must maintain a consistent structure, meaning that the data has consistent rows and columns. ​ For example, relational database systems allow data to be exported into data files, and these files are ideal for indexing with Omnidex. ​ Similarly, companies often receive data from vendors or suppliers in this same form, and these files can be indexed directly without having to load the data into a relational database.  ​
  
 === Fixed-length Files === === Fixed-length Files ===
  
-Fixed length files will always use the same number of bytes for each column and each row, regardless of the content of the data.  No delimiters are used, and instead each column and each row can be located based on its offset within the file.  Binary data such as integers, floating point and date datatypes are stored in their native, binary format. ​+Fixed length files will always use the same number of bytes for each column and each row, regardless of the content of the data.  No delimiters are used, and instead each column and each row can be located based on its offset within the file.  Binary data such as integers, floating pointand date datatypes are stored in their native, binary format. ​
  
 In the example below, each row consumes 44 bytes of the file, with the first row starting at the beginning of the file, the second row beginning at offset 44, the third row beginning at offset 88, and so forth. ​ Note that the STRING datatype stores one more byte than the number of characters allowed, which is storage for the terminating NULL character. ​ Also note that the FLOAT datatype requires 4 bytes regardless of the number of digits displayed, since a binary floating point value always requires 4 bytes of storage. In the example below, each row consumes 44 bytes of the file, with the first row starting at the beginning of the file, the second row beginning at offset 44, the third row beginning at offset 88, and so forth. ​ Note that the STRING datatype stores one more byte than the number of characters allowed, which is storage for the terminating NULL character. ​ Also note that the FLOAT datatype requires 4 bytes regardless of the number of digits displayed, since a binary floating point value always requires 4 bytes of storage.
Line 54: Line 54:
 </​code>​ </​code>​
  
-=== Omnidex Standalone Tables (OST's) ===+=== Omnidex Standalone Tables (OSTs) ===
  
-Omnidex Standalone Tables (OST's) are a proprietary storage format that stores data, indexes and metadata for a table all in one file.  Data is compressed for faster access. ​ The primary purpose of an OST is to allow data to be easily moved around and dynamically attached to different environments;​ however, ​OST'​s ​can be directly referenced in Omnidex Environment Files as well.  ​+Omnidex Standalone Tables (OSTs) are a proprietary storage format that stores data, indexesand metadata for a table all in one file.  Data is compressed for faster access. ​ The primary purpose of an OST is to allow data to be easily moved around and dynamically attached to different environments;​ however, ​OSTs can be directly referenced in Omnidex Environment Files as well.  ​
  
-Omnidex Standalone Tables are a good solution for tables that have lots of binary data as well as lots of large textual data.  ​OST'​s ​will provide excellent compression and performance in these situations.  ​+Omnidex Standalone Tables are a good solution for tables that have lots of binary data as well as lots of large textual data.  ​OSTs will provide excellent compression and performance in these situations.  ​
  
  
 
Back to top
integration/rawdata/types.1300112479.txt.gz ยท Last modified: 2016/06/28 22:38 (external edit)