Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
integration:rdbms:servers [2011/03/14 19:20]
127.0.0.1 external edit
integration:rdbms:servers [2016/06/28 22:38] (current)
Line 10: Line 10:
 [[integration:​rdbms:​queries|Queries]] | [[integration:​rdbms:​queries|Queries]] |
 [[integration:​rdbms:​updates|Updates]] | [[integration:​rdbms:​updates|Updates]] |
-[[integration:​rdbms:​creation|Generating Data Files]] | +[[integration:​rdbms:​integration|Integration ​Guides]]
-[[integration:​rdbms:​implementation|Implementation ​Guides]]+
  
  
Line 17: Line 16:
 ==== Servers ==== ==== Servers ====
  
-Omnidex ​supports ​variety of raw data files.  ​Fixed-length files are a simple binary, flat file that uses consistent number of bytes per row.  ​Delimited files store data in character format and use delimiters ​to separate columns ​and rows.  Omnidex Standalone Tables (OST'​s) are proprietary files that store data in a compressed fashion and can be easily moved around.  ​Support is planned for the future ​for Hadoop Distributed File System (HDFS) files.+Omnidex ​benefits from being on separate server from the relational database whenever possible.  ​Relational databases tend to use strategies ​that consume most of the resources on server.  ​It is not uncommon for a relational database ​to consume nearly all of the memory ​and CPU while processing large quantities of data using multiple threads.  ​While this maximizes the performance of the relational database, it does not preserve many resources ​for other processes such as Omnidex.
  
-In general, Omnidex ​data files must maintain a consistent structure, meaning that the data has consistent rows and columns. ​ For example, ​relational database ​systems allow data to be exported data into data files, and these files are ideal for indexing with Omnidex.  ​Similarly, companies often receive data from vendors or suppliers in this same form, and these files can be indexed directly without having to load the data into a relational database.  ​+Typically, Omnidex ​is placed on an Omnidex Server and the relational database ​is placed on a Relational Database Server.  ​This allows each to utilize system resources as they need without interference ​from the other.  ​
  
-=== Fixed-length Files === +Applications ​can be written to direct all query traffic to Omnidexor the application can split the traffic between Omnidex ​and the relational database.  ​Applications ​that direct all query traffic ​to Omnidex ​simply switch ​the connections ​from the relational database ​to Omnidex.  ​Applications that will split the traffic maintain concurrent connections ​to Omnidex and the relational database.  ​
- +
-Fixed length files will always use the same number of bytes for each column and each row, regardless of the content of the data.  No delimiters are used, and instead each column and each row can be located based on its offset within the file.  Binary data such as integersfloating point and date datatypes are stored in their native, binary format.  +
- +
-In the example below, each row consumes 44 bytes of the file, with the first row starting at the beginning of the file, the second row beginning at offset 44, the third row beginning at offset 88, and so forth. ​ Note that the STRING datatype stores one more byte than the number of characters allowed, which is storage for the terminating NULL character.  ​Also note that the FLOAT datatype requires 4 bytes regardless of the number of digits displayed, since a binary floating point value always requires 4 bytes of storage. +
- +
-^Column ​ ^  STATE  ^  DESCRIPTION ​ ^  STATE_CODE ​ ^  REGION ​ ^  COUNTRY ​ ^  TAX_RATE ​ ^ +
-^,,​Datatype,, ​ ^  ,,​CHAR(2),, ​ ^  ,,​STRING(31),, ​ ^  ,,​CHAR(2),, ​ ^  ,,​CHAR(2),, ​ ^  ,,​CHAR(2),, ​ ^  ,,​FLOAT,, ​ ^ +
-^,,Bytes of Storage,, ​   ^  ,,2,,  ^  ,,​32,, ​ ^  ,,2,,   ​^ ​ ,,2,,  ^  ,,2,,  ^  ,,4,,  ^ +
-^,,Offset 0,,   ​| ​ AK            |  Alaska ​       |  02          |  PC          |  US          |  0.000    | +
-^,,Offset 44,,  |  AL            |  Alabama ​      ​| ​ 01          |  ES          |  US          |  4.000    | +
-^,,Offset 88,,  |  AR            |  Arkansas ​     |  05          |  WS          |  US          |  4.625    | +
- +
- +
-=== Delimited Files === +
- +
-Delimited files separate the content of columns and rows using special delimiter characters. ​ These are commonly used by relational databases and other data-related tools as a standard, portable data format. ​ Tabs and commas are the most common delimiters between columns, and linefeeds are the most common delimiters between rows.  Delimited files always store data in character format, though they can correlate ​to binary datatypes in a table definition. ​  +
- +
-Omnidex ​supports a wide variety of delimited files. ​ Omnidex supports ​the standard tab-delimited and comma-delimited formats, but also allows any combination of one or two characters for delimiters. ​ This allows great flexibility when receiving data from other sources, and also allows administrators ​to create delimited files based on their specific needs.  ​Omnidex ​also allows configuration options specifying how quotation marks are handled, how escape characters are used and whether header rows are present.  ​ +
- +
-In the example below, each row consumes a variable amount of space, based on the actual length of the data.  Columns are separated by commas and rows are separated by linefeeds, so Omnidex ​will parse the file to read it as a table. ​ All data is stored in character format, including numeric columns. ​  +
- +
-<​code>​ +
-AK,​Alaska,​02,​PC,​US,​0.000000 ​                  +
-AL,​Alabama,​01,​ES,​US,​4.000000 ​                 +
-AR,​Arkansas,​05,​WS,​US,​4.625000 ​                +
-AZ,​Arizona,​04,​MT,​US,​5.000000 ​                 +
-CA,​California,​06,​PC,​US,​6.000000 ​              +
-CO,​Colorado,​08,​MT,​US,​3.000000 ​                +
-CT,​Connecticut,​09,​NE,​US,​6.000000 ​             +
-... +
-</​code>​ +
- +
-=== Omnidex ​Standalone Tables (OST'​s) === +
- +
-Omnidex Standalone Tables (OST'​s) are a proprietary storage format that stores data, indexes ​and metadata for a table all in one file.  Data is compressed for faster access. ​ The primary purpose of an OST is to allow data to be easily moved around and dynamically attached to different environments;​ however, OST's can be directly referenced in Omnidex Environment Files as well.   +
- +
-Omnidex Standalone Tables are a good solution for tables that have lots of binary data as well as lots of large textual data.  OST's will provide excellent compression and performance in these situations.  ​+
  
 +{{:​integration:​rdbms:​rdbms_server_architecture.png|}}
  
 +Most companies rely on a Storage Area Network (SAN) for accessing their data.  Omnidex indexes will usually reside on the SAN as well, though this is not a requirement. ​ High-quality SANs provide great flexibility and excellent performance,​ though it is important to insure sufficient cache in the SAN as well as a sufficient number of paths to the servers.  ​
  
 =====  ===== =====  =====
 
Back to top
integration/rdbms/servers.1300130416.txt.gz · Last modified: 2016/06/28 22:38 (external edit)