These instructions show how to create Omnidex Grids using environment files that are compiled with OACOMP. This is an approach that pre-dated the use of SQL statements such as CREATE DATABASE or CREATE TABLE. Environment files that are compiled with OACOMP are still supported for backward compatability. These instructions are for Omnidex Version 5.1 and later. Click here for instructions for Omnidex Version 5.0.
Omnidex Grids are straightforward to administer. Each grid node is a separate Omnidex environment, with an environment file, indexes and data. For convenience, there are methods to share a single Omnidex environment file across multiple nodes. There are also methods to share data files across multiple nodes.
Before creating an Omnidex Grid, you must first have a partitioning scheme and a distribution plan. These will be needed during the steps of creating an Omnidex Grid.
These instructions describe how to create an Omnidex Grid using Environment Rules. This was the original method to create Omnidex Grids. This method has now been replace by use of the Omnidex Administrator or SQL statements such as the CREATE DATABASE and CREATE TABLE.
Omnidex Grids needs an environment that points to the original database so that the data can be partitioned. This environment is a standard environment that accesses the underlying database.
If the database will be logically partitioned, then create the relational views that correlate with the partitioning strategy. If the database will be physically partitioned using an Omnidex Snapshot, then partition the database using the PARTITION command.
The PARTITION command is available in the Omnidex SQL language, and is callable from the OdxSQL application. Since the command is in the Omnidex SQL language, it may also be called directly from ODBC and JDBC.
The PARTITION command allows the table to be partitioned one of three ways, 1) into a specific number of equally sized partitions, 2) into separate partitions based on a partition qualifier consisting of a column or a SQL expression, or 3) a combination of these two steps. In the third case, Omnidex will evaluate the cardinality of the partition qualifier and group values together to comprise the specific number of partitions in as equal size as possible. The syntax of the PARTITION command is:
PARTITION table [INTO n [PARTITIONS]] [BY column] [IN path] [WITH options]
An example of a command to partition a table is:
PARTITION LIST BY “STATE || SUBSTRING(ZIP from 1 for 2)” IN D:\grid\data
If the database will be logically partitioned, then make sure each grid server has access to the database. If the database will be physically partitioned, then distribute both the partitioned and unpartitioned data to the grid nodes. If each node will be in its own directory, then the data files can be copied or symbolically linked to the correct locations. It is also possible for multiple nodes to share the same physical directories as long as the partitioned data can be differentiated.
The environment file for the Omnidex Grid is a normal environment file; however, it has additional information that describes the locations of the nodes and the partitioning scheme. There are changes that need to be made for the ENVIRONMENT statement, the DATABASE statement and the TABLE statements. Note that older versions of Omnidex Grids had a RULE statement at the bottom, and that is no longer necessary.
The ENVIRONMENT statement is expanded to declare the grid nodes and their connection information. The NODE declarations in the ENVIRONMENT statement are show below. Consult the documentation on the ENVIRONMENT statement for the complete syntax.
ENVIRONMENT environment [NODE node] [<PARTITIONED | UNPARTITIONED>] [HOST "host"] [PORT n] [ENVIRONMENT "filename"] [OPTIONS "options"]
Here is an example of an ENVIRONMENT declaration on an Omnidex Grid.
environment "list_env" maxthreads 4 node NODE01 partitioned node NODE02 partitioned node NODE03 partitioned node NODE04 partitioned
The DATABASE statement is expanded to show the database characteristics for each node. The NODE declarations in the DATABASE statement are show below. Consult the documentation on the DATABASE statement for the complete syntax.
[DATABASE database [NODE node] TYPE database_type_spec [VERSION “version”] [PHYSICAL “filespec”] [INDEXPREFIX “filespec”] [OPTIONS “options”] [USERCLASS database_access_definition [, database_access_definition ...]]
Here is an example of a DATABASE statement on an Omnidex Grid.
database "list" node NODE01 type flatfile indexprefix "{idx/node01/LIST_}" node NODE02 type flatfile indexprefix "{idx/node02/LIST_}" node NODE03 type flatfile indexprefix "{idx/node03/LIST_}" node NODE04 type flatfile indexprefix "{idx/node04/LIST_}"
The TABLE statement is expanded for each partitioned table to show the table characteristics for each node. The NODE declarations in the TABLE statement are show below. Consult the documentation on the TABLE statement for the complete syntax.
[TABLE table [NODE node] [TYPE table_type_spec] [PHYSICAL "filespec"] [OPTIONS "options"] [CARDINALITY n] [INDEXMAINTENANCE <AUTOMATIC | MANUAL | NONE>] [DATA CACHING <NONE | DYNAMIC>] [PARTITIONED BY "sql_predicates"]
Here is an example of TABLE statements on an Omnidex Grid.
table "HOUSEHOLDS" node NODE01 physical "{dat/node01/pro*.dat}" partition by "STATE in ('MA','ME','NH','NY','PR','RI','VT', 'NY','DE','PA','DC','MD','VA','WV') and ZIP between '00000' and '26999'" node NODE02 physical "{dat/node02/pro*.dat}" partition by "STATE in ('NC','SC','GA','FL','AL','MS', 'TN','KY','OH') and ZIP between '27000' and '45999'" node NODE03 physical "{dat/node03/pro*.dat}" partition by "STATE in ('IN','MI','IA','MN','MT','ND','SD', 'WI','IL','KS','MO','NE','AR','LA', 'OK','TX') and ZIP between '46000' and '79999'" node NODE04 physical "{dat/node04/pro*.dat}" partition by "STATE in ('AZ','CO','ID','NM','NV','UT','WY', 'CA','HI','AK','OR','WA') and ZIP between '80000' and '99999'"
As discussed in the partition scheme, it is necessary to partition tables so that referential integrity is maintained. Typically, the parent is partitioned using criteria, as shown for HOUSEHOLDS above, and the child is partitioned along the foreign key. The PARTITION BY clause for a child shows the join criteria, as shown below:
table "INDIVIDUALS" node NODE01 physical "{dat/node01/pro*.dat}" partition by "HOUSEHOLDS.HOUSEHOLD = INDIVIDUALS.HOUSEHOLD" node NODE02 physical "{dat/node02/pro*.dat}" partition by "HOUSEHOLDS.HOUSEHOLD = INDIVIDUALS.HOUSEHOLD" node NODE03 physical "{dat/node03/pro*.dat}" partition by "HOUSEHOLDS.HOUSEHOLD = INDIVIDUALS.HOUSEHOLD" node NODE04 physical "{dat/node04/pro*.dat}" partition by "HOUSEHOLDS.HOUSEHOLD = INDIVIDUALS.HOUSEHOLD"
It is recommended, though not required, that each node be indexed the same. This reduces the likelihood of queries taking too long because of a lack of indexing on one node.
Indexes can be built in parallel. For maximum performance, it is recommended that the temporary directory, indicated by the TMPDIR environment variable, be different between builds whenever possible.
Omnidex network services must be started on every machine that contains grid nodes. No special options are required.
See how to configure and start the Omnidex Network Services.
Applications should connect to the grid controller as though it was a standard Omnidex environment.
(c)Copyright Dynamic Information Systems - This document was last updated November 11, 2009.