DRAFT

Creating an Omnidex Grid using Environment Rules

:!: These instructions show how to create Omnidex Grids using environment files that are compiled with OACOMP in Omnidex Version 5.0. This is an approach that pre-dated the use of SQL statements such as CREATE DATABASE or CREATE TABLE. Environment files that are compiled with OACOMP are still supported for backward compatability. Click here for instructions for Omnidex Version 5.1 and later.

Omnidex Grids are straightforward to administer. Each grid node is a separate Omnidex environment, with an environment file, indexes and data. For convenience, there are methods to share a single Omnidex environment file across multiple nodes. There are also methods to share data files across multiple nodes.

Before creating an Omnidex Grid, you must first have a partitioning scheme and a distribution plan. These will be needed during the steps of creating an Omnidex Grid.

These instructions describe how to create an Omnidex Grid using Environment Rules. This was the original method to create Omnidex Grids. This method has now been replace by use of the Omnidex Administrator or SQL statements such as the CREATE DATABASE and CREATE TABLE.

Steps to create an Omnidex Grid using Environment Rules

1. Create an environment pointing to the original database

Omnidex Grids needs an environment that points to the original database so that the data can be partitioned. This environment is a standard environment that accesses the underlying database.

2. Partition the database

If the database will be logically partitioned, then create the relational views that correlate with the partitioning strategy. If the database will be physically partitioned using an Omnidex Snapshot, then partition the database using the PARTITION command.

The PARTITION command is available in the Omnidex SQL language, and is callable from the OdxSQL application. Since the command is in the Omnidex SQL language, it may also be called directly from ODBC and JDBC.

The PARTITION command allows the table to be partitioned one of three ways, 1) into a specific number of equally sized partitions, 2) into separate partitions based on a partition qualifier consisting of a column or a SQL expression, or 3) a combination of these two steps. In the third case, Omnidex will evaluate the cardinality of the partition qualifier and group values together to comprise the specific number of partitions in as equal size as possible. The syntax of the PARTITION command is:

PARTITION table [INTO n [PARTITIONS]] [BY column] 
                [IN path] 
                [WITH options]

An example of a command to partition a table is:

PARTITION LIST BY “STATE || SUBSTRING(ZIP from 1 for 2)” 
  IN D:\grid\data

3. Distribute both the partitioned and partitioned data across the grid.

If the database will be logically partitioned, then make sure each grid server has access to the database. If the database will be physically partitioned, then distribute both the partitioned and unpartitioned data to the grid nodes. If each node will be in its own directory, then the data files can be copied or symbolically linked to the correct locations. It is also possible for multiple nodes to share the same physical directories as long as the partitioned data can be differentiated. To assist this, it is possible to use the environment variable “OMNIDEX_NODE” in the physical clauses for the table, as well as the index prefix for the database. The variable $OMNIDEX_NODE will be replaced by the node name.

An example of an environment variable used in a DATABASE … INDEXPREFIX declaration is:

database       "list"
  type         flatfile
  indexprefix  "{idx/$OMNIDEX_NODE/LIST_}"

An example of a partitioned table is:

table          "LIST"        
  physical     "{dat/$OMNIDEX_NODE/list_*.dat}"

4. Install and build Omnidex indexes on each node.

It is recommended, though not required, that each node be indexed the same. This reduces the likelihood of queries taking too long because of a lack of indexing on one node.

Indexes can be built in parallel. For maximum performance, it is recommended that the temporary directory, indicated by the TMPDIR environment variable, be different between builds whenever possible.

5. Create an environment file for the grid controller.

The environment file for the grid controller has the same database and table layout as the grid nodes. The physical clauses for the tables can be omitted on the grid controller since the controller only access the tables through the nodes.

At the bottom of the environment file, include a rule that describes the Omnidex Grid. This rule declares the nodes and partitioning scheme used in the Omnidex grid, as well as naming the columns that are guaranteed to have distinct values on each node (discussed earlier).

The syntax for the grid rule is:

  RULE dbname_GRID
  {
    MAX_CONCURRENT_NODES 	n
    UNIQUE_TO_NODE 		“table.column, table.column, …”
    NODE node_name 		[HOST name] [PORT n] [ENVIRONMENT filename] 
      DATABASE database_name
          TABLE table_name PARTITION BY “partition_clause”
  }

An example of a grid rule is:

  rule LIST_GRID
  {
    max_concurrent_nodes = 4
    unique_to_node "HOUSEHOLDS.HOUSEHOLD"
    unique_to_node "HOUSEHOLDS.STATE"
    unique_to_node "HOUSEHOLDS.ZIP"
    unique_to_node "INDIVIDUALS.HOUSEHOLD"
    unique_to_node "INDIVIDUALS.INDIVIDUAL"
    unique_to_node "INDIVIDUALS.SSN"

    node node01
      database LIST 
        table HOUSEHOLDS 
          partition by "STATE in ('MA','ME','NH','NY','PR','RI','VT',
                                  'NY','DE','PA','DC','MD','VA','WV') and 
                        ZIP between '00000' and '26999'"
        table INDIVIDUALS 
          partition by "HOUSEHOLDS.HOUSEHOLD = INDIVIDUALS.HOUSEHOLD"

    node node02
      database LIST 
        table HOUSEHOLDS 
          partition by "STATE in ('NC','SC','GA','FL','AL','MS',
                                  'TN','KY','OH') and 
                        ZIP between '27000' and '45999'"
        table INDIVIDUALS 
          partition by "HOUSEHOLDS.HOUSEHOLD = INDIVIDUALS.HOUSEHOLD"

    node node03
      database LIST 
        table HOUSEHOLDS 
          partition by "STATE in ('IN','MI','IA','MN','MT','ND','SD',
                                  'WI','IL','KS','MO','NE','AR','LA',
                                  'OK','TX') and 
                        ZIP between '46000' and '79999'"
        table INDIVIDUALS 
          partition by "HOUSEHOLDS.HOUSEHOLD = INDIVIDUALS.HOUSEHOLD"
  
    node node04
      database LIST 
        table HOUSEHOLDS 
          partition by "STATE in ('AZ','CO','ID','NM','NV','UT','WY',
                                  'CA','HI','AK','OR','WA') and 
                        ZIP between '80000' and '99999'"
        table INDIVIDUALS 
          partition by "HOUSEHOLDS.HOUSEHOLD = INDIVIDUALS.HOUSEHOLD"
  } 

6. Start Omnidex Network Services on each server.

Omnidex network services must be started on every machine that contains grid nodes. No special options are required.

See how to configure and start the Omnidex Network Services (OdxNet).

7. Perform queries against the controller.

Applications should connect to the grid controller as though it was a standard Omnidex environment.

Quick Links

 
Back to top
admin/grid/creation/rules.txt · Last modified: 2012/10/26 14:27 (external edit)