Home

Getting Started

Utilities

Indexing

Omnidex

Development

Tutorials

Quick Links

 

OMNIDEX Indexing

Indexing Strategies

Comparison

 

Domains

SI Domain

RS Domain

DR Domain

Hybrid-DR Domain

Sorted Multi-Find

Omnidex IDs by Domain

 

Indexing Strategies

Sorted Multi-Find

Sorted Multi-Find was originally developed for star schema designs, however, it is equally effective in some deployments of relational tables. A sorted multi-find is used to join tables in a DR domain.

Because tables are built stand-alone, changes to the index installation or other changes that may require the indexes to be rebuilt, apply to each table individually. In a DR domain using multi-find, only one table will need to be reindexed, where all tables in the MDK domain will be rebuilt in an RS domain.

Both methods are designed to allow indexed joins between table, specifically parent-to-child joins. The following is a breakdown of the characteristics of each method:

Sorted Multi-Find was originally developed for star schema designs, however, it is equally effective in some deployments of relational tables. A sorted multi-find is used to join tables in a DR domain.

Because tables are built stand-alone, changes to the index installation or other changes that may require the indexes to be rebuilt, apply to each table individually. In a DR domain using multi-find, only one table will need to be reindexed, where all tables in the MDK domain will be rebuilt in an RS domain.

Both methods are designed to allow indexed joins between table, specifically parent-to-child joins. The following is a breakdown of the characteristics of each method:

 

RS Indexing
  • All tables in the MDK domain are built when the domain is specified.
  • 8 bytes of the ID (uncompressed) are stored per keyword, per record in the child table.
  • Faster joins on larger parent qualify counts.
  • ODXID column relates parent records to child records.
  • May be preferable where the join field is wide.

 

Sorted Multi-Find
  • MDK indexes are built on each table. Changes to the index installation or a need to rebuild indexes will apply to each table individually.
  • A child table does not have to be built in multiple domains to be used for index joins to multiple parent tables in a query.
  • 4 bytes of the ID (uncompressed) are stored per keyword, per record in the child table.
  • Faster joins on smaller parent qualify counts.
  • MDK parent key in the parent (or dimension) table is built with the option B;IDSORT.
  • MDK foreign key column in the child (or fact) table is indexed with ;NT;NE;NP to enable sorted multi-find.
  • Environment variable DISC_DBL_AMP must be set to any string to enable sorted multi-find.
  • Multiple dimension qualify lists can be merged to the fact in one operation.
  • MDK searches are faster on a child table because of the 4-byte ID's, as opposed to the 8-byte ID's in an RS domain.
  • Bitmap indexes are allowed on a child table. (Not allowed in an RS domain)

 

Comparison

In a recent test on a DISC customer's actual data, each method was found to out-perform the other depending on the number of parent rows qualified with the parent table criteria. The comparison involved a parent table "Customer" with 743,000 rows joined with a child table "Rating" with 6.1 million rows. The query returns a count from the child table using criteria in the parent table.

select count(*) from customer a, rating b where a.patron_number=b.patron_number and a.birthday>19200101

The elapsed times reflect the slow performance of the server (AIX) and would prove considerably faster with either method, on other platforms.

The results of this comparison indicate that parent tables with fewer than 200,000 rows should always use sorted multi-find.

RS should be used when large dimension tables exist and parent-child qualify speed is at a premium. A 10-million row dimension table joined to a 100-million row fact table would be a definite candidate for RS.

The sorted multi-find should also be used if the build window is constrained. A full indexing with RS on the environment used in this comparison too 17 hours. With sorted multi-find, only 5 tables needed to be indexed, which took 2 hours 9 minutes.

Parent Qualifying Count
RS
Multi-Find
CPU time
Elapsed time
CPU time
Elapsed time

683,355

4.8

10.6

8.6

18.4

 

4.7

9.2

8.6

17.6

625,151

4.3

8.6

8.2

18.9

 

4.4

8.8

7.7

15.3

522,923

4.0

7.7

7.7

15.1

 

4.0

7.7

7.7

15.4

374,407

3.4

6.6

5.0

9.6

 

3.5

6.6

4.9

9.5

218,356

2.9

5.8

3.4

6.4

 

2.9

5.4

3.4

6.4

163,645

3.0

5.7

2.8

5.6

 

3.2

7.1

2.8

5.2

90,415

2.9

5.4

2.2

4.0

 

2.9

5.3

2.1

3.8

35,607

2.8

5.6

1.7

3.1

 

2.8

5.4

1.7

3.1

3,448

2.9

7.9

0.8

1.4

 

2.8

5.4

0.9

1.3

 

 

Top