Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
admin:indexing:text:home [2011/01/25 17:42]
els
admin:indexing:text:home [2016/06/28 22:38] (current)
Line 6: Line 6:
 ===== Omnidex Text ===== ===== Omnidex Text =====
  
-[[admin:​indexing:​text:​home|Overview]] ​|  +**[[admin:​indexing:​text:​home|Overview]]** ​|  
-**[[admin:​indexing:​text:​clob|Textual Datatypes]]** +[[admin:​indexing:​text:​clob|Textual Datatypes]] | 
 [[admin:​indexing:​text:​retrieve|External Files]] |  [[admin:​indexing:​text:​retrieve|External Files]] | 
 [[admin:​indexing:​text:​proximity|Proximity Searches]] |  [[admin:​indexing:​text:​proximity|Proximity Searches]] | 
 +[[admin:​indexing:​text:​advanced|Advanced Searches]] |
 [[admin:​indexing:​text:​results|Displaying Results]] |  [[admin:​indexing:​text:​results|Displaying Results]] | 
 [[admin:​indexing:​text:​relevancy|Relevancy]] [[admin:​indexing:​text:​relevancy|Relevancy]]
Line 15: Line 16:
 ---- ----
  
-==== Textual Datatypes ​====+==== Overview ​====
  
-Databases store text in character columns or in variable character columns (often called VARCHAR columns).  ​These datatypes are good for storing smaller amounts of text, such as names, addresses ​and short descriptions.  ​These datatypes ​are often limited in sizethough. ​ For larger ​amounts ​of text, a database may employ special datatypes designed specifically for this purpose. ​ As examples, Oracle provides a CLOB (Character Large Object) datatype and SQL Server provides a TEXT datatype. ​ These datatypes can often store up to 2-4 gigabytes of text+Omnidex allows applications to search for both textual and non-textual data using SQL statements. ​ For most textual data, [[admin:​indexing:​indexes:​types|QuickText]] indexes parse the individual keywords ​in the column and allows them to be individually qualified.  ​This feature is well suited ​for small textual fields, such as names, addressesdescriptions, comments and so forth.  ​QuickText indexes ​are simple and efficientbut when working with larger ​blocks ​of text, more powerful indexing is needed.
  
 +Omnidex Text provides more sophisticated indexes designed for large amounts of textual data stored in a database, such as articles, knowledgebase entries, forums, lengthy product descriptions and so forth. ​ Omnidex Text focuses on searching text in a database. ​ Omnidex Text is not a document management system and does not parse Microsoft Office, PDF, HTML or XML files. ​ Omnidex Text can index large textual files outside of the database, using a table as a catalog.
  
-^  Datatype ​         ^ Description ​   ^ +Omnidex Text works seamlessly ​with the rest of Omnidex ​and allows ​large textual fields ​to be searched alongside regular columns ​in standard SQL statementsusing standard ODBC and JDBC interfaces.  ​Queries can issue traditional queries ​with criteriatable joinsaggregations ​and ordering, while at the same time querying large blocks ​of text.
-|CHARACTER|Space-padded data up to 4,095 bytes.| +
-|C STRING|Null-terminated data up to 64MB.  If indexed ​with Omnidex, ​the extracted text from this column may be up to 16MB.| +
-|VARCHAR|Non-terminated and non-padded data up to 4,095 bytes. ​ This datatype may contain embedded null characters since it is not null-terminated;​ however, it should not be used to store binary data.  Use of this datatype requires use of oadescribe ​and oabind so that the data_lengths variables can be used.  This allows ​applications ​to know the length of the data in this column for each row returned.\\ \\ This datatype is not appropriate for fixed length Flatfiles since the data length cannot be stored. ​ Flatfiles should use CHARACTERC STRING or OMNIDEX VARCHAR datatypes.| +
-|CLOB|Non-terminated ​and non-padded data up to 64MB.  ​If indexed ​with Omnidexthe extracted text from this column may be up to 16MB.  This datatype may contain embedded null characters since it is not null-terminated;​ howeverit should not be used to store binary data.  Use of this datatype requires use of oadescribe ​and oabind so that the data_lengths variables can be used.  This allows applications to know the length ​of the data in this column for each row returned.\\ \\ The handling of CLOB data may be more expensive than the handling of CHARACTER, C STRING and VARCHAR data.  It is better to use those datatypes if their size limitations will not be exceeded.\\ \\ This datatype is not appropriate for fixed length Flatfiles since the data length cannot be stored. ​ Flatfiles should use CHARACTER, C STRING, OMNIDEX VARCHAR or OMNIDEX CLOB datatypes.+
-|----------------|-----------------------------------------------------------------------------------------------------------------------------|+
  
-The textual datatypes have different characteristics and have different restrictions within ​Omnidex ​SQL.  ​The following table shows the capabilities of each datatype.+Omnidex ​Text provides additional features that allow excerpts to be extracted from the text, and allows sorting by relevancy, similar to an Internet search engine.  ​Omnidex Text also allows [[admin:​indexing:​powersearch:​home|PowerSearch]] to be used against large textual data, providing features like misspellings,​ word forms and synonyms.
  
  
-^Characteristics ​                      ​^ ​ CHARACTER ​ ^  C STRING ​  ​^ ​ VARCHAR ​   ^  CLOB       ^ 
-|\\ **Datatype Characteristics** ​                                                          ||||| 
-|Character data allowed ​               |  Yes        |  Yes        |  Yes        |  Yes        | 
-|Binary data allowed ​                  ​| ​ No         ​| ​ No         ​| ​ No         ​| ​ No         | 
-|Embedded nulls allowed ​               |  No         ​| ​ No         ​| ​ Yes        |  Yes        | 
-|Null-terminated ​                      ​| ​ No         ​| ​ Yes        |  No         ​| ​ No         | 
-|Data_lengths required ​                ​| ​ No         ​| ​ No         ​| ​ Yes        |  Yes        | 
-|Max size                              |  4,095      |  64mb       ​| ​ 4,095      |  64mb       | 
-|\\ **Usage Characteristics** ​                                                             ||||| 
-|Select item of simple query           ​| ​ Yes        |  Yes        |  Yes        |  Yes        | 
-|Select item of outer query            |  Yes        |  Yes        |  Yes        |  Yes        | 
-|Select item of nested query           ​| ​ Yes        |  Yes        |  Yes        |  No         | 
-|Select item of INSERT ​                ​| ​ Yes        |  Yes        |  Yes        |  No         | 
-|Select item of set operation ​         |  Yes        |  Yes        |  Yes        |  Yes        | 
-|Table joins                           ​| ​ Yes        |  Yes        |  Yes        |  No         | 
-|WHERE clause ​                         |  Yes        |  Yes        |  Yes        |  Limited <​sup>​1</​sup>​ | 
-|GROUP BY clause ​                      ​| ​ Yes        |  Yes        |  Yes        |  No         | 
-|ORDER BY clause ​                      ​| ​ Yes        |  Yes        |  Yes        |  No         | 
-|HAVING clause ​                        ​| ​ Yes        |  Yes        |  Yes        |  No         | 
-|SELECT INTO clause ​                   |  Yes        |  Yes        |  Yes        |  No         | 
-|Aggregate functions ​                  ​| ​ Yes        |  Yes        |  Yes        |  No         | 
-|SQL Functions ​                        ​| ​ Yes        |  Yes        |  Yes        |  Limited <​sup>​2</​sup> ​ | 
-|\\ **Update Characteristics** ​                                                            ||||| 
-|Inserts, updates and deletes ​         |  Yes        |  Yes        |  Yes        |  No         | 
-|--------------------------------------------------------------------------|----------------|----------------|----------------|----------------| 
- 
-<​sup>​1. Limited to use of LIKE, $CONTAINS and IS NULL operators.</​sup>​\\ ​ 
-<​sup>​2.See individual functions for compatibility with CLOB datatype.</​sup>​ 
  
 =====  ===== =====  =====
  
-**[[admin:​indexing:​text:​home|Prev]]** | +**[[admin:​indexing:​text:​clob|Next]]**
-**[[admin:​indexing:​text:​retrieve|Next]]**+
  
 ====== Additional Resources ====== ====== Additional Resources ======
Line 71: Line 39:
  
 {{page>:​bottom_add&​nofooter&​noeditbtn}} {{page>:​bottom_add&​nofooter&​noeditbtn}}
- 
- 
- 
 
Back to top
admin/indexing/text/home.1295977322.txt.gz · Last modified: 2016/06/28 22:38 (external edit)