1.5 XML Search Applications
An XML search application performs searches over XML documents. A regular document search usually searches across a set of documents to return documents that satisfy a text predicate; an XML search often uses the structure of the XML document to restrict the search. Typically, only the document part that satisfies the search is returned. For example, instead of finding all purchase orders that contain the word electric, the user might need only purchase orders in which the comment field contains electric.
Oracle Text enables you to perform XML searching by using the following approaches:
See Also:
1.5.1 The CONTAINS Operator with XML Search Applications
The CONTAINS
operator is well suited to structured searching, enabling you to perform restrictive searches with the WITHIN
, HASPATH
, and INPATH
operators. If you use a CONTEXT
index, then you can also benefit from the following characteristics of Oracle Text searches:
-
Token-based, whitespace-normalized searches
-
Hitlists ranked by relevance
-
Case-sensitive searching
-
Section searching
-
Linguistic features such as stemming and fuzzy searching
-
Performance-optimized queries for large document sets
WARNING:
Starting with Oracle Database 12c, Oracle XML Database (XML DB) is automatically installed when you install the new Oracle Database software or when you upgrade.
See Also:
1.5.2 Combining Oracle Text Features with Oracle XML DB (XML Search Index)
When you want a full-text retrieval for applications, combine the features of Oracle Text and Oracle XML DB to create an XML Search Index. In this case, leverage the XML structure by entering queries such as "find all nodes that contain the word Pentium." Oracle Database 12c extends Oracle's support for the W3C XQuery specification by adding support for the XQuery full-text extension. This support lets you perform XML-aware, full-text searches on XML content that is stored in the database.
The following topics explain how to use Oracle XML DB with Oracle Text applications:
See Also:
-
Oracle Text Reference for information about the
xml_enable
variable ofSET_SEC_GRP_ATTR
to enable XML awareness -
Oracle XML DB Developer's Guide for more information about XML full-text indexing and XML Search Index
1.5.2.1 Using the xml_enable Method for an XML Search Index
An XML Search Index is an XML-enabled Oracle Text index (CTXSYS.CONTEXT). This index type supports information-retrieval searching and structured searching in one unified index. XML Search Index also stores a Binary Persistent Document Object Model (PDOM) internally within an Oracle Text table, so that XML operations can be functionally evaluated over the Binary PDOM. This XML Search Index is supported for XMLTYPE datastores. XMLEXISTS is seamlessly rewritten to a CONTAINS
query in the presence of such an XML Search Index.
When you create an XML Search Index, a Binary PDOM of the XML document is materialized in an internal table of Oracle Text. Post evaluation from the Oracle Text index is redirected to go against the PDOM stored in this internal table.
See Also:
Oracle Text Reference for information on xml_enable
variable of SET_SEC_GRP_ATTR
to enable XML awareness for XML Search Index
The following example creates an Oracle XML Search Index:
exec CTX_DDL.CREATE_SECTION_GROUP('secgroup','PATH_SECTION_GROUP'); exec CTX_DDL.SET_SEC_GRP_ATTR('secgroup','xml_enable','t'); CREATE INDEX po_ctx_idx on T(X) indextype is ctxsys.context parameters (‘section group SECGROUP');
1.5.2.2 Using the Text-on-XML Method
With Oracle Text, you can create a CONTEXT
index on a column that contains XML data. The column type can be XMLType
or any supported type, provided that you use the correct index preference for XML data.
With the Text-on-XML method, use the standard CONTAINS
query and add a structured constraint to limit the scope of a search to a particular section, field, tag, or attribute. That is, specify the structure inside text operators, such as WITHIN
, HASPATH
, and INPATH.
For example, set up your CONTEXT
index to create sections with XML documents. Consider the following XML document that defines a purchase order:
<?xml version="1.0"?> <PURCHASEORDER pono="1"> <PNAME>Po_1</PNAME> <CUSTNAME>John</CUSTNAME> <SHIPADDR> <STREET>1033 Main Street</STREET> <CITY>Sunnyvalue</CITY> <STATE>CA</STATE> </SHIPADDR> <ITEMS> <ITEM> <ITEM_NAME> Dell Computer </ITEM_NAME> <DESC> Pentium 2.0 Ghz 500MB RAM </DESC> </ITEM> <ITEM> <ITEM_NAME> Norelco R100 </ITEM_NAME> <DESC>Electric Razor </DESC> </ITEM> </ITEMS> </PURCHASEORDER>
To query all purchase orders that contain Pentium within the item description section, use the WITHIN
operator:
SELECT id from po_tab where CONTAINS( doc, 'Pentium WITHIN desc') > 0;
Use the INPATH
operator to specify more complex criteria with XPATH
expressions:
SELECT id from po_tab where CONTAINS(doc, 'Pentium INPATH (/purchaseOrder/items/item/desc') > 0;
1.5.2.3 Indexing JSON Data
JavaScript Object Notation (JSON) is a language-independent data format that is used for serializing structured data and exchanging this data over a network, typically between a server and web applications. JSON provides a text-based way of representing JavaScript object literals, arrays, and scalar data.
See Also:
-
Oracle Text Reference for information about creating a search index on JSON
-
Oracle Database JSON Developer's Guide for more information about JSON