Create and Query DBMS_SEARCH Indexes Using Multiple Tables and Views

Connect to Oracle Database as a local user.

Log in to SQL*Plus as the SYS user, connecting as SYSDBA:

conn sys/password as sysdba

CREATE TABLESPACE tbs1
DATAFILE 'tbs5.dbf' SIZE 20G AUTOEXTEND ON
EXTENT MANAGEMENT LOCAL
SEGMENT SPACE MANAGEMENT AUTO;

SET ECHO ON
SET FEEDBACK 1
SET NUMWIDTH 10
SET LINESIZE 80
SET TRIMSPOOL ON
SET TAB OFF
SET PAGESIZE 10000
SET LONG 10000

Create a local user (docuser) and grant necessary privileges:

DROP USER docuser cascade;

GRANT DB_DEVELOPER_ROLE, DEFAULT TABLESPACE tbs1 quota unlimited on tbs1
TO docuser IDENTIFIED BY password;

Connect as the local user (docuser):
```
CONN docuser/password
```

Create and populate the customers, items, orders, and lineitems tables. You will later add these tables to your ubiquitous search index.

customers:

Here, the schema owner name DOCUSER is explicitly specified.

CREATE TABLE docuser.customers (
    cust_id number PRIMARY KEY,
    first varchar2(30),
    last varchar2(30));

insert into customers values (1, 'Robert', 'Smith');
insert into customers values (2, 'John', 'Doe');
insert into customers values (3, 'James', 'Martin');
insert into customers values (5, 'Jane', 'Y');

items:

CREATE TABLE items (
    item_id number PRIMARY KEY,
    name    varchar2(30),
    price   number(5,2),
    stock_quantity number);

insert into items values (122, 'Potato Gun', 29.99, 10);
insert into items values (232, 'Rubber Christmas Tree', 65.00, 0);
insert into items values (345, 'Border Patrol Costume', 19.99, 20);
insert into items values (845, 'Meteor Impact Survival Kit', 299.00, 0);
insert into items values (429, 'Air Guitar', 9.99, 14);

orders:

CREATE TABLE orders (
    order_id number PRIMARY KEY,
    cust_id number REFERENCES customers(cust_id) ON DELETE CASCADE);

insert into orders values (1, 1);
insert into orders values (2, 1);
insert into orders values (3, 3);
insert into orders values (4, 2);

lineitems:

CREATE TABLE lineitems (
    order_id number REFERENCES orders(order_id) ON DELETE CASCADE,
    item_id  number REFERENCES items(item_id) ON DELETE CASCADE,
    quantity number,
    PRIMARY KEY(order_id, item_id));

insert into lineitems values(1, 845, 1);
insert into lineitems values(2, 232, 1);
insert into lineitems values(2, 429, 4);
insert into lineitems values(3, 122, 1);
insert into lineitems values(4, 345, 1);

Create a view named search_view based on the tables you created. You will later add this view to your ubiquitous search index.

CREATE OR REPLACE VIEW search_view(cust_id, business_object,
  CONSTRAINT search_view_pk
    PRIMARY KEY(cust_id)
      RELY DISABLE NOVALIDATE,
  CONSTRAINT search_view_fk
    FOREIGN KEY(cust_id) REFERENCES customers(cust_id)
      DISABLE NOVALIDATE) AS
SELECT c.cust_id, JSON_OBJECT(
          'id' VALUE c.cust_id,
          'name' VALUE (c.first || ' ' || c.last),
          'num_orders' VALUE (
               SELECT COUNT(*)
               FROM orders o
               WHERE o.cust_id = c.cust_id),
          'orders' VALUE (
               SELECT JSON_ARRAYAGG(
                    JSON_OBJECT(
                         'order_id' VALUE o.order_id,
                         'items' VALUE (
                              SELECT JSON_ARRAYAGG (
                                   JSON_OBJECT(
                                        'id' VALUE l.item_id,
                                        'name' VALUE i.name,
                                        'quantity' VALUE l.quantity,
                                        'single_item_price' VALUE i.price,
                                        'total_price' VALUE (i.price * l.quantity)))
                               FROM lineitems l, items i
                               WHERE l.order_id = o.order_id
                               AND i.item_id = l.item_id)))
          FROM orders o
          WHERE o.cust_id = c.cust_id) ABSENT ON NULL) business_object
FROM customers c;

Create a ubiquitous search index named MY_SEARCH_INDEX.

EXEC DBMS_SEARCH.CREATE_INDEX('DOCUSER.MY_SEARCH_INDEX', NULL, 'JSON');

Note that you can omit tablespace and datatype (defaults to JSON), as follows:

exec dbms_search.create_index('MY_SEARCH_INDEX');

Run the following command to determine the structure of your index, which is created in the DOCUSER schema:

DESC MY_SEARCH_INDEX;

 Name             Null?      Type
 ---------------- -------- ----------------------------
 METADATA         NOT NULL   JSON
 DATA                        JSON
 OWNER                       VARCHAR2(128)
 SOURCE                      VARCHAR2(128)
 KEY                         VARCHAR2(1024)

Add the CUSTOMERS table as data source to MY_SEARCH_INDEX.

EXEC DBMS_SEARCH.ADD_SOURCE('DOCUSER.MY_SEARCH_INDEX', 'DOCUSER.CUSTOMERS');

You can add all other table sources to this index, but it is not required to complete this example scenario.

Examine the DATA and METADATA columns of your index along with the DBMS_SEARCH dictionary views, as shown in the following steps:

Query the METADATA column:

SELECT JSON_SERIALIZE(METADATA FORMAT JSON) META
FROM DOCUSER.MY_SEARCH_INDEX
ORDER BY META;

The METADATA column helps the DBMS_SEARCH index to uniquely identify each row of the table or view that is indexed. You can see that the METADATA column stores a JSON representation of the following form for each indexed row of your customers table:

META
--------------------------------------------------------------------------------
{"OWNER":"DOCUSER","SOURCE":"CUSTOMERS","KEY":{"CUST_ID":1}}
{"OWNER":"DOCUSER","SOURCE":"CUSTOMERS","KEY":{"CUST_ID":2}}
{"OWNER":"DOCUSER","SOURCE":"CUSTOMERS","KEY":{"CUST_ID":3}}
{"OWNER":"DOCUSER","SOURCE":"CUSTOMERS","KEY":{"CUST_ID":5}}

4 rows selected.

Query the DATA column:
```
SELECT DATA
FROM DOCUSER.MY_SEARCH_INDEX;
```
Note that the DATA column does not store actual data. Instead, the data resides in the original base tables. This index references your data source tables to create a virtual indexed JSON document on the fly. After the data is fetched and indexed, this column is effectively emptied to avoid duplication.
```
DATA
--------------------------------------------------------------------------------


4 rows selected.
```

Get a virtual indexed document to examine the contents that are extracted from the customers table source.

SELECT 
JSON_SERIALIZE(
  DBMS_SEARCH.GET_DOCUMENT('DOCUSER.MY_SEARCH_INDEX', METADATA) FORMAT JSON) DOC
FROM DOCUSER.MY_SEARCH_INDEX
ORDER BY JSON_SERIALIZE(METADATA FORMAT JSON);

This document contains a JSON representation for each indexed row of the customers table that is added as data source to your index:

DOC
--------------------------------------------------------------------------------
{"DOCUSER":{"CUSTOMERS":{"CUST_ID":1,"FIRST":"Robert","LAST":"Smith"}}}
{"DOCUSER":{"CUSTOMERS":{"CUST_ID":2,"FIRST":"John","LAST":"Doe"}}}
{"DOCUSER":{"CUSTOMERS":{"CUST_ID":3,"FIRST":"James","LAST":"Martin"}}}
{"DOCUSER":{"CUSTOMERS":{"CUST_ID":5,"FIRST":"Jane","LAST":"Y"}}}

4 rows selected.

Query the USER_DBMS_SEARCH_INDEXES view to display metadata values for the index.

SELECT IDX_NAME FROM USER_DBMS_SEARCH_INDEXES ORDER BY IDX_NAME;

This view shows the index name added in your user schema, DOCUSER:

IDX_NAME
--------------------------------------------------
MY_SEARCH_INDEX

1 row selected.

Query the USER_DBMS_SEARCH_INDEX_SOURCES view to display metadata values for your data source.

SELECT IDX_NAME,SRC_OWNER,SRC_NAME,SRC_TYPE         
FROM USER_DBMS_SEARCH_INDEX_SOURCES
ORDER BY IDX_NAME, SRC_OWNER,SRC_NAME;

This view shows the data source details associated with your index, from your user schema (DOCUSER). Here, the source type T implies a "table" source:

IDX_NAME
------------------------------------------------
SRC_OWNER
------------------------------------------------
SRC_NAME
------------------------------------------------
SRC_TYPE
-
MY_SEARCH_INDEX
DOCUSER
CUSTOMERS
T

1 row selected.

Add a view to your index, then examine the METADATA column and dictionary views again to compare how the changes are reflected in the indexed data.

Add the view that you created (SEARCH_VIEW) as a data source to the index:

EXEC DBMS_SEARCH.ADD_SOURCE('DOCUSER.MY_SEARCH_INDEX', 'DOCUSER.SEARCH_VIEW');

Query the METADATA column:

SELECT JSON_SERIALIZE(METADATA FORMAT JSON) META
FROM DOCUSER.MY_SEARCH_INDEX
ORDER BY META;

The METADATA column additionally shows each row of the view that is indexed:

META
--------------------------------------------------------------------------------
{"OWNER":"DOCUSER","SOURCE":"CUSTOMERS","KEY":{"CUST_ID":1}}
{"OWNER":"DOCUSER","SOURCE":"CUSTOMERS","KEY":{"CUST_ID":2}}
{"OWNER":"DOCUSER","SOURCE":"CUSTOMERS","KEY":{"CUST_ID":3}}
{"OWNER":"DOCUSER","SOURCE":"CUSTOMERS","KEY":{"CUST_ID":5}}
{"OWNER":"DOCUSER","SOURCE":"SEARCH_VIEW","KEY":{"CUST_ID":1}}
{"OWNER":"CUSTOMER","SOURCE":"SEARCH_VIEW","KEY":{"CUST_ID":2}}
{"OWNER":"DOCUSER","SOURCE":"SEARCH_VIEW","KEY":{"CUST_ID":3}}
{"OWNER":"DOCUSER","SOURCE":"SEARCH_VIEW","KEY":{"CUST_ID":5}}

8 rows selected.

Query the USER_DBMS_SEARCH_INDEXES dictionary view.

SELECT IDX_NAME FROM USER_DBMS_SEARCH_INDEXES 
  ORDER BY IDX_NAME;

This view shows the index name present in your user schema, DOCUSER:

IDX_NAME
--------------------------------------------------------------------------------
MY_SEARCH_INDEX

1 row selected.

Query the USER_DBMS_SEARCH_INDEX_SOURCES dictionary view.

SELECT IDX_NAME,SRC_OWNER,SRC_NAME,SRC_TYPE
FROM USER_DBMS_SEARCH_INDEX_SOURCES
ORDER BY IDX_NAME, SRC_OWNER,SRC_NAME;

The output shows an additional row for SEARCH_VIEW, added as a data source to your index. Here, the source types T and V imply "table" and "view" sources, respectively:

IDX_NAME
----------------------------------------------------
SRC_OWNER
----------------------------------------------------
SRC_NAME
----------------------------------------------------
SRC_TYPE
-
MY_SEARCH_INDEX
DOCUSER
CUSTOMERS
T

MY_SEARCH_INDEX
DOCUSER
SEARCH_VIEW
V

2 rows selected.

Run queries against your index using the JSON_EXISTS operator.

Search for documents in the view source of your index, where the DATA column contains a JSON element $.DOCUSER.SEARCH_VIEW.

SELECT JSON_SERIALIZE(METADATA FORMAT JSON) META
FROM DOCUSER.MY_SEARCH_INDEX
WHERE JSON_EXISTS(DATA,'$.DOCUSER.SEARCH_VIEW')
ORDER BY META;

The output returns four rows from the DOCUSER schema with combined customer IDs as 1, 2, 3, and 5:

META
--------------------------------------------------------------------------------
{"OWNER":"DOCUSER","SOURCE":"SEARCH_VIEW","KEY":{"CUST_ID":1}}
{"OWNER":"DOCUSER","SOURCE":"SEARCH_VIEW","KEY":{"CUST_ID":2}}
{"OWNER":"DOCUSER","SOURCE":"SEARCH_VIEW","KEY":{"CUST_ID":3}}
{"OWNER":"DOCUSER","SOURCE":"SEARCH_VIEW","KEY":{"CUST_ID":5}}

4 rows selected.

Search for documents in the view source of you index as a virtual document, where the DATA column contains a JSON element $.DOCUSER.SEARCH_VIEW:

This is a similar query as shown in the previous step. However, here you can view an entire virtual indexed document with a JSON representation of all the metadata values:

SELECT 
JSON_SERIALIZE(
  DBMS_SEARCH.GET_DOCUMENT('DOCUSER.MY_SEARCH_INDEX', METADATA) FORMAT JSON) DOC
FROM DOCUSER.MY_SEARCH_INDEX
WHERE JSON_EXISTS(DATA,'$.DOCUSER.SEARCH_VIEW')
ORDER BY JSON_SERIALIZE(METADATA FORMAT JSON);

The output returns a JSON document with combined metadata values, as indexed in MY_SEARCH_INDEX:

DOC
--------------------------------------------------------------------------------
{"DOCUSER":{"SEARCH_VIEW":{"CUST_ID":1,"BUSINESS_OBJECT":{"id":1,"name":"Robert 
Smith","num_orders":2,"orders":[{"order_id":1,"items":[{"id":845,"name":"Meteor
 Impact Survival Kit","quantity":1,"single_item_price":299,"total_price":299}]},
{"order_id":2,"items":[{"id":232,"name":"Rubber Christmas Tree","quantity":1,
"single_item_price":65,"total_price":65},{"id":429,"name":"Air Guitar","quantity":4
,"single_item_price":9.99,"total_price":39.96}]}]}}}}

{"DOCUSER":{"SEARCH_VIEW":{"CUST_ID":2,"BUSINESS_OBJECT":{"id":2,"name":"John
Doe","num_orders":1,"orders":[{"order_id":4,"items":[{"id":345,"name":"Border
Patrol Costume","quantity":1,"single_item_price":19.99,"total_price":19.99}]
}]}}}}

{"DOCUSER":{"SEARCH_VIEW":{"CUST_ID":3,"BUSINESS_OBJECT":{"id":3,"name":"James 
Martin","num_orders":1,"orders":[{"order_id":3,"items":[{"id":122,"name":"Potato
Gun","quantity":1,"single_item_price":29.99,"total_price":29.99}]}]}}}}

{"DOCUSER":{"SEARCH_VIEW":{"CUST_ID":5,"BUSINESS_OBJECT":{"id":5,"name":"Jane Y",
"num_orders":0}}}}

4 rows selected.

Perform a textual search query on targeted paths using the JSON_TEXTCONTAINS operator.

Query the $.DOCUSER.SEARCH_VIEW.BUSINESS_OBJECT.name JSON path in the DATA column for the keywords "Anon or Jane".

SELECT 
JSON_SERIALIZE(
  DBMS_SEARCH.GET_DOCUMENT('DOCUSER.MY_SEARCH_INDEX', METADATA) FORMAT JSON) DOC
FROM DOCUSER.MY_SEARCH_INDEX
WHERE
JSON_TEXTCONTAINS(DATA,'$.DOCUSER.SEARCH_VIEW.BUSINESS_OBJECT.name',
                       'Anon or Jane')
ORDER BY JSON_SERIALIZE(METADATA FORMAT JSON);

The output returns a JSON document with the customer ID as 5 and the name as Jane Y, from the SEARCH_VIEW source in the DOCUSER schema:

DOC
---------------------------------------------------------------------------------
{"DOCUSER":{"SEARCH_VIEW":{"CUST_ID":5,"BUSINESS_OBJECT":{"id":5,"name":"Jane Y",
"num_orders":0}}}}

1 row selected.

Use the SCORE operator with JSON_TEXTCONTAINS to obtain a relevance score for your search result.

SELECT METADATA, score(1) from DOCUSER.MY_SEARCH_INDEX
    WHERE JSON_TEXTCONTAINS(
        DATA,'$.DOCUSER.SEARCH_VIEW.BUSINESS_OBJECT.name','Anon or Jane',1);

The output returns metadata values and a relevance score of 5 for the matching record of customer ID 5, Jane Y.

 METADATA
--------------------------------------------------------------------------------
  SCORE(1)
----------
{"OWNER":"DOCUSER","SOURCE":"SEARCH_VIEW","KEY":{"CUST_ID":5}}
         5

1 row selected.

Search across the entire schema using the CONTAINS operator.

Query the index to retrieve records that match the keywords "Anon or Jane".

SELECT 
JSON_SERIALIZE(
  DBMS_SEARCH.GET_DOCUMENT('DOCUSER.MY_SEARCH_INDEX', METADATA) FORMAT JSON) DOC
FROM DOCUSER.MY_SEARCH_INDEX
WHERE
CONTAINS(DATA, 'Anon or Jane') > 0
ORDER BY JSON_SERIALIZE(METADATA FORMAT JSON);

The output returns two JSON objects from the DOCUSER schema. One with the customer ID as 5 and the name as Jane Y, from the CUSTOMERS table source. Another also with the customer ID as 5 and the name as Jane Y, but from the SEARCH_VIEW view source.

Note that the business object here has zero orders associated with it.

DOC
--------------------------------------------------------------------------------
{"DOCUSER":{"CUSTOMERS":{"CUST_ID":5,"FIRST":"Jane","LAST":"Y"}}}
{"DOCUSER":{"SEARCH_VIEW":{"CUST_ID":5,"BUSINESS_OBJECT":{"id":5,"name":"Jane Y",
 "num_orders":0}}}}

2 rows selected.

Use the SCORE operator with CONTAINS to obtain a relevance score for your search result.

SELECT METADATA, score(1) as search_score 
    from DOCUSER.MY_SEARCH_INDEX  
    WHERE CONTAINS(DATA, 'Anon or Jane',1)>0;

Here, the output returns the matching records of customer ID 5, Jane Y, from both the table source and view source. It also shows a search score of 5 for both the records.

METADATA
--------------------------------------------------------------------------------
SEARCH_SCORE
------------
{"OWNER":"DOCUSER","SOURCE":"CUSTOMERS","KEY":{"CUST_ID":5}}
           5

{"OWNER":"DOCUSER","SOURCE":"SEARCH_VIEW","KEY":{"CUST_ID":5}}
           5

2 rows selected.

Use the DBMS_SEARCH.FIND procedure to retrieve a hitlist. This also facets an aggregations of JSON documents based on the specified query-by-example (QBE) filter conditions.

SELECT JSON_SERIALIZE(DBMS_SEARCH.FIND('DOCUSER.MY_SEARCH_INDEX',
JSON('{
  "$query" : { "DOCUSER.SEARCH_VIEW.*" : { "$contains" : "Gun or patrol costume" }  },
  "$facet" : [
    { "$sum" : "DOCUSER.SEARCH_VIEW.BUSINESS_OBJECT.orders.items.total_price"},
    { "$count" : 
      { "path" : "DOCUSER.SEARCH_VIEW.BUSINESS_OBJECT.orders.items.single_item_price",
        "bucket" : [ { "$lt" : 20  }, { "$gte" : 20 } ] } },
    { "$uniqueCount" : "DOCUSER.SEARCH_VIEW.BUSINESS_OBJECT.name"  }
  ]
}')) FORMAT JSON PRETTY) AGG FROM DUAL;

The output shows an aggregation result in JSON format.

Here, the query searches for the phrase Gun or patrol costume within DOCUSER.SEARCH_VIEW. The $count indicates two records that match the query criteria. The $facet key groups multiple aggregation results into separate buckets, performing the following aggregations:

DOCUSER.SEARCH_VIEW.BUSINESS_OBJECT.orders.items.total_price:

Aggregates the total price of all items in the orders.items.total_price field into a total of 49.98.
DOCUSER.SEARCH_VIEW.BUSINESS_OBJECT.orders.items.single_item_price:

Aggregates the prices of individual items in the single_item_price field and groups them into price ranges (buckets).
DOCUSER.SEARCH_VIEW.BUSINESS_OBJECT.name:

Counts the unique occurrences of names, John Doe and James Martin, in the BUSINESS_OBJECT.name field. Both the names appear once.

AGG
--------------------------------------------------------------------------------
{
  "$count" : 2,
  "$facet" :
  [
    {
      "DOCUSER.SEARCH_VIEW.BUSINESS_OBJECT.orders.items.total_price" :
      {
        "$sum" : 49.98
      }
    },
    {
      "DOCUSER.SEARCH_VIEW.BUSINESS_OBJECT.orders.items.single_item_price" :
      [
        {
          "bucket" :
          {
            "$gte" : 19.99,
            "$lt" : 20
          },
          "$count" : 1
        },
        {
          "bucket" :
          {
            "$gte" : 20,
            "$lte" : 29.99
          },
          "$count" : 1
        }
      ]
    },
    {
      "DOCUSER.SEARCH_VIEW.BUSINESS_OBJECT.name" :
      [
        {
          "value" : "John Doe",
          "$uniqueCount" : 1
        },
        {
          "value" : "James Martin",
          "$uniqueCount" : 1
        }
      ]
    }
  ]
}

1 row selected.

13.2.1 Create and Query DBMS_SEARCH Indexes Using Multiple Tables and Views