30.8 Creating Multivalue Function-Based Indexes for JSON_EXISTS
For JSON data that is stored as JSON
data type you can use
a multivalue function-based index for SQL/JSON condition json_exists
. Such
an index targets scalar JSON values, either individually or within a JSON array.
The main use of a multivalue index is to index scalar values within arrays. This includes scalar array elements, but also scalar field values of object array elements.
A multivalue index can also index a single scalar value, but for queries that target a single value it is generally more performant to use a B-tree or bitmap index.
In a query, you use json_exists
in the
WHERE
clause of a SELECT
statement. Condition
json_exists
returns true if the data it targets matches the
SQL/JSON path expression (or equivalent simple dot-notation syntax) in the query.
Otherwise it returns false. It is common for the path expression to include a filter
expression — matching then requires that the targeted data satisfy that filter.
You create a multivalue index using CREATE INDEX
with
keyword MULTIVALUE
, and using either the syntax of SQL/JSON
function json_table
or simple dot-notation that you use in queries
to specify the path to the indexed data. (However, you cannot use a SQL
NESTED
clause in place of
json_table
— a compile-time error is raised if you do
that.)
You can create a composite function-based index, to index more
than one virtual column, that is, more than one JSON field. A composite index acts
like a set of function-based indexes. When used to query, you use function
json_table
to project specified JSON field values as virtual
columns of SQL scalar values. Similarly, when used to define an index, the field
values that json_table
specifies are indexed as a composite
function-based index.
When using json_table
syntax to create a multivalue
index you must use these error-handling clauses: ERROR ON ERROR NULL
ON EMPTY NULL ON MISMATCH
; otherwise, a query compile-time error is
raised. When using simple dot-notation syntax without json_table
,
the behavior of these clauses is provided implicitly.
When using json_table
syntax you can use a FOR
ORDINALITY
clause, to enable use of the index for queries that target
specific array positions. (See COLUMNS Clause of SQL/JSON Function JSON_TABLE.)
For a multivalue index to be picked up by a query, the index must specify the SQL type of the data to be indexed, and the SQL type for the query result must match the type specified by the index.
If you create a non-composite multivalue index, that is, without using
json_table
syntax, then the index specification must
include a data-type conversion item method (other than
binary()
and dateWithTime()
), to indicate the
SQL data type. See SQL/JSON Path Expression Item Methods for information about the data-type conversion item methods.
If the index uses an item method with "only" in its name then only queries that use that same item method can pick up the index. Otherwise (with a non-"only" method or with no method), any query that targets a scalar value (possibly as an array element) that can be converted to the type indicated by the item method can pick up the index.
For example, a multivalue index that uses item method
numberOnly()
can only be picked up for a query that also uses
numberOnly()
. But an index that uses number()
,
or that uses no item method, can be picked up for a query that matches any scalar
(such as the string "3.14"
) that can be converted to a number.
If you create a composite multivalue index then the
json_table
virtual column type specifies the SQL type to use.
This means that queries of data that can be converted to the specified SQL
type can pick up the index.
However, just as in the non-composite index case, you can use a data-type conversion item method with "only" in its name, to override (further constrain) the specified column type. You use the item method in the column path expression.
For example, if the column type is specified as NUMBER
then queries
with matching data (such as the string "3.14"
) that can be
converted to a number can pick up the index. But if the column path expression uses
item method numberOnly()
then only queries that also use
numberOnly()
can pick up the index.
You can create more than one multivalue index for a given target. For
example, you can create one index for a field month
that uses item
method number()
and another for the same field that uses item
method string()
.
The following are not allowed, as ways to create a multivalue index:
-
You cannot specify sibling nested arrays in the
json_table
expression used to create a composite multivalue index. An error is raised if you try. You can index multiple arrays, but they cannot be siblings, that is, they cannot have the same parent field. -
Using a SQL
NESTED
clause (see SQL NESTED Clause Instead of JSON_TABLE).
A type-error mismatch between the type of a scalar JSON value and the
corresponding scalar SQL data type of a json_table
virtual column
can be because of type incompatibility, as put forth in Table 18-2, or because the SQL data type is too constraining — too small to
store the data.
Error-handling ERROR ON ERROR NULL ON EMPTY NULL ON
MISMATCH
returns SQL NULL
for the first kind of
mismatch, but it raises an error for the second kind. For example, type
incompatibility is tolerated when creating an index with SQL type
NUMBER
for JSON string data, but an error is raised if you try
to create an index using SQL type VARCHAR(2)
for data that has a
JSON string value of "hello"
, because the data has more than two
characters.
Example 30-13 Table PARTS_TAB, for Multivalue Index Examples
Table parts_tab
, with JSON
data type
column jparts
, is used in multivalue index examples here. The JSON
data includes field subparts
whose value is an array with scalar
elements.
CREATE TABLE parts_tab (id NUMBER, jparts JSON);
INSERT INTO parts_tab VALUES
(1, '{"parts" : [{"partno" : 3, "subparts" : [510, 580, 520]},
{"partno" : 4, "subparts" : 730}]}');
INSERT INTO parts_tab VALUES
(2, '{"parts" : [{"partno" : 7, "subparts" : [410, 420, 410]},
{"partno" : 4, "subparts" : [710, 730, 730]}]}');
Example 30-14 Creating a Multivalue Index for JSON_EXISTS
The multivalue index created here indexes the value of field
subparts
. The table alias (t
in this case) is
required when using simple dot notation syntax.
If the subparts
value targeted by a query is an
array then the index can be picked up for any array elements that are
numbers. If the value is a scalar then the index can be picked up if the
scalar is a number.
Given the data in table parts_tab
, a
subparts
field in each of the objects of array
parts
in the first row (which has id
1
) is indexed: the field in the first object because its array
value has elements that are numbers (510
, 580
, and
520
) the field in the second object because its value is a
number (730
).
If item method number()
were used in the index
definition, instead of numberOnly()
, then non-number scalar values
(such as the string "730"
) that can be converted to numbers would
also be indexed.
CREATE MULTIVALUE INDEX mvi ON parts_tab t
(t.jparts.parts.subparts.numberOnly());
Example 30-15 Creating a Composite Multivalue Index for JSON_EXISTS
This example creates a composite multivalue index that targets both field
partno
and field subparts
. The composite index
acts like a set of two function-based indexes that target those two fields.
The query uses json_table
syntax with a SQL/JSON path
expression for the row pattern, $.parts[*]
. As must always be the
case for multivalue index creation using json_table
, the error
handling is specified as ERROR ON ERROR NULL ON EMPTY NULL ON
MISMATCH
.
Column PARTNUM
is given SQL data type
NUMBER(10)
here, which means that, for the index to be used for
a query that targets field partno
, the value of that field must be
convertible to that data type.
-
If type conversion is impossible because the types are generally incompatible, as put forth in Table 18-2, then the
NULL ON MISMATCH
error handler causes SQLNULL
to be returned. An example of this would be apartno
string value of"hello"
for the SQLpartNum
column of typeNUMBER(10)
. -
If, on the other hand, the SQL data type storage is too constraining then an error is raised — the index is not created. An example of this would be a
partno
string with more than 10 characters, such as"1234567890123"
.
CREATE MULTIVALUE INDEX cmvi_1 ON parts_tab
(json_table(jparts, '$.parts[*]'
ERROR ON ERROR NULL ON EMPTY NULL ON MISMATCH
COLUMNS (partNum NUMBER(10) PATH '$.partno',
NESTED PATH '$.subparts[*]'
COLUMNS (subpartNum NUMBER(20) PATH '$'))));
Example 30-16 Creating a Composite Multivalue Index That Can Target Array Positions
The code in this example is like that in Example 30-15, except that it also specifies virtual column SEQ
for ordinality. That means that values in the column just before it,
SUBPARTNUM
, can be accessed by way of their (one-based)
positions in array subparts
. (The SQL data type of a FOR
ORDINALITY
column is always NUMBER
.)
As always, at most one entry in a COLUMNS
clause can be
a column name followed by FOR ORDINALITY
, which specifies a column
of generated row numbers (SQL data type NUMBER
), starting with one.
Otherwise, an error is raised when creating the index.
In addition to that general rule for json_table
syntax:
-
When
json_table
is used to create a multivalue index, theFOR ORDINALITY
column must be the last column of json_table. (This is not required whenjson_table
is used in queries; it applies only to index creation.) -
In order for a multivalue index created using
json_table
to be picked up for a given query, the query must apply a filter expression to the JSON field corresponding to the first virtual column of thejson_table
expression.
In order for a query that targets array elements by their position
to pick up a multivalue index for array positions, the index column for those array
elements must be the one immediately before the FOR
ORDINALITY
column
(The code here uses simple dot notation for the row pattern; if it instead used a SQL/JSON path expression for the row pattern, the rest of the code would be the same.)
CREATE MULTIVALUE INDEX cmvi_2 ON parts_tab t
(t.jparts.parts[*]
ERROR ON ERROR NULL ON EMPTY NULL ON MISMATCH
COLUMNS (partNum NUMBER(10) PATH '$.partno',
NESTED PATH subparts[*]
COLUMNS (subpartNum NUMBER(20) PATH '$',
seq FOR ORDINALITY))));