MySQL 9.3 Reference Manual Including MySQL NDB Cluster 9.3
MySQL supports indexes on generated columns. For example:
CREATE TABLE t1 (f1 INT, gc INT AS (f1 + 1) STORED, INDEX (gc));
The generated column, gc
, is defined as the
expression f1 + 1
. The column is also indexed
and the optimizer can take that index into account during
execution plan construction. In the following query, the
WHERE
clause refers to gc
and the optimizer considers whether the index on that column
yields a more efficient plan:
SELECT * FROM t1 WHERE gc > 9;
The optimizer can use indexes on generated columns to generate
execution plans, even in the absence of direct references in
queries to those columns by name. This occurs if the
WHERE
, ORDER BY
, or
GROUP BY
clause refers to an expression that
matches the definition of some indexed generated column. The
following query does not refer directly to gc
but does use an expression that matches the definition of
gc
:
SELECT * FROM t1 WHERE f1 + 1 > 9;
The optimizer recognizes that the expression f1 +
1
matches the definition of gc
and
that gc
is indexed, so it considers that
index during execution plan construction. You can see this using
EXPLAIN
:
mysql> EXPLAIN SELECT * FROM t1 WHERE f1 + 1 > 9\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: t1
partitions: NULL
type: range
possible_keys: gc
key: gc
key_len: 5
ref: NULL
rows: 1
filtered: 100.00
Extra: Using index condition
In effect, the optimizer has replaced the expression f1
+ 1
with the name of the generated column that matches
the expression. That is also apparent in the rewritten query
available in the extended EXPLAIN
information displayed by SHOW
WARNINGS
:
mysql> SHOW WARNINGS\G
*************************** 1. row ***************************
Level: Note
Code: 1003
Message: /* select#1 */ select `test`.`t1`.`f1` AS `f1`,`test`.`t1`.`gc`
AS `gc` from `test`.`t1` where (`test`.`t1`.`gc` > 9)
The following restrictions and conditions apply to the optimizer's use of generated column indexes:
For a query expression to match a generated column
definition, the expression must be identical and it must
have the same result type. For example, if the generated
column expression is f1 + 1
, the
optimizer does not recognize a match if the query uses
1 + f1
, or if f1 + 1
(an integer expression) is compared with a string.
The optimization applies to these operators:
=
,
<
,
<=
,
>
,
>=
,
BETWEEN
, and
IN()
.
For operators other than
BETWEEN
and
IN()
, either operand can be
replaced by a matching generated column. For
BETWEEN
and
IN()
, only the first argument
can be replaced by a matching generated column, and the
other arguments must have the same result type.
BETWEEN
and
IN()
are not yet supported for
comparisons involving JSON values.
The generated column must be defined as an expression that
contains at least a function call or one of the operators
mentioned in the preceding item. The expression cannot
consist of a simple reference to another column. For
example, gc INT AS (f1) STORED
consists
only of a column reference, so indexes on
gc
are not considered.
For comparisons of strings to indexed generated columns that
compute a value from a JSON function that returns a quoted
string, JSON_UNQUOTE()
is
needed in the column definition to remove the extra quotes
from the function value. (For direct comparison of a string
to the function result, the JSON comparator handles quote
removal, but this does not occur for index lookups.) For
example, instead of writing a column definition like this:
doc_name TEXT AS (JSON_EXTRACT(jdoc, '$.name')) STORED
Write it like this:
doc_name TEXT AS (JSON_UNQUOTE(JSON_EXTRACT(jdoc, '$.name'))) STORED
With the latter definition, the optimizer can detect a match for both of these comparisons:
... WHERE JSON_EXTRACT(jdoc, '$.name') = 'some_string
' ... ... WHERE JSON_UNQUOTE(JSON_EXTRACT(jdoc, '$.name')) = 'some_string
' ...
Without JSON_UNQUOTE()
in the
column definition, the optimizer detects a match only for
the first of those comparisons.
If the optimizer picks the wrong index, an index hint can be used to disable it and force the optimizer to make a different choice.