MySQL 8.0 Reference Manual Including MySQL NDB Cluster 8.0
A character string literal, hexadecimal literal, or bit-value
literal may have an optional character set introducer and
COLLATE
clause, to designate it as a string
that uses a particular character set and collation:
[_charset_name
]literal
[COLLATEcollation_name
]
The _
expression is formally called an
introducer. It tells the parser, “the
string that follows uses character set
charset_name
charset_name
.” An introducer
does not change the string to the introducer character set like
CONVERT()
would do. It does not
change the string value, although padding may occur. The
introducer is just a signal.
For character string literals, space between the introducer and the string is permitted but optional.
For character set literals, an introducer indicates the
character set for the following string, but does not change how
the parser performs escape processing within the string. Escapes
are always interpreted by the parser according to the character
set given by
character_set_connection
. For
additional discussion and examples, see
Section 12.3.6, “Character String Literal Character Set and Collation”.
Examples:
SELECT 'abc'; SELECT _latin1'abc'; SELECT _binary'abc'; SELECT _utf8mb4'abc' COLLATE utf8mb4_danish_ci; SELECT _latin1 X'4D7953514C'; SELECT _utf8mb4 0x4D7953514C COLLATE utf8mb4_danish_ci; SELECT _latin1 b'1000001'; SELECT _utf8mb4 0b1000001 COLLATE utf8mb4_danish_ci;
Character set introducers and the COLLATE
clause are implemented according to standard SQL specifications.
Character string literals can be designated as binary strings by
using the _binary
introducer. Hexadecimal
literals and bit-value literals are binary strings by default,
so _binary
is permitted, but normally
unnecessary. _binary
may be useful to
preserve a hexadecimal or bit literal as a binary string in
contexts for which the literal is otherwise treated as a number.
For example, bit operations permit numeric or binary string
arguments in MySQL 8.0 and higher, but treat
hexadecimal and bit literals as numbers by default. To
explicitly specify binary string context for such literals, use
a _binary
introducer for at least one of the
arguments:
mysql>SET @v1 = X'000D' | X'0BC0';
mysql>SET @v2 = _binary X'000D' | X'0BC0';
mysql>SELECT HEX(@v1), HEX(@v2);
+----------+----------+ | HEX(@v1) | HEX(@v2) | +----------+----------+ | BCD | 0BCD | +----------+----------+
The displayed result appears similar for both bit operations,
but the result without _binary
is a
BIGINT
value, whereas the result with
_binary
is a binary string. Due to the
difference in result types, the displayed values differ:
High-order 0 digits are not displayed for the numeric result.
MySQL determines the character set and collation of a character string literal, hexadecimal literal, or bit-value literal in the following manner:
If both _charset_name
and
COLLATE
are
specified, character set
collation_name
charset_name
and collation
collation_name
are used.
collation_name
must be a
permitted collation for
charset_name
.
If _charset_name
is specified but
COLLATE
is not specified, character set
charset_name
and its default
collation are used. To see the default collation for each
character set, use the SHOW CHARACTER
SET
statement or query the
INFORMATION_SCHEMA
CHARACTER_SETS
table.
If _charset_name
is not specified
but COLLATE
is
specified:
collation_name
For a character string literal, the connection default
character set given by the
character_set_connection
system variable and collation
collation_name
are used.
collation_name
must be a
permitted collation for the connection default character
set.
For a hexadecimal literal or bit-value literal, the only
permitted collation is binary
because
these types of literals are binary strings by default.
Otherwise (neither _charset_name
nor COLLATE
is
specified):
collation_name
For a character string literal, the connection default
character set and collation given by the
character_set_connection
and
collation_connection
system variables are used.
For a hexadecimal literal or bit-value literal, the
character set and collation are
binary
.
Examples:
Nonbinary strings with latin1
character
set and latin1_german1_ci
collation:
SELECT _latin1'Müller' COLLATE latin1_german1_ci; SELECT _latin1 X'0A0D' COLLATE latin1_german1_ci; SELECT _latin1 b'0110' COLLATE latin1_german1_ci;
Nonbinary strings with utf8mb4
character
set and its default collation (that is,
utf8mb4_0900_ai_ci
):
SELECT _utf8mb4'Müller'; SELECT _utf8mb4 X'0A0D'; SELECT _utf8mb4 b'0110';
Binary strings with binary
character set
and its default collation (that is,
binary
):
SELECT _binary'Müller'; SELECT X'0A0D'; SELECT b'0110';
The hexadecimal literal and bit-value literal need no introducer because they are binary strings by default.
A nonbinary string with the connection default character set
and utf8mb4_0900_ai_ci
collation (fails
if the connection character set is not
utf8mb4
):
SELECT 'Müller' COLLATE utf8mb4_0900_ai_ci;
This construction (COLLATE
only) does not
work for hexadecimal literals or bit literals because their
character set is binary
no matter the
connection character set, and binary
is
not compatible with the
utf8mb4_0900_ai_ci
collation. The only
permitted COLLATE
clause in the absence
of an introducer is COLLATE binary
.
A string with the connection default character set and collation:
SELECT 'Müller';