MySQL HeatWave User Guide
As of MySQL 9.4.0, you can set external table options using
SQL syntax. The following options apply to
CREATE TABLE and
ALTER TABLE statements. For
ALTER TABLE statements,
ADD COLUMN is not supported for external
tables.
This topic contains the following sections:
The following syntax shows how to specify external table options with SQL syntax.
CREATE EXTERNAL TABLE table_name (col_name data_type ['Col_Options_SQL'], ...)Tbl_Options_SQL;
After creating a table, you can load data into the table in the following ways:
Load
the data automatically with Lakehouse Auto Parallel Load by using the
HEATWAVE_LOAD command. See how to
Load External Data Using Lakehouse Auto Parallel Load with an Existing Table.
Load
the data manually with the
ALTER TABLE statement and
the option to use Guided Load.
Tbl_Options_SQL(table_option[table_option]...)table_option: { | FILE_FORMAT [=]format_spec| FILES [=] (file_spec[,file_spec] ...) | ALLOW_MISSING_FILES [=] {0 | 1 | DEFAULT} | AUTO_REFRESH_SOURCE [=] {'string' | NONE} | VERIFY_KEY_CONSTRAINTS [=] {0 | 1 | DEFAULT} | STRICT_LOAD [=] {0 | 1 | DEFAULT} } format_spec: { [general_options] [field_options] [LINES [TERMINATED BY 'string']] [IGNORE number {LINES | ROWS}] } general_options: {general_option[general_option] ...} general_option: { [FORMAT {csv | avro | json | parquet | delta}] | [COMPRESSION {zip | gzip | bzip2 | auto}] | [CHARACTER SETcharset name] | [HEADER {ON | OFF}] } field_options:{ {FIELDS | COLUMNS}field_option[field_option]...} field_option:{ DATE FORMAT 'date_format' | DATETIME FORMAT 'datetime_format' | TIME FORMAT 'time_format' | ESCAPED BY {'char' | ''} | TERMINATED BY | { [OPTIONALLY] ENCLOSED BY {'char' | ''}} } file_spec: {file_attribute[file_attribute] ...} file_attribute: { {URI | URL} [=] 'uri_string' | FILE_NAME [=] 'object_name' | FILE_PATTERN [=] 'pattern' | FILE_PREFIX [=] 'prefix_name' | ALLOW_MISSING_FILES [=] {0 | 1 | DEFAULT} | STRICT_LOAD [=] {0 | 1 | DEFAULT} }
For a complete list of table options, see
CREATE TABLE.
For FILES options, you can set the
following:
Uniform Resource Identifiers (URI)
Available for MySQL HeatWave on OCI. You can provide an OCIFS URI, PAR URI, or Native URI.
Pre-Authenticated Requests (PAR)
See Object Storage Pre-Authenticated Requests in Oracle Cloud Infrastructure Documentation.
You cannot use resource principals to specify details for Object Storage files. To use resource principals, you must use External Table ENGINE_ATTRIBUTE Options with JSON syntax. Alternatively, you can convert resource principals to URIs.
For versions earlier than MySQL 9.5.0, you can override some
options at the column level by specifying column-level
parameters with the EXTERNAL_FORMAT
statement. For MySQL 9.5.0 and later, you can do this by
loading
data manually with Guided Load disabled. You can set
DATE, TIME,
DATETIME, and
TIMESTAMP as optional column-level
parameters.
As of MySQL 9.5.0, Lakehouse Auto Parallel Load and Guided Load can infer supported
formats for temporal data types by setting the appropriate
dialect-level temporal formats to auto.
The auto setting is the default value if
no temporal format is specified. If you set a specific
format at the column level for a temporal data type when
creating a table, and a different format is inferred when
loading the table to Lakehouse, the specific format used
for the column when creating the table is prioritized and
used for loading the column. See
About Lakehouse Auto Parallel Load Schema Inference to
learn more.
EXTERNAL_FORMAT {'date_format' | 'time_format' | 'timestamp_format'}
The following example sets the
date_format,
time_format, and
timestamp_format at the column level.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 date EXTERNAL_FORMAT '%W %M %Y',
col_2 time EXTERNAL_FORMAT '%H:%i:%s',
col_3 timestamp EXTERNAL_FORMAT '%Y-%m-%d %H:%i:%s',
col_4 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv');
The following example sets the
EXTERNAL_FORMAT option to
auto, which allows Lakehouse Auto Parallel Load and Guided Load to
infer supported formats for temporal data types.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 date EXTERNAL_FORMAT 'auto',
col_2 time EXTERNAL_FORMAT 'auto',
col_3 timestamp EXTERNAL_FORMAT 'auto',
col_4 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv');The following table parameters are optional and specify options at the table level.
ALLOW_MISSING_FILES: Handles missing
files based on the selected option. This overrides the
table parameter STRICT_LOAD for
missing files. The supported options are:
1: An error does not occur for
any missing file, and data loading continues with
the existing files. An error occurs if all files are
not available.
0: If any file is missing, an
error occurs and data is not loaded.
If you set no value or DEFAULT,
the setting is defined by
STRICT_LOAD. If no value is set
for STRICT_LOAD, the setting is
defined by the
sql_mode.
Review Table 4.6, “Combinations of Settings for ALLOW_MISSING_FILES and STRICT_LOAD”.
A missing file is defined as:
With the FILE_NAME parameter:
There is no file with that name.
With the FILE_PATTERN parameter:
There are no files that match the pattern.
With the FILE_PREFIX parameter:
There are no files with that prefix.
The following example sets
ALLOW_MISSING_FILES to
1 in the table options. If any of the
specified files are not available, an error does not
occur and data continues loading for the existing files.
If all files are missing, an error occurs. If a
file-specific setting is set, this may override the
ALLOW_MISSING_FILES setting.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 int,
col_2 int,
col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv',
URI = 'oci://mybucket@mynamespace/data_files/data_file_2.csv',
URI = 'oci://mybucket@mynamespace/data_files/data_file_3.csv')
ALLOW_MISSING_FILES = 1;
AUTO_REFRESH_SOURCE: Supported as of
MySQL 9.4.1. Specifies a Stream OCID for event-based
incremental load. This string value is verified as a
valid Stream resource identifier. Learn more about
Resource Identifiers from Oracle Cloud Infrastructure
Documentation. Learn how to
Refresh Data Using Event-Based Incremental Load.
The following example specifies a Stream OCID to set up event-based incremental load.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 int,
col_2 int,
col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv')
AUTO_REFRESH_SOURCE = 'ocid1.stream.oc1.iad...';
STRICT_LOAD: Loads data in strict
mode or non-strict mode, based on the selected option.
By default, STRICT_LOAD is set to the
sql_mode value. See
Strict SQL Mode. The
FILES parameter
STRICT_LOAD can override this
setting. The supported options are:
1: Loads the data in strict mode.
The data loading stops if there is an error due to
missing files, empty columns, formatting errors or
parsing errors.
0: Loads the data in non-strict
mode. Missing files, empty columns, formatting
errors, or parsing errors display a warning, and
data is loaded. If there are missing columns,
implicit default values are used in the absence of
user-provided default values. See
Data Type Default Values to learn more.
If you set no value or DEFAULT,
the setting is defined by the
sql_mode.
Review Table 4.6, “Combinations of Settings for ALLOW_MISSING_FILES and STRICT_LOAD”.
This setting overrides the global
sql_mode setting for
handling missing files. The default is the value of
sql_mode. See
Strict SQL Mode. The
FILES parameter
STRICT_LOAD can override this
setting.
The following example sets
STRICT_LOAD to 0
in the table options.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 int,
col_2 int,
col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv')
STRICT_LOAD = 0;
VERIFY_KEY_CONSTRAINTS: Lakehouse
validates primary key and unique key constraints during
the initial load based on the selected option. The
supported options are:
1:
Lakehouse validates primary key and unique key
constraints only during the initial load of the
table. If there are subsequent loads or refreshes of
the table, validation does not occur.
0: Lakehouse does not validate
primary key and unique key constraints.
DEFAULT: Lakehouse does not
validate primary key and unique key constraints.
The following example sets
VERIFY_KEY_CONSTRAINTS to
0 in the table options.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 int,
col_2 int,
col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv')
VERIFY_KEY_CONSTRAINTS = 0;
Refer to the following table to review how different
combinations of settings for
ALLOW_MISSING_FILES and
STRICT_LOAD affect errors for missing
files.
Table 4.6 Combinations of Settings for ALLOW_MISSING_FILES and STRICT_LOAD
| ALLOW_MISSING_FILES | STRICT_LOAD | Possible Errors |
|---|---|---|
| 1 | 1 | No error generated unless all files are missing. |
| 1 | 0 | |
| 1 | Not set | |
| Not set | 0 | |
| 0 | 1 | Error generated for any missing files. |
| 0 | 0 | |
| 0 | Not set | |
| Not set | 1 | |
| Not set | Not set | Error generation depends on setting for
sql_mode. |
File specifications are enclosed in
FILES.
A FILES clause is required if you specify
any additional external table options. The following
parameters are available:
FILES: Defines the Object Storage
files. You can define the file locations using uniform
resource identifier (URI) or PAR URL.
As of MySQL 9.4.1, Lakehouse supports a maximum of 164
file locations. Before MySQL 9.4.1, Lakehouse supports
a maximum of 170 file locations. To define more than the
maximum number of files, store the files under the same
bucket or use FILE_PREFIX or
FILE_PATTERN.
FILES parameters for
uniform
resource identifiers:
URI: The URL of the URI you
specify. Do not provide a pattern, prefix, or name
parameter with the URI. These
details are inferred by the
object_path you define.
The following examples use the URI syntax for OCIFS:
oci://bucket_name@namespace_name/object_path
Name: If the object_path is
neither a glob pattern nor prefix.
The following example specifies a single file with an OCIFS URI.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 int,
col_2 int,
col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv');
Prefix: If the object_path is
not a glob pattern and ends with an unencoded
/ character, such as a folder
path.
The following example uses a prefix with an OCIFS URI. The prefix specifies files in an Object Storage folder.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 int,
col_2 int,
col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'oci://mybucket@mynamespace/data_files/');
Glob pattern: If the
object_path contains at least
one of the following characters:
?, *, or
[. To use these characters as
literals, you need to escape them or encode them
as needed depending on the URI syntax. Regex
patterns are not supported. See
Glob
Patterns from the Oracle Cloud Infrastructure
Documentation to learn more.
The following example uses an unencoded
* character to create a glob
pattern.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 int,
col_2 int,
col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_files/data_file_*.csv');
Review additional examples that use URIs to create external tables.
FILES parameters for
pre-authenticated
requests:
URL: The PAR URL. Review
PAR
Recommendations.
The following example specifies a single file with a PAR.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 int,
col_2 int,
col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URL = 'https://objectstorage.us-ashburn-1.oraclecloud.com/p/.../n/tenancy_1/b/bucket_1/o/data_file_1.csv');
FILE_NAME: A specific Object
Storage file name. For
pre-authenticated
requests, the file name provided in the PAR
URL is considered.
The following example uses the
FILE_NAME parameter to specify a
file in an Object Storage bucket. The PAR points to
the Object Storage bucket.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 int,
col_2 int,
col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URL = 'https://objectstorage.us-ashburn-1.oraclecloud.com/p/.../n/tenancy_1/b/bucket_1/o/'
FILE_NAME = 'data_files/data_file_1.csv');
FILE_PATTERN: A regular
expression that defines a set of Object Storage
files.
The following example uses a pattern to load files that have a numerical suffix.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 int,
col_2 int,
col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URL = 'https://objectstorage.us-ashburn-1.oraclecloud.com/p/.../n/tenancy_1/b/bucket_1/o/'
FILE_PATTERN = 'data_files/data_file_\\d+\\.csv');
See the following to learn more:
FILE_PREFIX: The prefix for a set
of Object Storage files. The prefix or bucket name
present in the PAR URL is considered.
The following example uses a prefix to specify files to load in an Object Storage folder.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 int,
col_2 int,
col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URL = 'https://objectstorage.us-ashburn-1.oraclecloud.com/p/.../n/tenancy_1/b/bucket_1/o/'
FILE_PREFIX = 'data_files/');
Review additional examples that use PARs to create external tables.
The following parameters enclosed in
FILES are optional:
ALLOW_MISSING_FILES: Handles missing
files based on the selected option. This overrides the
table option parameter STRICT_LOAD
for missing files. To learn about the default value, see
the ALLOW_MISSING_FILES table option
parameter in
Table Parameters.
The supported options are:
1: If any file is missing, no
error occurs and data loading continues with the
existing files. An error occurs if all the files are
not available.
0: If any file is missing, an
error occurs and data is not loaded.
The following example uses a glob pattern specified with
a uniform resource identifier (URI) and sets
ALLOW_MISSING_FILES to
1 in the FILES
parameters. If any of the specified files in the pattern
are not available, an error does not occur and data
continues loading for the existing files. If all files
are missing, an error occurs.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 int,
col_2 int,
col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_[1-10].csv'
ALLOW_MISSING_FILES = 1);
The following example sets
ALLOW_MISSING_FILES to
0 for the first file and
ALLOW_MISSING_FILES to
1 for the second file. If the first
file is missing, an error occurs. If the second file is
missing, the load continues with the loading of the
first file.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 int,
col_2 int,
col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv' ALLOW_MISSING_FILES = 0,
URI = 'oci://mybucket@mynamespace/data_files/data_file_2.csv' ALLOW_MISSING_FILES = 1);
STRICT_LOAD: Loads data in strict
mode or non-strict mode, based on the selected option.
This overrides the STRICT_LOAD table
option parameter. To learn about the default value, see
the STRICT_LOAD table option
parameter in
Table Parameters.
The supported options are:
1: Loads the data in strict mode.
The data loading stops if there is an error due to
missing files, empty columns, formatting errors or
parsing errors.
0: Loads the data in non-strict
mode. Missing files, empty columns, formatting
errors, or parsing errors display a warning, and
data is loaded. If there are missing columns,
implicit default values are used in the absence of
user-provided default values. See
Data Type Default Values to learn more.
The following example sets
STRICT_LOAD to 0
in the FILES parameters.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 int,
col_2 int,
col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv'
STRICT_LOAD = 0);
File format options are enclosed in
FILE_FORMAT.
The following file format options are available:
FORMAT: The file format defined in a
table. You can define only one file format per table.
The supported file formats are:
csv: The default file format.
The following example specifies a CSV file.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 int,
col_2 int,
col_3 int)
FILE_FORMAT = (FORMAT csv)
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv');
json: Only Newline Delimited JSON
files are supported. Tables created with
json format must only have a
single column that conforms to the
JSON data type.
The following example specifies a JSON file.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 json)
FILE_FORMAT = (FORMAT json)
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.json');
parquet: The
Parquet
data type.
The following example specifies a Parquet file.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 int,
col_2 int,
col_3 int)
FILE_FORMAT = (FORMAT parquet)
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.parquet');
avro: The
Avro
data type.
The following example specifies an Avro file.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 int,
col_2 int,
col_3 int)
FILE_FORMAT = (FORMAT avro)
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.avro');
delta: Supported as of MySQL
9.5.0. Loads delta lake tables. See
Delta
Lake Tables and
Limitations
for Delta Lake Tables to learn more.
The following example specifies a delta lake table.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 int,
col_2 int,
col_3 int)
FILE_FORMAT = (FORMAT delta)
FILES = (URI = 'oci://mybucket@mynamespace/data_files/');
COMPRESSION: Use this to load
compressed files.
Compression is supported for all file formats:
csv: The default value is no
compression. You can set
compression to
zip, gzip, or
bzip2.
The following example specifies a CSV file
compressed in zip format.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 int,
col_2 int,
col_3 int)
FILE_FORMAT = (FORMAT csv COMPRESSION 'zip')
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv.zip');
json: The default value is no
compression. You can set
compression to
zip, gzip, or
bzip2.
The following example specifies a JSON file
compressed in gzip format.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 json)
FILE_FORMAT = (FORMAT json COMPRESSION 'gzip')
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.json.gz');
avro: The default value is
auto. You cannot set any other
options for COMPRESSION, as Avro
files declare their compression format in the file
metadata.
The following specifies an Avro file with the
auto compression option.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 int,
col_2 int,
col_3 int)
FILE_FORMAT = (FORMAT avro COMPRESSION 'auto')
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.avro');
parquet: The default value is
auto. You cannot set any other
options for COMPRESSION, as
Parquet files declare their compression format in
the file metadata.
The following example specifies a Parquet file with
the auto compression option.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 int,
col_2 int,
col_3 int)
FILE_FORMAT = (FORMAT parquet COMPRESSION 'auto')
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.parquet');Note the following when loading compressed files:
Each compressed file must only contain a single original file. You cannot provide a compressed file with archives or folders with multiple files.
Files in zip format must be
compressed with the Deflate algorithm.
If loading multiple files, all the files must be in the same compression format.
When loading multiple compressed files, it is best to prepare files of similar size, and in the range of 80-120 MB each. Otherwise, you may experience longer loading times.
CHARACTER SET: Defines the character
encoding. The default is 'utf8mb4'.
HEADER: Adds a header row to the
CSV file based on the selected option
(ON or OFF). The
default is OFF.
If HEADER and
IGNORE are both specified,
Lakehouse first skips the number of rows, and then
uses the next row as the header row.
The following field options are available:
The FIELDS and
COLUMNS clauses define the format for
fields and columns. The following parameters are
available:
DATE FORMAT: The date format. See
date_format.
You can also set date formats for each column by
using the EXTERNAL_FORMAT
statement, which overrides the format in the table
parameter.
TIME FORMAT: The time format. See
String and Numeric Literals in Date and Time Context.
You can also set time formats for each column by
using the EXTERNAL_FORMAT
statement, which overrides the format in the table
parameter.
DATETIME FORMAT: The datetime
format. See
date_format. You can
also set datetime formats for each column by using
the EXTERNAL_FORMAT statement,
which overrides the format in the table parameter.
As of MySQL 9.5.0, Lakehouse Auto Parallel Load and Guided Load can infer
supported formats for temporal data types by setting
the appropriate field option to
auto. The auto
setting is the default value if no temporal format
is specified. See
About Lakehouse Auto Parallel Load Schema Inference
to learn more.
The following example sets the DATE
FORMAT, TIME FORMAT,
and DATETIME FORMAT.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 date,
col_2 time,
col_3 timestamp)
FILE_FORMAT = (FORMAT csv FIELDS DATE FORMAT '%W %M %Y' TIME FORMAT '%H:%i:%s' DATETIME FORMAT '%Y-%m-%d %H:%i:%s')
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv');
The following example sets the DATE
FORMAT, TIME FORMAT,
and DATETIME FORMAT to
auto, which allows Auto Parallel Load and Guided Load
to infer supported formats for temporal data types.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 date,
col_2 time,
col_3 timestamp)
FILE_FORMAT = (FORMAT csv FIELDS DATE FORMAT 'auto' TIME FORMAT 'auto' DATETIME FORMAT 'auto')
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv');
TERMINATED BY: Specifies one or
more characters used to enclose fields. The maximum
field delimiter length is 64 characters. You can add
any string as a
custom_field_delimiter.
When FIELDS TERMINATED BY is set
to auto, Lakehouse
automatically detects the field delimiters.
The following field delimiters are detected automatically by Lakehouse:
|: Pipe.
,: Comma
\t: Tab
;: Semicolon
The following example sets the FIELDS
TERMINATED BY and LINES
TERMINATED BY options.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 int,
col_2 int,
col_3 int)
FILE_FORMAT = (FORMAT csv FIELDS TERMINATED BY '\r\n' LINES TERMINATED BY ';')
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv');
[OPTIONALLY] ENCLOSED BY:
Specifies one or more characters used to enclose
fields. The default is '\''. You
can add any string as a
custom_quotation_marks.
As of MySQL 9.5.0, you have the option to ignore the
quote character by setting its value to an empty
string.
ESCAPED BY: Specifies one or more
characters used to escape special characters. The
default is '\'. You might have to
use an additional escape character depending on the
client you use. You can add any string as a
custom_escape_character.
As of MySQL 9.5.0, you have the option to ignore the
escape character by setting its value to an empty
string.
Consider the following example for setting empty
strings for escape and quotation characters. You
have the following data in a CSV file:
"{""doc1"":{\""doc2\"":22}}".
You create an external table and set an empty
string for the escape character:
ESCAPED BY ''.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 text)
FILE_FORMAT = (FORMAT csv FIELDS ESCAPED BY '')
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv');
The value loads as:
{"doc1":{\"doc2\":22}}.
You create an external table and set an empty
string for the quotation character:
ENCLOSED BY ''.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 text)
FILE_FORMAT = (FORMAT csv FIELDS ENCLOSED BY '')
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv');
The value loads as:
"{""doc1"":{""doc2"":22}}".
You create an external table and set empty strings for both the escape character and quotation character.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 text)
FILE_FORMAT = (FORMAT csv FIELDS ESCAPED BY '' OPTIONALLY ENCLOSED BY '')
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv');
The value loads as:
"{""doc1"":{\""doc2\"":22}}".
The following example sets the ESCAPED
BY and [OPTIONALLY] ENCLOSED
BY options.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 int,
col_2 int,
col_3 int)
FILE_FORMAT = (FORMAT csv FIELDS ESCAPED BY '\\' OPTIONALLY ENCLOSED BY '\"')
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv');The following additional file format options are available:
The LINES clause defines the format
for each new line. The following parameters are
available:
TERMINATED BY: Specifies one or
more characters used to delimit records. The maximum
record delimiter length is 64 characters. You can
add any string as a delimiter.
The default record delimiter for is
'\n'.
For CSV files, if you set
LINES TERMINATED BY to
auto, Lakehouse automatically
detects the record delimiters and composite record
delimiters with field delimiters as prefixes.
The following record delimiters detected automatically by Lakehouse:
\r: Carriage return.
\n: Line feed. This is the
default.
\r\n : Carriage return and
line feed
IGNORE: The number of rows to skip at
the start of the file. The default value is
0 and the maximum value is
20.
If HEADER and
IGNORE are both specified,
Lakehouse first skips the number of rows, and then
uses the next row as the header row.
The following example sets IGNORE and
HEADER. Since both parameters are
set, the first row is skipped and the second row is set
as the header row.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 int,
col_2 int,
col_3 int)
FILE_FORMAT = (FORMAT csv HEADER ON IGNORE 1 LINES)
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv');The following examples load tables with additional options and show the proper order of options in the commands.
mysql> CREATE EXTERNAL TABLE table_1(
col_1 int,
col_2 int,
col_3 DATE EXTERNAL_FORMAT '%e/%c %Y')
FILE_FORMAT = (FORMAT csv COMPRESSION 'zip' CHARACTER SET utf8mb4 HEADER OFF
FIELDS DATE FORMAT '%Y%M%D' TERMINATED BY ',' LINES TERMINATED BY '\n' IGNORE 2 LINES)
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv.zip' ALLOW_MISSING_FILES = 0);mysql> CREATE EXTERNAL TABLE table_1(
col_1 int,
col_2 int,
col_3 int)
FILE_FORMAT = (FORMAT csv HEADER ON FIELDS ESCAPED BY '\\' IGNORE 1 LINES)
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_[1-5].csv' STRICT_LOAD = 1)
ALLOW_MISSING_FILES = 1;mysql> CREATE EXTERNAL TABLE table_1(
col_1 int,
col_2 int,
col_3 int)
FILE_FORMAT = (FORMAT csv HEADER ON FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"' IGNORE 1 LINES)
FILES = (URI = 'oci://mybucket@mynamespace/data_files/data_file_1.csv')
VERIFY_KEY_CONSTRAINTS = 0;Learn how to create external tables using Auto Parallel Load or manually.
Learn how to Refresh Data Using Event-Based Incremental Load.