2.9 Loading a File
Use the LOAD
command in SQLcl to load a comma-separated
value file from a local directory or cloud storage location into a table.
2.9.1 LOAD Command
Loads a comma-separated value (csv) file from a local directory or cloud storage location into a table.
Syntax
LOAD [TABLE] [schema.]table_name { <file-specification> | <cloud-storage-specification> }
[NEW | SHOW | SHOW_DDL | CREATE |CREATE_DDL]
where
[schema.]table_name identifies the table to load. If the schema is omitted, the table for the connected user schema is loaded.
file-specification has the following syntax:
{ <fully-qualified-file-name> | <file-name> }
-
fully-qualified-file-name: Identifies the full path to the file to load.
-
file-name: Identifies the file to load. The file must be located in the default path.
cloud-storage-specification has the following syntax:
{ CLOUDSTORAGE | CS | CLOUD_STORAGE } [ <url> | <qualified-name> ]
-
url: Complete URL for the cloud storage file if a default cloud storage URL is not set using the Cloud Storage command.
-
qualified-name: Name of the object, optionally qualified by the namespace and bucket. The qualified name combined with the URL specified by the Cloud Storage command must fully identify the object URL. If url and qualified-name are omitted, the default Cloud Storage URL must be set to the object.
NEW
creates a table and loads data.
[SHOW | SHOW_DDL]
executes the DDL
generation phase and shows the DDL.
[CREATE | CREATE_DDL]
executes the DDL
generation phase and creates the table.
Use SET LOAD
and SET LOADFORMAT
to
specify properties for DDL analysis and generation.
Create table DDL generation pre-scans the data file to determine column
properties. Use SET LOAD SCAN
<n> to specify the number of rows to scan for DDL. 100
is the default. To turn off scanning, set to 0.
Use SET LOAD COL_SIZE
to change column
sizes that are generated. Use SET LOAD MAP_NAMES
to
map file column names to table column names.
For more information about the Cloud Storage command, see Using Cloud Storage.
The defaults for the file format are:
-
The columns are delimited by a comma and may optionally be enclosed in double quotes.
-
Lines are terminated with standard line terminators for Windows, UNIX or Mac.
-
File is encoded UTF8.
The default load:
-
Processes with 50 rows per batch.
-
If AUTOCOMMIT is set in SQLcL, a commit is done every 10 batches.
-
The load is terminated if more than 50 errors are found.
Use SET LOADFORMAT
options for reading the file
(delimiter, enclosures).
Use SET LOAD
options for loading the data (rows per
batch, date formats).
Example
The following example shows how to load a file from local storage into a table.
--Create Table "countries"
create table countries(countries_id NUMBER(5),countries_name VARCHAR2(40));
Table COUNTRIES created
--Load file COUNTRIES_DATA_TABLE.csv in local storage to "countries" table
load countries C:\Users\JDOE\SQLcl\COUNTRIES_DATA_TABLE.csv
format csv
column_names on
delimiter ,
enclosure_left "
enclosure_right "
encoding UTF8
row_limit off
row_terminator default
skip_rows 0
skip_after_names
--Number of rows processed: 30
--Number of rows in error:
0 - SUCCESS: Load processed without errors
--Check the number of rows in countries table
select count(*) from countries;
COUNT(*)
–--------
30
The following example shows how to load data into a new table EMP.
load emp empfile.csv new
--Create new table and load data into table HR.EMP
csv
column_names on
delimiter ,
enclosures ""
encoding UTF8
row_limit off
row_terminator default
skip_rows 0
skip_after_names
#INFO DATE format detected: RRRR-MM-DD
CREATE TABLE HR.EMP
(
EMPLOYEE_ID NUMBER(5),
FIRST_NAME VARCHAR2(26),
LAST_NAME VARCHAR2(26),
EMAIL VARCHAR2(26),
PHONE_NUMBER VARCHAR2(26),
HIRE_DATE DATE,
JOB_ID VARCHAR2(26),
SALARY NUMBER(9, 2),
COMMISSION_PCT VARCHAR2(26),
MANAGER_ID NUMBER(5),
DEPARTMENT_ID NUMBER(5)
)
;
#INFO Table created
#INFO Number of rows processed: 21
#INFO Number of rows in error: 0
#INFO Last row processed in final committed batch: 21
SUCCESS: Processed without errors
The following example shows how to create a new table from a local file.
load emp1 empfile.csv create_ddl
--Create new table HR.EMP1
csv
column_names on
delimiter ,
enclosures ""
encoding UTF8
row_limit off
row_terminator default
skip_rows 0
skip_after_names
batch_rows 50
batches_per_commit 10
clean_names transform
column_size rounded
commit on
date_format
errors 50
map_column_names off
method insert
timestamp_format
timestamptz_format
locale English United States
scan_rows 100
truncate off
unknown_columns_fail on
--Pre-scans the date format
#INFO DATE format detected: RRRR-MM-DD
CREATE TABLE SYSTEM.EMP1
(
EMPLOYEE_ID NUMBER(5),
FIRST_NAME VARCHAR2(26),
LAST_NAME VARCHAR2(26),
EMAIL VARCHAR2(26),
PHONE_NUMBER VARCHAR2(26),
HIRE_DATE DATE,
JOB_ID VARCHAR2(26),
SALARY NUMBER(9, 2),
COMMISSION_PCT VARCHAR2(26),
MANAGER_ID NUMBER(5),
DEPARTMENT_ID NUMBER(5)
)
;
#INFO Table created
SUCCESS: Processed without errors
2.9.1.1 SET LOAD Command
SET LOAD
enables you to set options for loading data
when using the LOAD
command.
Syntax
SET LOAD default | [options...]
where
default
means load method properties return to default values.
options represents the following:
-
BATCH_ROWS|BATCHROWS <number_of_rows>
Data loading is done in batches. Specifies the number of rows to include in each batch.
-
BATCHES_PER_COMMIT|BATCHESPERCOMMIT <batches_per_commit>
Commit after processing number_of_batches. If the number is equal to 0, commit happens at the end of the load. If the number is greater than or equal to 0,
COMMIT ON
is set. -
CLEAN_NAMES [ TRANSFORM | TRANSFORM128 | QUOTE | QUOTE128 | UNIQUE ]
Identifies the rule for making table and column names compliant with database identifiers. Names are cleaned before they are mapped to provide consistency with previous releases. If both
CLEAN_NAMES
andMAP_COLUMN_NAMES
are used, then clean names should be specified.The standard identifiers are:
- No longer than 30 or 128 characters.
- Not a reserved word.
- Starts with a letter and contains only letters, digits, or one of _$#.
- Uppercase
- Names that do not comply must be quoted. Length rules always apply.
Note:
Data that is enclosed in quotes will have quotes in the header row removed before names are cleaned.TRANSFORM (default)
Indicates that names are transformed as follows:
- Names are in uppercase.
- If the name starts and ends with the quote character, the quotes are removed.
- Names that are reserved words are appended with a dollar sign ($).
- Names that start with a number or special character is prefixed with an X.
- Spaces and hyphens are replaced with underscores (_). $ and # characters are retained.
- Special characters other than $ and # is replaced with the number sign (#).
- Names are truncated to 30 or 128 characters depending on database MAX_STRING_SIZE.
- After names are cleaned, non-unique names within the column set are appended with a unique sequence number. If truncation is required, the sequence number is maintained.
TRANSFORM (default)
Applies all transform rules. Names may be 128 characters.
QUOTE
Quote non-compliant names and shorten to 30 or 128 characters depending on database MAX_STRING_SIZE.
QUOTE128
Quote non-compliant names. Names may be 128 characters.
UNIQUE
Compatibility option with previous releases of load service. Names that are not unique within the column set are appended with a unique sequential number. Truncation is not provided.
-
COLUMN_SIZE|COLUMNSIZE|COLSIZE {ACTUAL|ROUND|ROUNDED|MAX|MAXIMUM}
Create table column size strategy.
ACTUAL
uses the largest size found during the scan.ROUND|ROUNDED
uses a size a little larger than the largest size found during the scan.MAX|MAXIMUM
uses the database maximum size for the data type that was detected. -
COMMIT {ON|OFF}
Enable or disable data commits.
-
DATE|DATE_FORMAT|DATEFORMAT format_mask
The format of all DATE data type columns loaded. Specify no format_mask or DEFAULT to use database default.
For DATE columns, if format is not set and
SCAN_ROWS = 0
, the data is not scanned for a valid mask.ERRORS {number_of_rows | UNLIMITED}|-1
: Indicates the number of error rows allowed.If this number is exceeded, the load will be terminated.
-1 and UNLIMITED indicate no error limit.
All rows in a batch may be in error if any row fails.
-
LOCALE { <language country> | DEFAULT | "" }
Specify locale language and optionally country.
DEFAULT|""
: Set to default locale. -
MAP_COLUMN_NAMES|MAPCOLUMNNAMES|MAPNAMES { OFF| (<file-col-name>=<table-col-name>,...) }
Provide a mapping from the column names specified in the file to column names in the table.
-
METHOD INSERT
Method to use for data loads.
-
SCAN_ROWS|SCANROWS|SCAN <1-5000>
Identify the number of rows to scan for create table generation. Default is 100 rows.
-
TIMESTAMP|TIMESTAMP_FORMAT|TIMESTAMPFORMAT
The format of all TIMESTAMP data type columns being loaded. Specify no format_mask or DEFAULT to use database default. For TIMESTAMP columns, if format is not set and SCAN_ROWS not equal to 0, the data is scanned for a valid mask.
-
TIMESTAMPTZ|TIMESTAMPTZ_FORMAT|TIMESTAMPTZFORMAT
The format of all TIMESTAMPTZ data type columns being loaded. Specify no format_mask or DEFAULT to use database default. For TIMESTAMPTZ columns, if format is not set and SCAN_ROWS not equal to 0, the data is scanned for a valid mask.
-
TRUNCATE {OFF|ON}
Truncate ON truncates the table before loading
-
UNKNOWN_COLUMNS_FAIL|UNKNOWNCOLUMNSFAIL|UNKNOWNFAIL {ON|OFF}
ON
: Terminates the load if any columns in the file do not map to a column in the table.OFF
: Allows the load to proceed when columns in the file do not map to a column in the table.
2.9.1.2 SET LOADFORMAT Command
SET LOADFORMAT
enables you to set format properties
for loading data when using the LOAD
command.
Syntax
SET LOADFORMAT [ default|csv|delimited|html|insert|json|json-formatted|loader|t2|xml] [options...]
where
default
: Load format properties return to default values.csv
: Comma-separated values.delimited
: (csv synonym) Delimited format, comma separated values by default.-
html
: Hypertext Markup Language. For theUNLOAD
command only. insert
: SQL insert statements. For theUNLOAD
command only.-
json
: JavaScript Object Notation. For theUNLOAD
command only. json-formatted
: Pretty-formatted JSON. For theUNLOAD
command only.loader
: Oracle SQL Loader format. For theUNLOAD
command only.t2
: T2 Metrics. For theUNLOAD
command only.xml
: Extensible Markup Language. For theUNLOAD
command only.
options represent the following clauses:
COLUMN_NAMES|COLUMNNAMES|NAMES {ON|OFF}
: Header row with column names.DELIMITER
{separator}: Delimiter separating fields in the record.DOUBLE [OFF]
: (Import only) Embedded right enclosures are doubled. OFF indicates embedded right enclosures are not doubled and embedded right enclosures can lead to unexpected results.ENCLOSURES {enclosures|OFF}
: Optional left and right enclosures.- OFF indicates no enclosures
- If 1 character is specified, sets left and right enclosures to this value.
- If 2 or more characters are specified, sets left to the first character, right to the second character and ignores the remaining characters.
- To set multiple character enclosures, use Set ENCLOSURE_LEFT and ENCLOSURE_RIGHT.
ENCODING {encoding|OFF|""}
: Encoding of load file. OFF and "" reset to default encoding for environment.LEFT|ENCLOSURE_LEFT|ENCLOSURELEFT {enclosure|OFF}
: Set a 1 or more character left enclosure. If no ENCLOSURE_RIGHT is specified, it is used for both left and right. OFF indicates no enclosures.RIGHT|ENCLOSURE_RIGHT|ENCLOSURERIGHT {enclosure|OFF}
: Set a 1 or more character right enclosure. OFF indicates no right enclosure.ROW_LIMIT|ROWLIMIT|LIMIT} {number_of_rows|OFF|""}
: Maximum number of rows to read, including the header. OFF and "" set to not limit.SKIP|SKIP_ROWS|SKIPROWS {number_of_rows|OFF|""}
: Number of rows to skip.[[SKIP_AFTER_NAMES|SKIPAFTERNAMES|AFTER]|[SKIP_BEFORE_NAMES|SKIPBEFORENAMES|BEFORE]]
: Skip the rows before or after the (header) Column Names row.TERM|ROW_TERMINATOR {terminator|""|DEFAULT|CR|CRLF|LF}
: Character(s) indicating end of row. If the file contains standard line end characters, the line_end does not need to be specified.- "" or DEFAULT specifies the default (any standard
terminator) for the
LOAD
command. - "" or DEFAULT specifies the environment default for the
UNLOAD
command. - CRLF specifies WINDOWS terminator, generally for the
UNLOAD
command. - LF specifies UNIX terminator, generally for the
UNLOAD
command. - CR specifies MAC terminator, generally for the
UNLOAD
command.
- "" or DEFAULT specifies the default (any standard
terminator) for the
Examples
SQL> set loadformat delimited
7369,"SMITH","CLERK",7902,17-DEC-80,800,,20,5555555555554444
SQL> set loadformat delimited enclosures <> line_end {eol}
7369,<SMITH>,<CLERK>,7902,17-DEC-80,800,,20,5555555555554444{eol}
SQL> set loadformat default (restore default settings)
7369,"SMITH","CLERK",7902,17-DEC-80,800,,20,5555555555554444