Query External Tables with Implicit Partitioning
On Autonomous AI Database you can create implicit partitioned external tables from Hive style partitioned data or from simple folder partitioned data stored on your Cloud Object Store.
About External Tables with Implicit Partitioning
On Autonomous AI Database use the DBMS_CLOUD.CREATE_EXTERNAL_TABLE procedure to create implicit partitioned external tables from Hive style partitioned data or from simple folder partitioned data stored on your Cloud Object Store.
By passing the appropriate options to this procedure, the partitions are derived from the source data. The partitioned external tables support runtime discovery of partition columns and their values. Runtime discovery of changes in the underlying object store structure, such as adding or removing objects, simplifies the maintenance process by eliminating the need for additional synchronization procedures required by DBMS_CLOUD.CREATE_EXTERNAL_PART_TABLE. This ensures that the data is up to date at query runtime.
With implicit partitioning, Autonomous AI Database automatically determines the columns a table is “partitioned on” based on the object storage source hierarchical file structure. There is no need to explicitly declare a partitioning scheme. Implicit partitioning provides partition-table-like performance benefits without the need to explicitly define a partitioned external table using the DBMS_CLOUD.CREATE_EXTERNAL_PART_TABLE procedure.
In order to optimize query response time for queries of very large data sets using implicit partitioning, you can enable optimization by setting strict_column_order to true. The strict_column_order parameter is a sub-option of the implicit_partition_config format option. This optimization only applies when partition_type is hive. See Query External Implicit Partitioned Data with Hive Format Source File Organization for an example and DBMS_CLOUD Package Format Options for more information on the implicit partitioning format parameters.
Implicit partitioned external tables support the following naming styles for partitioned objects in object store:
-
Hive Naming Format: Data in object storage that has been generated from Hive has the following format.
OBJBUCKET/<table>/<partcol1>=<value1>/<partcol2>=<value2>/file.parquetFor an example, See Query External Implicit Partitioned Data with Hive Format Source File Organization
-
Non-Hive “Pathtail” Naming Format: A second format that is often found in data lakes is similar to the hive format, but the folder name does not include the partition columns.
OBJBUCKET/<table>/<value1>/<value2>/file.parquetFor an example, See Query External Implicit Partitioned Non-Hive Style Data.
For more information on naming formats, see About External Tables with Source File Partitioning.
Using the DBMS_CLOUD.CREATE_EXTERNAL_TABLE procedure, implicit partitioned external tables can be created in the following ways:
-
Optimize implicit partitioning. Use the
implicit_partition_configoption to enable implicit partitioning and optimization of elapsed query time.In this example, implicit partitioning is enabled by setting
partition_typeto hive. The only valid value forpartition_typeis hive. Optimization is enabled by settingsetting strict_column_orderto true. Thepartition_columnsoption specifies which columns are partitioned.For example:
BEGIN dbms_cloud.create_external_table ( table_name => 'partitions1', file_uri_list => 'https://swiftobjectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname/*.parquet', column_list => 'c varchar2(100), y number, total_sales number', format => '{"type":"parquet", "implicit_partition_config":{ "partition_type":"hive", "strict_column_order":true, "partition_columns":["org","tenant","dbtype","year","month","day"] } }'); END; / -
Set the partition type to hive with a given list of partition columns
In this case,
implicit_partition_typeis set tohiveandimplicit_partition_columnsprovides a list of partition columns.For example:
BEGIN DBMS_CLOUD.CREATE_EXTERNAL_TABLE ( table_name => 'mysales', credential_name => 'MY_CREDENTIAL', file_uri_list => 'https://swiftobjectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname/*.parquet', column_list => 'product varchar2(100), units number, country varchar2(100),year (number), month varchar2(2)', format => '{"type":"parquet", "implicit_partition_type":"hive", "implicit_partition_columns":["country","year","month"]}'); END; / -
Set the partition type to hive without providing a list of partition columns
In this case,
implicit_partition_typeis set tohiveandimplicit_partition_columnsis not provided. The partition columns are automatically detected by searching for '=' in the path specified byfile_uri_list. Thecolumn_nameis to the left of the '=' and the value is on the right side. If acolumn_nameis not found in the path, an error is thrown.For example:
BEGIN DBMS_CLOUD.CREATE_EXTERNAL_TABLE ( table_name => 'mysales', credential_name => 'MY_CREDENTIAL', file_uri_list => 'https://swiftobjectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname/*.parquet', column_list => 'product varchar2(100), units number, country varchar2(100),year (number), month varchar2(2)', format => '{"type":"parquet", "implicit_partition_type":"hive"'); END; / -
Provide a list of partition columns without specifying the type
In this case,
implicit_partition_typeis not set andimplicit_partition_columnsprovides a list of columns.For example:
BEGIN DBMS_CLOUD.CREATE_EXTERNAL_TABLE ( table_name => 'mysales', credential_name => 'MY_CREDENTIAL', file_uri_list => 'https://swiftobjectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname/*.parquet', column_list => 'product varchar2(100), units number, country varchar2(100),year (number), month varchar2(2)', format => '{"type":"parquet", "implicit_partition_columns":["country","year","month"]}'); END; /
See Query External Implicit Partitioned Data with Hive Format Source File Organization and CREATE_EXTERNAL_TABLE Procedure for further information.
Query External Implicit Partitioned Data with Hive Format Source File Organization
Use the DBMS_CLOUD.CREATE_EXTERNAL_TABLE procedure to create an implicit partitioned external table from data in Object Store that was generated from Hive data.
The sample source files in the follow example, use this naming format:
OBJBUCKET/<table>/<partcol1>=<value1>/<partcol2>=<value2>/file.parquet
Consider the following sample source files:
OBJBUCKET/sales/country=USA/year=2024/month=01/sales-2024-01.parquet
OBJBUCKET/sales/country=USA/year=2024/month=02/sales-2024-02.parquet
OBJBUCKET/sales/country=USA/year=2024/month=02/sales-2024-03.parquet
To create an implicit partitioned external table with data stored in this sample Hive format, do the following:
-
Store Object Store credentials using the procedure
DBMS_CLOUD.CREATE_CREDENTIAL.For example:
BEGIN DBMS_CLOUD.CREATE_CREDENTIAL ( credential_name => 'MY_CREDENTIAL', username => 'adb_user@example.com', password => 'password' ); END; /Creating a credential to access Oracle Cloud Infrastructure Object Store is not required if you enable resource principal credentials. See Use Resource Principal to Access Oracle Cloud Infrastructure Resources for more information.
This operation stores the credentials in the database in an encrypted format. You can use any name for the credential name. Note that this step is required only once unless your object store credentials change. Once you store the credentials you can then use the same credential name for creating external tables.
See CREATE_CREDENTIAL Procedure for information about the
usernameandpasswordparameters for different object storage services. -
Create an implicit partitioned external table on top of your source files using the procedure
DBMS_CLOUD.CREATE_EXTERNAL_PART_TABLE.-
Case 1: Enable implicit partitioning.
In this example, since the format option
implicit_partition_columnsis not provided, partition columns are automatically detected and implicit partitioning is enabled by setting the format optionimplicit_partition_typetohive.However, if a list of columns is given, then those columns are used as partition columns and Autonomous AI Database doesn't try to discover the columns.
To detect the partition columns Autonomous AI Database starts searching from the beginning of the path, specified by
file_uri_list, for '='. When found, the left part of '=' up to the last '/' is taken as column (e.g. “country”) and the right part of '=', until the first '/' as value (e.g. “USA”). The search continues for '=' until after the '/' that follows the first partition value, detecting the second '=' in the path, and so on.BEGIN DBMS_CLOUD.CREATE_EXTERNAL_TABLE ( table_name => 'mysales', credential_name => 'MY_CREDENTIAL', file_uri_list => 'https://swiftobjectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname/*.parquet', column_list => 'product varchar2(100), units number, country varchar2(100),year (number), month varchar2(2)', format => '{"type":"parquet", "implicit_partition_type":"hive"'); END; /The parameters are:
-
table_name: is the external table name. -
credential_name: is the name of the credential created in the previous step. -
file_uri_list: is a comma-delimited list of source file URIs. There are two options for this list:-
Specify a comma-delimited list of individual file URIs without wildcarding.
-
Specify a single file URI with wildcards, where the wildcards can only be after the last slash “/”. The character “*” can be used as the wildcard for multiple characters, the character “?” can be used as the wildcard for a single character.
In this example,
namespace-stringis the Oracle Cloud Infrastructure object storage namespace andbucketnameis the bucket name. See Understanding Object Storage Namespaces for more information. -
-
column_list: is a comma delimited list of column names and data types for the external table. The list includes the columns inside the data file and those derived from the object name (from names in the file path specified byfile_uri_list).The
column_listis not required when the data files are structured files (Parquet, Avro, or ORC). -
format: defines the options you can specify to describe the format of the source file. Theimplicit_partition_typeoption specifies the data format type as hive.If the data in your source file is encrypted, decrypt the data by specifying the
encryptionformat option. See Decrypt Data While Importing from Object Storage for more information on decrypting data.See DBMS_CLOUD Package Format Options for more information.
-
-
Case 2: Optimized implicit partitioning
In this example, implicit partitioning is enabled by specifying the
implicit_partition_configformat option, and optimized implicit partitioning is enabled by settingstrict_column_orderto true.Since partition_columns specifies a list of partitioned columns, those columns are used as partition columns and Autonomous AI Database doesn't try to discover the columns.
BEGIN dbms_cloud.create_external_table ( table_name => 'mysales', credential_name => 'MY_CREDENTIAL', file_uri_list => 'https://swiftobjectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname/*.parquet', column_list => 'product varchar2(100), units number, country varchar2(100),year (number), month varchar2(2)', format => '{"type":"parquet", "implicit_partition_config":{ "partition_type":"hive", "strict_column_order":true, "partition_columns":["country","year","month","day"] } }'); END; /The format parameters are:
-
format: defines the options you can specify to describe the format of the source file. Theimplicit_partition_configoption has the following sub-options:-
partition_typeThe only accepted value is
"hive". The default is non-hive filename path. Ifpartition_typeis not specified, a client error is returned. -
partition_columnsList of partitioned columns.
-
strict_column_orderSet to
trueto enable optimized implicit partitioning. Set totrueonly for cases where you know the Object Store path will not change and there are no missing prefixes. Set tofalseor omit to disable.
If the data in your source file is encrypted, decrypt the data by specifying the
encryptionformat option. See Decrypt Data While Importing from Object Storage for more information on decrypting data.See DBMS_CLOUD Package Format Options for more information.
-
-
See CREATE_EXTERNAL_TABLE Procedure for detailed information about the parameters. See DBMS_CLOUD URI Formats for more information on the supported cloud object storage services.
-
-
You can now run queries on the external partitioned table you created in the previous step.
Your Autonomous AI Database takes advantage of the partitioning information of your external partitioned table, ensuring that the query only accesses the relevant data files in Object Store.
For example:
SELECT product, units FROM mysales WHERE year='2024' and month='02'This SQL statement only queries data in the partition for month 02 of year 2024.
Optimize Object Store Query Planning with the strict_column_order option
When a query targets an object-store folder with a very large number of files and subfolders, the planning and listing phase can become the primary cost before any data is actually scanned. This is common with Hive-style folder layouts where partition values are embedded in the path.
Note: - The strict_column_order setting is disabled by default. Enable it only if the path layout is consistent.
- If folder conventions change (such as adding, removing, or reordering partition columns), you must update the
partition_columnsoption and may need to disable this option.
Enable the strict_column_order option by the following:
If your data uses Hive-style partitioned paths and the partition columns always appear in a fixed, consistent order with no missing segments, enable the optimization by the following setting:
strict_column_order = true (with partition_type = "hive")
The database can skip unnecessary directories by following the order of partitions that are already defined. This means it does not have to list every single object, which can significantly speed up the planning time for large datasets.
Notes for using the strict_column_order option:
-
Paths follow Hive naming and order, for example:
.../country=US/year=2025/month=09/... -
The partition columns maintain a fixed set and sequence, with no reordering or skipped prefixes.
-
You need to optimize plan-time listing for folders containing a very large number of objects.
-
You cannot use this option in datasets where not all prefixes are present. For example, some folders include year=, while others begin with month=).
Examples: Use strict_column_order option for Hive-Style Partitions with DBMS_CLOUD.CREATE_EXTERNAL_TABLE
BEGIN
DBMS_CLOUD.CREATE_EXTERNAL_TABLE (
table_name => 'sales_xt',
credential_name => 'MY_CREDENTIAL',
file_uri_list => 'https://objectstorage.example.com/n/tenant/b/lake/o/sales/',
-- Data is stored as: .../country=US/year=2025/month=09/*.parquet
column_list => 'product VARCHAR2(100),
units NUMBER,
amount NUMBER,
country VARCHAR2(30),
year NUMBER,
month NUMBER',
format => '{
"type": "parquet",
"implicit_partition_config": {
"partition_type": "hive",
"partition_columns": ["country","year","month"],
"strict_column_order": true
}
}'
);
END;
/
Verify that the planning logic avoids listing unrelated top-level paths (e.g., paths belonging to other countries or years):
-
Run a selective query that constrains leading partitions, for example:
SELECT COUNT(*) FROM sales_xt WHERE country = ''US'' AND year = 2025;with
strict_column_order = true.
-
If the layout assumptions are not met, disable the option and retry.
Query External Implicit Partitioned Non-Hive Style Data
Use the DBMS_CLOUD.CREATE_EXTERNAL_TABLE procedure to create an implicit partitioned external table from data in object storage that has been generated from non-Hive data.
The sample source files in the follow example, use this naming format:
OBJBUCKET/<table>/<value1>/<value2>/file.parquet
Consider the following sample source files:
OBJBUCKET/sales/USA/2024/01/sales-2024-01.parquet
OBJBUCKET/sales/USA/2024/01/sales-2024-02.parquet
OBJBUCKET/sales/USA/2024/01/sales-2024-03.parquet
To create an implicit partitioned external table with data stored in this sample Hive format, do the following:
-
Store Object Store credentials using the procedure
DBMS_CLOUD.CREATE_CREDENTIAL.For example:
BEGIN DBMS_CLOUD.CREATE_CREDENTIAL ( credential_name => 'MY_CREDENTIAL', username => 'adb_user@example.com', password => 'password' ); END; /Creating a credential to access Oracle Cloud Infrastructure Object Store is not required if you enable resource principal credentials. See Use Resource Principal to Access Oracle Cloud Infrastructure Resources for more information.
This operation stores the credentials in the database in an encrypted format. You can use any name for the credential name. Note that this step is required only once unless your object store credentials change. Once you store the credentials you can then use the same credential name for creating external tables.
See CREATE_CREDENTIAL Procedure for information about the
usernameandpasswordparameters for different object storage services. -
Create an implicit partitioned external table on top of your source files using the procedure
DBMS_CLOUD.CREATE_EXTERNAL_PART_TABLE.In this example, implicit partitioning is enabled by providing the partition columns with the format option
implicit_partition_columns. Since the folder name does not include the partition columns, the partition values in the path, specified byfile_uri_list, can only be discovered if an explicit list of columns is given with the format optionimplicit_partition_columns. In order to detect the respective column values in the path the order of columns must be the same as the order of values.The partition columns are automatically detected.
BEGIN DBMS_CLOUD.CREATE_EXTERNAL_TABLE ( table_name => 'mysales', credential_name => 'MY_CREDENTIAL', file_uri_list => 'https://swiftobjectstorage.us-phoenix-1.oraclecloud.com/n/namespace-string/b/bucketname/*.parquet', column_list => 'product varchar2(100), units number, country varchar2(100),year (number), month varchar2(2)', format => '{"type":"parquet", "implicit_partition_columns":["country","year","month"]}'); END; /The parameters are:
-
table_name: is the external table name. -
credential_name: is the name of the credential created in the previous step. -
file_uri_list: is a comma-delimited list of source file URIs. There are two options for this list:-
Specify a comma-delimited list of individual file URIs without wildcarding.
-
Specify a single file URI with wildcards, where the wildcards can only be after the last slash “/”. The character “*” can be used as the wildcard for multiple characters, the character “?” can be used as the wildcard for a single character.
-
-
column_list: is a comma delimited list of column names and data types for the external table. The list includes the columns inside the data file and those derived from the object name (from names in the file path specified byfile_uri_list).The
column_listis not required when the data files are structured files (Parquet, Avro, or ORC). -
format: defines the options you can specify to describe the format of the source file. Theimplicit_partition_typeoption is unset. Sinceimplicit_partition_columnsis given, the type is automatically detected as non-hive.If the data in your source file is encrypted, decrypt the data by specifying the
encryptionformat option. See Decrypt Data While Importing from Object Storage for more information on decrypting data.See DBMS_CLOUD Package Format Options for more information.
In this example,
namespace-stringis the Oracle Cloud Infrastructure object storage namespace andbucketnameis the bucket name. See Understanding Object Storage Namespaces for more information.See CREATE_EXTERNAL_TABLE Procedure for detailed information about the parameters.
See DBMS_CLOUD URI Formats for more information on the supported cloud object storage services.
-
-
You can now run queries on the external partitioned table you created in the previous step.
Your Autonomous AI Database takes advantage of the partitioning information of your external partitioned table, ensuring that the query only accesses the relevant data files in Object Store.
For example:
SELECT product, units FROM mysales WHERE year='2024'This SQL statement only queries data in the partition for year 2024.