17.5 ORACLE_BIGDATA Accessing Apache Iceberg
See how to use the Apache Iceberg open table format with the Oracle Big Data Access Driver
- Apache Iceberg Tables Overview
TheORACLE_BIGDATA
Access Driver allows users to access data stored in object stores as if the data resided in Oracle Database tables. - Supported Configurations for Apache Iceberg
Oracle supports catalog-based and non-catalog based data, and supports Parquet data formats with the Apache Iceberg table format - Iceberg-Specific Access Parameters
Oracle supportsORACLE_BIGDATA
access parameters that defines Apache Iceberg table access. - Examples of Table Creation and Inline External Table SQL for Iceberg Tables
The following examples demonstrate how to create tables and perform queries on Iceberg tables in Oracle Database, leveraging both manifest files and AWS Glue catalogs.
Parent topic: ORACLE_BIGDATA Access Driver
17.5.1 Apache Iceberg Tables Overview
The ORACLE_BIGDATA
Access Driver allows users to access
data stored in object stores as if the data resided in Oracle Database tables.
The ORACLE_BIGDATA
functionality now extends to include support for
Apache Iceberg, a widely adopted open table format that introduces features such as
schema evolution, time-travel queries, and fast query planning. Iceberg integration
enables efficient data management for external data sets in data lakes.
Key features of Apache Iceberg:
- Updates/Deletes: Serializable isolation of updates and deletes enhances consistency.
- Time-Travel Queries: You can query historical snapshots.
- Schema Evolution: You can manage changes in table schemas without data migration.
- Partition Evolution: You can perform logical partitioning without having to perform physical data movement.
- Extensive Metadata: With enhanced metadata, you can set up advanced optimizations, such as partition pruning and column statistics.
Parent topic: ORACLE_BIGDATA Accessing Apache Iceberg
17.5.2 Supported Configurations for Apache Iceberg
Oracle supports catalog-based and non-catalog based data, and supports Parquet data formats with the Apache Iceberg table format
Catalog-Based
-
AWS Glue Catalog: Metadata and data are stored in Amazon S3, managed by AWS Glue.
Non-Catalog (File-Based)
-
Manifest File: Directly specify the path to the metadata/manifest file for Iceberg tables.
File Formats
Oracle Database supports IParquet data format for Iceberg tables.
Parent topic: ORACLE_BIGDATA Accessing Apache Iceberg
17.5.3 Iceberg-Specific Access Parameters
Oracle supports ORACLE_BIGDATA
access parameters that
defines Apache Iceberg table access.
Table 17-6 Iceberg-Specific Access Parameters
Parameter | Description | Mandatory |
---|---|---|
|
Table protocol defnition. This must be set to
|
Yes |
|
JSON configuration for catalog details (For example: AWS Glue or OCI Hadoop Catalog). |
Optional |
|
File format used in Iceberg tables (For example:
|
Yes |
Parent topic: ORACLE_BIGDATA Accessing Apache Iceberg
17.5.4 Examples of Table Creation and Inline External Table SQL for Iceberg Tables
The following examples demonstrate how to create tables and perform queries on Iceberg tables in Oracle Database, leveraging both manifest files and AWS Glue catalogs.
- Creating a Table Pointing to the Manifest File
This example creates an external table that references a specific manifest file for Iceberg: - Inline External Table Query (Manifest File Reference)
In this example, no database object is created. Instead, the query directly references the manifest file. - Creating a Table Using DBMS_CLOUD
This example uses theDBMS_CLOUD.CREATE_EXTERNAL_TABLE
procedure to define an external table. - Creating a Table Using AWS Glue as a Catalog
This example uses theDBMS_CLOUD.CREATE_EXTERNAL_TABLE
procedure to define an external table. - Inline External Table Query Using AWS Glue Catalog
This query uses an inline external table referencing the Amazon Web Service data integration service AWS Glue as the catalog. - Creating an External Table Using DBMS_CLOUD with AWS Glue Catalog
This example demonstrates using theDBMS_CLOUD.CREATE_EXTERNAL_TABLE
procedure with the Amazon Web Service data integration service AWS Glue as the catalog.
Parent topic: ORACLE_BIGDATA Accessing Apache Iceberg
17.5.4.1 Creating a Table Pointing to the Manifest File
This example creates an external table that references a specific manifest file for Iceberg:
Example 17-14 Creating a Table Pointing to the Manifest File
CREATE TABLE CUSTOMERS_ICEBERG
(
CUSTOMER_ID NUMBER,
FIRST_NAME VARCHAR2(64),
LAST_NAME VARCHAR2(64),
EMAIL VARCHAR2(64),
SIGNUP_DATE DATE
)
ORGANIZATION EXTERNAL
(
TYPE ORACLE_BIGDATA
DEFAULT DIRECTORY DATA_PUMP_DIR
ACCESS PARAMETERS
(
com.oracle.bigdata.fileformat=parquet
com.oracle.bigdata.credential.name=AWS_S3_CREDENTIAL
com.oracle.bigdata.access_protocol=iceberg
)
LOCATION ('iceberg:https://sales-data.s3.us-west-2.amazonaws.com/customers/metadata/00001-27da..ef5.metadata.json')
)
PARALLEL;
17.5.4.2 Inline External Table Query (Manifest File Reference)
In this example, no database object is created. Instead, the query directly references the manifest file.
Example 17-15 Inline External Table Query (Manifest File Reference)
SELECT *
FROM EXTERNAL
(
(
CUSTOMER_ID NUMBER,
FIRST_NAME VARCHAR2(64),
LAST_NAME VARCHAR2(64),
EMAIL VARCHAR2(64),
SIGNUP_DATE DATE
)
TYPE ORACLE_BIGDATA
ACCESS PARAMETERS
(
com.oracle.bigdata.fileformat=parquet
com.oracle.bigdata.credential.name=AWS_S3_CREDENTIAL
com.oracle.bigdata.access_protocol=iceberg
)
LOCATION ('iceberg:https://sales-data.s3.us-west-2.amazonaws.com/customers/metadata/00001-27da..ef5.metadata.json')
) t;
17.5.4.3 Creating a Table Using DBMS_CLOUD
This example uses the DBMS_CLOUD.CREATE_EXTERNAL_TABLE
procedure to define an external table.
Example 17-16 Creating a Table Using DBMS_CLOUD
BEGIN
DBMS_CLOUD.CREATE_EXTERNAL_TABLE(
table_name => 'CUSTOMERS_ICEBERG',
credential_name => 'AWS_S3_CREDENTIAL',
file_uri_list => 'https://sales-data.s3.us-west-2.amazonaws.com/customers/metadata/00001-27da..ef5.metadata.json',
format =>'{"access_protocol":{"protocol_type":"iceberg"}}'
);
END;
17.5.4.4 Creating a Table Using AWS Glue as a Catalog
This example uses the DBMS_CLOUD.CREATE_EXTERNAL_TABLE
procedure to define an external table.
Example 17-17 Creating a Table Using AWS Glue as a Catalog
This example defines a table that uses the Amazon Web Service data integration service AWS Glue as the Iceberg catalog:
CREATE TABLE CUSTOMERS_ICEBERG
(
CUSTOMER_ID NUMBER,
FIRST_NAME VARCHAR2(64),
LAST_NAME VARCHAR2(64),
EMAIL VARCHAR2(64),
SIGNUP_DATE DATE
)
ORGANIZATION EXTERNAL
(
TYPE ORACLE_BIGDATA
ACCESS PARAMETERS
(
com.oracle.bigdata.fileformat=parquet
com.oracle.bigdata.credential.name=AWS_S3_CREDENTIAL
com.oracle.bigdata.access_protocol=iceberg
com.oracle.bigdata.access_protocol.config='{"iceberg_catalog_type": "aws_glue", "iceberg_glue_region": "us-west-2", "iceberg_table_path": "sales_db.customers"}'
)
LOCATION ('iceberg:')
)
PARALLEL;
17.5.4.5 Inline External Table Query Using AWS Glue Catalog
This query uses an inline external table referencing the Amazon Web Service data integration service AWS Glue as the catalog.
Example 17-18 Inline External Table Query Using AWS Glue Catalog
SELECT *
FROM EXTERNAL
(
(
CUSTOMER_ID NUMBER,
FIRST_NAME VARCHAR2(64),
LAST_NAME VARCHAR2(64),
EMAIL VARCHAR2(64),
SIGNUP_DATE DATE
)
TYPE ORACLE_BIGDATA
ACCESS PARAMETERS
(
com.oracle.bigdata.fileformat=parquet
com.oracle.bigdata.credential.name=AWS_S3_CREDENTIAL
com.oracle.bigdata.access_protocol=iceberg
com.oracle.bigdata.access_protocol.config='{"iceberg_catalog_type": "aws_glue", "iceberg_glue_region": "us-west-2", "iceberg_table_path": "sales_db.customers"}'
)
LOCATION ('iceberg:')
) t;
17.5.4.6 Creating an External Table Using DBMS_CLOUD with AWS Glue Catalog
This example demonstrates using the
DBMS_CLOUD.CREATE_EXTERNAL_TABLE
procedure with the Amazon Web Service
data integration service AWS Glue as the catalog.
Example 17-19 Creating an External Table Using DBMS_CLOUD with AWS Glue Catalog
BEGIN
DBMS_CLOUD.CREATE_EXTERNAL_TABLE(
table_name => 'CUSTOMERS_ICEBERG',
credential_name => 'AWS_S3_CREDENTIAL',
file_uri_list => '',
format =>'{"access_protocol":{"protocol_type":"iceberg","protocol_config":{"iceberg_catalog_type": "aws_glue", "iceberg_glue_region": "us-west-2", "iceberg_table_path": "sales_db.customers"}}}'
);
END;