Iceberg Event Handler

9.2.24 Iceberg Event Handler

Iceberg is a high-performance table format for extremely large analytic tables. Iceberg brings the reliability and simplicity of SQL tables to GG for DAA, while making it possible for engines, such as Spark, Trino, Flink, Presto, Hive, and Impala to safely work with the same tables, at the same time.

Parent topic: Target

9.2.24.1 Detailed Functionality

The Oracle GoldenGate Iceberg Replicat can replicate GoldenGate trail records to Iceberg tables.

The Iceberg open-table-format files could be written to local files, AWS Simple Storage Service(S3), Google Cloud Storage(GCS), or Azure DataLake Storage(ADLS).

Parent topic: Iceberg Event Handler

9.2.24.1.1 Replication without a SQL Engine

Oracle GoldenGate Iceberg Replicat process does not require a SQL engine to replicate data to Iceberg tables.

It uses the Iceberg Java SDK along with object storage specific Java SDK to write data to Iceberg tables.

Parent topic: Detailed Functionality

9.2.24.1.2 Iceberg File Format

The default file format for Iceberg data files and delete files is Parquet.

Oracle GoldenGate can be configured to write files in any of the following Iceberg supported file formats:

Parquet (default)
Avro
ORC

Parent topic: Detailed Functionality

9.2.24.1.3 Iceberg Catalog

Oracle GoldenGate supports the following Iceberg catalogs:

Hadoop Catalog
Nessie Catalog
AWS Glue Catalog
Polaris Catalog
REST Catalog
JDBC Catalog

Parent topic: Detailed Functionality

9.2.24.1.4 Iceberg Specification

Oracle GoldenGate generates data files and delete files as per the Iceberg specification version 2.

See https://iceberg.apache.org/spec/#version-2-row-level-deletes

Parent topic: Detailed Functionality

9.2.24.1.5 Delete Files and Merge-On-Read (MoR)

Oracle GoldenGate generates Iceberg delete files for the UPDATE and DELETE operations.

Therefore, the Iceberg table property write.update.mode is always set to merge-on-read.

SQL engines should support merge-on-read to query tables replicated by Oracle GoldenGate.

Iceberg supports two types of delete files:

Equality Deletes: The deleted records are identified by the equality of the values in the columns specified in the delete file.
Position Deletes: The deleted records are identified by the position of the records in the Iceberg data file.
In the current release, Oracle GoldenGate uses Iceberg Equality Deletes to delete records from the Iceberg table.

This allows records to be deleted without looking up the position of the rows in the Iceberg data file.

Note:
Contact Oracle support for use cases that require Iceberg Position Deletes.

Parent topic: Detailed Functionality

9.2.24.1.6 Operation Support

The Iceberg event handler supports the following operations:

INSERT: Generates Iceberg data files for the insert operations.
UPDATE: Generates Iceberg data files and delete files for update operations.
DELETE: Generates Iceberg delete files for delete operations.
TRUNCATE: Generates an Iceberg delete file with a condition as always true to truncate the target table.
This operation creates an empty Iceberg snapshot with no data files.

Parent topic: Detailed Functionality

9.2.24.1.7 Compressed Update Handling

A compressed update record in the Oracle GoldenGate trail file contains values for the key columns and the modified columns.

An uncompressed update record contains values for all the columns.

Oracle GoldenGate trails may contain compressed or uncompressed update records. The default extract configuration writes compressed updates to the trail files.

If there are missing column values in the update operations, then Replicat will ABEND.

This behavior can be overridden by setting the parameter gg.eventhandler.iceberg.abendOnMissingColumns=false in the Replicat properties file.

When the parameter is set to false, Replicat will handle compressed updates by querying the previous values of the missing columns from the Iceberg table.

Lookup Missing values in Sparse Updates

Parent topic: Detailed Functionality

9.2.24.1.7.1 Lookup Missing values in Sparse Updates

The lookup of the missing values is an expensive operation and may impact the performance of the Replicat process.

By default, Oracle GoldenGate writes records to Iceberg in micro batches every ten minutes.

Every micro-batch for a table can potentially contain millions of rows.

Micro batches will be processed for every target table in concurrent threads.

Therefore, it is critical that sufficient JVM heap memory is allocated to the Replicat process.

The lookup is performed only for such rows that contain at least one missing value in the update operation.

Oracle GoldenGate will automatically create target tables. During auto-creation of tables, Oracle GoldenGate Replicat will enable creation of Iceberg metrics (min/max values) for all the identifier (key) columns.

The metrics are stored in the Iceberg metadata files.

Iceberg metrics helps speed up the lookup of the missing values in the UPDATE operations.

Parent topic: Compressed Update Handling

9.2.24.1.8 INSERTALLRECORDS Support

Iceberg event handler supports INSERTALLRECORDS parameter. See: https://docs.oracle.com/en/middleware/goldengate/core/21.3/reference/insertallrecords.html#GUID-A1019C40-97BE-437B-9D80-7C99A9A6DB8E. Set the INSERTALLRECORDS parameter in the Replicat parameter file (.prm).

Setting this property directs the Replicat process to generate Iceberg data files to append operation data into the Iceberg target table.

Parent topic: Detailed Functionality

9.2.24.1.9 Operation Aggregation

Operation aggregation is the process of aggregating (merging/compressing) multiple operations on the same row into a single output operation based on a threshold.

Operation records are aggregated in-memory.

You can tune the frequency of apply interval using gg.handler.iceberg.fileRollInterval property, the default value is set to 15m (fifteen minutes).

The Replicat process will generate Iceberg data files and delete files for the aggregated operations.

Parent topic: Detailed Functionality

9.2.24.1.10 Automatic Table Creation

Oracle GoldenGate Replicat will automatically create target tables if the target table does not exist.

Parent topic: Detailed Functionality

9.2.24.1.11 Iceberg Metadata Provider

A new metadata provider for Iceberg is implemented to retrieve the Iceberg target table metadata.

Iceberg Metadata provider is auto configured and enabled by the Replicat process.

Parent topic: Detailed Functionality

9.2.24.1.12 Iceberg Identifier Fields

The identifier fields in the Iceberg table are used to uniquely identify the rows in the Iceberg table.

During the automatic table creation, Oracle GoldenGate maps the key columns to Iceberg identifier fields.

Note:

Iceberg tables without identifier fields are not supported in the current release.

Parent topic: Detailed Functionality

9.2.24.1.13 Primary Key Updates and Truncates

Primary key updates with missing column values will trigger files to be flushed to the Iceberg table before the flush interval.
This can result in small data files and delete files for the primary key update operation.

For workloads or tables with frequent primary key updates, Oracle recommends to generate trail files with uncompressed update records.

Oracle also recommends to set gg.validate.keyupdate=true for trail generated from Oracle source.

There is a known issue with Oracle extract to generate primary key update operations even though the key columns are not modified.
A truncate operation will trigger files to be flushed to the Iceberg table before the flush interval.

Parent topic: Detailed Functionality

9.2.24.2 Configuration

The configuration of the Iceberg replication properties is stored in the Replicat properties file.

Parent topic: Iceberg Event Handler

9.2.24.2.1 Automatic Configuration

Iceberg replication involves configuring multiple components, such as the File Writer Handler, and the target Iceberg Event Handler.

The Automatic Configuration functionality helps you to autoconfigure these components so that the manual configuration is minimal.

The properties modified by autoconfiguration is also logged in the handler log file.

To enable autoconfiguration to replicate to the Iceberg target, set the parameter gg.target=iceberg.

Parent topic: Configuration

9.2.24.2.1.1 File Writer Configuration

The File Writer Handler name is pre-set to the value iceberg and its properties are automatically set to the required values for Iceberg.

Parent topic: Automatic Configuration

9.2.24.2.1.2 Iceberg Event Handler Configuration

The Iceberg Event Handler name is pre-set to the value iceberg.

This topic details the configuration properties available for the Iceberg Event handler, the required ones must be changed to match your Iceberg configuration.

Parent topic: Automatic Configuration

9.2.24.2.1.2.1 Common Iceberg Properties

Iceberg can be configured to work with multiple catalogs and object stores.

The following are the common properties:

Properties	Required/Optional	Legal Values	Default	Explanation
`gg.eventhandler.iceberg.warehouseLocation`	Optional	String value.	None	Directory path to the Iceberg warehouse location excluding the object storage scheme. Example: `/path/to/warehouse`. This is a required property when using the `hadoop` catalog. For other Iceberg catalogs, warehouse location has a catalog specific requirement.
`gg.eventhandler.iceberg.fileRollInterval`	Optional	The default unit of measure is milliseconds. You can stipulate ms, s, m, h to signify milliseconds, seconds, minutes, or hours respectively. Examples of legal values include 10000, 10000ms, 10s, 10m, or 1.5h. Values of 0 or less indicate that file rolling on time is turned off.	`15m`	The parameter determines how often the data will be pushed into the Iceberg warehouse. Use with caution, the higher this value is the more data will need to be stored in the memory of the Replicat process. Note: Use the parameter with caution. Increasing its default value (`15m`) will increase the amount of data stored in the internal memory of the Replicat. This can cause out of memory errors and stop the Replicat if it runs out of memory.
`gg.eventhandler.iceberg.fileSystemScheme`	Optional	String value.	`file://`	Warehouse scheme to indicate the Iceberg object storage location. Valid values are: `file://`, `gs://`, `s3://`, `s3a://`, `abfss://`. For more information, see File System Scheme.
`gg.eventhandler.iceberg.catalogType`	Optional	String value.	`hadoop`	Iceberg catalog type. Valid values are: `hadoop`, `jdbc`, `nessie`, `rest`, `glue`, `polaris`.
`gg.eventhandler.iceberg.fileFormat`	Optional	parquet, orc, or avro.	`parquet`	Iceberg table file format to be used in target tables. Supported file formats: Parquet, Avro, and ORC.
`gg.eventhandler.iceberg.icebergTableProperties`	Optional	String value.	None	Path to a table properties file to specify additional Iceberg table properties to set to the target tables.
`gg.eventhandler.iceberg.abendOnMissingColumns`	Optional	`true` or `false`.	`true`	When set to `true` and the `UPDATE` operation contains a missing value, Replicat will ABEND. When set to `false`, Replicat will not ABEND if `UPDATE` operations have missing column values. The missing columns values will be read by querying the target tables. This lookup may impact the performance of the Replicat process.
`gg.eventhandler.iceberg.abendOnSchemaChanges`	Optional	`true` or `false`	`true`	When set to `true` and schema changes are detected, the replicat process will ABEND. User can manually update the target schema and set the configuration to `false` to proceed. When set to `false`, a warning message is logged for schema changes.
`gg.validate.keyupdate`	Optional	`true` or `false`	`false`	If set to `true`, Replicat will validate key update operations (optype 115) and correct to normal update if no key values have changed.

File System Scheme

Parent topic: Iceberg Event Handler Configuration

9.2.24.2.1.2.1.1 File System Scheme

The gg.eventhandler.iceberg.fileSystemScheme property is used to specify the object storage scheme.

The following are the supported object storage schemes:

file://: Local file system
gs://: Google Cloud Storage
s3://: AWS S3
s3a://: AWS S3
abfss://: Azure Data Lake Storage

Parent topic: Common Iceberg Properties

9.2.24.2.1.2.2 Iceberg Common Dependencies

The following are the common Iceberg dependencies:

<dependencies>
       <!-- Common Iceberg dependencies START -->
       <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-common</artifactId>
            <version>3.4.0</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-mapreduce-client-core</artifactId>
            <version>3.4.0</version>
        </dependency>
        <dependency>
            <groupId>org.apache.iceberg</groupId>
            <artifactId>iceberg-arrow</artifactId>
            <version>1.6.1</version>
        </dependency>
        <dependency>
            <groupId>org.apache.iceberg</groupId>
            <artifactId>iceberg-core</artifactId>
            <version>1.6.1</version>
        </dependency>
        <dependency>
            <groupId>org.apache.iceberg</groupId>
            <artifactId>iceberg-data</artifactId>
            <version>1.6.1</version>
        </dependency>
        <dependency>
            <groupId>org.apache.iceberg</groupId>
            <artifactId>iceberg-parquet</artifactId>
            <version>1.6.1</version>
        </dependency>
        <dependency>
            <groupId>org.apache.iceberg</groupId>
            <artifactId>iceberg-gcp</artifactId>
            <version>1.6.1</version>
        </dependency>
        <dependency>
            <groupId>org.apache.iceberg</groupId>
            <artifactId>iceberg-aws</artifactId>
            <version>1.6.1</version>
        </dependency>
        <dependency>
            <groupId>org.apache.iceberg</groupId>
            <artifactId>iceberg-orc</artifactId>
            <version>1.6.1</version>
        </dependency>
    <dependency>
        <groupId>org.apache.iceberg</groupId>
        <artifactId>iceberg-nessie</artifactId>
        <version>1.6.1</version>
    </dependency>
    <!-- Common Iceberg dependencies END -->
</dependencies>

You can download the dependencies from maven central using the script download_dependencies.sh in the DependencyDownloader directory.

Follow these steps:

Change directory to DependencyDownloader.
Edit config_proxy.sh if proxy configuration is required.
Run the script:
```
./download_dependencies.sh xmls/iceberg-common.xml
```
This script will download the dependencies and store them in the iceberg-common directory. gg.classpath can be configured to include the dependencies from the iceberg-common directory as follows: gg.classpath=/path/to/DependencyDownloader/dependencies/iceberg-common/*

Parent topic: Iceberg Event Handler Configuration

9.2.24.2.1.2.3 AWS Java SDK dependencies for Writing to AWS S3 (s3:// Scheme)

The following are the Iceberg dependencies to write to AWS S3 using the s3:// scheme:

<dependencies>
    <!-- s3:// scheme dependencies START -->
    <dependency>
        <groupId>software.amazon.awssdk</groupId>
        <artifactId>s3</artifactId>
        <version>2.28.6</version>
    </dependency>
    <dependency>
        <groupId>software.amazon.awssdk</groupId>
        <artifactId>sts</artifactId>
        <version>2.28.6</version>
    </dependency>
    <dependency>
        <groupId>software.amazon.awssdk</groupId>
        <artifactId>glue</artifactId>
        <version>2.28.6</version>
    </dependency>
    <dependency>
        <groupId>software.amazon.awssdk</groupId>
        <artifactId>url-connection-client</artifactId>
        <version>2.28.6</version>
    </dependency>
    <!-- s3:// scheme dependencies END -->
</dependencies>

The dependencies can be downloaded from maven central using the script download_dependencies.sh in the DependencyDownloader directory.

Follow these steps:

Change directory to DependencyDownloader.
Edit config_proxy.sh if proxy configuration is required.
Run the script: ./download_dependencies.sh xmls/iceberg-aws-java-sdk.xml

This script will download the dependencies and store them in the iceberg-aws-java-sdk directory.

gg.classpath: can be configured to include the dependencies as follows:

gg.classpath=/path/to/DependencyDownloader/dependencies/iceberg-aws-java-sdk/*:/path/to/DependencyDownloader/dependencies/iceberg-common/*

Parent topic: Iceberg Event Handler Configuration

9.2.24.2.1.2.4 Hadoop AWS SDK Dependencies for Writing to AWS S3 (s3a:// Scheme)

The following are the Iceberg dependencies to write to AWS S3 using the

s3a://
            scheme

<dependencies>
    <!-- s3a:// scheme dependencies START -->
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-aws</artifactId>
        <version>3.4.0</version>
    </dependency>
    <!-- s3a:// scheme dependencies END -->
</dependencies>

You can download the dependencies from maven central using the script download_dependencies.sh in the DependencyDownloader directory.

Follow these steps:

Change directory to DependencyDownloader.
Edit config_proxy.sh if proxy configuration is required.

Run the script:

./download_dependencies.sh xmls/iceberg-hadoop-aws.xml

This script will download the dependencies and store them in the iceberg-hadoop-aws directory.

gg.classpath can be configured to include the dependencies as follows:

gg.classpath=/path/to/DependencyDownloader/dependencies/iceberg-hadoop-aws/*:/path/to/DependencyDownloader/dependencies/iceberg-common/*

Parent topic: Iceberg Event Handler Configuration

9.2.24.2.1.2.5 Hadoop Google Cloud Storage SDK Dependencies for Writing to Google Cloud Storage (GCS)

The following are the Iceberg dependencies to write to GCS using the Hadoop GCS SDK:

<dependencies>
    <!-- gs:// scheme dependencies START -->
    <dependency>
        <groupId>com.google.cloud.bigdataoss</groupId>
        <artifactId>gcs-connector</artifactId>
        <version>hadoop3-2.2.22</version>
    </dependency>
    <!-- gs:// scheme dependencies END -->
</dependencies>

The dependencies can be downloaded from maven central using the script download_dependencies.sh in the DependencyDownloader directory.

Follow these steps:

Change directory to DependencyDownloader.
Edit config_proxy.sh if proxy configuration is required.
Run the script: ./download_dependencies.sh xmls/iceberg-hadoop-gcs.xml

This script will download the dependencies and store them in the iceberg-hadoop-gcs directory.

gg.classpath can be configured to include the dependencies as follows:

g.classpath=/path/to/DependencyDownloader/dependencies/iceberg-hadoop-gcs/*:/path/to/DependencyDownloader/dependencies/iceberg-common/*

Parent topic: Iceberg Event Handler Configuration

9.2.24.2.1.2.6 Google Cloud Storage SDK Dependencies for Writing to Google Cloud Storage (GCS)

The following are the Iceberg dependencies to write to GCS using the Google Cloud Storage Java SDK:

<dependencies>
    <dependency>
        <groupId>com.google.cloud</groupId>
        <artifactId>google-cloud-storage</artifactId>
        <version>2.37.0</version>
    </dependency>
</dependencies>

The dependencies can be downloaded from maven central using the script download_dependencies.sh in the DependencyDownloader directory.

Follow these steps:

Change directory to DependencyDownloader.
Edit config_proxy.sh if proxy configuration is required.

Run the script:

./download_dependencies.sh xmls/iceberg-gcs-java-sdk.xml

This script will download the dependencies and store them in the iceberg-gcs-java-sdk directory.

gg.classpath can be configured to include the dependencies as follows:

gg.classpath=/path/to/DependencyDownloader/dependencies/iceberg-hadoop-gcs/*:/path/to/DependencyDownloader/dependencies/iceberg-common/*

Parent topic: Iceberg Event Handler Configuration

9.2.24.2.1.2.7 Hadoop Azure SDK Dependencies for Writing to Azure Data Lake (ADLS)

The following are the Iceberg dependencies to write to ADLS using the Hadoop Azure Java SDK:

<dependencies>
    <!-- abfss:// scheme dependencies START -->
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-azure</artifactId>
        <version>3.4.0</version>
    </dependency>
    <!-- abfss:// scheme dependencies END -->
</dependencies>

The dependencies can be downloaded from maven central using the script download_dependencies.sh in the DependencyDownloader directory.

Follow these steps:

Change directory to DependencyDownloader.
Edit config_proxy.sh if proxy configuration is required.

Run the script:

./download_dependencies.sh xmls/iceberg-hadoop-azure.xml

This script will download the dependencies and store them in the iceberg-hadoop-azure directory.

gg.classpath: can be configured to include the dependencies as follows:

gg.classpath=/path/to/DependencyDownloader/dependencies/iceberg-hadoop-azure/*:/path/to/DependencyDownloader/dependencies/iceberg-common/*

Parent topic: Iceberg Event Handler Configuration

9.2.24.2.2 Configuration for Iceberg Nessie Catalog

Parent topic: Configuration

9.2.24.2.2.1 Configuration for Nessie Catalog and AWS S3 s3:// Scheme

The following are the configuration properties for the Nessie catalog and AWS S3 object store using s3:// scheme:

Properties	Required/Optional	Legal Values	Default	Explanation
`gg.eventhandler.iceberg.catalogType`	Optional	String value.	`hadoop`	`nessie`
`gg.eventhandler.iceberg.nessieBranch`	Optional	String value.	`main`	Nessie Catalog branch name where the Iceberg table metadata exists.
`gg.eventhandler.iceberg.nessieUri`	Required	String value.	None	Nessie Catalog endpoint URI. Example: `http://<nessie-server>.com:10001/api/v2` .
`gg.eventhandler.iceberg.fileSystemScheme`	Optional	String value.	`file://`	File system scheme to indicate AWS S3 object storage location: `s3://`.
`gg.eventhandler.iceberg.awsS3Region`	Required	String value.	None	AWS S3 bucket region. Example: `us-east-2`.
`gg.eventhandler.iceberg.awsS3Bucket`	Required	String value.	None	AWS S3 bucket name that houses the Iceberg Warehouse.
`gg.eventhandler.iceberg.awsAccessKeyId`	Optional	String value.	None	AWS access key id for authentication.
`gg.eventhandler.iceberg.awsSecretKey`	Optional	String value.	None	AWS secret access key for authentication.
`gg.eventhandler.iceberg.awsSessionToken`	Optional	String value.	None	AWS session token for authentication.
`gg.eventhandler.iceberg.awsRoleArn`	Optional	String value.	None	AWS role ARN for authentication.
`gg.eventhandler.iceberg.awsS3Endpoint`	Optional	String value.	None	AWS S3 endpoint.
`gg.eventhandler.iceberg.proxyServer`	Optional	String value.	None	Proxy server to connect to the AWS S3 object storage.
`gg.eventhandler.iceberg.proxyPort`	Optional	String value.	`80`	Proxy server port to connect to the AWS S3 object storage.

Parent topic: Configuration for Iceberg Nessie Catalog

9.2.24.2.2.1.1 Classpath And Dependencies

The Java classpath (gg.classpath) should include the following dependencies:

Iceberg common dependencies
AWS SDK dependencies for writing to AWS S3 (s3:// scheme)

Parent topic: Configuration for Nessie Catalog and AWS S3 s3:// Scheme

9.2.24.2.2.1.2 Sample Configuration for Nessie Catalog and AWS S3 s3:// Scheme

gg.target=iceberg
gg.eventhandler.iceberg.warehouseLocation=/path/to/iceberg/tables
gg.classpath=DependencyDownloader/dependencies/iceberg-aws-java-sdk/*:DependencyDownloader/dependencies/iceberg-common/*
gg.eventhandler.iceberg.catalogType=nessie
gg.eventhandler.iceberg.nessieBranch=main
gg.eventhandler.iceberg.nessieUri=http://<nessie-server>:10001/api/v2
gg.eventhandler.iceberg.fileSystemScheme=s3://
gg.eventhandler.iceberg.awsS3Region=us-east-2
gg.eventhandler.iceberg.awsS3Bucket=<s3-bucket>
gg.eventhandler.iceberg.awsAccessKeyId=<access-key-id>
gg.eventhandler.iceberg.awsSecretKey=<secret-key>
gg.eventhandler.iceberg.proxyServer=<proxy-server>
gg.eventhandler.iceberg.proxyPort=<proxy-port>

Parent topic: Configuration for Nessie Catalog and AWS S3 s3:// Scheme

9.2.24.2.2.2 Configuration for Nessie Catalog and AWS S3 s3a:// Scheme

The following are the configuration properties for the Nessie catalog and AWS S3 object store using s3a:// scheme:

Properties	Required/Optional	Legal Values	Default	Explanation
`gg.eventhandler.iceberg.catalogType`	Optional	String value.	`hadoop`	`nessie`.
`gg.eventhandler.iceberg.nessieBranch`	Optional	String value.	`main`	Nessie Catalog branch name where the Iceberg table metadata exists.
`gg.eventhandler.iceberg.nessieUri`	Required	String value.	None	Nessie Catalog endpoint URI. Example: `http://<nessie-server>.com:10001/api/v2`.
`gg.eventhandler.iceberg.fileSystemScheme`	Optional	String value.	`file://`	File system scheme to indicate AWS S3 object storage location: `s3a://`.
`gg.eventhandler.iceberg.awsS3Bucket`	Required	String value.	None	AWS S3 bucket name that houses the Iceberg Warehouse.
`gg.eventhandler.iceberg.awsAccessKeyId`	Required	String value.	None	AWS access key id for authentication.
`gg.eventhandler.iceberg.awsSecretKey`	Required	String value.	None	AWS secret access key for authentication.
`gg.eventhandler.iceberg.awsSessionToken`	Optional	String value.	None	AWS session token for authentication.
`gg.eventhandler.iceberg.proxyServer`	Optional	String value.	None	Proxy server to connect to the AWS S3 object storage.
`gg.eventhandler.iceberg.proxyPort`	Optional	String value.	`80`	Proxy server port to connect to the AWS S3 object storage.

Parent topic: Configuration for Iceberg Nessie Catalog

9.2.24.2.2.2.1 Classpath and Dependencies

The Java classpath (gg.classpath) should include the following dependencies:

Iceberg common dependencies
Hadoop AWS SDK dependencies for writing to AWS S3 (s3a:// scheme)

Parent topic: Configuration for Nessie Catalog and AWS S3 s3a:// Scheme

9.2.24.2.2.2.2 Sample Configuration for Nessie Catalog and AWS S3 s3a:// scheme

gg.target=iceberg
gg.eventhandler.iceberg.warehouseLocation=/path/to/iceberg/tables
gg.classpath=DependencyDownloader/dependencies/iceberg-hadoop-aws/*:DependencyDownloader/dependencies/iceberg-common/*
gg.eventhandler.iceberg.catalogType=nessie
gg.eventhandler.iceberg.nessieBranch=main
gg.eventhandler.iceberg.nessieUri=http://<nessie-server>:10001/api/v2
gg.eventhandler.iceberg.fileSystemScheme=s3a://
gg.eventhandler.iceberg.awsS3Region=us-east-2
gg.eventhandler.iceberg.awsS3Bucket=<s3-bucket>
gg.eventhandler.iceberg.awsAccessKeyId=<access-key-id>
gg.eventhandler.iceberg.awsSecretKey=<secret-key>
gg.eventhandler.iceberg.proxyServer=<proxy-server>
gg.eventhandler.iceberg.proxyPort=<proxy-port>

Parent topic: Configuration for Nessie Catalog and AWS S3 s3a:// Scheme

9.2.24.2.2.3 Configuration for Nessie Catalog and GCS gs:// Scheme

The following are the configuration properties for the Nessie catalog and GCS object store using gs:// scheme:

Properties	Required/Optional	Legal Values	Default	Explanation
`gg.eventhandler.iceberg.catalogType`	Optional	String value.	`hadoop`	`nessie`.
`gg.eventhandler.iceberg.nessieBranch`	Optional	String value.	`main`	Nessie Catalog branch name where the Iceberg table metadata exists.
`gg.eventhandler.iceberg.nessieUri`	Required	String value.	None	Nessie Catalog endpoint URI. Example: `http://<nessie-server>.com:10001/api/v2`.
`gg.eventhandler.iceberg.fileSystemScheme`	Optional	String value.	`file://`	File system scheme to indicate GCS object storage location: `gs://`.
`gg.eventhandler.iceberg.gcpStorageBucket`	Required	String value.	None	Google Cloud Storage bucket name that houses the Iceberg Warehouse.
`gg.eventhandler.iceberg.gcpProjectId`	Required	String value.	None	Sets the project-id of the Google Cloud project that houses the GCS bucket.
`gg.eventhandler.iceberg.gcpServiceAccountJsonKeyFile`	Required	String value.	None	Sets the path to the Google Service account key file.
`gg.eventhandler.iceberg.proxyServer`	Optional	String value.	None	Proxy server to connect to the GCS object storage.
`gg.eventhandler.iceberg.proxyPort`	Optional	String value.	`80`	Proxy server port to connect to the GCS object storage.

Parent topic: Configuration for Iceberg Nessie Catalog

9.2.24.2.2.3.1 Classpath and Dependencies

The Java classpath (gg.classpath) should include the following dependencies:

Iceberg common dependencies
Hadoop Google Cloud Storage SDK dependencies for writing to Google Cloud Storage (GCS)

Parent topic: Configuration for Nessie Catalog and GCS gs:// Scheme

9.2.24.2.2.3.2 Sample Configuration for Nessie Catalog and GCS gs:// Scheme

gg.target=iceberg
gg.eventhandler.iceberg.warehouseLocation=/path/to/iceberg/tables
gg.classpath=DependencyDownloader/dependencies/iceberg-hadoop-gcs/*:DependencyDownloader/dependencies/iceberg-common/*
gg.eventhandler.iceberg.catalogType=nessie
gg.eventhandler.iceberg.nessieBranch=main
gg.eventhandler.iceberg.nessieUri=http://<nessie-server>:10001/api/v2
gg.eventhandler.iceberg.fileSystemScheme=gs://
gg.eventhandler.iceberg.gcpStorageBucket=<gcs-bucket>
gg.eventhandler.iceberg.gcpProjectId=<gcp-project-id>
gg.eventhandler.iceberg.gcpServiceAccountJsonKeyFile=<gcp-service-account-key-file>
gg.eventhandler.iceberg.proxyServer=<proxy-server>
gg.eventhandler.iceberg.proxyPort=<proxy-port>

Parent topic: Configuration for Nessie Catalog and GCS gs:// Scheme

9.2.24.2.2.4 Configuration for Nessie Catalog and Azure Data Lake Storage abfss:// Scheme

The following are the configuration properties for the Nessie catalog and Azure Data Lake Storage using abfss:// scheme:

Properties	Required/Optional	Legal Values	Default	Explanation
`gg.eventhandler.iceberg.catalogType`	Optional	String value.	`hadoop`	`nessie`.
`gg.eventhandler.iceberg.nessieBranch`	Optional	String value.	`main`	Nessie Catalog branch name where the Iceberg table metadata exists.
`gg.eventhandler.iceberg.nessieUri`	Required	String value.	None	Nessie Catalog endpoint URI. Example:`http://<nessie-server>.com:10001/api/v2`.
`gg.eventhandler.iceberg.fileSystemScheme`	Optional	String value.	`file://`	File system scheme to indicate Azure Data Lake Storage location: `abfss://`.
`gg.eventhandler.iceberg.azureAccountName`	Required	String value.	None	Azure storage account name that contains the container for the Iceberg Warehouse.
`gg.eventhandler.iceberg.azureContainer`	Required	String value.	None	Azure storage account container name that houses the Iceberg Warehouse.
`gg.eventhandler.iceberg.azureAccountKey`	Required	String value.	None	Azure storage account key.
`gg.eventhandler.iceberg.azureBlobEndpoint`	Optional	String value.	`<azureContainer>@<azureAccountName>.dfs.core.windows.net`	Azure Storage service endpoint.
`gg.eventhandler.iceberg.proxyServer`	Optional	String value.	None	Proxy server to connect to the Azure object storage.
`gg.eventhandler.iceberg.proxyPort`	Optional	String value.	`80`	Proxy server port to connect to the Azure object storage.

Parent topic: Configuration for Iceberg Nessie Catalog

9.2.24.2.2.4.1 Classpath and Dependencies

The Java classpath (gg.classpath) should include the following dependencies:

Iceberg common dependencies
Hadoop Azure SDK dependencies for writing to Azure Data Lake (ADLS)

Parent topic: Configuration for Nessie Catalog and Azure Data Lake Storage abfss:// Scheme

9.2.24.2.2.4.2 Sample Configuration for Nessie Catalog and ADLS abfss:// Scheme

gg.target=iceberg
gg.eventhandler.iceberg.warehouseLocation=/path/to/iceberg/tables
gg.classpath=DependencyDownloader/dependencies/iceberg-hadoop-azure/*:DependencyDownloader/dependencies/iceberg-common/*
gg.eventhandler.iceberg.catalogType=nessie
gg.eventhandler.iceberg.nessieBranch=main
gg.eventhandler.iceberg.nessieUri=http://<nessie-server>:10001/api/v2
gg.eventhandler.iceberg.fileSystemScheme=abfss://
gg.eventhandler.iceberg.azureAccountName=<azure-storage-account-name>
gg.eventhandler.iceberg.azureContainer=<azure-storage-container>
gg.eventhandler.iceberg.azureAccountKey=<azure-storage-account-key>
gg.eventhandler.iceberg.proxyServer=<proxy-server>
gg.eventhandler.iceberg.proxyPort=<proxy-port>

Parent topic: Configuration for Nessie Catalog and Azure Data Lake Storage abfss:// Scheme

9.2.24.2.2.4.3 Nessie Namespace

Nessie namespace is the top-level container for all the tables in the Nessie catalog.

Before starting the Replicat process, it is required to have existing namespaces before creating or writing to tables.

Nessie namespace can be created using the nessie command line program (nessie-cli-<version>.jar) as follows: create namespace QASOURCE;

The Nessie namespace is mapped to the GoldenGate schema in the MAP statement.

For example: MAP QASOURCE.TCUSTMER, TARGET QASOURCE.TCUSTMER;

Parent topic: Configuration for Nessie Catalog and Azure Data Lake Storage abfss:// Scheme

9.2.24.2.3 Configuration for Iceberg AWS Glue Catalog

Parent topic: Configuration

9.2.24.2.3.1 Configuration for Iceberg AWS Glue Catalog and AWS S3 s3:// OR s3a:// Scheme

The following are the configuration properties for the AWS Glue catalog and AWS S3 object store using s3:// or s3a:// scheme:

Properties	Required/Optional	Legal Values	Default	Explanation
`gg.eventhandler.iceberg.catalogType`	Optional	String value.	hadoop	glue.
`gg.eventhandler.iceberg.awsGlueId`	Required	String value.	None	The Glue catalog ID is your numeric AWS account ID.
`gg.eventhandler.iceberg.fileSystemScheme`	Optional	String value.	file://	File system scheme to indicate AWS S3 object storage location: `s3://` or `s3a://`.
`gg.eventhandler.iceberg.awsS3Region`	Required	String value.	None	AWS S3 bucket region. Example: us-east-2.
`gg.eventhandler.iceberg.awsS3Bucket`	Required	String value.	None	AWS S3 bucket name that houses the Iceberg Warehouse.
`gg.eventhandler.iceberg.awsAccessKeyId`	Optional	String value.	None	AWS access key id for authentication.
`gg.eventhandler.iceberg.awsSecretKey`	Optional	String value.	None	AWS secret access key for authentication.
`gg.eventhandler.iceberg.awsSessionToken`	Optional	String value.	None	AWS session token for authentication.
`gg.eventhandler.iceberg.awsRoleArn`	Optional	String value.	None	AWS role ARN for authentication.
`gg.eventhandler.iceberg.awsS3Endpoint`	Optional	String value.	None	AWS S3 endpoint.
`gg.eventhandler.iceberg.proxyServer`	Optional	String value.	None	Proxy server to connect to the AWS S3 object storage.
`gg.eventhandler.iceberg.proxyPort`	Optional	String Value.	`80`	Proxy server port to connect to the AWS S3 object storage.

Parent topic: Configuration for Iceberg AWS Glue Catalog

9.2.24.2.3.2 Classpath and Dependencies

The Java classpath (gg.classpath) should include the following dependencies:

Iceberg common dependencies
AWS SDK dependencies for writing to AWS S3 (s3://)

Parent topic: Configuration for Iceberg AWS Glue Catalog

9.2.24.2.3.3 Sample Configuration for Iceberg AWS Glue Catalog and AWS S3 s3:// or s3a:// Scheme

gg.target=iceberg
gg.eventhandler.iceberg.warehouseLocation=/path/to/iceberg/tables
gg.classpath=DependencyDownloader/dependencies/iceberg-aws-java-sdk/*:DependencyDownloader/dependencies/iceberg-common/*
gg.eventhandler.iceberg.catalogType=glue
gg.eventhandler.iceberg.awsGlueId=<aws-acccount-id>
gg.eventhandler.iceberg.fileSystemScheme=s3://
#gg.eventhandler.iceberg.fileSystemScheme=s3a://
gg.eventhandler.iceberg.awsS3Region=us-east-2
gg.eventhandler.iceberg.awsS3Bucket=<s3-bucket>
gg.eventhandler.iceberg.awsAccessKeyId=<access-key-id>
gg.eventhandler.iceberg.awsSecretKey=<secret-key>
gg.eventhandler.iceberg.proxyServer=<proxy-server>
gg.eventhandler.iceberg.proxyPort=<proxy-port>

Parent topic: Configuration for Iceberg AWS Glue Catalog

9.2.24.2.3.4 Table Names and Case Sensitivity

AWS Glue catalog supports only lower case names.

AWS Glue catalog supports only two-part table names.

The target table in the GGDAA Replicat MAP statement should be mapped to the Glue database and table names.

Example: MAP QASOURCE.TCUSTMER, TARGET "glue_database"."tcustmer";

In this example, glue_database is the Glue database name and tcustmer is the Glue table name.

Parent topic: Configuration for Iceberg AWS Glue Catalog

9.2.24.2.4 Configuration for Iceberg Polaris Catalog

Apache Polaris is an open-source, fully-featured catalog for Apache Iceberg.

There are a few options to setup Polaris:

Snowflake hosted Polaris (https://other-docs.snowflake.com/en/opencatalog/overview).
Polaris on your own infrastructure (https://polaris.apache.org/in-dev/unreleased/quickstart/).

Polaris catalog setup includes configuration and authentication to the object stores (S3/GCS/ADLS).

Iceberg warehouse location and authentication to object stores is not setup by GoldenGate when using Polaris.

This topic contains the following:

Parent topic: Configuration

9.2.24.2.4.1 Polaris Common Configuration

The following are the configuration properties for the Polaris catalog:

Properties	Required/Optional	Legal Values	Default	Explanation
`gg.eventhandler.iceberg.catalogType`	Required	String value.	`hadoop`	`polaris`.
`gg.eventhandler.iceberg.polarisCatalogUri`	Required	String value.	None	Polaris Catalog endpoint URI. Example: `https://<polaris-account>.snowflakecomputing.com/polaris/api/catalog.`
`gg.eventhandler.iceberg.polarisCatalogName`	Required	String value.	None	Polaris Catalog name. Catalog name is the entry point to the Polaris catalog namespace and tables.
`gg.eventhandler.iceberg.polarisClientId`	Required	String value.	None	Polaris principal’s client ID used for authentication and authorization to the respective Polaris catalog.
`gg.eventhandler.iceberg.polarisClientSecret`	Required	String value.	None	Polaris principal’s client secret used for authentication and authorization to the respective Polaris catalog.
`gg.eventhandler.iceberg.polarisPrincipalRole`	Optional	String value.	`ALL`	The role to be assumed by the Polaris principal.

Parent topic: Configuration for Iceberg Polaris Catalog

9.2.24.2.4.2 Polaris Catalog with Google Cloud Storage (GCS)

The environment variable GOOGLE_APPLICATION_CREDENTIALS must be set to the path to the Google Service account key file. Add the following to the Replicat parameter file (.prm):

SETENV (GOOGLE_APPLICATION_CREDENTIALS = "/path/to/the/gcp-service-account-json-key.json")

Parent topic: Configuration for Iceberg Polaris Catalog

9.2.24.2.4.3 Polaris Catalog with AWS S3 Storage

Properties	Required/Optional	Legal Values	Default	Explanation
`gg.eventhandler.iceberg.awsS3Region`	Required	String value.	None	Required only if the Polaris catalog points to AWS S3 Storage. AWS S3 bucket region. Example: `us-east-2`.
`gg.eventhandler.iceberg.fileSystemScheme`	Optional	String value.	`file://`	Required only if the Polaris catalog points to AWS S3 Storage. File system scheme to indicate AWS S3 object storage location: `s3://`.
`gg.eventhandler.iceberg.awsAccessKeyId`	Optional	String value.	None	Required only if the Polaris catalog points to AWS S3 Storage. AWS access key id for authentication.
`gg.eventhandler.iceberg.awsSecretKey`	Optional	String value.	None	Required only if the Polaris catalog points to AWS S3 Storage. AWS secret access key for authentication.
`gg.eventhandler.iceberg.awsSessionToken`	Optional	String value.	None	Required only if the Polaris catalog points to AWS S3 Storage. AWS session token for authentication.
`gg.eventhandler.iceberg.awsS3Endpoint`	Optional	String value.	None	Required only if the Polaris catalog points to AWS S3 Storage. AWS S3 endpoint.

Parent topic: Configuration for Iceberg Polaris Catalog

9.2.24.2.4.4 Polaris Catalog with Azure Data Lake Storage (ADLS)

roperties	Required/Optional	Legal Values	Default	Explanation
`gg.eventhandler.iceberg.fileSystemScheme`	Optional	String value.	`file://`	Required only if the Polaris catalog points to Azure Data Lake Storage. Warehouse scheme to indicate Azure Data Lake Storage location: `abfss://`.
`gg.eventhandler.iceberg.azureAccountName`	Required	String value.	None	Required only if the Polaris catalog points to Azure Data Lake Storage. Azure storage account name that contains the container for the Iceberg Warehouse.
`gg.eventhandler.iceberg.azureContainer`	Required	String value.	None	Required only if the Polaris catalog points to Azure Data Lake Storage. Azure storage account container name that houses the Iceberg Warehouse.
`gg.eventhandler.iceberg.azureAccountKey`	Required	String value.	None	Required only if the Polaris catalog points to Azure Data Lake Storage. Azure storage account key.
`gg.eventhandler.iceberg.azureBlobEndpoint`	Optional	String value.	`<azureContainer>@<azureAccountName>.dfs.core.windows.net`	Required only if the Polaris catalog points to Azure Data Lake Storage. Azure Storage service endpoint.

Parent topic: Configuration for Iceberg Polaris Catalog

9.2.24.2.4.5 Polaris Catalog and GCS Storage Classpath And Dependencies

If Polaris catalog is setup to write to GCS, then the Java classpath (gg.classpath) should include the following dependencies:

Iceberg common dependencies
Google Cloud Storage SDK dependencies for writing to Google Cloud Storage (GCS)

Parent topic: Configuration for Iceberg Polaris Catalog

9.2.24.2.4.6 Polaris Catalog and AWS S3 storage Classpath and Dependencies

If Polaris catalog is setup to write to AWS S3, then the Java classpath (gg.classpath) should include the following dependencies:

Iceberg common dependencies
AWS SDK dependencies for writing to AWS S3(s3://)

Parent topic: Configuration for Iceberg Polaris Catalog

9.2.24.2.4.7 Polaris Catalog and ADLS storage Classpath And Dependencies

If Polaris catalog is setup to write to ADLS, then the Java classpath (gg.classpath) should include the following dependencies:

Iceberg common dependencies
Hadoop Azure SDK dependencies for writing to Azure Data Lake Storage (abfss://).

Parent topic: Configuration for Iceberg Polaris Catalog

9.2.24.2.4.8 Sample Configuration for Polaris Catalog

gg.target=iceberg
#For catalog using GCS
gg.classpath=DependencyDownloader/dependencies/iceberg-gcs-java-sdk/*:DependencyDownloader/dependencies/iceberg-common/*
#For catalog using S3
#gg.classpath=DependencyDownloader/dependencies/iceberg-aws-java-sdk/*:DependencyDownloader/dependencies/iceberg-common/*
#For catalog using ADLS
#gg.classpath=DependencyDownloader/dependencies/iceberg-hadoop-azure/*:DependencyDownloader/dependencies/iceberg-common/*
gg.eventhandler.iceberg.catalogType=polaris
gg.eventhandler.iceberg.polarisCatalogUri=https://<polaris-account>.snowflakecomputing.com/polaris/api/catalog
gg.eventhandler.iceberg.polarisCatalogName=<polaris_gcs_catalog>
gg.eventhandler.iceberg.polarisClientId=<clientId>
gg.eventhandler.iceberg.polarisClientSecret=<clientSecret>
gg.eventhandler.iceberg.polarisPrincipalRole=ALL

Parent topic: Configuration for Iceberg Polaris Catalog

9.2.24.2.4.9 Polaris Namespace

Polaris namespace is the top-level container for all the tables in the Polaris catalog.

Before starting the Replicat process, the Polaris namespace should be created in the respective Polaris catalog.

The Polaris namespace is mapped to the GoldenGate schema in the MAP statement.

Example: MAP QASOURCE.TCUSTMER, TARGET "polaris_namespace"."tcustmer";

Parent topic: Configuration for Iceberg Polaris Catalog

9.2.24.2.5 Configuration for Iceberg REST Catalog

Iceberg defines a REST specification (https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml) for catalog implementations.

Any REST server that implements the Iceberg REST API can be used as the Iceberg catalog.

For example, Polaris is an implementation of the Iceberg REST API.

Parent topic: Configuration

9.2.24.2.5.1 Configuration for Iceberg REST Catalog

The following are the configuration properties for the Polaris catalog:

Properties	Required/Optional	Legal Values	Default	Explanation
`gg.eventhandler.iceberg.catalogType`	Required	String value.	`hadoop`	`rest`.
`gg.eventhandler.iceberg.restCatalogUri`	Required	String value.	None	REST Catalog endpoint URI. Example:`https://<polaris-account>.snowflakecomputing.com/polaris/api/catalog.`
`gg.eventhandler.iceberg.restCatalogProperties`	Optional	String value.	None	Properties file with additional configuration for the REST catalog.

Parent topic: Configuration for Iceberg REST Catalog

9.2.24.2.5.2 Sample Configuration for REST Catalog based on Polaris

gg.target=iceberg
#For catalog using GCS
gg.classpath=DependencyDownloader/dependencies/iceberg-gcs-java-sdk/*:DependencyDownloader/dependencies/iceberg-common/*
#For catalog using S3
#gg.classpath=DependencyDownloader/dependencies/iceberg-s3/*:DependencyDownloader/dependencies/iceberg-common/*
#For catalog using ADLS
#gg.classpath=DependencyDownloader/dependencies/iceberg-hadoop-azure/*:DependencyDownloader/dependencies/iceberg-common/*
gg.eventhandler.iceberg.catalogType=rest
gg.eventhandler.iceberg.restCatalogUri=https://<polaris-account>.snowflakecomputing.com/polaris/api/catalog
gg.eventhandler.iceberg.restCatalogProperties=/path/to/rest/catalog.properties
# Optional configuration for authentication to the object storage. 
# Some REST implementations do not require a separate authentication to the storage layer. 
#gg.eventhandler.iceberg.fileSystemScheme=s3://
#gg.eventhandler.iceberg.awsS3Region=<s3-region>
#gg.eventhandler.iceberg.awsS3Bucket=<s3-bucket>
#gg.eventhandler.iceberg.awsAccessKeyId=<access-key-id>
#gg.eventhandler.iceberg.awsSecretKey=<secret-key>
#gg.eventhandler.iceberg.fileSystemScheme=abfss://
#gg.eventhandler.iceberg.azureAccountName=<azure-storage-account-name>
#gg.eventhandler.iceberg.azureContainer=<azure-storage-container>
#gg.eventhandler.iceberg.azureAccountKey=<azure-storage-account-key>
#gg.eventhandler.iceberg.fileSystemScheme=gs://
#gg.eventhandler.iceberg.gcpStorageBucket=<gcs-bucket>
#gg.eventhandler.iceberg.gcpProjectId=<gcp-project-id>
#gg.eventhandler.iceberg.gcpServiceAccountJsonKeyFile=<gcp-service-account-key-file>

Parent topic: Configuration for Iceberg REST Catalog

9.2.24.2.5.3 Sample Rest Catalog Properties file (For Polaris)

warehouse=polaris_s3_catalog
credential=<ClientId>:<ClientSecret>
scope=PRINCIPAL_ROLE:ALL
token-refresh-enabled=true

Parent topic: Configuration for Iceberg REST Catalog

9.2.24.2.6 Configuration for Iceberg JDBC Catalog

Some JDBC compatible databases can be used to store the Iceberg catalog information.

Not all JDBC compatible databases are supported with the Iceberg JDBC Catalog API.

Note:

The Databricks target using the Databricks JDBC driver has been tested internally.

Parent topic: Configuration

9.2.24.2.6.1 Configuration for Iceberg JDBC Catalog and file:// Scheme

The following are the configuration properties for the JDBC catalog and the local file system as the Iceberg storage using file:// scheme:

Properties	Required/Optional	Legal Values	Default	Explanation
`gg.eventhandler.iceberg.catalogType`	Optional	String value.	`hadoop`	`jdbc`.
`gg.eventhandler.iceberg.fileSystemScheme`	Optional	String value.	`file://`	File system scheme to indicate local file system as the storage: `file://`.
`gg.eventhandler.iceberg.warehouseLocation`	Required	String value.	None	Local directory path to the Iceberg warehouse.
`gg.eventhandler.iceberg.jdbcUrl`	Required	String value.	None	JDBC URL to connect to the database used as Iceberg catalog.
`gg.eventhandler.iceberg.jdbcUser`	Optional	String value.	None	JDBC user to connect to the database used as Iceberg catalog.
`gg.eventhandler.iceberg.jdbcPassword`	Optional	String value.	None	JDBC password to connect to the database used as Iceberg catalog.

Parent topic: Configuration for Iceberg JDBC Catalog

9.2.24.2.6.1.1 Classpath and Dependencies

The Java classpath (gg.classpath) should include the following dependencies:

Iceberg common dependencies
Path to the JDBC driver to access the database used to store the Iceberg catalog.

Parent topic: Configuration for Iceberg JDBC Catalog and file:// Scheme

9.2.24.2.6.1.2 Sample Configuration for Iceberg JDBC Catalog and Local File Storage file:// Scheme

gg.target=iceberg
gg.eventhandler.iceberg.warehouseLocation=/path/to/iceberg/tables
gg.classpath=/path/to/the/jdbc/driver/*:DependencyDownloader/dependencies/iceberg-common/*
gg.eventhandler.iceberg.catalogType=jdbc
gg.eventhandler.iceberg.jdbcUrl=<jdbc-url>
gg.eventhandler.iceberg.jdbcUser=<jdbc-user>
gg.eventhandler.iceberg.jdbcPassword=<jdbc-password>

Parent topic: Configuration for Iceberg JDBC Catalog and file:// Scheme

9.2.24.2.6.2 Configuration for Iceberg JDBC Catalog and s3a:// Scheme

The following are the configuration properties for the JDBC catalog and AWS S3 object store using s3a:// scheme:

Properties	Required/Optional	Legal Values	Default	Explanation
`gg.eventhandler.iceberg.catalogType`	Optional	String value.	`hadoop`	`jdbc`.
`gg.eventhandler.iceberg.fileSystemScheme`	Optional	String value.	`file://`	File system scheme to indicate AWS S3 object storage location: `s3a://`.
`gg.eventhandler.iceberg.warehouseLocation`	Required	String value.	None	Local directory path to the Iceberg warehouse.
`gg.eventhandler.iceberg.jdbcUrl`	Required	String value.	None	JDBC URL to connect to the database used as Iceberg catalog.
`gg.eventhandler.iceberg.jdbcUser`	Optional	String value.	None	JDBC user to connect to the database used as Iceberg catalog.
`gg.eventhandler.iceberg.jdbcPassword`	Optional	String value.	None	JDBC password to connect to the database used as Iceberg catalog.
`gg.eventhandler.iceberg.awsS3Bucket`	Required	String value.	None	AWS S3 bucket name that houses the Iceberg Warehouse.
`gg.eventhandler.iceberg.awsAccessKeyId`	Required	String value.	None	AWS access key id for authentication.
`gg.eventhandler.iceberg.awsSecretKey`	Required	String value.	None	AWS secret access key for authentication.
`gg.eventhandler.iceberg.awsSessionToken`	Optional	String value.	None	AWS session token for authentication.
`gg.eventhandler.iceberg.proxyServer`	Optional	String value.	None	Proxy server to connect to the AWS S3 object storage.
`gg.eventhandler.iceberg.proxyPort`	Optional	String value.	`80`	Proxy server port to connect to the AWS S3 object storage.

Parent topic: Configuration for Iceberg JDBC Catalog

9.2.24.2.6.2.1 Classpath and Dependencies

The Java classpath (gg.classpath) should include the following dependencies:

Iceberg common dependencies
Hadoop AWS SDK dependencies for writing to AWS S3 (s3a:// scheme)
Path to the JDBC driver to access the database used to store the Iceberg catalog.

Parent topic: Configuration for Iceberg JDBC Catalog and s3a:// Scheme

9.2.24.2.6.2.2 Sample Configuration for JDBC Catalog and AWS S3 s3a:// scheme

gg.target=iceberg
gg.eventhandler.iceberg.warehouseLocation=/path/to/iceberg/tables
gg.classpath=DependencyDownloader/dependencies/iceberg-hadoop-aws/*:DependencyDownloader/dependencies/iceberg-common/*:/path/to/the/jdbc/driver/*
gg.eventhandler.iceberg.catalogType=jdbc
gg.eventhandler.iceberg.jdbcUrl=<jdbc-url>
gg.eventhandler.iceberg.jdbcUser=<jdbc-user>
gg.eventhandler.iceberg.jdbcPassword=<jdbc-password>
gg.eventhandler.iceberg.fileSystemScheme=s3a://
gg.eventhandler.iceberg.awsS3Region=us-east-2
gg.eventhandler.iceberg.awsS3Bucket=<s3-bucket>
gg.eventhandler.iceberg.awsAccessKeyId=<access-key-id>
gg.eventhandler.iceberg.awsSecretKey=<secret-key>
gg.eventhandler.iceberg.proxyServer=<proxy-server>
gg.eventhandler.iceberg.proxyPort=<proxy-port>

Parent topic: Configuration for Iceberg JDBC Catalog and s3a:// Scheme

9.2.24.2.6.3 Configuration for Iceberg JDBC Catalog and gs:// Scheme

The following are the configuration properties for the JDBC catalog and GCS object store using gs:// scheme:

Properties	Required/Optional	Legal Values	Default	Explanation
`gg.eventhandler.iceberg.catalogType`	Optional	String value.	`hadoop`	`jdbc`.
`gg.eventhandler.iceberg.fileSystemScheme`	Optional	String value.	`file://`	File system scheme to indicate GCS object storage location: `gs://`.
`gg.eventhandler.iceberg.warehouseLocation`	Required	String value.	None	Local directory path to the Iceberg warehouse.
`gg.eventhandler.iceberg.jdbcUrl`	Required	String value.	None	JDBC URL to connect to the database used as Iceberg catalog.
`gg.eventhandler.iceberg.jdbcUser`	Optional	String value.	None	JDBC user to connect to the database used as Iceberg catalog.
`gg.eventhandler.iceberg.jdbcPassword`	Optional	String value.	None	JDBC password to connect to the database used as Iceberg catalog.
`gg.eventhandler.iceberg.gcpStorageBucket`	Required	String value.	None	Google Cloud Storage bucket name that houses the Iceberg Warehouse.
`gg.eventhandler.iceberg.gcpProjectId`	Required	String value.	None	Sets the project-id of the Google Cloud project that houses the GCS bucket.
`gg.eventhandler.iceberg.gcpServiceAccountJsonKeyFile`	Required	String value.	None	Sets the path to the Google Service account key file.
`gg.eventhandler.iceberg.proxyServer`	Optional	String value.	None	Proxy server to connect to the GCS object storage.
`gg.eventhandler.iceberg.proxyPort`	Optional	String value.	`80`	Proxy server port to connect to the GCS object storage.

Parent topic: Configuration for Iceberg JDBC Catalog

9.2.24.2.6.3.1 Classpath And Dependencies

The Java classpath (gg.classpath) should include the following dependencies:

Iceberg common dependencies
Hadoop Google Cloud Storage SDK dependencies for writing to Google Cloud Storage (GCS)
Path to the JDBC driver to access the database used to store the Iceberg catalog.

Parent topic: Configuration for Iceberg JDBC Catalog and gs:// Scheme

9.2.24.2.6.3.2 Sample Configuration for JDBC Catalog and GCS `gs://` scheme

gg.target=iceberg
gg.eventhandler.iceberg.warehouseLocation=/path/to/iceberg/tables
gg.classpath=DependencyDownloader/dependencies/iceberg-hadoop-gcs/*:DependencyDownloader/dependencies/iceberg-common/*:/path/to/the/jdbc/driver/*
gg.eventhandler.iceberg.catalogType=jdbc
gg.eventhandler.iceberg.jdbcUrl=<jdbc-url>
gg.eventhandler.iceberg.jdbcUser=<jdbc-user>
gg.eventhandler.iceberg.jdbcPassword=<jdbc-password>
gg.eventhandler.iceberg.fileSystemScheme=gs://
gg.eventhandler.iceberg.gcpStorageBucket=<gcs-bucket>
gg.eventhandler.iceberg.gcpProjectId=<gcp-project-id>
gg.eventhandler.iceberg.gcpServiceAccountJsonKeyFile=<gcp-service-account-key-file>
gg.eventhandler.iceberg.proxyServer=<proxy-server>
gg.eventhandler.iceberg.proxyPort=<proxy-port>

Parent topic: Configuration for Iceberg JDBC Catalog and gs:// Scheme

9.2.24.2.6.4 Configuration for Iceberg JDBC Catalog and abfss:// Scheme

The following are the configuration properties for the JDBC catalog and Azure Data Lake Storage using the abfss:// scheme:

Properties	Required/Optional	Legal Values	Default	Explanation
`gg.eventhandler.iceberg.catalogType`	Optional	String value.	`hadoop`	`jdbc`.
`gg.eventhandler.iceberg.fileSystemScheme`	Optional	String value.	`file://`	File system scheme to indicate Azure Data Lake Storage location: `abfss://`.
`gg.eventhandler.iceberg.warehouseLocation`	Required	String value.	None	Local directory path to the Iceberg warehouse.
`gg.eventhandler.iceberg.jdbcUrl`	Required	String value.	None	JDBC URL to connect to the database used as Iceberg catalog.
`gg.eventhandler.iceberg.jdbcUser`	Optional	String value.	None	JDBC user to connect to the database used as Iceberg catalog.
`gg.eventhandler.iceberg.jdbcPassword`	Optional	String value.	None	JDBC password to connect to the database used as Iceberg catalog.
`gg.eventhandler.iceberg.azureAccountName`	Required	String value.	None	Azure storage account name that contains the container for the Iceberg Warehouse.
`gg.eventhandler.iceberg.azureContainer`	Required	String value.	None	Azure storage account container name that houses the Iceberg Warehouse.
`gg.eventhandler.iceberg.azureAccountKey`	Required	String value.	None	Azure storage account key.
`gg.eventhandler.iceberg.azureBlobEndpoint`	Optional	String value.	`<azureContainer>@<azureAccountName>.dfs.core.windows.net`	Azure Storage service endpoint.
`gg.eventhandler.iceberg.proxyServer`	Optional	String value.	None	Proxy server to connect to the Azure object storage.
`gg.eventhandler.iceberg.proxyPort`	Optional	String value.	`80`	Proxy server port to connect to the Azure object storage.

Parent topic: Configuration for Iceberg JDBC Catalog

9.2.24.2.6.4.1 Classpath And Dependencies

The Java classpath (gg.classpath) should include the following dependencies:

Iceberg common dependencies
Hadoop Azure SDK dependencies for writing to Azure Data Lake (ADLS)
Path to the JDBC driver to access the database used to store the Iceberg catalog.

Parent topic: Configuration for Iceberg JDBC Catalog and abfss:// Scheme

9.2.24.2.6.4.2 Sample Configuration for JDBC Catalog and ADLS abfss:// Scheme

gg.target=iceberg
gg.eventhandler.iceberg.warehouseLocation=/path/to/iceberg/tables
gg.classpath=DependencyDownloader/dependencies/iceberg-hadoop-azure/*:DependencyDownloader/dependencies/iceberg-common/*:/path/to/the/jdbc/driver/*
gg.eventhandler.iceberg.catalogType=jdbc
gg.eventhandler.iceberg.jdbcUrl=<jdbc-url>
gg.eventhandler.iceberg.jdbcUser=<jdbc-user>
gg.eventhandler.iceberg.jdbcPassword=<jdbc-password>
gg.eventhandler.iceberg.fileSystemScheme=abfss://
gg.eventhandler.iceberg.azureAccountName=<azure-storage-account-name>
gg.eventhandler.iceberg.azureContainer=<azure-storage-container>
gg.eventhandler.iceberg.azureAccountKey=<azure-storage-account-key>
gg.eventhandler.iceberg.proxyServer=<proxy-server>
gg.eventhandler.iceberg.proxyPort=<proxy-port>

Parent topic: Configuration for Iceberg JDBC Catalog and abfss:// Scheme

9.2.24.2.7 Configuration for Iceberg Hadoop Catalog

Hadoop catalog is not recommended for production usage as it has no reliable locking mechanism and would impact concurrent reads and writes.

Hadoop catalog is used for testing purposes only.

Parent topic: Configuration

9.2.24.2.7.1 Configuration for Iceberg Hadoop Catalog and file:// Scheme

The following are the configuration properties for the Hadoop catalog and the local file system as the Iceberg storage using file:// scheme:

Properties	Required/Optional	Legal Values	Default	Explanation
`gg.eventhandler.iceberg.catalogType`	Optional	String value.	`hadoop`	`hadoop`.
`gg.eventhandler.iceberg.fileSystemScheme`	Optional	String value.	`file://`	File system scheme to indicate local file system as the storage: `file://`.
`gg.eventhandler.iceberg.warehouseLocation`	Required	String value.	None	Local directory path to the Iceberg warehouse.

Note:

This configuration is typically used for testing purposes for storing the Iceberg tables on the local file system.

Parent topic: Configuration for Iceberg Hadoop Catalog

9.2.24.2.7.1.1 Classpath and Dependencies

The Java classpath (gg.classpath) should include the following dependencies:

Iceberg common dependencies

Parent topic: Configuration for Iceberg Hadoop Catalog and file:// Scheme

9.2.24.2.7.1.2 Sample Configuration for Iceberg Hadoop Catalog and Local File Storage file:// Scheme

gg.target=iceberg
gg.eventhandler.iceberg.warehouseLocation=/path/to/iceberg/tables
gg.classpath=DependencyDownloader/dependencies/iceberg-common/*

Parent topic: Configuration for Iceberg Hadoop Catalog and file:// Scheme

9.2.24.2.7.2 Configuration for Iceberg Hadoop Catalog and s3a:// Scheme

The following are the configuration properties for the Hadoop catalog and AWS S3 object store using s3a:// scheme:

Properties	Required/Optional	Legal Values	Default	Explanation
`gg.eventhandler.iceberg.catalogType`	Optional	String value.	`hadoop`	`hadoop`.
`gg.eventhandler.iceberg.fileSystemScheme`	Optional	String value.	`file://`	File system scheme to indicate AWS S3 object storage location: `s3a://`.
`gg.eventhandler.iceberg.awsS3Bucket`	Required	String value.	None	AWS S3 bucket name that houses the Iceberg Warehouse.
`gg.eventhandler.iceberg.awsAccessKeyId`	Required	String value.	None	AWS access key id for authentication.
`gg.eventhandler.iceberg.awsSecretKey`	Required	String value.	None	AWS secret access key for authentication.
`gg.eventhandler.iceberg.awsSessionToken`	Optional	String value.	None	AWS session token for authentication.
`gg.eventhandler.iceberg.proxyServer`	Optional	String value.	None	Proxy server to connect to the AWS S3 object storage.
`gg.eventhandler.iceberg.proxyPort`	Optional	String value.	`80`	Proxy server port to connect to the AWS S3 object storage.

Parent topic: Configuration for Iceberg Hadoop Catalog

9.2.24.2.7.2.1 Classpath and Dependencies

The Java classpath (gg.classpath) should include the following dependencies:

Iceberg common dependencies
Hadoop AWS SDK dependencies for writing to AWS S3 (s3a:// scheme)

Parent topic: Configuration for Iceberg Hadoop Catalog and s3a:// Scheme

9.2.24.2.7.2.2 Sample Configuration for Hadoop Catalog and AWS S3 s3a:// Scheme

gg.target=iceberg
gg.eventhandler.iceberg.warehouseLocation=/path/to/iceberg/tables
gg.classpath=DependencyDownloader/dependencies/iceberg-hadoop-aws/*:DependencyDownloader/dependencies/iceberg-common/
gg.eventhandler.iceberg.catalogType=hadoop
gg.eventhandler.iceberg.fileSystemScheme=s3a://
gg.eventhandler.iceberg.awsS3Region=us-east-2
gg.eventhandler.iceberg.awsS3Bucket=<s3-bucket>
gg.eventhandler.iceberg.awsAccessKeyId=<access-key-id>
gg.eventhandler.iceberg.awsSecretKey=<secret-key>
gg.eventhandler.iceberg.proxyServer=<proxy-server>
gg.eventhandler.iceberg.proxyPort=<proxy-port>

Parent topic: Configuration for Iceberg Hadoop Catalog and s3a:// Scheme

9.2.24.2.7.3 Configuration for Iceberg Hadoop Catalog and gs:// Scheme

The following are the configuration properties for the Hadoop catalog and GCS object store using gs:// scheme:

Properties	Required/Optional	Legal Values	Default	Explanation
`gg.eventhandler.iceberg.catalogType`	Optional	String value.	`hadoop`	`hadoop`.
`gg.eventhandler.iceberg.fileSystemScheme`	Optional	String value.	`file://`	File system scheme to indicate GCS object storage location: `gs://`.
`gg.eventhandler.iceberg.gcpStorageBucket`	Required	String value.	None	Google Cloud Storage bucket name that houses the Iceberg Warehouse.
`gg.eventhandler.iceberg.gcpProjectId`	Required	String value.	None	Sets the project-id of the Google Cloud project that houses the GCS bucket.
`gg.eventhandler.iceberg.gcpServiceAccountJsonKeyFile`	Required	String value.	None	Sets the path to the Google Service account key file.
`gg.eventhandler.iceberg.proxyServer`	Optional	String value.	None	Proxy server to connect to the GCS object storage.
`gg.eventhandler.iceberg.proxyPort`	Optional	String value.	`80`	Proxy server port to connect to the GCS object storage.

Parent topic: Configuration for Iceberg Hadoop Catalog

9.2.24.2.7.3.1 Classpath and Dependencies

The Java classpath (gg.classpath) should include the following dependencies:

Iceberg common dependencies
Hadoop Google Cloud Storage SDK dependencies for writing to Google Cloud Storage (GCS)

Parent topic: Configuration for Iceberg Hadoop Catalog and gs:// Scheme

9.2.24.2.7.3.2 Sample Configuration for Hadoop Catalog and GCS gs:// Scheme

gg.target=iceberg
gg.eventhandler.iceberg.warehouseLocation=/path/to/iceberg/tables
gg.classpath=DependencyDownloader/dependencies/iceberg-hadoop-gcs/*:DependencyDownloader/dependencies/iceberg-common/*
gg.eventhandler.iceberg.catalogType=hadoop
gg.eventhandler.iceberg.fileSystemScheme=gs://
gg.eventhandler.iceberg.gcpStorageBucket=<gcs-bucket>
gg.eventhandler.iceberg.gcpProjectId=<gcp-project-id>
gg.eventhandler.iceberg.gcpServiceAccountJsonKeyFile=<gcp-service-account-key-file>
gg.eventhandler.iceberg.proxyServer=<proxy-server>
gg.eventhandler.iceberg.proxyPort=<proxy-port>

Parent topic: Configuration for Iceberg Hadoop Catalog and gs:// Scheme

9.2.24.2.7.4 Configuration for Iceberg Hadoop Catalog and abfss:// Scheme

The following are the configuration properties for the Hadoop catalog and Azure Data Lake Storage using abfss:// scheme:

Properties	Required/Optional	Legal Values	Default	Explanation
`gg.eventhandler.iceberg.catalogType`	Optional	String value	`hadoop`	`hadoop`.
`gg.eventhandler.iceberg.fileSystemScheme`	Optional	String value	`file://`	File system scheme to indicate Azure Data Lake Storage location: `abfss://`.
`gg.eventhandler.iceberg.azureAccountName`	Required	String value	None	Azure storage account name that contains the container for the Iceberg Warehouse.
`gg.eventhandler.iceberg.azureContainer`	Required	String value	None	Azure storage account container name that houses the Iceberg Warehouse.
`gg.eventhandler.iceberg.azureAccountKey`	Required	String value.	None	Azure storage account key.
`gg.eventhandler.iceberg.azureBlobEndpoint`	Optional	String value.	`\`	Azure Storage service endpoint.
`gg.eventhandler.iceberg.proxyServer`	Optional	String value.	None	Proxy server to connect to the Azure object storage.
`gg.eventhandler.iceberg.proxyPort`	Optional	String value.	`80`	Proxy server port to connect to the Azure object storage.

Parent topic: Configuration for Iceberg Hadoop Catalog

9.2.24.2.7.4.1 Classpath and Dependencies

The Java classpath (gg.classpath) should include the following dependencies:

Iceberg common dependencies
Hadoop Azure SDK dependencies for writing to Azure Data Lake (ADLS)

Parent topic: Configuration for Iceberg Hadoop Catalog and abfss:// Scheme

9.2.24.2.7.4.2 Sample Configuration for Hadoop Catalog and ADLS abfss:// Scheme

gg.target=iceberg
gg.eventhandler.iceberg.warehouseLocation=/path/to/iceberg/tables
gg.classpath=DependencyDownloader/dependencies/iceberg-hadoop-azure/*:DependencyDownloader/dependencies/iceberg-common/*
gg.eventhandler.iceberg.catalogType=hadoop
gg.eventhandler.iceberg.fileSystemScheme=abfss://
gg.eventhandler.iceberg.azureAccountName=<azure-storage-account-name>
gg.eventhandler.iceberg.azureContainer=<azure-storage-container>
gg.eventhandler.iceberg.azureAccountKey=<azure-storage-account-key>
gg.eventhandler.iceberg.proxyServer=<proxy-server>
gg.eventhandler.iceberg.proxyPort=<proxy-port>

Parent topic: Configuration for Iceberg Hadoop Catalog and abfss:// Scheme

9.2.24.3 Configuration Templates

Iceberg configuration templates are available in the directory /path/to/AdapterExamples/bigdata/iceberg.

The following template properties files are packaged with Oracle GoldenGate:

iceberg-glue-s3.properties
iceberg-hadoop-adls.properties
iceberg-hadoop-gcs.properties
iceberg-hadoop-localfile.properties
iceberg-hadoop-s3.properties
iceberg-jdbc-localfile.properties
iceberg-jdbc-s3.properties
iceberg-jdbc-adls.properties
iceberg-jdbc-gcs.properties
iceberg-nessie-adls.properties
iceberg-nessie-gcs.properties
iceberg-nessie-s3.properties
iceberg-nessie-s3a.properties
iceberg-polaris-adls.properties
iceberg-polaris-gcs.properties
iceberg-polaris-s3.properties
iceberg-rest.properties

Parent topic: Iceberg Event Handler

9.2.24.4 Limitations

Oracle GoldenGate does not support configuration of partition columns during automatic table creation.
If partitioned tables are required, the Iceberg table should be created manually with the required partition columns.
Altering the partitioning schema of a table is not supported after starting the Replication process.
If the partitioning schema of a table needs to be changed, the table should be dropped and recreated manually in the target database.

The data in the table will need to be reloaded.

Note:
Contact Oracle Support for assistance with this process.
Pre-existing Iceberg target tables must have identifier columns(key columns) in the schema.
The Replicat process will ABEND if the target table does not have identifier columns.
The following Iceberg data types cannot be used as a key column (Iceberg identifier field):
- binary
- fixed
- uuid

Parent topic: Iceberg Event Handler

9.2.24.5 Instantiating Oracle GoldenGate with an Initial Load

For more information about the standard steps for instantiation, see: https://docs.oracle.com/en/middleware/goldengate/core/21.3/admin/instantiating-oracle-goldengate-initial-load.html#GUID-7D3BD34D-490B-4E76-A48B-63572D93881A

Parent topic: Iceberg Event Handler

9.2.24.5.1 Instantiation Steps Specific to Iceberg

Start initial load groups for Extract and Replicat.
Start change synchronization group for Extract and write operations to a trail file.

Note:
Do not start change synchronization group for Replicat yet.
Wait until the initial load Replicat group has completed apply of the initial load trail files.
Stop the change synchronization group for Extract.
Configure a change synchronization Replicat group.
Add the parameter UPDATEINSERTS to the change synchronization Replicat group.
Start the change synchronization Replicat group.
Wait until the change synchronization Replicat group has processed all the trails generated by change synchronization Extract group.
The last record’s end offset in the last trail file must match the targetCheckpoint value in the JSON checkpoint file of the change synchronization Replicat group.
Example:
- Run ls -l on the last trail file.
```
-rw-r--r-- 1 username dba 5660 Feb 22 2024 /path/to/trail/tr000000003
```
- Here the last record’s end offset is 5660, and the trail sequence is 3.
- Open JSON checkpoint file for the change synchronization Replicat group
  This should have the following attribute:
```
 "targetCheckpoint" : {
     "trailSequence" : 3,
     "trailOffset" : 5660
  }
```
  This targetCheckpoint must match the last record’s end offset.
Shutdown change synchronization Replicat group and remove the parameter UPDATEINSERTS.
Initial load is complete now. Start change synchronization Extract and Replicat groups.

Parent topic: Instantiating Oracle GoldenGate with an Initial Load

9.2.24.5.2 Iceberg Change Synchronization Replicat Behavior During Instantiation

Execute [DELETE+INSERT] for all the INSERT operations, irrespective of whether the base row exists on the target or not.
Run [DELETE+INSERT] for all the UPDATE operations, irrespective of whether the base row exists on the target or not.
Run DELETE for all the DELETE operations, irrespective of whether the base row exists on the target or not.

Note:
No collisions will be logged in the Iceberg Replicat report file.

Parent topic: Instantiating Oracle GoldenGate with an Initial Load

9.2.24.6 Troubleshooting and Diagnostics

Oracle GoldenGate replicat supports the Iceberg data types as per the version 2 specification.
Iceberg identifier(key) fields cannot be null. Therefore, the Replicat process will ABEND if the key column value is null.
Schema changes to the table such as ADD/ALTER/DROP columns is not supported while Replicat process is running.
There are steps to quiesce the replication process, apply the schema changes and resume the replication process.

Note:
Contact Oracle Support for assistance with this process.

The Replicat process will ABEND if there are unmapped columns in the target table.

Replicat ABEND with the following message:

ICEBERGEH-00060 Operation record at position  '00000000030000003318' for the table 'hadoop.oggdb1.types_tab' has  missing column values in an UPDATE. Replicat will
ABEND. To override  this behavior set 'gg.eventhandler.iceberg.abendOnMissingColumns=false'and restart the Replicat process. Setting this property to false will  instruct Replicat to
lookup missing columns from the target table and therefore may impact performance.

By default, the Iceberg Replicat process expects trails files without missing column value in the UPDATE operations. Replicat can be configured to process compressed trails files with missing column values in the UPDATE operations by setting the property gg.eventhandler.iceberg.abendOnMissingColumns=false.

Replicat ABEND with the following message:

ICEBERGEH-00057 Detected changes in the partition columns for the table 'hadoop.oggdb1.types_tab'. 
Partition columns in the previous run: '<column list>', partition columns in this run: '<column list>'. 
GoldenGate does not support changing partition columns. 
Alter the table manually to match the partition columns in the previous run and restart the replicat process.

The Iceberg Replicat process does not support changing partition columns.

Replicat ABEND with the following message:

ICEBERGEH-00067 Invalid state. The column '<column_name>' in the target table '<table_name>' is not mapped. 
The following are the mapped columns: '<column list>'. Iceberg Replicat requires all the columns in the target table to be mapped. 
Please map the column ''<unmapped column>' and restart the Replicat process.

The Iceberg Replicat process requires all the columns in the target table to be mapped.

Replicat ABEND with the following message:

ICEBERGEH-00068 Key column '<column name>' in the table '<table name>' is of type float or double. 
Iceberg does not support float or double type as identifier (key) fields. Initiating Replicat process shutdown. 
Please modify the table schema to exclude double/float types as key columns and restart the Replicat process.

As per the current Iceberg specification (version 2), the column types double and float cannot be used as identifier (key) columns.

Replicat ABEND with the following message:

ICEBERGEH-00070 Table '<table_name>' contains a key column '<column_name>' of '<binary/fixed/uuid>' type that is not supported by GoldenGate. 
The following column types are not supported as key: 'binary, fixed, uuid'. To proceed, either use a supported Iceberg key column type by altering the 'KEYCOLS' clause in the Replicat 'MAP' statement as per the following example: 'MAP <sourceSchema>.<sourceTable>, TARGET <targetSchema>.<targetTable>, KEYCOLS("key1", "key2");' or alter the Iceberg target tables's identifier fields to exclude the key column types that are not supported by GoldenGate. 
You can use the following Iceberg SQL statement to alter the table schema: 'ALTER TABLE prod.db.sample SET IDENTIFIER FIELDS key1, key2'.

The Iceberg types binary, fixed and uuid cannot be used as identifier (key) columns.

Replicat ABEND with the following message:

ICEBERGEH-00071=Table '<table_name>' does not define an  Iceberg identifier column. 
Identifier columns are used as key columns by  GoldenGate. Initiating Replicat process shutdown. 
Please alter the  Iceberg target tables's schema to add identifier columns. 
You can use  the following Iceberg SQL statement to alter the table schema: 'ALTER TABLE prod.db.sample SET IDENTIFIER FIELDS key1, key2'.

The Iceberg target table should have identifier columns (key columns) in the schema.

Exceptions in the Replicat handler log file:
- com.google.cloud.storage.StorageException: 401 Unauthorized
- org.apache.iceberg.exceptions.RuntimeIOException: Failed to get file system for path
- org.apache.iceberg.exceptions.RuntimeIOException: Failed to create file
- org.apache.iceberg.exceptions.ForbiddenException: Forbidden
  These are common exceptions due to the incorrect configuration of the object storage authentication properties.
  
  Ensure that the following properties are set:
  - gg.eventhandler.iceberg.fileSystemScheme, gg.eventhandler.iceberg.proxyServer, gg.eventhandler.iceberg.proxyPort
  - gg.eventhandler.iceberg.awsAccessKeyId, gg.eventhandler.iceberg.awsSecretKey, gg.eventhandler.iceberg.awsS3Region
  - gg.eventhandler.iceberg.azureAccountKey
  - gg.eventhandler.iceberg.gcpProjectId, gg.eventhandler.iceberg.gcpServiceAccountJsonKeyFile.

Parent topic: Iceberg Event Handler

9.2.24 Iceberg Event Handler

9.2.24.1 Detailed Functionality

9.2.24.1.1 Replication without a SQL Engine

9.2.24.1.2 Iceberg File Format

9.2.24.1.3 Iceberg Catalog

9.2.24.1.4 Iceberg Specification

9.2.24.1.5 Delete Files and Merge-On-Read (MoR)

9.2.24.1.6 Operation Support

9.2.24.1.7 Compressed Update Handling

9.2.24.1.7.1 Lookup Missing values in Sparse Updates

9.2.24.1.8 INSERTALLRECORDS Support

9.2.24.1.9 Operation Aggregation

9.2.24.1.10 Automatic Table Creation

9.2.24.1.11 Iceberg Metadata Provider

9.2.24.1.12 Iceberg Identifier Fields

9.2.24.1.13 Primary Key Updates and Truncates

9.2.24.2 Configuration

9.2.24.2.1 Automatic Configuration

9.2.24.2.1.1 File Writer Configuration

9.2.24.2.1.2 Iceberg Event Handler Configuration

9.2.24.2.1.2.1 Common Iceberg Properties

9.2.24.2.1.2.1.1 File System Scheme

9.2.24.2.1.2.2 Iceberg Common Dependencies

9.2.24.2.1.2.3 AWS Java SDK dependencies for Writing to AWS S3 (s3:// Scheme)

9.2.24.2.1.2.4 Hadoop AWS SDK Dependencies for Writing to AWS S3 (s3a:// Scheme)

9.2.24.2.1.2.5 Hadoop Google Cloud Storage SDK Dependencies for Writing to Google Cloud Storage (GCS)

9.2.24.2.1.2.6 Google Cloud Storage SDK Dependencies for Writing to Google Cloud Storage (GCS)

9.2.24.2.1.2.7 Hadoop Azure SDK Dependencies for Writing to Azure Data Lake (ADLS)

9.2.24.2.2 Configuration for Iceberg Nessie Catalog

9.2.24.2.2.1 Configuration for Nessie Catalog and AWS S3 s3:// Scheme

9.2.24.2.2.1.1 Classpath And Dependencies

9.2.24.2.2.1.2 Sample Configuration for Nessie Catalog and AWS S3 s3:// Scheme

9.2.24.2.2.2 Configuration for Nessie Catalog and AWS S3 s3a:// Scheme

9.2.24.2.2.2.1 Classpath and Dependencies

9.2.24.2.2.2.2 Sample Configuration for Nessie Catalog and AWS S3 s3a:// scheme

9.2.24.2.2.3 Configuration for Nessie Catalog and GCS gs:// Scheme

9.2.24.2.2.3.1 Classpath and Dependencies

9.2.24.2.2.3.2 Sample Configuration for Nessie Catalog and GCS gs:// Scheme

9.2.24.2.2.4 Configuration for Nessie Catalog and Azure Data Lake Storage abfss:// Scheme

9.2.24.2.2.4.1 Classpath and Dependencies

9.2.24.2.2.4.2 Sample Configuration for Nessie Catalog and ADLS abfss:// Scheme

9.2.24.2.2.4.3 Nessie Namespace

9.2.24.2.3 Configuration for Iceberg AWS Glue Catalog

9.2.24.2.3.1 Configuration for Iceberg AWS Glue Catalog and AWS S3 s3:// OR s3a:// Scheme

9.2.24.2.3.2 Classpath and Dependencies

9.2.24.2.3.3 Sample Configuration for Iceberg AWS Glue Catalog and AWS S3 s3:// or s3a:// Scheme

9.2.24.2.3.4 Table Names and Case Sensitivity

9.2.24.2.4 Configuration for Iceberg Polaris Catalog

9.2.24.2.4.1 Polaris Common Configuration

9.2.24.2.4.2 Polaris Catalog with Google Cloud Storage (GCS)

9.2.24.2.4.3 Polaris Catalog with AWS S3 Storage

9.2.24.2.4.4 Polaris Catalog with Azure Data Lake Storage (ADLS)

9.2.24.2.4.5 Polaris Catalog and GCS Storage Classpath And Dependencies

9.2.24.2.4.6 Polaris Catalog and AWS S3 storage Classpath and Dependencies

9.2.24.2.4.7 Polaris Catalog and ADLS storage Classpath And Dependencies

9.2.24.2.4.8 Sample Configuration for Polaris Catalog

9.2.24.2.4.9 Polaris Namespace

9.2.24.2.5 Configuration for Iceberg REST Catalog

9.2.24.2.5.1 Configuration for Iceberg REST Catalog

9.2.24.2.5.2 Sample Configuration for REST Catalog based on Polaris

9.2.24.2.5.3 Sample Rest Catalog Properties file (For Polaris)

9.2.24.2.6 Configuration for Iceberg JDBC Catalog

9.2.24.2.6.1 Configuration for Iceberg JDBC Catalog and file:// Scheme

9.2.24.2.6.1.1 Classpath and Dependencies

9.2.24.2.6.1.2 Sample Configuration for Iceberg JDBC Catalog and Local File Storage file:// Scheme

9.2.24.2.6.2 Configuration for Iceberg JDBC Catalog and s3a:// Scheme

9.2.24.2.6.2.1 Classpath and Dependencies

9.2.24.2.6.2.2 Sample Configuration for JDBC Catalog and AWS S3 s3a:// scheme

9.2.24.2.6.3 Configuration for Iceberg JDBC Catalog and gs:// Scheme

9.2.24.2.6.3.1 Classpath And Dependencies

9.2.24.2.6.3.2 Sample Configuration for JDBC Catalog and GCS gs:// scheme

9.2.24.2.6.4 Configuration for Iceberg JDBC Catalog and abfss:// Scheme

9.2.24.2.6.4.1 Classpath And Dependencies

9.2.24.2.6.4.2 Sample Configuration for JDBC Catalog and ADLS abfss:// Scheme

9.2.24.2.7 Configuration for Iceberg Hadoop Catalog

9.2.24.2.7.1 Configuration for Iceberg Hadoop Catalog and file:// Scheme

9.2.24.2.7.1.1 Classpath and Dependencies

9.2.24.2.7.1.2 Sample Configuration for Iceberg Hadoop Catalog and Local File Storage file:// Scheme

9.2.24.2.7.2 Configuration for Iceberg Hadoop Catalog and s3a:// Scheme

9.2.24.2.7.2.1 Classpath and Dependencies

9.2.24.2.6.3.2 Sample Configuration for JDBC Catalog and GCS `gs://` scheme