11 Using the S3 Event Handler
Learn how to use the S3 Event Handler, which provides the interface to Amazon S3 web services.
11.1 Overview
Amazon S3 is object storage hosted in the Amazon cloud. The purpose of the S3 Event Handler is to load data files generated by the File Writer Handler into Amazon S3, see https://aws.amazon.com/s3/.
You can use any format that the File Writer Handler, see Using the File Writer Handler.
Parent topic: Using the S3 Event Handler
11.2 Detailing Functionality
The S3 Event Handler requires the Amazon Web Services (AWS) Java SDK to transfer files to S3 object storage.Oracle GoldenGate for Big Data does not include the AWS Java SDK. 1.x AWS Java SDK versions are no longer supported, it is recommended to use 2.28.11 or higher. You have to download and install the AWS Java SDK from:
https://aws.amazon.com/sdk-for-java/
Then you have to configure the gg.classpath
variable to include the JAR files in the AWS Java SDK and are divided into two directories. Both directories must be in gg.classpath
, for example:
gg.classpath=/usr/var/aws_sdk_2.28.11/*:/usr/var/aws_sdk_2.28.11/third-party/lib/
- Configuring the Client ID and Secret
- About the AWS S3 Buckets
- Using Templated Strings
- Troubleshooting
Parent topic: Using the S3 Event Handler
11.2.1 Configuring the Client ID and Secret
A client ID and secret are the required credentials for the S3 Event Handler to interact with Amazon S3. A client ID and secret are generated using the Amazon AWS website. The retrieval of these credentials and presentation to the S3 server are performed on the client side by the AWS Java SDK. The AWS Java SDK provides multiple ways that the client ID and secret can be resolved at runtime.
The client ID and secret can be set as Java properties, on one line, in the Java Adapter properties file as follows:
javawriter.bootoptions=-Xmx512m -Xms32m
-Djava.class.path=ggjava/ggjava.jar
-Daws.accessKeyId=your_access_key
-Daws.secretKey=your_secret_key
This sets environmental variables using the Amazon Elastic Compute Cloud (Amazon EC2) AWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
variables on the local machine.
Parent topic: Detailing Functionality
11.2.2 About the AWS S3 Buckets
AWS divides S3 storage into separate file systems called buckets. The S3 Event Handler can write to pre-created buckets. Alternatively, if the S3 bucket does not exist, the S3 Event Handler attempts to create the specified S3 bucket. AWS requires that S3 bucket names are lowercase. Amazon S3 bucket names must be globally unique. If you attempt to create an S3 bucket that already exists in any Amazon account, it causes the S3 Event Handler to abend.
Parent topic: Detailing Functionality
11.2.3 Using Templated Strings
Templated strings can contain a combination of string constants and keywords that are dynamically resolved at runtime. The S3 Event Handler makes extensive use of templated strings to generate the S3 directory names, data file names, and S3 bucket names. This gives you the flexibility to select where to write data files and the names of those data files.
Supported Templated Strings
Keyword | Description |
---|---|
${fullyQualifiedTableName} |
The fully qualified source table name delimited by a period ( |
${catalogName} |
The individual source catalog name. For example, |
${schemaName} |
The individual source schema name. For example, |
${tableName} |
The individual source table name. For example, |
${groupName} |
The name of the Replicat process (with the thread number appended if you’re using coordinated apply). |
${emptyString} |
Evaluates to an empty string. For example, |
${operationCount} |
The total count of operations in the data file. It must be used either on rename or by the event handlers or it will be zero ( |
${insertCount} |
The total count of insert operations in the data file. It must be used either on rename or by the event handlers or it will be zero ( |
${updateCount} |
The total count of update operations in the data file. It must be used either on rename or by the event handlers or it will be zero ( |
${deleteCount} |
The total count of delete operations in the data file. It must be used either on rename or by the event handlers or it will be zero ( |
${truncateCount} |
The total count of truncate operations in the data file. It must be used either on rename or by the event handlers or it will be zero ( |
${currentTimestamp} |
The current timestamp. The default output format for the date time is
This format uses the syntax defined in the Java |
${toUpperCase[]} |
Converts the contents inside the square brackets to uppercase. For example, |
${toLowerCase[]} |
Converts the contents inside the square brackets to lowercase. For example, |
Configuration of template strings can use a mix of keywords and static strings to assemble path and data file names at runtime.
Parent topic: Detailing Functionality
11.2.4 Troubleshooting
Connectivity Issues
If the S3 Event Handler is unable to connect to the S3 object storage when running on premise, it’s likely your connectivity to the public internet is protected by a proxy server. Proxy servers act a gateway between the private network of a company and the public internet. Contact your network administrator to get the URLs of your proxy server, and then setup up a proxy server.
Oracle GoldenGate can be used with a proxy server using the following parameters to enable the proxy server:
gg.handler.name.proxyServer=
-
gg.handler.name.proxyPort=80
Access to the proxy servers can be secured using credentials and the following configuration parameters:
gg.handler.name.proxyUsername=username
gg.handler.name.proxyPassword=password
Sample configuration:
gg.eventhandler.s3.type=s3
gg.eventhandler.s3.region=us-west-2
gg.eventhandler.s3.proxyServer=www-proxy.us.oracle.com
gg.eventhandler.s3.proxyPort=80
gg.eventhandler.s3.proxyProtocol=HTTP
gg.eventhandler.s3.bucketMappingTemplate=yourbucketname
gg.eventhandler.s3.pathMappingTemplate=thepath
gg.eventhandler.s3.finalizeAction=none
goldengate.userexit.writers=javawriter
Parent topic: Detailing Functionality
11.3 Configuring the S3 Event Handler
You can configure the S3 Event Handler operation using the properties file. These properties are located in the Java Adapter properties file (not in the Replicat properties file).
To enable the selection of the S3 Event Handler, you must first configure the
handler type by specifying gg.eventhandler.name.type=s3
and
the other S3 Event properties as follows:
Table 11-1 S3 Event Handler Configuration Properties
Properties | Required/ Optional | Legal Values | Default | Explanation |
---|---|---|---|---|
|
Required |
|
None |
Selects the S3 Event Handler for use with Replicat. |
|
Required |
The AWS region name that is hosting your S3 instance. |
None |
Setting the legal AWS region name is required. |
gg.eventhandler.name.cannedACL |
Optional | Accepts one of the following values:
|
None | Amazon S3 supports a set of predefined grants, known as canned Access Control Lists. Each canned ACL has a predefined set of grantees and permissions. For more information, see Managing access with ACLs |
|
Optional |
The host name of your proxy server. |
None |
Sets the host name of your proxy server if connectivity to AWS is required use a proxy server. |
|
Optional |
The port number of the proxy server. |
None |
Sets the port number of the proxy server if connectivity to AWS is required use a proxy server. |
|
Optional |
The username of the proxy server. |
None |
Sets the user name of the proxy server if connectivity to AWS is required use a proxy server and the proxy server requires credentials. |
|
Optional |
The password of the proxy server. |
None |
Sets the password for the user name of the proxy server if connectivity to AWS is required use a proxy server and the proxy server requires credentials. |
|
Required |
A string with resolvable keywords and constants used to dynamically generate the path in the S3 bucket to write the file. |
None |
Use resolvable keywords and constants used to dynamically generate the S3 bucket name at runtime. The handler attempts to create the S3 bucket if it does not exist. AWS requires bucket names to be all lowercase. A bucket name with uppercase characters results in a runtime exception. |
|
Required |
A string with resolvable keywords and constants used to dynamically generate the path in the S3 bucket to write the file. |
None |
Use keywords interlaced with constants to dynamically generate a unique S3 path names at runtime. Typically, path names follow the format, |
|
Optional |
A string with resolvable keywords and constants used to dynamically generate the S3 file name at runtime. |
None |
Use resolvable keywords and constants used to dynamically generate the S3 data file name at runtime. If not set, the upstream file name is used. |
|
Optional |
|
None |
Set to |
|
Optional |
A unique string identifier cross referencing a child event handler. |
No event handler configured. |
Sets the event handler that is invoked on the file roll event. Event handlers can do file roll event actions like loading files to S3, converting to Parquet or ORC format, or loading files to HDFS. |
|
Optional (unless Dell ECS, then required) |
A legal URL to connect to cloud storage. |
None |
Not required for Amazon AWS S3. Required for Dell ECS. Sets the URL to connect to cloud storage. |
|
Optional |
|
|
Sets the proxy protocol connection to the proxy server for additional level of security. The client first performs an SSL handshake with the proxy server, and then an SSL handshake with Amazon AWS. This feature was added into the Amazon SDK in version 1.11.396 so you must use at least that version to use this property. |
|
Optional |
|
Empty |
Set only if you are enabling S3 server side encryption. Use the parameters to set the algorithm for server side encryption in S3. |
|
Optional |
A legal AWS key management system server side management key or the alias that represents that key. |
Empty |
Set only if you are enabling S3 server side encryption and the S3
algorithm is |
gg.eventhandler.name.enableSTS |
Optional |
|
|
Set to |
gg.eventhandler.name.STSAssumeRole |
Optional | AWS user and role in the following format:
{user arn}:role/{role name} |
None | Set configuration if you want to assume a different user/role. Only valid with STS enabled. |
gg.eventhandler.name.STSAssumeRoleSessionName |
Optional | Any string. | AssumeRoleSession1
|
The assumed role requires a session name for session
logging. However this can be any value. Only valid if both
gg.eventhandler.name.enableSTS=true and
gg.eventhandler.name.STSAssumeRole are
configured.
|
gg.eventhandler.name.STSRegion |
Optional |
Any legal AWS region specifier. |
The region is obtained from the
|
Use to resolve the region for the STS call. It's
only valid if the
|
gg.eventhandler.name.enableBucketAdmin |
Optional |
|
|
Set to |
gg.eventhandler.name.accessKeyId |
Optional | A valid AWS access key. | None | Set this parameter to explicitly set the access key
for AWS. This parameter has no effect if
gg.eventhandler.name.enableSTS is set to
true . If this property is not set, then the
credentials resolution falls back to the AWS default credentials
provider chain.
|
gg.eventhandler.name.secretKey |
Optional | A valid AWS secret key. | None | Set this parameter to explicitly set the secret key
for AWS. This parameter has no effect if
gg.eventhandler.name.enableSTS is set to
true . If this property is not set, then
credentials resolution falls back to the AWS default credentials
provider chain.
|
Parent topic: Using the S3 Event Handler