This chapter describes how to work with Oracle GoldenGate in order to capture changes on source transactional systems and replicate them in a staging server for consumption by Oracle Data Integrator interfaces.
This chapter includes the following sections:
Oracle GoldenGate (OGG) product offers solutions that provide key business applications with continuous availability and real-time information. It provides guaranteed capture, routing, transformation and delivery across heterogeneous databases and environments in real-time.
Using the Oracle GoldenGate knowledge modules requires that you know and understand Oracle GoldenGate concepts and architecture. See the Oracle GoldenGate Documentation on OTN for more information:
http://www.oracle.com/technetwork/middleware/goldengate/overview/index.html
Oracle Data Integrator uses Oracle GoldenGate to replicate online data from a source database to a staging database. This staging database contains a copy of the source tables and the ODI Changed Data Capture (CDC) infrastructure, both loaded using Oracle GoldenGate.
The staging database can be stored in an Oracle or Teradata schema. The source database can be Oracle, Microsoft SQL Server, DB2 UDB, or Sybase ASE. In this chapter, <database> refers to any of these source database technologies.
Setting up CDC with GoldenGate is done using the following process:
A replica of the source tables is created in the staging database, using, for example, the Oracle Data Integrator Common Format Designer feature.
Oracle Data Integrator Changed Data Capture (CDC) is activated on these replicated tables using either the JKM <database> to Oracle Consistent (OGG) or the JKM <database> to Teradata Consistent (OGG). Starting the journals creates Oracle GoldenGate configuration files and sets up a CDC infrastructure in the staging database. Note that no active process is started for capturing source data at that stage.
Using the generated configuration files, an Oracle GoldenGate Extract process is configured and started to capture changes from the source database, and corresponding Replicat processes are configured and started to replicate these changes into the staging database. Changes are replicated into both the replicated source table and the CDC infrastructure. GoldenGate can optionally be configured to perform the initial load of the source data into the staging tables.
ODI interfaces can source from the replicated tables and use captured changes seamlessly within any ODI scenario.
Oracle Data Integrator provides the Knowledge Modules listed in Table 29-1 for replicating online data from a source to a staging database. Unlike other CDC JKMs, the Oracle GoldenGate JKMs journalize data in the staging Oracle or Teradata database and not in the source server.
The JKM <database> to Oracle Consistent (OGG) and the JKM <database> to Teradata Consistent (OGG) perform the same tasks:
Create and manage the ODI CDC framework infrastructure on the replicated tables
Generate the parameter files to setup the Oracle GoldenGate capture (Extract) and Delivery (Replicat) processes
Provide extra steps to check the configuration of the source database and proposes tips to correct the configuration
Generate a readme file explaining how to complete the setup
Table 29-1 Oracle GoldenGate Knowledge Modules
| Knowledge Module | Description | 
|---|---|
| JKM Oracle to Oracle Consistent (OGG) | Creates the infrastructure for consistent set journalizing on an Oracle staging server and generates the Oracle GoldenGate configuration for replicating data from an Oracle source to this staging server. | 
| JKM DB2 UDB to Oracle Consistent (OGG) | Creates the infrastructure for consistent set journalizing on an Oracle staging server and generates the Oracle GoldenGate configuration for replicating data from an IBM DB2 UDB source to this staging server. | 
| JKM Sybase ASE to Oracle Consistent (OGG) | Creates the infrastructure for consistent set journalizing on an Oracle staging server and generates the Oracle GoldenGate configuration for replicating data from a Sybase ASE source to this staging server. | 
| JKM MSSQL to Oracle Consistent (OGG) | Creates the infrastructure for consistent set journalizing on an Oracle staging server and generates the Oracle GoldenGate configuration for replicating data from a Microsoft SQL Server source to this staging server. | 
| JKM Oracle to Teradata Consistent (OGG) | Creates the infrastructure for consistent set journalizing on a Teradata staging server and generates the Oracle GoldenGate configuration for replicating data from an Oracle source to this staging server. | 
| JKM DB2 UDB to Teradata Consistent (OGG) | Creates the infrastructure for consistent set journalizing on a Teradata staging server and generates the Oracle GoldenGate configuration for replicating data from an IBM DB2 UDB source to this staging server. | 
| JKM Sybase ASE to Teradata Consistent (OGG) | Creates the infrastructure for consistent set journalizing on a Teradata staging server and generates the Oracle GoldenGate configuration for replicating data from a Sybase ASE source to this staging server. | 
| JKM MSSQL to Teradata Consistent (OGG) | Creates the infrastructure for consistent set journalizing on a Teradata staging server and generates the Oracle GoldenGate configuration for replicating data from a Microsoft SQL Server source to this staging server. | 
Make sure you have read the information in this section before you start using the Oracle GoldenGate Knowledge Modules:
Before performing any installation you should read the system requirements and certification documentation to ensure that your environment meets the minimum installation requirements for the products you are installing.
The list of supported platforms and versions is available on Oracle Technical Network (OTN):
http://www.oracle.com/technology/products/oracle-data-integrator/index.html.
See also the Oracle GoldenGate documentation on OTN for source and staging database version platform support.
In order to run the Extract and Replicat processes, Oracle GoldenGate must be installed on both the source and staging servers. Installing Oracle GoldenGate installs all of the components required to run and manage GoldenGate processes.
Oracle GoldenGate Manager Process must be running on each system before Extract or Replicat can be started, and must remain running during their execution for resource management.
Oracle GoldenGate has specific requirement and installation instructions that must be performed before starting the Extract and Replicat processes configured with the Oracle GoldenGate JKMs. See the Oracle GoldenGate Documentation on OTN for more information.
If the source database is Oracle, there are no connectivity requirements for using Oracle GoldenGate data in Oracle Data Integrator.
If the source database is IBM DB2 UDB, Microsoft SQL Server, or Sybase ASE, Oracle GoldenGate uses the ODBC driver to connect to the source database. You need to install the ODBC driver and to declare the data source in your system. You also need to set the data source name (DSN) in the KM option SRC_DSN.
To use the JKM <database> to Oracle Consistent (OGG) or the JKM <database> to Teradata Consistent (OGG) in your Oracle Data Integrator integration projects, you need to perform the following steps:
This step consists in declaring in Oracle Data Integrator the staging data server, the source data server, as well as the physical and logical schemas attached to these servers.
To define the topology in this configuration, perform the following tasks:
Create a data server for the Oracle or Teradata technology. For more information, see:
Create an Oracle or Teradata physical schema using the standard procedure, as described in "Creating a Physical Schema" of the Oracle Fusion Middleware Developer's Guide for Oracle Data Integrator.
Note:
The physical schema defined in the staging server will contain in the data schema the changed records captured and replicated by the Oracle GoldenGate processes. The work schema will be used to store the ODI CDC infrastructure.
Create for this physical schema a logical schema using the standard procedure, as described in "Creating a Logical Schema" of the the Oracle Fusion Middleware Developer's Guide for Oracle Data Integrator and associate it in a given context.
You have to define a source data server from which Oracle GoldenGate will capture changes.
Create a data server for your source technology using the standard procedure. For more information, see the chapter corresponding to your source technology in this guide:
This data server represents the source database instance.
Create a physical schema under the data server that you have created in Section 29.3.1.3, "Define the Source Data Server". Use the standard procedure, as described in "Creating a Physical Schema" of the Oracle Fusion Middleware Developer's Guide for Oracle Data Integrator.
Create for this physical schema a logical schema using the standard procedure, as described in "Creating a Logical Schema" of the Oracle Fusion Middleware Developer's Guide for Oracle Data Integrator and associate it in a given context.
Oracle GoldenGate will replicate in the staging server the records changed in the source. In order to perform this replication, the source table structures must be replicated in the staging server.
To replicate these source tables:
Create a new Data Model using the Oracle or Teradata technology. This model must use the logical schema created using the instructions in Section 29.3.1.2, "Create the Staging Physical Schema".
See "Creating a Model" in the Oracle Fusion Middleware Developer's Guide for Oracle Data Integrator for more information on model creation.
Note that you do not need to reverse-engineer this data model.
Create a new diagram for this model and add to this diagram the source tables that you want to replicate.
Generate the DDL Scripts and run these scripts for creating the tables in the staging data server.
An initial load of the source data can be made to replicate this data into the staging tables. You can perform this initial load with ODI using the Generate Interface IN feature of Common Format Designer. Alternately, you can use Oracle GoldenGate to perform this initial load, by setting the USE_OGG_FOR_INIT JKM option to Yes when you Configure CDC for the Replicated Tables.
Note:
See "Working with Common Format Designer" in the Oracle Fusion Middleware Developer's Guide for Oracle Data Integrator for more information on diagrams, generating DDL, and generating Interface IN features.
Setting up a project using Oracle GoldenGate features follows the standard procedure. See "Creating an Integration Project" of the Oracle Fusion Middleware Developer's Guide for Oracle Data Integrator.
Depending on the technology of your source data server and staging server, import one of the following KMs into your project:
JKM Oracle to Oracle Consistent (OGG)
JKM DB2 UDB to Oracle Consistent (OGG)
JKM Sybase ASE to Oracle Consistent (OGG)
JKM MSSQL to Oracle Consistent (OGG)
JKM Oracle to Teradata Consistent (OGG)
JKM DB2 UDB to Teradata Consistent (OGG)
JKM Sybase ASE to Teradata Consistent (OGG)
JKM MSSQL to Teradata Consistent (OGG)
Changed Data Capture must be configured for the replicated tables. This configuration is similar to setting up consistent set journalizing and is performed using the following steps.
Edit the data model that contains the replicated tables. In the Journalizing tab of the data model, set the Journalizing Mode to Consistent Set and select the appropriate JKM <database> to Oracle Consistent (OGG) or JKM <database> to Teradata Consistent (OGG) .
Set the KM options as follows:
LOCAL_TEMP_DIR: Full path to a temporary folder into which the Oracle GoldenGate configuration files will be generated
SRC_OGG_OBJECT_GROUP: Name of the Oracle GoldenGate source object group.
Note:
This name should be unique across the entire information system.
This name must not be longer than 5 characters and must contain only upper case letters and/or digits.
When SRC_SETUP_OGG_PROCESSES is set to No, this option is required and the value of this option must match the value that was provided when setting up the capture on the other Model.
SRC_LSCHEMA: Name of the logical schema of the source model
SRC_DB_USER: Source schema or database user name
SRC_DB_PASSWORD: Source schema or database user password
SRC_OGG_PATH: Oracle GoldenGate installation path on the source server
SRC_DSN: Name of the data source. This KM option is required when the ODBC driver is used. Note that this option does not exist in the JKM Oracle to Oracle Consistent (OGG).
Note:
For Sybase users only: When defining the data source name, you have to add the database server name to the datasource name as follows:
DSN_name@SYBASE_DBSERVER
STG_OGG_OBJECT_GROUP: Name of the Oracle GoldenGate staging object group
Note:
This name should be unique across the entire information system.
This name must not be longer than 5 characters and must contain only upper case letters and/or digits.
STG_HOST_NAME: Name or IP address of the staging machine
STG_MANAGER_PORT: TCP port on which the Oracle GoldenGate Manager process is listening on the staging machine
STG_OGG_PATH: Oracle GoldenGate installation path on the staging server
USE_OGG_FOR_INIT: Generate the Oracle GoldenGate processes to perform the initial load of the replicated tables. If you have performed this initial load using Oracle Data Integrator while Creating the Replicated Tables, you can leave this option to NO.
Select the tables that you want to replicate or the model if want to replicate all tables, right-click then select Changed Data Capture > Add to CDC.
Select the model, right-click then select Changed Data Capture > Subscriber > Subscribe. Add subscribers for this model.
Select the model, right-click then select Changed Data Capture > Start Journal. The JKM creates the CDC infrastructure and generates the configuration for Oracle GoldenGate.
You can review the result of the journal startup action:
The Oracle GoldenGate configuration files, as well as a Readme.txt file are generated in the directory that is specified in the LOCAL_TEMP_DIR KM option. You can use these files to Configure and Start Oracle GoldenGate Processes.
The CDC infrastructure is set up correctly. The journalized datastores appear in the Models accordion with a Journalizing Active flag. You can right-click the model and select Changed Data Capture > Journal Data… to access the journalized data for these datastores.
See "Working with Changed Data Capture" in the Oracle Fusion Middleware Developer's Guide for Oracle Data Integrator for more conceptual information and detailed instructions on CDC.
Note:
Although this CDC configuration supports consistent set journalizing, it is not required to order datastores in the Journalized Table tab of the model after adding them to CDC.
The JKM generates in the LOCAL_TEMP_DIR a folder named after the source and target object groups. This folder contains the following:
The Readme.txt file that contains detailed instructions for configuring and starting the Oracle GoldenGate processes.
The src folder that contains configuration files to upload on the source server, in the Oracle GoldenGate installation directory.
The stg folder that contains configuration files to upload on the staging server, in the Oracle GoldenGate installation directory.
The detailed instructions, customized for your configuration, are provided in the readme file.
These instructions include:
Uploading or copying files from the src folder to the source server.
Uploading or copying files from the stg folder to the staging server.
Running on the source server the OBEY file generated by the JKM for starting the Extract process, using the ggsci command line.
Generating on the source server definition file using the defgen command line.
Copying this definition file to the staging server.
If the initial load option is used:
Running on the staging server the OBEY file generated by the JKM for the initial load, using the ggsci command line.
Running on the source server the OBEY file generated by the JKM for the initial load, using the ggsci command line.
Finally Running on the staging server the OBEY file generated by the JKM for the starting the Replicat processes, using the ggsci command line.
See the Oracle GoldenGate documentation on OTN for more information on OBEY files, the ggsci and defgen utilities.
You can use the data in the replicated data as a source in your integration interfaces. This process is similar to using a source datastore journalized in consistent set mode. See "Using Changed Data: Consistent Set Journalizing" in the the Oracle Fusion Middleware Developer's Guide for Oracle Data Integrator for more information.
This section includes the following advanced configuration topics:
The staging tables contain a replica of the structure and data from the source tables. The Oracle GoldenGate processes capture changes on the source tables and apply them to the target. Yet the staging tables must be initially loaded with the original content of the source tables. You can use the following methods to perform the initial load:
Using Oracle GoldenGate: A specific GoldenGate process loads the whole content of the source tables into the staging tables.
Using Oracle Data Integrator: The Generate Interfaces IN option of Oracle Data Integrator's Common Format Designer. This method uses ODI interfaces to transfer the data.
Using database backup/restore tools to copy data and structures.
The following KM options can be used to improve replication performances:
COMPATIBLE: This Oracle-specific option affects the use of the PURGE key word and the way statistics (using DBMS_STATS or ANALYZE) are collected. Set this value to the database version of your staging server.
NB_APPLY_PROCESS: Number of Oracle GoldenGate Apply processes created on the staging server.
TRAIL_FILE_SIZE: Size of the Oracle GoldenGate trail file in Megabytes.
For the NB_APPLY_PROCESS and TRAIL_FILE_SIZE parameters, see the Oracle GoldenGate Documentation on OTN for more information on performance tuning.
It is possible to set up a configuration where changes are captured on a single source and replicated to several staging servers. The example below illustrates how to set this up in a typical configuration.
Replication should source from source server SRC and replicate in both STG1 and STG2 staging servers.
Configure CDC for STG1 with the following configuration:
SRC_OGG_OBJECT_GROUP = SRC
SRC_SETUP_OGG_PROCESSES = YES
STG_OGG_OBJECT_GROUP = STG1
STG_SETUP_OGG_PROCESSES = YES
ENABLE_ODI_CDC= YES
Start the journal and follow the instructions in the readme to set up the Oracle GoldenGate processes in SRC and STG1.
Configure CDC for STG2 with the following configuration:
SRC_OGG_OBJECT_GROUP = SRC (Use the same name as for STG1)
SRC_SETUP_OGG_PROCESSES = NO (The processes have been set up with STG1)
STG_OGG_OBJECT_GROUP = STG2
STG_SETUP_OGG_PROCESSES = YES
ENABLE_ODI_CDC= YES
Start the journal and follow the instructions in the readme to set up the Oracle GoldenGate processes in SRC and STG2. Note that playing the configuration on SRC again will not recreate a capture process, trail files, or definition files. It will simply create a new Oracle GoldenGate Datapump process to push data to STG2.