Using Oracle GoldenGate to Replicate Data Between Distributed Databases and Non-Distributed Databases
You can migrate data from a non-distributed database to a distributed database using Oracle GoldenGate.
Migrating data from a non-distributed database to a distributed database using Oracle GoldenGate is done in two phases.
Extraction on source database
- All of the tables from the source database are extracted using single extract on the source database.
Replication on target database
- The replication into sharded tables is performed on the shards and the replication into duplicated tables is performed on the shard catalog.
Oracle GoldenGate Replication Prerequisites
Make sure your source and target databases, and Oracle GoldenGate environments meet these prerequisites before attempting distributed database Oracle GoldenGate replication.
Assumptions
-
It is assumed that the tables to be migrated from the non-distributed database to a distributed database have already been classified into sharded and duplicated tables.
-
The sharding keys for all of the tables to be migrated to sharded tables have already been identified.
-
Sharded and duplicated tables have been pre-created in the target distributed database.
-
Oracle GoldenGate software is already installed on the source and target systems.
Source and Target Databases
-
Oracle Database version: 19c (19.15.0.0.0) or later in a Multitenant architecture
-
Target database sharding type: System-managed
Oracle GoldenGate Configuration
Oracle GoldenGate Version: 19c Classic Architecture in a hub configuration
Replicating Data from a Non-Distributed Database to a Distributed Database
Example environment
The examples in the steps below use the following topology
System/Object | Source Environment | Target Environment |
---|---|---|
CDB Name |
srccdb |
sdbcdb |
PDB Name |
srcpdb |
Shards: sdbpdb1,sdbpdb2,sdbpdb3 Shard catalog: scpdb |
Application Schema |
app_schema |
app_schema |
Sharded Tables |
Customers, Orders, LineItems |
Customers, Orders, LineItems |
Duplicated Tables |
Products |
Products |
High Level Steps
1) Create an extract on the source database to capture transactions from the source tables and start it.
2) Capture data from source database for initial load using expdp
and flashback_scn
.
3) Perform initial load into the sharded tables on target shards using
impdp
.
4) Perform initial load into the duplicated tables on the target shard catalog using
impdp
.
5) Create the same number of replicats as number of target shards to replicate sharded tables.
6) Create one replicat for the shard catalog to replicate duplicated tables.
7) Start replicats on the target shards using at csn
8) Start replicat on the shard catalog using at csn
9) Validate the data replication from the source to target tables.
-
Configure the source (non-distributed) database
-
Create an extract on the source database to capture transactions from source tables and start it.
$ ggsci GGSCI > dblogin useridalias ggadmin GGSCI > add extract extnshd, integrated tranlog, begin now GGSCI > register extract extnshd, database container (SRCPDB) GGSCI > add exttrail ./dirdat/tr, extract extnshd Add the following parameters in extract parameter file GGSCI > edit params extnshd extract extnshd useridalias ggadmin TranlogOptions IntegratedParams (max_sga_size 256) extTrail ./dirdat/tr DiscardFile ./dirrpt/extnshd.dsc, Append Megabytes 50 REPORTCOUNT EVERY 2 HOURS, RATE Table SRCPDB.app_schema.customers; Table SRCPDB.app_schema.orders; Table SRCPDB.app_schema.lineitems; Table SRCPDB.app_schema.products; GGSCI> start extract extnshd
-
Capture data from the source database for initial load using
expdp
.SQL> select current_scn from v$database; $ expdp app_schema/xxxxx@SRCPDB flashback_scn=current_scn_from_previous_step directory=DATA_PUMP_DIR dumpfile=app_schema_exp.dmp logfile=app_schema_exp.log
-
-
Configure the target distributed database.
-
Perform the initial load on the target shard databases and shard catalog using
impdp
.Import into shards $ impdp app_schema/xxxxx@SDBPDB1 directory=DATA_PUMP_DIR dumpfile=app_schema_exp.dmp logfile=app_schema_imp.log tables=CUSTOMERS,ORDERS,LINEITEMS, CONTENT=DATA_ONLY $ impdp app_schema/xxxxx@SDBPDB2 directory=DATA_PUMP_DIR dumpfile=app_schema_exp.dmp logfile=app_schema_imp.log tables=CUSTOMERS,ORDERS,LINEITEMS, CONTENT=DATA_ONLY $ impdp app_schema/xxxxx@SDBPDB3 directory=DATA_PUMP_DIR dumpfile=app_schema_exp.dmp logfile=app_schema_imp.log tables=CUSTOMERS,ORDERS,LINEITEMS, CONTENT=DATA_ONLY Import into shard catalog $ impdp app_schema/xxxxx@SCPDB directory=DATA_PUMP_DIR dumpfile=app_schema_exp.dmp logfile=app_schema_imp.log tables=PRODUCTS CONTENT=DATA_ONLY
-
Create replicats (same as the number of shards) on the target database.
Replicat for sharded tables on Shard 1 ====================================== GGSCI > dblogin useridalias ggadmin_shd1 GGSCI > add replicat repshd1, INTEGRATED, exttrail ./dirdat/tr CHECKPOINTTABLE ggadmin.GGCHKPT Add the following parameters in replicat for shard1 GGSCI > edit params repshd1 replicat repshd1 useridalias ggadmin_shd1 HANDLECOLLISIONS SOURCECATALOG SDBPDB1 MAP NSHDPDB.APP_SCHEMA.CUSTOMERS , target APP_SCHEMA.CUSTOMERS, & SQLEXEC (ID chunklookup1, QUERY 'select count(*) count FROM gsmadmin_internal.chunks WHERE ora_hash(:CODE_IN_PARAM) >= low_key and ora_hash(:CODE_IN_PARAM) < high_key',& PARAMS (CODE_IN_PARAM = CUSTID), BEFOREFILTER), & FILTER (chunklookup1.COUNT = 1); MAP NSHDPDB.APP_SCHEMA.ORDERS, target APP_SCHEMA.ORDERS, & SQLEXEC (ID chunklookup2, QUERY 'select count(*) count FROM gsmadmin_internal.chunks WHERE ora_hash(:CODE_IN_PARAM) >= low_key and ora_hash(:CODE_IN_PARAM) < high_key',& PARAMS (CODE_IN_PARAM = CUSTID), BEFOREFILTER), & FILTER (chunklookup2.COUNT = 1); MAP NSHDPDB.APP_SCHEMA.LINEITEMS, target APP_SCHEMA.LINEITEMS, & SQLEXEC (ID chunklookup3, QUERY 'select count(*) count FROM gsmadmin_internal.chunks WHERE ora_hash(:CODE_IN_PARAM) >= low_key and ora_hash(:CODE_IN_PARAM) < high_key',& PARAMS (CODE_IN_PARAM = CUSTID), BEFOREFILTER), & FILTER (chunklookup3.COUNT = 1); Replicat for sharded tables on Shard 2 ====================================== GGSCI > dblogin useridalias ggadmin_shd2 GGSCI > add replicat repshd2, INTEGRATED, exttrail ./dirdat/tr CHECKPOINTTABLE ggadmin.GGCHKPT Add the following parameters in replicat for shard2 GGSCI > edit params repshd2 replicat repshd2 useridalias ggadmin_shd2 HANDLECOLLISIONS SOURCECATALOG SDBPDB2 MAP NSHDPDB.APP_SCHEMA.CUSTOMERS , target APP_SCHEMA.CUSTOMERS, & SQLEXEC (ID chunklookup1, QUERY 'select count(*) count FROM gsmadmin_internal.chunks WHERE ora_hash(:CODE_IN_PARAM) >= low_key and ora_hash(:CODE_IN_PARAM) < high_key',& PARAMS (CODE_IN_PARAM = CUSTID), BEFOREFILTER), & FILTER (chunklookup1.COUNT = 1); MAP NSHDPDB.APP_SCHEMA.ORDERS, target APP_SCHEMA.ORDERS, & SQLEXEC (ID chunklookup2, QUERY 'select count(*) count FROM gsmadmin_internal.chunks WHERE ora_hash(:CODE_IN_PARAM) >= low_key and ora_hash(:CODE_IN_PARAM) < high_key',& PARAMS (CODE_IN_PARAM = CUSTID), BEFOREFILTER), & FILTER (chunklookup2.COUNT = 1); MAP NSHDPDB.APP_SCHEMA.LINEITEMS, target APP_SCHEMA.LINEITEMS, & SQLEXEC (ID chunklookup3, QUERY 'select count(*) count FROM gsmadmin_internal.chunks WHERE ora_hash(:CODE_IN_PARAM) >= low_key and ora_hash(:CODE_IN_PARAM) < high_key',& PARAMS (CODE_IN_PARAM = CUSTID), BEFOREFILTER), & FILTER (chunklookup3.COUNT = 1); Replicat for sharded tables on Shard 3 ====================================== GGSCI > dblogin useridalias ggadmin_shd3 GGSCI > add replicat repshd3, INTEGRATED, exttrail ./dirdat/tr CHECKPOINTTABLE ggadmin.GGCHKPT Add the following parameters in replicat for shard3 GGSCI > edit params repshd3 replicat repshd3 useridalias ggadmin_shd3 HANDLECOLLISIONS SOURCECATALOG SDBPDB3 MAP NSHDPDB.APP_SCHEMA.CUSTOMERS , target APP_SCHEMA.CUSTOMERS, & SQLEXEC (ID chunklookup1, QUERY 'select count(*) count FROM gsmadmin_internal.chunks WHERE ora_hash(:CODE_IN_PARAM) >= low_key and ora_hash(:CODE_IN_PARAM) < high_key',& PARAMS (CODE_IN_PARAM = CUSTID), BEFOREFILTER), & FILTER (chunklookup1.COUNT = 1); MAP NSHDPDB.APP_SCHEMA.ORDERS, target APP_SCHEMA.ORDERS, & SQLEXEC (ID chunklookup2, QUERY 'select count(*) count FROM gsmadmin_internal.chunks WHERE ora_hash(:CODE_IN_PARAM) >= low_key and ora_hash(:CODE_IN_PARAM) < high_key',& PARAMS (CODE_IN_PARAM = CUSTID), BEFOREFILTER), & FILTER (chunklookup2.COUNT = 1); MAP NSHDPDB.APP_SCHEMA.LINEITEMS, target APP_SCHEMA.LINEITEMS, & SQLEXEC (ID chunklookup3, QUERY 'select count(*) count FROM gsmadmin_internal.chunks WHERE ora_hash(:CODE_IN_PARAM) >= low_key and ora_hash(:CODE_IN_PARAM) < high_key',& PARAMS (CODE_IN_PARAM = CUSTID), BEFOREFILTER), & FILTER (chunklookup3.COUNT = 1); #### NOTE #### 1. Remove Handlecollisions parameter and restart replicats after deltas have been applied on target shards. 2. If sharding key column is of number datatype, please use below sqlexec filter which has to_number in ora_hash function. SQLEXEC (ID chunklookup, QUERY 'select count(*) count FROM gsmadmin_internal.chunks WHERE ora_hash(to_number(:CODE_IN_PARAM)) >= low_key and ora_hash(to_number(:CODE_IN_PARAM)) < high_key',& Replicat for duplicate tables on Catalog ======================================== GGSCI > dblogin useridalias ggadmin_cat GGSCI > add replicat repcat, INTEGRATED, exttrail ./dirdat/tr CHECKPOINTTABLE ggadmin.GGCHKPT Add the following parameters in replicat for shard1 GGSCI > edit params repcat replicat repcat useridalias ggadmin_cat HANDLECOLLISIONS SOURCECATALOG SCPDB map NSHDPDB.APP_SCHEMA.PRODUCTS, target APP_SCHEMA.PRODUCTS;
-
Start replicats on target shards using
atcsn
.GGSCI> start replicat repshd1, atcsn <SCN captured on source> GGSCI> start replicat repshd2, atcsn <SCN captured on source> GGSCI> start replicat repshd3, atcsn <SCN captured on source> GGSCI> start replicat repcat, atcsn <SCN captured on source> GGSCI > info all Program Status Group Lag at Chkpt Time Since Chkpt MANAGER RUNNING EXTRACT RUNNING EXTNSHD 00:00:00 00:00:05 REPLICAT RUNNING REPCAT 00:00:00 00:00:00 REPLICAT RUNNING REPSHD1 00:00:00 00:00:03 REPLICAT RUNNING REPSHD2 00:00:00 00:00:06 REPLICAT RUNNING REPSHD3 00:09:01 00:00:09
-
-
Validate the data replication from source to target tables.
To validate that rows are replicated from the non-sharded table to the shards, for example, if you have 9000 rows in the source table, and three target shards, about 3000 rows should be distributed to each shard.