MongoDB

8.2.23 MongoDB

Learn how to use the MongoDB Handler, which can replicate transactional data from Oracle GoldenGate to a target MongoDB and Autonomous JSON databases (AJD and ATP) .

Overview
MongoDB Wire Protocol
Supported Target Types
Detailed Functionality
Setting Up and Running the MongoDB Handler
Security and Authentication
Reviewing Sample Configurations
MongoDB to AJD/ATP Migration
Configuring an Initial Synchronization of Extract for a MongoDB Source Database using Precise Instantiation
MongoDB Handler Client Dependencies
What are the dependencies for the MongoDB Handler to connect to MongoDB databases?

Parent topic: Target

8.2.23.1 Overview

Mongodb Handler can used to replicate data from RDMS as well as document based databases like Mongodb or Cassandra to the following target databases using MongoDB wire protocol

Parent topic: MongoDB

8.2.23.2 MongoDB Wire Protocol

The MongoDB Wire Protocol is a simple socket-based, request-response style protocol. Clients communicate with the database server through a regular TCP/IP socket, see https://docs.mongodb.com/manual/reference/mongodb-wire-protocol/.

Parent topic: MongoDB

8.2.23.3 Supported Target Types

MongoDB is an open-source document database that provides high performance, high availability, and automatic scaling, see https://www.mongodb.com/.
Oracle Autonomous JSON Database (AJD) is a cloud document database service that makes it simple to develop JSON-centric applications, see Autonomous JSON Database | Oracle.
Autonomous Database for transaction processing and mixed workloads (ATP) is a fully automated database service optimized to run transactional, analytical, and batch workloads concurrently, see Autonomous Transaction Processing | Oracle.
On-premises Oracle Database 21c with Database API for MongoDB is also a supported target. See Installing Database API for MongoDB for any Oracle Database.

Parent topic: MongoDB

8.2.23.4 Detailed Functionality

The MongoDB Handler takes operations from the source trail file and creates corresponding documents in the target MongoDB or Autonomous databases (AJD and ATP).

A record in MongoDB is a Binary JSON (BSON) document, which is a data structure composed of field and value pairs. A BSON data structure is a binary representation of JSON documents. MongoDB documents are similar to JSON objects. The values of fields may include other documents, arrays, and arrays of documents.

A collection is a grouping of MongoDB or AJD/ATP documents and is the equivalent of an RDBMS table. In MongoDB or AJD/ATP databases, a collection holds collection of documents. Collections do not enforce a schema. MongoDB or AJD/ATP documents within a collection can have different fields.

Parent topic: MongoDB

8.2.23.4.1 Document Key Column

MongoDB or AJD/ATP databases require every document (row) to have a column named _id whose value should be unique in a collection (table). This is similar to a primary key for RDBMS tables. If a document does not contain a top-level _id column during an insert, the MongoDB driver adds this column.

The MongoDB Handler builds custom _id field values for every document based on the primary key column values in the trail record. This custom _id is built using all the key column values concatenated by a : (colon) separator. For example:

KeyColValue1:KeyColValue2:KeyColValue3

The MongoDB Handler enforces uniqueness based on these custom _id values. This means that every record in the trail must be unique based on the primary key columns values. Existence of non-unique records for the same table results in a MongoDB Handler failure and in Replicat abending with a duplicate key error.

The behavior of the _id field is:

By default, MongoDB creates a unique index on the column during the creation of a collection.
It is always the first column in a document.
It may contain values of any BSON data type except an array.

Parent topic: Detailed Functionality

8.2.23.4.2 Primary Key Update Operation

MongoDB or AJD/ATP databases do not allow the _id column to be modified. This means a primary key update operation record in the trail needs special handling. The MongoDB Handler converts a primary key update operation into a combination of a DELETE (with old key) and an INSERT (with new key). To perform the INSERT, a complete before-image of the update operation in trail is recommended. You can generate the trail to populate a complete before image for update operations by enabling the Oracle GoldenGate GETUPDATEBEFORES and NOCOMPRESSUPDATES parameters, see Reference for Oracle GoldenGate.

Parent topic: Detailed Functionality

8.2.23.4.3 MongoDB Trail Data Types

The MongoDB Handler supports delivery to the BSON data types as follows:

32-bit integer
64-bit integer
Double
Date
String
Binary data

Parent topic: Detailed Functionality

8.2.23.5 Setting Up and Running the MongoDB Handler

The following topics provide instructions for configuring the MongoDB Handler components and running the handler.

Parent topic: MongoDB

8.2.23.5.1 Classpath Configuration

The MongoDB Java Driver is required for Oracle GoldenGate for Distributed Applications and Analytics (GG for DAA) to connect and stream data to MongoDB. If the GG for DAA version is 21.7.0.0.0 and below, then you need to use 3.x (MongoDB Java Driver 3.12.8). If the GG for DAA version is 21.8.0.0.0 and above, then you need to use MongoDB Java Driver 4.6.0. The MongoDB Java Driver is not included in the GG for DAA product. You must download the driver from: mongo java driver.

Select mongo-java-driver and the version to download the recommended driver JAR file.

You must configure the gg.classpath variable to load the MongoDB Java Driver JAR at runtime. For example: gg.classpath=/home/mongodb/mongo-java-driver-3.12.8.jar.

GG for DAA supports the MongoDB Decimal 128 data type that was added in MongoDB 3.4. Use of a MongoDB Java Driver prior to 3.12.8 results in a ClassNotFound exception.

Parent topic: Setting Up and Running the MongoDB Handler

8.2.23.5.2 MongoDB Handler Configuration

You configure the MongoDB Handler operation using the properties file. These properties are located in the Java Adapter properties file (not in the Replicat properties file).

To enable the selection of the MongoDB Handler, you must first configure the handler type by specifying gg.handler.name.type=mongodb and the other MongoDB properties as follows:

Table 8-31 MongoDB Handler Configuration Properties

Properties	Required/ Optional	Legal Values	Default	Explanation
`gg.handler.name.type`	Required	`mongodb`	None	Selects the MongoDB Handler for use with Replicat.
`gg.handler.name.bulkWrite`	Optional	`true` \| `false`	`true`	Set to `true`, the handler caches operations until a commit transaction event is received. When committing the transaction event, all the cached operations are written out to the target MongoDB, AJD and ATP databases, which provides improved throughput. Set to `false`, there is no caching within the handler and operations are immediately written to the MongoDB, AJD and ATP databases.
`gg.handler.name.WriteConcern`	Optional	`{“w”: “value” , “wtimeout”: “number” }`	None	Sets the required write concern for all the operations performed by the MongoDB Handler. The property value is in JSON format and can only accept keys as `w` and `wtimeout`, see https://docs.name.com/manual/reference/write-concern/.
`gg.handler.name.clientURI`	Optional	Valid MongoDB client URI	None	Sets the MongoDB client URI. A client URI can also be used to set other MongoDB connection properties, such as authentication and `WriteConcern`.
`gg.handler.name.CheckMaxRowSizeLimit`	Optional	`true` \| `false`	`false`	When set to `true`, the handler verifies that the size of the BSON document inserted or modified is within the limits defined by the MongoDB database. Calculating the size involves the use of a default codec to generate a `RawBsonDocument`, leading to a small degradation in the throughput of the MongoDB Handler. If the size of the document exceeds the MongoDB limit, an exception occurs and Replicat abends.
`gg.handler.name.upsert`	Optional	`true` \| `false`	`false`	Set to `true`, a new Mongo document is inserted if there are no matches to the query filter when performing an `UPDATE` operation.
`gg.handler.name.enableDecimal128`	Optional	`true` \| `false`	`true`	MongoDB version 3.4 added support for a 128-bit decimal data type called Decimal128. This data type was needed since Oracle GoldenGate for Distributed Applications and Analytics (GG for DAA) supports both integer and decimal data types that do not fit into a 64-bit Long or Double. Setting this property to `true` enables mapping into the `Double128` data type for source data types that require it. Set to `false` to process these source data types as 64-bit Doubles.
`gg.handler.name.enableTransactions`	Optional	`true` \| `false`	`false`	Set to `true`, to enable transactional processing in MongoDB 4.0 and higher. Note: MongoDB added support for transactions in MongoDB version 4.0. Additionally, the minimum version of the MongoDB client driver is 3.10.1.

Parent topic: Setting Up and Running the MongoDB Handler

8.2.23.5.3 Using Bulk Write

Bulk write is enabled by default. For better throughput, Oracle recommends that you use bulk write.

You can also enable bulk write by using the BulkWrite handler property. To enable or disable bulk write use the gg.handler.handler.BulkWrite=true | false. The MongoDB Handler does not use the gg.handler.handler.mode=op | tx property that is used by Oracle GoldenGate for Distributed Applications and Analytics (GG for DAA).

With bulk write, the MongoDB Handler uses the GROUPTRANSOPS parameter to retrieve the batch size. The handler converts a batch of trail records to MongoDB documents, which are then written to the database in one request.

Parent topic: Setting Up and Running the MongoDB Handler

8.2.23.5.4 Using Write Concern

Write concern describes the level of acknowledgement that is requested from MongoDB for write operations to a standalone MongoDB, replica sets, and sharded-clusters. With sharded-clusters, Mongo instances pass the write concern on to the shards, see https://docs.mongodb.com/manual/reference/write-concern/.

Use the following configuration:

w: value
wtimeout: number

Parent topic: Setting Up and Running the MongoDB Handler

8.2.23.5.5 Using Three-Part Table Names

An Oracle GoldenGate trail may have data for sources that support three-part table names, such as Catalog.Schema.Table. MongoDB only supports two-part names, such as DBName.Collection. To support the mapping of source three-part names to MongoDB two-part names, the source Catalog and Schema is concatenated with an underscore delimiter to construct the Mongo DBName.

For example, Catalog.Schema.Table would become catalog1_schema1.table1.

Parent topic: Setting Up and Running the MongoDB Handler

8.2.23.5.6 Using Undo Handling

The MongoDB Handler can recover from bulk write errors using a lightweight undo engine. This engine works differently from typical RDBMS undo engines, rather the best effort to assist you in error recovery. Error recovery works well when there are primary violations or any other bulk write error where the MongoDB database provides information about the point of failure through BulkWriteException.

Table 8-32Table 1 lists the requirements to make the best use of this functionality.

Table 8-32 Undo Handling Requirements

Operation to Undo	Require Full Before Image in the Trail?
`INSERT`	No
`DELETE`	Yes
`UPDATE`	No (before image of fields in the `SET` clause.)

If there are errors during undo operations, it may be not possible to get the MongoDB collections to a consistent state. In this case, you must manually reconcile the data.

Parent topic: Setting Up and Running the MongoDB Handler

8.2.23.6 Security and Authentication

MongoDB Handler uses Oracle GoldenGate credential store to manage user IDs and their encrypted passwords (together known as credentials) that are used by Oracle GoldenGate processes to interact with the MongoDB database. The credential store eliminates the need to specify user names and clear-text passwords in the Oracle GoldenGate parameter files.

An optional alias can be used in the parameter file instead of the user ID to map to a userid and password pair in the credential store.

In Oracle GoldenGate for Distributed Applications and Analytics (GG for DAA), you specify the alias and domain in the property file and not the actual user ID or password. User credentials are maintained in secure wallet storage.

To add CREDENTIAL STORE and DBLOGIN run the following commands in the adminclient:

adminclient> add credentialstore
adminclient> alter credentialstore add user <userid> password <pwd> alias mongo

Example value of userid:

mongodb://myUserAdmin@localhost:27017/admin?replicaSet=rs0

adminclient > dblogin useridalias mongo

To test DBLOGIN, run the following command

adminclient> list tables tcust*

On successful add of authentication to credential store, add the alias in the parameter file of extract.

Example:

SOURCEDB USERIDALIAS mongo

MongoDB Handler uses connection URI to connect to a MongoDB deployment. Authentication and Security is passed as query string as part of connection URI. See SSL Configuration Setup to configure SSL.

To specify access control use userid:

mongodb://<user>@<hostname1>:<port>,<hostname2>:<port>,<hostname3>:<port>/?replicaSet=<replicatName>

To specify TLS/SSL:

Using connection string prefix of “+srv” as mongodb+srv automatically sets the tls option to true.

 mongodb+srv://server.example.com/

To disable TLS add tls=false in the query string.

mongodb:// >@<hostname1>:<port>/?replicaSet=<replicatName>&tls=false

To specify Authentication:

authSource:

mongodb://<user>@<hostname1>:<port>,<hostname2>:<port>,<hostname3>:<port>/?replicaSet=<replicatName>&authSource=admin

authMechanism:

mongodb://<user>@<hostname1>:<port>,<hostname2>:<port>,<hostname3>:<port>/?replicaSet=<replicatName>&authSource=admin&authMechanism=GSSAPI

For more information about Security and Authentication using Connection URL, see Mongo DB Documentation

SSL Configuration Setup

Parent topic: MongoDB

8.2.23.6.1 SSL Configuration Setup

To configure SSL between the MongoDB instance and Oracle GoldenGate for Distributed Applications and Analytics (GG for DAA) MongoDB Handler, do the following:

Create certificate authority (CA)

openssl req -passout pass:password -new -x509 -days 3650 -extensions v3_ca -keyout 
ca_private.pem -out ca.pem -subj 
"/CN=CA/OU=GOLDENGATE/O=ORACLE/L=BANGALORE/ST=KA/C=IN"

Create key and certificate signing requests (CSR) for client and all server nodes

openssl req -newkey rsa:4096 -nodes -out client.csr -keyout client.key -subj
'/CN=certName/OU=OGGBDCLIENT/O=ORACLE/L=BANGALORE/ST=AP/C=IN'
openssl req -newkey rsa:4096 -nodes -out server.csr -keyout server.key -subj
'/CN=slc13auo.us.oracle.com/OU=GOLDENGATE/O=ORACLE/L=BANGALORE/ST=TN/C=IN'

Sign the certificate signing requests with CA

openssl x509 -passin pass:password -sha256 -req -days 365 -in client.csr -CA ca.pem -CAkey
ca_private.pem -CAcreateserial -out client-signed.crtopenssl x509 -passin pass:password -sha256 -req -days 365 -in server.csr -CA ca.pem -CAkey
ca_private.pem -CAcreateserial -out server-signed.crt -extensions v3_req -extfile
 <(cat << EOF[ v3_req ]subjectAltName = @alt_names 
[ alt_names ]
DNS.1 = 127.0.0.1
DNS.2 = localhost
DNS.3 = hostname 
EOF)

Create the privacy enhanced mail (PEM) file for mongod

cat client-signed.crt client.key > client.pem
cat server-signed.crt server.key > server.pem

Create trust store and keystore

openssl pkcs12 -export -out server.pkcs12 -in server.pem
openssl pkcs12 -export -out client.pkcs12 -in client.pem

bash-4.2$ ls
ca.pem  ca_private.pem     client.csr  client.pem     server-signed.crt  server.key  server.pkcs12
ca.srl  client-signed.crt  client.key  client.pkcs12  server.csr         server.pem

Start instances of mongod with the following options:

--tlsMode requireTLS --tlsCertificateKeyFile ../opensslKeys/server.pem --tlsCAFile
        ../opensslKeys/ca.pem

credentialstore connectionString

alter credentialstore add user  
        mongodb://myUserAdmin@localhost:27017/admin?ssl=true&tlsCertificateKeyFile=../mcopensslkeys/client.pem&tlsCertificateKeyFilePassword=password&tlsCAFile=../mcopensslkeys/ca.pem
        password root alias mongo

Note:

The Length of connectionString should not exceed 256.

For CDC Extract, add the key store and trust store as part of the JVM options.

JVM options

-Xms512m -Xmx4024m -Xss32m -Djavax.net.ssl.trustStore=../mcopensslkeys /server.pkcs12
          -Djavax.net.ssl.trustStorePassword=password  
        -Djavax.net.ssl.keyStore =../mcopensslkeys/client.pkcs12
        -Djavax.net.ssl.keyStorePassword=password

Parent topic: Security and Authentication

8.2.23.7 Reviewing Sample Configurations

Basic Configuration

The following is a sample configuration for the MongoDB Handler from the Java adapter properties file:

gg.handlerlist=mongodb
gg.handler.mongodb.type=mongodb

#The following handler properties are optional.
#Refer to the Oracle GoldenGate for Distributed Applications and Analytics (GG for DAA) documentation
#for details about the configuration.
#gg.handler.mongodb.clientURI=mongodb://localhost:27017/
#gg.handler.mongodb.WriteConcern={w:value, wtimeout: number }
#gg.handler.mongodb.BulkWrite=false
#gg.handler.mongodb.CheckMaxRowSizeLimit=true

goldengate.userexit.timestamp=utc
goldengate.userexit.writers=javawriter
javawriter.stats.display=TRUE
javawriter.stats.full=TRUE
gg.log=log4j
gg.log.level=INFO
gg.report.time=30sec

#Path to MongoDB Java driver.
# maven co-ordinates
# <dependency>
# <groupId>org.mongodb</groupId>
# <artifactId>mongo-java-driver</artifactId>
# <version>3.10.1</version>
# </dependency>
gg.classpath=/path/to/mongodb/java/driver/mongo-java-driver-3.10.1.jar
javawriter.bootoptions=-Xmx512m -Xms32m -Djava.class.path=.:ggjava/ggjava.jar:./dirprm

Oracle or MongDB Database Source to MongoDB, AJD, and ATP Target

You can map an Oracle or MongDB Database source table name in uppercase to a table in MongoDB that is in lowercase. This applies to both table names and schemas. There are two methods that you can use:

Create a Data Pump

You can create a data pump before the Replicat, which translates names to lowercase. Then you configure a MongoDB Replicat to use the output from the pump:

extract pmp 
exttrail ./dirdat/le 
map RAMOWER.EKKN, target "ram"."ekkn";

Convert When Replicating

You can convert table column names to lowercase when replicating to the MongoDB table by adding this parameter to your MongoDB properties file:

gg.schema.normalize=lowercase

Parent topic: MongoDB

8.2.23.8 MongoDB to AJD/ATP Migration

Parent topic: MongoDB

8.2.23.8.1 Overview

Oracle Autonomous JSON Database (AJD) and Autonomous Database for transaction processing also uses wire protocol to connect. Wire protocol has the same MongoDB CRUD APIs.

Parent topic: MongoDB to AJD/ATP Migration

8.2.23.8.2 Configuring MongoDB handler to Write to AJD/ATP

Basic configuration remains the same including optional properties mentioned in this chapter.

The handler uses same protocol (mongodb wire protocol) and same driver jar for Autonomous databases as that of mongodb for performing all operation in target agnostic manner for performing the replication. The properties can also be used for any of the supported targets.

The following is a sample configuration for the MongoDB Handler for AJD/ATP from the Java adapter properties file:

gg.handlerlist=mongodb
gg.handler.mongodb.type=mongodb
#URL mentioned below should be an AJD instance URL
gg.handler.mongodb.clientURI=mongodb://[username]:[password]@[url]?authSource=$external&authMechanism=PLAIN&ssl=true
#Path to MongoDB Java driver. Maven co-ordinates
# <dependency>
# <groupId>org.mongodb</groupId>
# <artifactId>mongo-java-driver</artifactId>
# <version>3.10.1</version>
# </dependency>
gg.classpath=/path/to/mongodb/java/driver/mongo-java-driver-3.10.1.jar
javawriter.bootoptions=-Xmx512m -Xms32m -Djava.class.path=.:ggjava/ggjava.jar:./dirprm

Parent topic: MongoDB to AJD/ATP Migration

8.2.23.8.3 Steps for Migration

To migrate from MongoDB to AJD, first it is required to run initial load. Initial load comprises inserts operations only. After running initial load, start CDC which keeps the source and target database synchronized.

Start CDC extract and generate trails. Do not start replicat to consume these trail files.
Start Initial load extract and wait for initial load to complete.
Create a new replicat to consume the initial load trails generated in Step 2. Wait for completion and then stop replicat.
Create a new replicate to consume the CDC trails. Configure this replicat to use HANDLECOLLISIONS and then start replicat.
Wait for the CDC replicat (Step 4) to consume all the trails, check replicat lag, and replicat RBA to ensure that the CDC replicat has caught up. At this point, the source and target databases should be in sync.
Stop the CDC replicat, remove HANDLECOLLISIONS parameter, and then restart the CDC replicat.

Parent topic: MongoDB to AJD/ATP Migration

8.2.23.8.4 Best Practices

For migration from mongoDB to Oracle Autonomous Database (AJD/ATP), following are the best practices:

Before running CDC, ensure to run initial load, which loads the initial data using insert operations.
Use bulk mode for running mongoDB handler in order to achieve better throughput.
Enable handle-collision while migration to allow replicat to handle any collision error automatically.
In order to insert missing update, ensure to add the INSERTMISSINGUPDATES property in the.prm file.

Parent topic: MongoDB to AJD/ATP Migration

8.2.23.9 Configuring an Initial Synchronization of Extract for a MongoDB Source Database using Precise Instantiation

Data synchronization from a source MongoDB database to a target MongoDB database can be effectively achieved using the MongoDB dump utility through the method of precise instantiation.

The method of precise instantiation eliminates the need for collision handling in the target replicat, which is crucial for maintaining performance. Collision handling can negatively impact performance due to the necessary conversion of records into appropriate operations to prevent conflicts and ensure consistency.

This precise instantiation approach involves creating a database snapshot with the MongoDB dump utility to capture the current data state from the source and transfer it to the target using MongoDB Restore utility. The oplog dump’s first operation would be aligned as the initial operation on the Extract side as the starting point of the Change Data Capture (CDC) process. This alignment guarantees that there is no operational loss or duplication between the initially dumped data and the CDC trail produced by the extract.

Parent topic: MongoDB

8.2.23.9.1 Synchronization of MongoDB dump with Change Data Capture (CDC) Extract

The MongoDB dump utility enables the extraction of documents, metadata, and index definitions from a specified collection within a designated database, saving them as a binary archive file in a chosen directory. When the --oplog option is utilized, a noop entry is added at the start of the dump. Any operations that take place while the dump is being executed are recorded directly into the oplog.bson file located in the dump folder. This includes all operations, such as the noop and any incoming actions that occur during the dump, each accompanied by a timestamp and additional details. By analyzing the oplog.bson file, one can determine the first operation that took place during the dump process, along with its timestamp, provided at least one operation occurred. If no operations were recorded, the analysis will reveal the first noop entry and its corresponding timestamp.

Once the dump completes, dumped records, metadata and index information can be applied to the target MongoDB instance using MongoDB restore utility. This will apply all the dumped data during the dump process.

When initiating the Change Data Capture (CDC) Extract, we will use the timestamp recorded in the oplog.bson file as previously indicated. This process will begin capturing operations from the first event that occurs after the completion of the dump process, as the noted timestamp corresponds to the first operation recorded during the dump or indicates a no-operation (noop) if no operations took place. This approach guarantees that there are no missing operations or duplicates between the dumped data and the CDC trail file.

Parent topic: Configuring an Initial Synchronization of Extract for a MongoDB Source Database using Precise Instantiation

8.2.23.9.2 Steps with Example

Run the MongoDB dump utility by executing mongodump executable with ---oplog option from bin folder of MongoDB tools as follows:

$ ./mongodump --uri="mongodb://localhost:27021" --oplog -v

Sample Output:

./bin/mongodump --uri="mongodb://localhost:27021" --oplog -v
2024-12-12T15:10:50.666+0000	getting most recent oplog timestamp
2024-12-12T15:10:50.694+0000	writing admin.system.version to dump/admin/system.version.bson
2024-12-12T15:10:50.697+0000	done dumping admin.system.version (1 document)
2024-12-12T15:10:50.697+0000	dumping up to 4 collections in parallel
2024-12-12T15:10:50.698+0000	writing mydb.myColl2 to dump/mydb/myColl2.bson
2024-12-12T15:10:50.699+0000	writing mydb.myColl3 to dump/mydb/myColl3.bson
2024-12-12T15:10:50.699+0000	writing mydb.myColl0 to dump/mydb/myColl0.bson
2024-12-12T15:10:50.699+0000	writing mydb.myColl4 to dump/mydb/myColl4.bson
2024-12-12T15:10:50.739+0000	done dumping mydb.myColl3 (10000 documents)
2024-12-12T15:10:50.740+0000	done dumping mydb.myColl2 (10000 documents)
2024-12-12T15:10:50.741+0000	writing mydb.myColl6 to dump/mydb/myColl6.bson
2024-12-12T15:10:50.742+0000	writing mydb.myColl1 to dump/mydb/myColl1.bson
2024-12-12T15:10:50.748+0000	done dumping mydb.myColl0 (10000 documents)
2024-12-12T15:10:50.748+0000	done dumping mydb.myColl4 (10000 documents)
2024-12-12T15:10:50.748+0000	writing mydb.myColl8 to dump/mydb/myColl8.bson
2024-12-12T15:10:50.748+0000	writing mydb.myColl5 to dump/mydb/myColl5.bson
2024-12-12T15:10:50.770+0000	done dumping mydb.myColl1 (10000 documents)
2024-12-12T15:10:50.773+0000	writing mydb.myColl9 to dump/mydb/myColl9.bson
2024-12-12T15:10:50.786+0000	done dumping mydb.myColl6 (10000 documents)
2024-12-12T15:10:50.786+0000	writing mydb.myColl7 to dump/mydb/myColl7.bson
2024-12-12T15:10:50.793+0000	done dumping mydb.myColl8 (10000 documents)
2024-12-12T15:10:50.801+0000	done dumping mydb.myColl5 (10000 documents)
2024-12-12T15:10:50.806+0000	done dumping mydb.myColl9 (10000 documents)
2024-12-12T15:10:50.810+0000	done dumping mydb.myColl7 (10000 documents)
2024-12-12T15:10:50.811+0000	writing captured oplog to 
2024-12-12T15:10:50.812+0000		dumped 1 oplog entry

This will create a dump directory containing binary archive data file and oplog.bson file for all databases and collections. We can specify the database and collection name if we want a specific database and collection. For more information, see https://www.mongodb.com/docs/database-tools/mongodump/

Note:

We can use the option --numParallelCollections to specify the number of collections to back up in parallel. Default value is 4.

For example:

$  ./mongodump --uri="mongodb://localhost:27021" --oplog -v
--numParallelCollections 8

To analyse the generated oplog.bson file, utilize the MongoDB bsondump utility to convert it into a human-readable JSON format by executing the following command.
```
$ ./bsondump --pretty --outFile path/to/oplog.json path/to/oplog.bson
```
After the conversion, review the oplog.json file to carry out the next steps.

If no incoming operation occurred during dump process, the oplog.json file will have only noop entry with timestamp. Note down the timestamp of no-op entry that can be used as starting point of MongoDB CDC Extract.

{
	"op": "n",
	"ns": "",
	"o": {
		"msg": "periodic noop"
	},
	"ts": {
		"$timestamp": {
			"t": 1726486546,
			"i": 1
		}
	},
	"t": {
		"$numberLong": "1"
	},
	"v": {
		"$numberLong": "2"
	},
	"wall": {
		"$date": {
			"$numberLong": "1726486546549"
		}
	}
}

If there exists one or more incoming operation during the dump process, the oplog.json file will contain the entries for all those incoming operations (with timestamp) along with no-op. Note down the timestamp of first incoming operation logged in the oplog.json file that would be used as starting position for MongoDB CDC Extract as follows:

{
	"lsid": {
		"id": {
			"$binary": {
				"base64": "teT9VByFTI2COKwsVbp8/g==",
				"subType": "04"
			}
		},
		"uid": {
			"$binary": {
				"base64": "47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=",
				"subType": "00"
			}
		}
	},
	"txnNumber": {
		"$numberLong": "1"
	},
	"op": "i",
	"ns": "mydb.myColl",
	"ui": {
		"$binary": {
			"base64": "BFAoef89RNC1kObCDV+8SA==",
			"subType": "04"
		}
	},
	"o": {
		"_id": {
			"$numberInt": "10000006"
		},
		"Name": "TEST DATA 006",
		"EmployeeID": {
			"$numberInt": "1006"
		},
		"Designation": "Sr. Software Engineer",
		"Level": "MGR",
		"Age": {
			"$numberInt": "75"
		},
		"Qualification": "Masters",
		"Address": {
			"Street": "Street_65",
			"City": "City_35",
			"Nationality": "German"
		}
	},
	"ts": {
		"$timestamp": {
			"t": 1726486553,
			"i": 2
		}
	},
	"t": {
		"$numberLong": "1"
	},
	"v": {
		"$numberLong": "2"
	},
	"wall": {
		"$date": {
			"$numberLong": "1726486553788"
		}
	},
	"stmtId": {
		"$numberInt": "1"
	},
	"prevOpTime": {
		"ts": {
			"$timestamp": {
				"t": 1726486553,
				"i": 1
			}
		},
		"t": {
			"$numberLong": "1"
		}
	}
}

Use MongoDB restore utility to restore data from a backup created with mongodump. Run the MongoDB restore utility under the bin folder of mongo tools: $ ./mongorestore --uri="mongodb://localhost:27021". This will dump all data along with metadata and index definitions to target MongoDB instance as follows:

$ ./mongorestore --uri="mongodb://localhost:27021"  

Sample outcome:
2024-12-12T15:17:56.593+0000	using write concern: &{majority <nil> 0s}
2024-12-12T15:17:56.598+0000	using default 'dump' directory
2024-12-12T15:17:56.598+0000	preparing collections to restore from
2024-12-12T15:17:56.598+0000	found collection admin.system.version bson to restore to admin.system.version
2024-12-12T15:17:56.598+0000	found collection metadata from admin.system.version to restore to admin.system.version
2024-12-12T15:17:56.598+0000	found collection mydb.myColl0 bson to restore to mydb.myColl0
2024-12-12T15:17:56.598+0000	found collection metadata from mydb.myColl0 to restore to mydb.myColl0
2024-12-12T15:17:56.598+0000	found collection mydb.myColl1 bson to restore to mydb.myColl1
2024-12-12T15:17:56.598+0000	found collection metadata from mydb.myColl1 to restore to mydb.myColl1
2024-12-12T15:17:56.598+0000	found collection mydb.myColl2 bson to restore to mydb.myColl2
2024-12-12T15:17:56.598+0000	found collection metadata from mydb.myColl2 to restore to mydb.myColl2
2024-12-12T15:17:56.598+0000	found collection mydb.myColl3 bson to restore to mydb.myColl3
2024-12-12T15:17:56.598+0000	found collection metadata from mydb.myColl3 to restore to mydb.myColl3
2024-12-12T15:17:56.598+0000	found collection mydb.myColl4 bson to restore to mydb.myColl4
2024-12-12T15:17:56.598+0000	found collection metadata from mydb.myColl4 to restore to mydb.myColl4
2024-12-12T15:17:56.598+0000	found collection mydb.myColl5 bson to restore to mydb.myColl5
2024-12-12T15:17:56.598+0000	found collection metadata from mydb.myColl5 to restore to mydb.myColl5
2024-12-12T15:17:56.598+0000	reading metadata for mydb.myColl0 from dump/mydb/myColl0.metadata.json
2024-12-12T15:17:56.598+0000	reading metadata for mydb.myColl4 from dump/mydb/myColl4.metadata.json
2024-12-12T15:17:56.598+0000	reading metadata for mydb.myColl3 from dump/mydb/myColl3.metadata.json
2024-12-12T15:17:56.599+0000	reading metadata for mydb.myColl5 from 
dump/mydb/myColl9.metadata.json
2024-12-12T15:17:56.599+0000	reading metadata for mydb.myColl1 from dump/mydb/myColl1.metadata.json
2024-12-12T15:17:56.599+0000	reading metadata for mydb.myColl2 from dump/mydb/myColl2.metadata.json
2024-12-12T15:17:56.605+0000	creating collection mydb.myColl3 with no metadata
2024-12-12T15:17:56.608+0000	creating collection mydb.myColl0 with no metadata
2024-12-12T15:17:56.656+0000	restoring mydb.myColl3 from dump/mydb/myColl3.bson
2024-12-12T15:17:56.667+0000	restoring mydb.myColl0 from dump/mydb/myColl0.bson
2024-12-12T15:17:56.885+0000	creating collection mydb.myColl2 with no metadata
2024-12-12T15:17:56.913+0000	restoring mydb.myColl2 from dump/mydb/myColl2.bson
2024-12-12T15:17:56.947+0000	finished restoring mydb.myColl0 (10000 documents, 0 failures)
2024-12-12T15:17:56.947+0000	creating collection mydb.myColl1 with no metadata
2024-12-12T15:17:56.949+0000	finished restoring mydb.myColl3 (10000 documents, 0 failures)
2024-12-12T15:17:56.949+0000	creating collection mydb.myColl4 with no metadata
2024-12-12T15:17:56.976+0000	restoring mydb.myColl1 from dump/mydb/myColl1.bson
2024-12-12T15:17:56.980+0000	restoring mydb.myColl4 from dump/mydb/myColl4.bson
2024-12-12T15:17:57.214+0000	creating collection mydb.myColl5 with no metadata
2024-12-12T15:17:57.229+0000	finished restoring mydb.myColl2 (10000 documents, 0 failures)
2024-12-12T15:17:57.230+0000	restoring mydb.myColl5 from dump/mydb/myColl5.bson
2024-12-12T15:17:57.269+0000	finished restoring mydb.myColl4 (10000 documents, 0 failures)
2024-12-12T15:17:57.269+0000	finished restoring mydb.myColl1 (10000 documents, 0 failures)
2024-12-12T15:17:57.445+0000	finished restoring mydb.myColl5 (10000 documents, 0 failures)
2024-12-12T15:17:57.474+0000	100000 document(s) restored successfully. 0 document(s) failed to restore.

After the MongoDB restored, the initial load is completed. After initial load is done, configure MongoDB CDC Extract to begin with Position in Log with Log Sequence Number (LSN): as the timestamp that has been captured in step 3 or step 4. For example: The first timestamp captured from the oplog.json file is $timestamp:{ “t”:1726173148, “i”:1}, the format to be provided is t.i , which is 1726173148.1. This will make sure precise initiation is configured so that it does not encounter duplicates or miss any of the documents.

Parent topic: Configuring an Initial Synchronization of Extract for a MongoDB Source Database using Precise Instantiation

8.2.23.10 MongoDB Handler Client Dependencies

What are the dependencies for the MongoDB Handler to connect to MongoDB databases?

Oracle GoldenGate requires version 4.6.0 MongoDB reactive streams for integration with MongoDB. You can download this driver from: https://search.maven.org/artifact/org.mongodb/mongodb-driver-reactivestreams

Note:

If the Oracle GoldenGate for Distributed Applications and Analytics (GG for DAA) version is 21.7.0.0.0 and below, the driver version is MongoDB Java Driver 3.12.8. For Oracle GoldenGate for Distributed Applications and Analytics (GG for DAA) versions 21.8.0.0.0 and above, the driver version is MongoDB Java Driver 4.6.0.

Parent topic: MongoDB

8.2.23.10.1 MongoDB Java Driver 4.6.0

The required dependent client libraries are:

bson-4.6.0.jar
bson-record-codec-4.6.0.jar
mongodb-driver-core-4.6.0.jar
mongodb-driver-legacy-4.6.0.jar
mongodb-driver-legacy-4.6.0.jar
mongodb-driver-sync-4.6.0.jar

The Maven coordinates of these third-party libraries that are needed to run MongoDB replicat are:

<dependency>

    <groupId>org.mongodb</groupId>
    <artifactId>mongodb-driver-legacy</artifactId>
    <version>4.6.0</version>
    </dependency>

 <dependency>
    <groupId>org.mongodb</groupId>
    <artifactId>mongodb-driver-sync</artifactId>
    <version>4.6.0</version>
</dependency>

Example

Download the latest version from Maven central at: https://central.sonatype.com/artifact/org.mongodb/mongodb-driver-reactivestreams/4.6.0.

Parent topic: MongoDB Handler Client Dependencies

8.2.23.10.2 MongoDB Java Driver 3.12.8

You must include the path to the MongoDB Java driver in the gg.classpath property. To automatically download the Java driver from the Maven central repository, add the following lines in the pom.xml file, substituting your correct information:

<!-- https://mvnrepository.com/artifact/org.mongodb/mongo-java-driver -->
<dependency>
    <groupId>org.mongodb</groupId>
    <artifactId>mongo-java-driver</artifactId>
    <version>3.12.8</version>
</dependency>

Parent topic: MongoDB Handler Client Dependencies