Migrate from DynamoDB JSON file to Oracle NoSQL Database

This example shows how to use Oracle NoSQL Database Migrator to copy DynamoDB JSON file to NoSQL Database.

Use Case:

After evaluating multiple options, an organization finalizes Oracle NoSQL Database over DynamoDB database. The organization wants to migrate their tables and data from DynamoDB to Oracle NoSQL Database (on-premises).

See Mapping of DynamoDB table to Oracle NoSQL table for more details.

You can migrate a file or directory containing the DynamoDB exported JSON data from a file system by specifying the path in the source configuration template.

A sample DynamoDB-formatted JSON File is as follows:

{"Item":{"Id":{"N":"101"},"Phones":{"L":[{"L":[{"S":"555-222"},{"S":"123-567"}]}]},"PremierCustomer":{"BOOL":false},"Address":{"M":{"Zip":{"N":"570004"},"Street":{"S":"21 main"},"DoorNum":{"N":"201"},"City":{"S":"London"}}},"FirstName":{"S":"Fred"},"FavNumbers":{"NS":["10"]},"LastName":{"S":"Smith"},"FavColors":{"SS":["Red","Green"]},"Age":{"N":"22"},"ttl": {"N": "1734616800"}}}
{"Item":{"Id":{"N":"102"},"Phones":{"L":[{"L":[{"S":"222-222"}]}]},"PremierCustomer":{"BOOL":false},"Address":{"M":{"Zip":{"N":"560014"},"Street":{"S":"32 main"},"DoorNum":{"N":"1024"},"City":{"S":"Wales"}}},"FirstName":{"S":"John"},"FavNumbers":{"NS":["10"]},"LastName":{"S":"White"},"FavColors":{"SS":["Blue"]},"Age":{"N":"48"},"ttl": {"N": "1734616800"}}}

You copy the exported DynamoDB table data from AWS S3 storage to a local mounted file system.

Example:

For this demonstration, you will learn how to migrate a DynamoDB JSON file to Oracle NoSQL Database (on-premises). You will use a manually created configuration file for this example.

Prerequisites

Identify the source and sink for the migration.
- Source: DynamoDB JSON File
- Sink: Oracle NoSQL Database (on-premises)
In order to import DynamoDB table data to Oracle NoSQL Database, you must first export the DynamoDB table to S3. See the steps provided in Exporting DynamoDB table data to Amazon S3 to export your table. While exporting, you select the format as DynamoDB JSON. The exported data contains DynamoDB table data in multiple gzip files as shown below.
```
/ 01639372501551-bb4dd8c3 
|-- 01639372501551-bb4dd8c3 ==> exported data prefix
|----data
|------sxz3hjr3re2dzn2ymgd2gi4iku.json.gz  ==>table data
|----manifest-files.json
|----manifest-files.md5
|----manifest-summary.json
|----manifest-summary.md5
|----_started
```

You must download the files from AWS S3. The structure of the files after the download will be as shown below.

download-dir/01639372501551-bb4dd8c3     
|----data    
|------sxz3hjr3re2dzn2ymgd2gi4iku.json.gz  ==>table data   
|----manifest-files.json   
|----manifest-files.md5   
|----manifest-summary.json   
|----manifest-summary.md5   
|----_started

Procedure

To migrate the DynamoDB JSON data to the Oracle NoSQL Database:

Prepare the configuration file (in JSON format) with the identified source and sink details. For details, see Source Configuration Templates and Sink Configuration Templates.

Note:
If your DynamoDB exported JSON table items contain TTL attribute, to optionally import the TTL values, specify the attribute in the ttlAttributeName configuration parameter of the source configuration template and set the includeTTL configuration parameter to true in the sink configuration template.
You can choose one of the following two options.
- Option 1: Importing DynamoDB table as a JSON document using default schema config.
  Here, you set the defaultSchema configuration parameter to true. Therefore, the NoSQL Database Migrator creates the default schema at the sink. You must specify the DDBPartitionKey and the corresponding NoSQL column type. Otherwise, an error is displayed.
  For details on the default schema for a DynamoDB exported JSON source, see Identify the Source and Sink topic in Workflow for Oracle NoSQL Database Migrator.
```
{
  "source" : {
    "type" : "file",
    "format" : "dynamodb_json",
    "ttlAttributeName" : "ttl",
    "dataPath" : "<complete/path/to/the/DynamoDB/Formatted/JSON/file>"
  },
  "sink" : {
    "type" : "nosqldb",
    "storeName" : "kvstore",
    "helperHosts" : ["<hostname>:5000"],
    "table" : "sampledynDBImp",
    "includeTTL" : true,
    "schemaInfo" : {
      "DDBPartitionKey" : "Id:INTEGER",
      "defaultSchema" : true
    },
    "overwrite" : true,
    "requestTimeoutMs" : 5000
  },
  "abortOnError" : true,
  "migratorVersion" : "1.6.5"
}
```
  The following default schema is used in this example:
```
CREATE TABLE IF NOT EXISTS sampledynDBImp (Id INTEGER, document JSON, PRIMARY KEY(SHARD(Id)))
```
- Option 2: Importing DynamoDB table as fixed columns using a user-supplied schema file.
  Here, you set the defaultSchema configuration parameter to false. Therefore, you specify the file containing the sink table's DDL statement in the schemaPath parameter. See Mapping of DynamoDB types to Oracle NoSQL types for more details.
  The following user-defined schema is used in this example:
```
CREATE TABLE IF NOT EXISTS sampledynDBImp (Id INTEGER, document JSON, PRIMARY KEY(SHARD(Id)))
```
  NoSQL Database Migrator uses the schema file to create the table at the sink as part of the migration. As long as the primary key data is provided, the input JSON record will be inserted. Otherwise, an error is displayed.
  Note:
  
  If the Dynamo DB table has a data type that is not supported in NoSQL Database, the migration fails.
  
  If the input data does not contain a value for a particular column (other than the primary key) then the column default value will be used. The default value must be a part of the column definition while creating the table. For example id INTEGER not null default 0. If the column does not have a default definition, SQL NULL is inserted if values are not provided for the column.
  
  If you are modeling DynamoDB table as a JSON document, ensure that you use AggregateFields transform in order to aggregate non-primary key data into a JSON column. For details, see aggregateFields.
```
{
  "source" : {
    "type" : "file",
    "format" : "dynamodb_json",
    "ttlAttributeName" : "ttl",
    "dataPath" : "<complete/path/to/the/DynamoDB/Formatted/JSON/file>"
  },
  "sink" : {
    "type" : "nosqldb",
    "storeName" : "kvstore",
    "helperHosts" : ["<hostname>:5000"],
    "table" : "sampledynDBImp",
    "includeTTL" : true,
    "schemaInfo" : {
      "schemaPath" : "<full path of the schema file with the DDL statement>"
    },
    "overwrite" : true,
    "requestTimeoutMs" : 5000
  },
  "transforms": {
    "aggregateFields" : {
      "fieldName" : "document",
      "skipFields" : ["Id"]
    }
  },
  "abortOnError" : true,
  "migratorVersion" : "1.6.5"
}
```
Open the command prompt and navigate to the directory where you extracted the NoSQL Database Migrator utility.
Run the runMigrator command by passing separate configuration files for options 1 and 2. Use the --config or -c option.
```
./runMigrator --config <complete/path/to/the/JSON/config/file>
```

The utility proceeds with the data migration as illustrated in the following sample:

[INFO] creating source from given configuration:
[INFO] source creation completed
[INFO] creating sink from given configuration:
[INFO] sink creation completed
[INFO] creating migrator pipeline
[INFO] [nosqldb sink] : start loading DDLs
[INFO] [nosqldb sink] : executing DDL: CREATE TABLE IF NOT EXISTS sampledynDBImp (Id INTEGER, document JSON, PRIMARY KEY(SHARD(Id)))
[INFO] [nosqldb sink] : completed loading DDLs
[INFO] migration started
[INFO] Start writing data to OnDB Sink
[INFO] executing for source:DynamoSample
[INFO] [DDB file source] : start parsing JSON records from file: DynamoSample.json.gz
[INFO] Writing data to OnDB Sink completed.
[INFO] migration completed.
Records provided by source=2, Records written to sink=2, Records failed=0,Records skipped=0.
Elapsed time: 0min 0sec 45ms
Migration completed.

Validation

Start the SQL prompt in your data store.

 java -jar lib/sql.jar -helper-hosts localhost:5000 -store kvstore

Verify that the new table is created with the source data:

SELECT * FROM sampledynDBImp

Output

Notice that the TTL information is included in the _metadata JSON object for each imported item.

{"Id":102,"document":{"Address":{"City":"Wales","DoorNum":1024,"Street":"32 main","Zip":560014},"Age":48,"FavColors":["Blue"],"FavNumbers":[10],"FirstName":"John","LastName":"White","Phones":[["222-222"]],"PremierCustomer":false,"_metadata":{"expiration":1734616196000}}}
{"Id":101,"document":{"Address":{"City":"London","DoorNum":201,"Street":"21 main","Zip":570004},"Age":22,"FavColors":["Red","Green"],"FavNumbers":[10],"FirstName":"Fred","LastName":"Smith","Phones":[["555-222","123-567"]],"PremierCustomer":false,"_metadata":{"expiration":1734616196000}}}