Workflow for Oracle NoSQL Database Migrator

Learn about the various steps involved in using the Oracle NoSQL Database Migrator utility for migrating your NoSQL data.

The high level flow of tasks involved in using NoSQL Database Migrator is depicted in the below figure.

Download the NoSQL Data Migrator Utility

The Oracle NoSQL Database Migrator utility is available for download from the Oracle NoSQL Downloads page. Once you download and unzip it on your machine, you can access the runMigrator command from the command line interface.

Note:

Oracle NoSQL Database Migrator utility requires Java 11 or higher versions to run.

Identify the Source and Sink

Before using the migrator, you must identify the data source and sink. For instance, if you want to migrate a NoSQL table from Oracle NoSQL Database on-premise to a JSON formatted file, your source will be Oracle NoSQL Database and sink will be JSON file. Ensure that the identified source and sink are supported by the Oracle NoSQL Database Migrator by referring to Supported Sources and Sinks. This is also an appropriate phase to decide the schema for your NoSQL table in the target or sink, and create them.
  • Identify Sink Table Schema: If the sink is Oracle NoSQL Database on-premise or cloud, you must identify the schema for the sink table and ensure that the source data matches with the target schema. If required, use transformations to map the source data to the sink table.
    • Default Schema: NoSQL Database Migrator provides an option to create a table with the default schema without the need to predefine the schema for the table. This is useful primarily when loading JSON source files into Oracle NoSQL Database.
      If the source is a MongoDB-formatted JSON file, the default schema for the table will be as follows:
      CREATE TABLE IF NOT EXISTS <tablename>(ID STRING, DOCUMENT JSON,PRIMARY KEY(SHARD(ID))

      Where:

      — tablename = value provided for the table attribute in the configuration.

      — ID = _id value from each document of the mongoDB exported JSON source file.

      — DOCUMENT = For each document in the mongoDB exported file, the contents excluding the _id field are aggregated into the DOCUMENT column.

      If the source is a DynamoDB-formatted JSON file, the default schema for the table will be as follows:
      CREATE TABLE IF NOT EXISTS <TABLE_NAME>(DDBPartitionKey_name DDBPartitionKey_type, 
      [DDBSortKey_name DDBSortKey_type],DOCUMENT JSON,
      PRIMARY KEY(SHARD(DDBPartitionKey_name),[DDBSortKey_name]))

      Where:

      — TABLE_NAME = value provided for the sink table in the configuration

      — DDBPartitionKey_name = value provided for the partition key in the configuration

      — DDBPartitionKey_type = value provided for the data type of the partition key in the configuration

      — DDBSortKey_name = value provided for the sort key in the configuration if any

      — DDBSortKey_type = value provided for the data type of the sort key in the configuration if any

      — DOCUMENT = All attributes except the partition and sort key of a Dynamo DB table item aggregated into a NoSQL JSON column

      If the source format is a CSV file, a default schema is not supported for the target table. You can create a schema file with a table definition containing the same number of columns and data types as the source CSV file. For more details on the Schema file creation, see Providing Table Schema.

      For all the other sources, the default schema will be as follows:
      CREATE TABLE IF NOT EXISTS <tablename> (ID LONG GENERATED ALWAYS AS IDENTITY, DOCUMENT JSON, PRIMARY KEY(ID))

      Where:

      — tablename = value provided for the table attribute in the configuration.

      — ID = An auto-generated LONG value.

      — DOCUMENT = The JSON record provided by the source is aggregated into the DOCUMENT column.

      Note:

      If the _id value is not provided as a string in the MongoDB-formatted JSON file, NoSQL Database Migrator converts it into a string before inserting it into the default schema.
  • Providing Table Schema: NoSQL Database Migrator allows the source to provide schema definitions for the table data using schemaInfo attribute. The schemaInfo attribute is available in all the data sources that do not have an implicit schema already defined. Sink data stores can choose any one of the following options.
    • Use the default schema defined by the NoSQL Database Migrator.
    • Use the source-provided schema.
    • Override the source-provided schema by defining its own schema. For example, if you want to transform the data from the source schema to another schema, you need to override the source-provided schema and use the transformation capability of the NoSQL Database Migrator tool.


    The table schema file, for example, mytable_schema.ddl can include table DDL statements. The NoSQL Database Migrator tool executes this table schema file before starting the migration. The migrator tool supports no more than one DDL statement per line in the schema file. For example,
    CREATE TABLE IF NOT EXISTS(id INTEGER, name STRING, age INTEGER, PRIMARY KEY(SHARD(ID)))

    Note:

    Migration will fail if the table is present at the sink and the DDL in the schemaPath is different than the table.
  • Create Sink Table: Once you identify the sink table schema, create the sink table either through the Admin CLI or using the schemaInfo attribute of the sink configuration file. See Sink Configuration Templates .

    Note:

    If the source is a CSV file, create a file with the DDL commands for the schema of the target table. Provide the file path in schemaInfo.schemaPath parameter of the sink configuration file.

Run the runMigrator command

The runMigrator executable file is available in the extracted NoSQL Database Migrator files. You must install Java 11 or higher version and bash on your system to successfully run the runMigrator command.

You can run the runMigrator command in two ways:
  1. By creating the configuration file using the runtime options of the runMigrator command as shown below.
    [~]$ ./runMigrator
    configuration file is not provided. Do you want to generate configuration? (y/n)                                                                              
     
    [n]: y
    ...
    ...
    • When you invoke the runMigrator utility, it provides a series of run time options and creates the configuration file based on your choices for each option.
    • After the utility creates the configuration file, you have a choice to either proceed with the migration activity in the same run or save the configuration file for a future migration.
    • Irrespective of your decision to proceed or defer the migration activity with the generated configuration file, the file will be available for edits or customization to meet your future requirements. You can use the customized configuration file for migration later.
  2. By passing a manually created configuration file (in the JSON format) as a runtime parameter using the -c or --config option. You must create the configuration file manually before running the runMigrator command with the -c or --config option. For any help with the source and sink configuration parameters, see Sources and Sinks.
    [~]$ ./runMigrator -c </path/to/the/configuration/json/file>

Logging Migrator Progress

NoSQL Database Migrator tool provides options, which enables trace, debugging, and progress messages to be printed to standard output or to a file. This option can be useful in tracking the progress of migration operation, particularly for very large tables or data sets.

  • Log Levels

    To control the logging behavior through the NoSQL Database Migrator tool, pass the --log-level or -l run time parameter to the runMigrator command. You can specify the amount of log information to write by passing the appropriate log level value.

    $./runMigrator --log-level <loglevel>
    Example:
    $./runMigrator --log-level debug

    Table 1-1 Supported Log Levels for NoSQL Database Migrator

    Log Level Description
    warning Prints errors and warnings.
    info (default) Prints the progress status of data migration such as validating source, validating sink, creating tables, and count of number of data records migrated.
    debug Prints additional debug information.
    all Prints everything. This level turns on all levels of logging.
  • Log File:
    You can specify the name of the log file using --log-file or -f parameter. If --log-file is passed as run time parameter to the runMigrator command, the NoSQL Database Migrator writes all the log messages to the file else to the standard output.
    $./runMigrator --log-file <log file name>
    Example:
    $./runMigrator --log-file nosql_migrator.log

Limitation

Oracle NoSQL Database Migrator does not lock the database during backup and block other users. Therefore, it is highly recommended not to perform the following activities when a migration task is running:

  • Any DML/DDL operations on the source table.
  • Any topology-related modification on the data store.