Migrate from Oracle NoSQL Database to a JSON file

This example shows how to use the Oracle NoSQL Database Migrator to copy data and the schema definition of a NoSQL table from Oracle NoSQL Database to a JSON file.

Use Case

An organization decides to train a model using the Oracle NoSQL Database data to predict future behaviors and provide personalized recommendations. They can take a periodic copy of the NoSQL Database tables data to a JSON file and apply it to the analytic engine to analyze and train the model. Doing this helps them separate the analytical queries from the low-latency critical paths.

Example

For the demonstration, let us look at how to migrate the data and schema definition of a NoSQL table called myTable from NoSQL Database to a JSON file.

Prerequisites
  • Identify the source and sink for the migration.
    • Source: Oracle NoSQL Database
    • Sink: JSON file
  • Identify the following details for the on-premise KVStore:
    • storeName: kvstore
    • helperHosts: <hostname>:5000
    • table: myTable

Procedure

To migrate the data and schema definition of your table from Oracle NoSQL Database to a JSON file, you can use one of the following options:

  1. Open the command prompt and navigate to the directory where you extracted the NoSQL Database Migrator utility.
  2. To generate the configuration file using the Migrator utility, run the runMigrator command without any runtime parameter.
    [~/nosqlMigrator]$./runMigrator
  3. As you did not provide the configuration file as a runtime parameter, the utility prompts if you want to generate the configuration now. Type y.
    Configuration file is not provided. Do you want to generate configuration? (y/n) [n]: y
    Generating a configuration file interactively.
  4. Based on the prompts from the utility, choose your options for the Source configuration.
    Enter a location for your config [./migrator-config.json]: /home/<user>/nosqlMigrator/
    Select the source: 
    1) nosqldb
    2) nosqldb_cloud
    3) file
    4) object_storage_oci
    5) aws_s3
    #? 1
    
    Configuration for source type=nosqldb
    
    Enter store name of the Oracle NoSQL Database: kvstore
    
    Enter comma separated list of host:port of Oracle NoSQL Database: <hostname>:5000
    
    Enter fully qualified table name: myTable
    
    Include TTL data? If you select 'yes' TTL of rows will also 
    be included in the exported data.(y/n) [n]: 
    
    Is the store secured? (y/n) [y]:
    
    Enter store operation timeout in milliseconds. (1-30000) [5000]:
  5. Based on the prompts from the utility, choose your options for the Sink configuration.
    Select the sink:
    1) nosqldb
    2) nosqldb_cloud
    3) file
    #? 3
    
    Configuration for sink type=file
    Select the sink file format: 
    1) json
    2) parquet
    #? 1
    
    Enter path to a directory to store JSON data: /home/<user>/nosqlMigrator
    
    would you like to export data to multiple files for each source?(y/n) [y]: n
    
    Would you like to store JSON in pretty format? (y/n) [n]: y
    
    Would you like to migrate the table schema also? (y/n) [y]: y
    
    Enter path to a file to store table schema: /home/<user>/nosqlMigrator/myTableSchema.ddl
  6. Based on the prompts from the utility, choose your options for the source data transformations. The default value is n.
    Would you like to add transformations to source data? (y/n) [n]:
  7. Enter your choice to determine whether to proceed with the migration in case any record fails to migrate.
    Would you like to continue migration in case of any record/row is failed to migrate?: (y/n) [n]:
  8. The utility displays the generated configuration on the screen.
    Generated configuration is:
    {
     "source" : {
      "type" : "nosqldb",
      "storeName" : "kvstore",
      "helperHosts" : ["<hostname>:5000"],
      "table" : "myTable",
      "includeTTL" : true,
      "requestTimeoutMs" : 5000
     },
     "sink" : {
      "type" : "file",
      "format" : "json",
      "useMultiFiles" : false,
      "schemaPath" : "/home/<username>/nosqlMigrator/myTableSchema.ddl",
      "pretty" : true,
      "dataPath" : "/home/<username>/nosqlMigrator/"
     },
     "abortOnError" : true,
     "migratorVersion" : "1.8.0"
    }
  9. The utility prompts for your choice to decide whether to proceed with the migration with the generated configuration file or not. The default option is y.

    Note:

    If you select n, you can use the generated configuration file to run the migration using the ./runMigrator -c or the ./runMigrator --config option.
    would you like to run the migration with above configuration?
    If you select no, you can use the generated configuration file to run the migration using
    ./runMigrator --config /home/<user>/nosqlMigrator/migrator-config.json
    (y/n) [y]:
  10. The NoSQL Database Migrator migrates your data and schema from Oracle NoSQL Database to a JSON file.
    Records provided by source=10, Records written to sink=10, Records failed=0.
    Elapsed time: 0min 10sec 426ms
    Migration completed.

Verification

To verify the migration, you can navigate to the specified sink directory and view the schema and data.

-- Exported myTable Data. JSON files are created in the supplied data path
[~/nosqlMigrator]$cat myTable_1_5.json

{
  "id" : 2,
  "document" : "{\"course\":\"Bio-Technology\",\"name\":\"Raja\",\"studentid\":108}"
}
{
  "id" : 7,
  "document" : "{\"course\":\"Computer Science\",\"name\":\"Ruby\",\"studentid\":100}"
}
{
  "id" : 4,
  "document" : "{\"course\":\"Computer Science\",\"name\":\"Ruby\",\"studentid\":100}"
}
{
  "id" : 10,
  "document" : "{\"course\":\"Computer Science\",\"name\":\"Neena\",\"studentid\":105}"
}
{
  "id" : 5,
  "document" : "{\"course\":\"Journalism\",\"name\":\"Rani\",\"studentid\":106}"
}

[~/nosqlMigrator]$cat myTable_6_10.json

{
  "id" : 8,
  "document" : "{\"course\":\"Computer Science\",\"name\":\"Tom\",\"studentid\":103}"
}
{
  "id" : 3,
  "document" : "{\"course\":\"Computer Science\",\"name\":\"John\",\"studentid\":107}"
}
{
  "id" : 9,
  "document" : "{\"course\":\"Computer Science\",\"name\":\"Peter\",\"studentid\":109}"
}
{
  "id" : 6,
  "document" : "{\"course\":\"Bio-Technology\",\"name\":\"Rekha\",\"studentid\":104}"
}
{
  "id" : 1,
  "document" : "{\"course\":\"Journalism\",\"name\":\"Tracy\",\"studentid\":110}"
}
-- Exported myTable Schema
 
[~/nosqlMigrator]$cat myTableSchema
CREATE TABLE IF NOT EXISTS myTable (id INTEGER, document JSON, PRIMARY KEY(SHARD(id)))
  1. Prepare the configuration file (in JSON format) with Oracle NoSQL Database source and JSON sink details. See Source Configuration Templates and Sink Configuration Templates.

    A users table is used with the following data in this example:
    {"id":10,"firstName":"John","lastName":"Smith","age":22,"income":45000,"address":{"city":"Santa Cruz","number":101,"phones":[{"area":408,"kind":"work","number":4538955},{"area":831,"kind":"home","number":7533341},{"area":831,"kind":"mobile","number":7533382}],"state":"CA","street":"Pacific Ave","zip":95008}}
    {"id":20,"firstName":"Jane","lastName":"Smith","age":22,"income":55000,"address":{"city":"San Jose","number":201,"phones":[{"area":608,"kind":"work","number":6538955},{"area":931,"kind":"home","number":9533341},{"area":931,"kind":"mobile","number":9533382}],"state":"CA","street":"Atlantic Ave","zip":95005}}
    {"id":30,"firstName":"Adam","lastName":"Smith","age":45,"income":75000,"address":{"city":"Houston","number":301,"phones":[{"area":618,"kind":"work","number":6618955},{"area":951,"kind":"home","number":9613341},{"area":981,"kind":"mobile","number":9613382}],"state":"TX","street":"Indian Ave","zip":95075}}
    {"id":40,"firstName":"Joanna","lastName":"Smith","age":null,"income":75000,"address":{"city":"Houston","number":401,"phones":[{"area":null,"kind":"work","number":1618955},{"area":451,"kind":"home","number":4613341},{"area":481,"kind":"mobile","number":4613382}],"state":"TX","street":"Tex Ave","zip":95085}}

    Ensure that you include the queryFilter parameter with appropriate query predicate in the source configuration template to export only the required rows from your table. For details on how to create query predicates, see the Sample Query Predicates table - Table 1-3.

    In this example, the query predicate exports rows with the city field in address JSON column = 'Houston' from the users table.

    {
      "source" : {
      "type" : "nosqldb",
        "storeName" : "kvstore",
        "helperHosts" : ["hostname:5000"],
        "table" : "users",
        "queryFilter" : "$row.address.city='Houston'",
        "includeTTL" : true,
        "requestTimeoutMs" : 5000
      },
      "sink" : {
        "type" : "file",
        "format" : "json",
        "useMultiFiles" : true,
        "schemaPath" : "/scratch/<user>/nosqlMigrator/tableschema.ddl",
    
        "pretty" : false,
        "dataPath" : "/scratch/<user>/nosqlMigrator"
      },
      "abortOnError" : true,
      "migratorVersion" : "1.8.0"
    }
  2. Open the command prompt and navigate to the directory where you extracted the NoSQL Database Migrator utility.
  3. Run the runMigrator command by passing the configuration file. Use the --config or -c option.
    [~/nosqlMigrator]$./runMigrator --config <complete/path/to/the/JSON/config/file>

    Note:

    You can also run the command with the --dump-filter|df option to view and verify the generated query before running the migration task as follows. For more details, see Dump Filter.
    [~/nosqlMigrator]$./runMigrator --dump-filter <complete/path/to/the/JSON/config/file>
    The utility proceeds with the data migration as follows:
    [INFO] creating source from given configuration:
    [INFO] [nosqldb source] : query = 'SELECT $row,expiration_time_millis($row) AS expiration FROM users $row where $row.address.city='Houston''
    [INFO] source creation completed
    [INFO] creating sink from given configuration:
    [INFO] sink creation completed
    [INFO] creating migrator pipeline
    [INFO] [json file sink] : writing table schema to /scratch/<user>/nosqlMigrator/tableschema.ddl
    [INFO] migration started
    [INFO] Migration success for source users_1_5. read=0,written=0,failed=0
    [INFO] Migration success for source users_6_10. read=2,written=2,failed=0
    [INFO] Migration is successful for all the sources.
    [INFO] migration completed.
    Records provided by source=2, Records written to sink=2, Records failed=0,Records skipped=0.
    Elapsed time: 0min 0sec 615ms
    Migration completed.

Verification

To verify the migration, you can navigate to the specified sink directory and view the schema and data. Only the rows in address JSON column with city field value 'Houston' are exported.
-- Exported users data. Schema and JSON files are created in the supplied data paths. 

[~/nosqlMigrator]: cat tableschema.ddl 

CREATE TABLE IF NOT EXISTS users (id INTEGER, firstName STRING, lastName STRING, age INTEGER, income INTEGER, address JSON, PRIMARY KEY(SHARD(id)))
[~/nosqlMigrator]: cat users_6_10.json

{"id":30,"firstName":"Adam","lastName":"Smith","age":45,"income":75000,"address":{"city":"Houston","number":301,"phones":[{"area":618,"kind":"work","number":6618955},{"area":951,"kind":"home","number":9613341},{"area":981,"kind":"mobile","number":9613382}],"state":"TX","street":"Indian Ave","zip":95075}}
{"id":40,"firstName":"Joanna","lastName":"Smith","age":null,"income":75000,"address":{"city":"Houston","number":401,"phones":[{"area":null,"kind":"work","number":1618955},{"area":451,"kind":"home","number":4613341},{"area":481,"kind":"mobile","number":4613382}],"state":"TX","street":"Tex Ave","zip":95085}}
bash-4.4$