Migrate Data with DBMS_CLOUD_IMPORT

DBMS_CLOUD_IMPORT enables you to import data from Oracle and non-Oracle databases into Oracle Autonomous AI Database using a simple, consistent approach.

When working across multiple database systems, data movement can be complex due to differences in technologies, formats, and tools. DBMS_CLOUD_IMPORT simplifies this process by providing a unified approach to import data across platforms, eliminating the need to manage multiple tools or workflows.

DBMS_CLOUD_IMPORT also supports high-performance and reliable data transfer. The import process uses parallel execution to improve throughput, and if an import is interrupted, such as during maintenance or a loss of connection to the source database, it automatically resumes from where it left off, ensuring completion without having to re-initiate the import.

You can import data from all Oracle databases running in OCI or other cloud providers such as Amazon RDS for Oracle, and all non-Oracle databases supported by Oracle Managed Heterogeneous connectivity. For Oracle sources, both data and database objects are imported. For non-Oracle sources, the import focuses on data movement with automatic data type conversion to Oracle-compatible formats.

The capability supports flexible data import. You can import an entire database or a subset of data, such as selected schemas or tables, based on your requirements. Oracle also provides data dictionary views to monitor progress and track import status.

Key Benefits

Unified data import across platforms: Import data from Oracle and non-Oracle databases without using multiple tools.
High-performance data transfer: Move large volumes of data using parallel execution.
Built-in resiliency: Resume import operations automatically after interruptions, including maintenance or a loss of connection to the source database.
Flexible data selection: Import a full database or a subset of data, including specific schemas or tables.

Prerequisites

Lists the prerequisites for performing the import into the Autonomous AI Database.

Before starting the import, ensure that the following prerequisites are met.

You must be logged in as the ADMIN user.
DBMS_CLOUD_IMPORT subprograms access the source database using a database link created implicitly as part of the import task. To establish this connection, you must create a credential object with the remote database credentials and reference it when creating the import task. See CREATE_CREDENTIAL Procedure for more information.
For non-Oracle source databases, specify longtovarchar => 'true' in the gateway_params parameter when calling DBMS_CLOUD_IMPORT.CREATE_IMPORT_TASK. This is required because metadata in non-Oracle databases consists of LONG columns. For example: gateway_params => JSON_OBJECT('db_type' VALUE <db_type>, 'longtovarchar' VALUE 'true').
Ensure that the database specified by service_name can connect to and access the target schema.
Import jobs resume from the point of interruption, such as after a planned maintenance event or an unexpected outage.

In this case, parallel processing is not supported, and if the job is interrupted, it restarts from the beginning of the table instead of resuming from the point of interruption.

Additional prerequisites may be required to support parallel processing and reliable restart ability, depending on the source database type.

Parallel processing and automatic resume are currently supported for Oracle, MySQL, PostgreSQL, and Amazon Redshift source databases. For these databases, the following additional prerequisites are required to enable parallel processing and automatic resume (additional prerequisites may apply depending on the source database type):

The source table is range-partitioned.
Histogram statistics exist for at least one numeric column.
A numeric index or primary key is present.

If these prerequisites are not met, the table is copied without any parallel processing, and if the job is interrupted, it restarts from the beginning of the table instead of resuming from the point of interruption.

The following are database specific prerequisites:

PostgreSQL Source Database
- Create the required views on the source PostgreSQL database. To support parallel processing and reliable restartability when migrating from PostgreSQL, create the required views on the source database. If the views are not present, the import uses CREATE TABLE AS SELECT(CTAS), in that mode, parallel processing is not supported, and if a table load is interrupted, it restarts from the beginning. See CREATE TABLE AS SELECT for more information.
- You can create these views using Oracle-provided scripts available on GitHub. Review and run the scripts on the source PostgreSQL database before starting the import.
MySQL Source Database
- schema_list is not supported for MySQL sources because MySQL does not use schemas. When you invoke CREATE_IMPORT_TASK, specify an empty array for schema_list ([]).
- The value you specify for service_name is used as the schema name.
- Create the required views on the source MySQL database. To support parallel processing and reliable restart ability when migrating from MySQL, create the required views on the source database. If the views are not present, the import will not use parallel copy and if a table load is interrupted, it restarts from the beginning.
- You can create these views using Oracle-provided scripts available on Github. Review and run the scripts on the source MySQL database before starting the import.
Amazon Redshift Source Database
- Grant privileges for database link access. To read data from remote tables, you must ensure the account you use to create the database link has the required privileges, for example:
  - GRANT USAGE ON SCHEMA <remote_schema> TO <dblink_user>;
  - GRANT SELECT ON <remote_schema>.<remote_table> TO <dblink_user>;
- Ensure metadata visibility for non-public schemas. If the remote schema is not public and the import needs to query metadata (for example, via pg_table_def), you must include the schema in the database link user’s search_path, for example:
  - ALTER USER <dblink_user> SET search_path TO <new_value>;

Import Data using DBMS_CLOUD_IMPORT

Describes how to import data into the Autonomous AI Database using DBMS_CLOUD_IMPORT.

To import data into Autonomous AI Database, use the DBMS_CLOUD_IMPORT.CREATE_IMPORT_TASK procedure. This procedure lets you import data from Oracle and supported non-Oracle databases by specifying the required connection details and, optionally, the objects to import.

You can import an entire database or a subset of data, such as specific schemas or tables, by specifying parameters such as schema_list or table_list. When you run this procedure, Oracle AI Database creates an Oracle Scheduler job to manage the import task. The job uses the provided connection details and credentials to create a database link to the source database and transfer data to Autonomous AI Database.

The behavior of the import depends on the source database type:

Oracle-to-Oracle imports: Both data and database objects are migrated. Indexes, constraints, and partitions are automatically created in the target Autonomous AI Database. If the import is interrupted (for example, due to maintenance), it resumes from the point of interruption.
Imports from non-Oracle databases: Only data is migrated. Keys, indexes, constraints, and other dependent objects are not created. Partitions are created only if the source table is range-partitioned. Resume capability depends on prerequisites; otherwise, the import restarts from the beginning.

You must run the import as the ADMIN user. The target Autonomous AI Database remains available during the import; however, Oracle recommends avoiding other activity on the target database until the import completes. If an import job is interrupted (for example, due to planned maintenance or an unexpected outage), it may resume from the point of interruption or restart, depending on the source database type and whether prerequisites are satisfied. You can also use suspend and resume to temporarily pause and then continue the job.

Create an Import Task

Describes the steps to create an import task.

Before you create and start the import task, set up authentication to the source database. DBMS_CLOUD_IMPORT uses a credential object to connect securely to the source (and to create the required database link for the task).

Follow these steps to create an import task:

Create credentials for the source database. These credentials are used to securely authenticate and connect to the source system.

For example:
```
BEGIN
  DBMS_CLOUD.CREATE_CREDENTIAL(
    credential_name => 'db1_cred',
    username        => '<username>',
    password        => '<password>'
  );
END;
/
```
See CREATE_CREDENTIAL Procedure for more information.
Create an Import Task.

After you create the credential, use the DBMS_CLOUD_IMPORT.CREATE_IMPORT_TASK procedure to create and start an import task. This procedure imports data from Oracle and supported non-Oracle databases, such as MySQL, PostgreSQL, and Amazon Redshift.

When you run this procedure, specify the required connection details and credentials. You can also control what data is imported by using parameters such as schema_list or table_list.

Oracle AI Database runs the import as an Oracle Scheduler job, which manages the operation and tracks progress. The job maintains the state of the import at the table level. If the import is interrupted, it automatically resumes by continuing from the last incomplete tables, rather than restarting the entire operation.

The following examples show how to create import tasks for different source database types by providing the required connection details and credentials.

Example: Creates an import task that connects to an Oracle source and migrates only the specified tables (for example, ADMIN.TABLE1 and ADMIN.TABLE2) into the target Autonomous AI Database.
```
BEGIN
  DBMS_CLOUD_IMPORT.CREATE_IMPORT_TASK(
    task_name          => '<orcl_import_job>',
    hostname           => '<example1.oraclecloud.com>',
    port               => '<port>',
    service_name       => '<service_name>',
    ssl_server_cert_dn => '<cn=remote_db_cn_name>',
    directory_name     => '<walletdb1>',
    credential_name    => 'db1_cred',
    table_list         => '[{"schema_name": "admin", "table_name": "table1"},
                           {"schema_name": "admin",  "table_name": "table2"}]'
  );
END;
/
```
Example: Creates an import task that connects to the source Oracle database using the specified connection details and credential, and migrates the entire ADMIN schema. Use schema_list when you want to migrate one or more schemas, instead of specifying individual tables.
```
BEGIN
  DBMS_CLOUD_IMPORT.CREATE_IMPORT_TASK(
    task_name          => 'orcl_import_job',
    hostname           => '<remote_db_hostname>',
    port               => '<remote_db_port>',
    service_name       => '<remote_db_service_name>',
    ssl_server_cert_dn => '<cn=remote_db_cn_name>',
    directory_name     => '<walletdb>',
    credential_name    => 'db1_cred',
    schema_list        => '["admin"]'
  );
END;
/
```
Note:

To restrict the scope of the import, use one of the following parameters:
- schema_list: Imports one or more schemas.
- table_list: Imports specific tables across multiple schemas.
The table_list parameter is supported only when the source database is Oracle and is not supported for non-Oracle databases.

When you use table_list:
- Schema metadata for the associated schemas is imported as needed.
- Data is imported only for the tables explicitly specified in table_list.
Do not specify overlapping objects in both schema_list and table_list; use only one option to select a given schema or table.

Example: Creates an import task for a MySQL source by specifying gateway_params with db_type set to mysql, and identifies what to migrate using schema_list (as applicable for the gateway configuration).
```
BEGIN
  DBMS_CLOUD_IMPORT.CREATE_IMPORT_TASK(
    task_name        => 'mysql_import_job',
    hostname         => '<mysql_hostname>',
    port             => '<mysql_port>',
    service_name     => '<mysql_service>',
    credential_name  => '<mysql_cred>',
    gateway_params   => JSON_OBJECT('db_type' VALUE 'mysql'),
    schema_list      => '["dg4odbc"]'
  );
END;
/
```
Example: Creates a import task for a PostgreSQL source by specifying gateway_params with db_type set to postgres. The task connects to PostgreSQL and migrates data into the target Autonomous AI Database based on the task configuration.
```
BEGIN
  DBMS_CLOUD_IMPORT.CREATE_IMPORT_TASK(
    task_name        => '<postgres_import_job>',
    hostname         => '<postgres_hostname>',
    port             => '<5432>',
    service_name     => '<postgres_serv>',
    credential_name  => '<postgres_cred>',
    gateway_params   => JSON_OBJECT('db_type' VALUE 'postgres')
  );
END;
/
```
In these examples, hostname, port, and service_name identify the source database endpoint and are used by the task to establish connectivity (including creating the required database link). credential_name authenticates to the source database so the link can be created. For Oracle sources, use table_list to migrate specific tables, or schema_list to migrate one or more schemas. For non-Oracle databases, gateway_params is also required to specify the source database type (for example, MySQL or PostgreSQL) and enable the appropriate gateway-based connectivity for the import.

See CREATE_IMPORT_TASK Procedure for more information.

Suspend and Resume Import

Describes how to suspend and resume an import operation.

If needed, you can temporarily pause the import using the SUSPEND_IMPORT_TASK procedure:

BEGIN
  DBMS_CLOUD_IMPORT.SUSPEND_IMPORT_TASK(
    task_name => 'orcl_import_job'
 );
END;
/

This stops the scheduler job while preserving all progress and metadata. See SUSPEND_IMPORT_TASK Procedure for more information.

To continue the import, use the RESUME_IMPORT_TASK procedure:

BEGIN
  DBMS_CLOUD_IMPORT.RESUME_IMPORT_TASK(
    task_name => 'orcl_import_job'
  );
END;
/

The job resumes from the last checkpoint, skipping completed tables and continuing remaining work.

Note: Resume behavior depends on the source database type and prerequisites. When prerequisites for parallel processing and automatic resume are met, the import continues from the point of interruption. Otherwise, the import will not use parallel copy and if the job is interrupted, it restarts from the beginning of the table.

See RESUME_IMPORT_TASK Procedure for more information.

Monitor Import Progress

After you start an import task, the scheduler job runs in the background and the Autonomous AI Database records progress at both the task and table levels.

You can use the following data dictionary views to monitor import progress and status:

DBA_DATA_IMPORT_TASK_STATUS: Shows task-level progress, including overall status, percent complete, and a summary of activity (for example, tables currently in progress).
DBA_DATA_IMPORT_TABLE_STATUS: Shows table-level progress for each table being imported, including status and error details for failed tables.

For example, query DBA_DATA_IMPORT_TABLE_STATUS to view table-level progress and errors. The STATUS column indicates the current state of each table, such as SUCCEEDED, FAILED, LOADING, or STOPPED.

SELECT t.import_task_name, s.schema_name, s.schema_object, s.status
FROM dba_data_import_task_status t
     JOIN dba_data_import_table_status s
       ON s.import_task_name = t.import_task_name
WHERE t.import_task_name = 'IMPORT_JOB1'
ORDER BY s.schema_name, s.schema_object;

The output is similar to the following:

IMPORT_TASK_NAME   SCHEMA_NAME   SCHEMA_OBJECT   STATUS
------------------ ------------- --------------- ----------
IMPORT_JOB1        SALES         ORDERS          SUCCEEDED
IMPORT_JOB1        SALES         CUSTOMERS       SUCCEEDED
IMPORT_JOB1        HR            EMPLOYEES       LOADING
IMPORT_JOB1        HR            DEPARTMENTS     FAILED

When the import completes, the scheduler job is automatically disabled. See Monitor and Diagnose Heterogeneous Import for more information.

Drop an Import Task

Describes the steps to drop an import task.

After the import is complete and no longer needed, you can remove the import task. Dropping the task removes the scheduler job, database links, and associated metadata. It does not roll back objects or data that have already been created in the target schema and can leave an in-progress table partially loaded.

If you rerun the import for the same schema, you may need to manually clean up partially loaded tables. To identify incomplete tables, query data dictionary views such as DBA_DATA_IMPORT_TABLE_STATUS and review the table status before performing cleanup.

BEGIN
  DBMS_CLOUD_IMPORT.DROP_IMPORT_TASK(
    task_name => 'orcl_import_job'
  );
END;
/

This example deletes the ORCL_IMPORT_JOB import job. See DROP_IMPORT_TASK Procedure for more information.

Monitor and Diagnose Heterogeneous Import

To diagnose issues and monitor the progress of data import operations, you can query the following data dictionary views. These views provide detailed information about task status, execution progress, parallel operations, and related objects.

View Name	Description
DBA_DATA_IMPORT_TASK_STATUS View	Displays the status of each table within an import task, including any errors encountered during processing.
DBA_DATA_IMPORT_TABLE_STATUS View	Provides overall task-level details, including progress information such as the number of tables processed, loaded, and successfully completed.