2 Getting Started with the DMU
This chapter introduces the Database Migration Assistant for Unicode (DMU) and describes the basic workflow of migrating a character set using this utility.
2.1 Using the Database Migration Assistant for Unicode: A Roadmap
This section provides an overview of the tasks involved in working with the DMU.
To work with the Database Migration Assistant for Unicode:
2.2 Introduction to the DMU Interface and Navigation
From the DMU home page, you can perform the basic tasks associated with migrating a database to Unicode.
DMU Home Page
This is the page that loads when you click the DMU icon. See "Introduction to the DMU User Interface".
Online Help
You can access context-sensitive online help by clicking the Help link displayed at the top of every page. On any help page, click Help or press the F1 key.
Navigation
Navigational features of the DMU include the following:
-
Subpage links at the top of the page. These links take you to the various subpages that organize management tasks into distinct categories.
-
Drill-down links that provide increasing levels of detail.
2.3 Overview of Requirements and Security Considerations
When using the Database Migration Assistant for Unicode, you must consider several requirements and restrictions.
This section also describes the first steps you must take to start the utility.
2.3.1 Overview of Database Requirements
To be supported by the DMU, a database must meet several requirements, including release version, type of character set, open mode, and so on.
The requirements are as follows:
-
The release of Oracle Database must be 10.2.0.4, 10.2.0.5, 11.1.0.7, 11.2.0.1, or later. Check the latest version of the DMU release notes for any additional patches that must be installed in the database. The release notes also list further database releases that are supported with appropriate patches.
-
The database character set must be ASCII-based. For example, databases running on the EBCDIC-based platforms IBM z/OS and Fujitsu BS2000 are not supported.
-
The
SYS.DBMS_DUMA_INTERNAL
package must be installed in the database.This package is available as part of the database installation. You must create it manually by running the script
?/rdbms/admin/prvtdumi.plb
from the Oracle home. You must log in asSYSDBA
to run the script. -
You must disable Oracle Database Vault before starting the migration. The DMU has not been certified to work with Database Vault enabled.
-
The database must be opened in read/write mode.
Additional requirements pertain to databases that the DMU should convert. Without meeting these requirements, you can still use the DMU for scanning and cleansing the database. These requirements are as follows:
-
All database objects, including objects created by standard PL/SQL packages, such as
DBMS_RULE
,DBMS_DATA_MINING
, orDBMS_WM
, must be named using only standard characters from the ASCII character set. Similarly, expressions ofCHECK
constraints, virtual columns and other database features must be specified only using those characters. In other words, the data dictionary of the database cannot contain non-ASCII characters. However, there is some character data in the data dictionary in which non-ASCII characters are allowed.See "Data Dictionary Tables That Are Converted" for more details.
-
No OLAP analytical workspaces, other than predefined system workspaces and certain predefined Oracle Applications workspaces, can exist in the database.
-
No flashback data archives can exist in the database.
-
No data to be converted can reside in a read-only or offline tablespace.
-
Neither cluster key columns nor partitioning key columns can be defined with character length semantics.
-
No convertible data can be present in tables in the recycle bin.
-
No convertible data can be present in a reference partitioning key column.
This release of the DMU supports the migration of pluggable databases (PDBs). To convert a PDB using the DMU in a container database (CDB), the root container's character set must be the same as the migration target character set (AL32UTF8 or UTF8). When this requirement is not met, you can still use the DMU for scanning and cleansing PDBs, but the conversion operation is disallowed.
This release of the DMU also supports the migration of Oracle E-Business Suite and PeopleSoft databases to Unicode. The migration workflow is mostly the same as that of a regular database instance. Note that you should not perform any data cleansing actions that require modifying the database schema of Oracle E-Business Suite or PeopleSoft applications because such modifications will not be supported by Oracle. In addition, this release of the DMU does not support migrating databases for PeopleSoft Application versions older than 9.0 or PeopleTools versions older than 8.48.
2.3.2 Overview of Java Runtime Requirements
The Java SE Development Kit (JDK) version 8 or version 11 must be available on the system running the DMU.
When it is first started, the DMU asks for the location of the Java Runtime executable file. Depending on the platform, this file is either java.exe
or java
.
From the Java SE download page on http://www.oracle.com
, you can download the JDK for Linux, Solaris, and Microsoft Windows operating systems for Intel x86, Intel x64, Intel Itanium, and SPARC processors. Look for the installation instructions for your platform on the relevant Oracle Technology Network documentation page at https://www.oracle.com/java/technologies/downloads/
.
For other platforms, such as Hewlett-Packard HP-UX or IBM AIX, visit the vendor's website.
2.3.3 Overview of DMU Security Considerations
Before you start the migration, be aware of the DMU security considerations listed in this section.
-
Protect DMU software integrity
In this release, the DMU requires that you connect to a database with the
SYSDBA
privilege. This highest privilege is needed to analyze and convert all data in the database. But this also means that once a connection is established, the program—if maliciously modified—could read and change all database content, create database users and grant privileges to them, and so on.To ensure that this power is not abused, the DMU software files must be modifiable only by users who are also authorized to connect as
SYSDBA
. This implies that all files in the DMU installation directory must be owned and writable only by an operating system user to which only the trusted administrators can connect. If writable by a malicious user, the DMU program code in JAR files, the repository maintenance SQL scripts, or configuration files could be abused to inject malicious SQL statements.Unless you install the DMU on a host to which only authorized users have access, you must protect the DMU installation and configuration files using the operating system privileges. After you have uncompressed the archive file with the DMU installation, ensure that all uncompressed files and directories are writable only by you and other authorized operating system users. The DMU does not have an installer that can automatically set file permissions.
-
Protect connection details
If you select the Save Password check box when creating a database connection, the password you specify is saved in an obfuscated form in a password file named
cwallet.sso
in your user directory. Because obfuscation is a reversible operation, use this feature only for passwords to test databases with no production data or only if the DMU is installed on a very well-protected host. Ensure that the password file is readable only by you.The
cwallet.sso
file is in the$HOME/.dmu/
directory on Linux and UNIX systems, and in the%APPDATA%\DMU\
directory on Microsoft Windows systems. -
Protect DMU repository objects
To avoid unauthorized access to potentially sensitive metadata and functionality, do not grant any privileges for the DMU repository database objects to any database users. This guideline applies to any object with a name starting with
DUM$
orDBMS_DUMA
. -
Avoid information leakage
To avoid unauthorized access to potentially sensitive information, ensure that the DMU configuration, log, and report files are readable and modifiable only by authorized operating system users. By default, these files are in the
$HOME/.dmu/
directory on Linux and UNIX systems, and in the%APPDATA%\DMU\
directory on Microsoft Windows systems. However, DMU users can often select a file path for generated logs and reports. -
Protect operating system files
The DMU lets its users specify file paths to access specific files. In a remote terminal setup where a DMU user does not have full shell access to the shared host, this functionality could be abused to read or overwrite sensitive files. The host's operating system administrator should protect all sensitive operating system files from inadvertent or malicious overwriting. This precaution is not specific to the DMU.
2.3.4 Review Your Preparations for Migration
Before you start the migration, collect information such as whether your database is supported, the target character set, which languages can be stored in the character set, and so on.
To help you set up the DMU properly, answer the following questions about your database and its use by applications:
-
Is your database supported by the DMU?
Verify that the version of your database software is supported by the DMU as described in "Overview of Database Requirements". The platform on which the database is installed must be ASCII-based (that is, not IBM z/OS or Fujitsu BS2000). Your database must not have Oracle Database Vault installed.
-
What is the target character set?
You have to decide if the target character set of the migration is AL32UTF8 or UTF8. Oracle recommends AL32UTF8, which is a proper implementation of the UTF-8 encoding form of the Unicode Standard. The DMU supports migration to the older character set UTF8 for databases that must support database client software based on Oracle 8i Client libraries, or that must support applications certified with UTF8 but not with AL32UTF8, for example, Oracle Applications Release 11i. The UTF8 character set, despite its name, is an implementation of the CESU-8 compatibility encoding form of the Unicode Standard. CESU-8 is very similar to UTF-8 except for the way the supplementary characters are stored (that is, characters with Unicode code points U+010000 and higher). Oracle does not generally update the definition of UTF8 to synchronize it with the newest versions of the Unicode Standard, while Oracle does update the definition of AL32UTF8.
-
Which languages and character sets can be stored in the database?
The very nature of character set encodings and the very nature of character storage in an Oracle database make it impossible to automatically and precisely recognize what language in what character set is represented by a given sequence of bytes in a column value. The DMU scanning process can tell if there are bytes in a character value that are invalid in the declared character set of this value, but not more. If almost all bytes of the declared character set are assigned to single-byte codes of some characters, such as the case of CL8MSWIN1251, which defines all bytes except 0x98, then virtually any sequence of bytes declared as encoded in this character set will appear as valid to the scanning process. Therefore, even if an application uses the pass-through configuration to store data in the database that it interprets in a character set different from the database character set (see "Invalid Binary Storage Representation of Data") the DMU might not see any invalid codes and it might not signal the mismatch between the real (application) character set of the data and the declared database character set. Hence, it is very important that you research, for example, by asking application developers and administrators, what languages or character sets are most probably stored in your database. This will help you to manually analyze the contents of the database, if necessary, to supplement the automatic analysis by the DMU.
-
Which columns contain data in foreign languages?
If your database contains data mainly in one language and only a few table columns might contain other languages, for example foreign customer names and street addresses, a list of all such columns facilitates the manual search for invalid data stored in the pass-through configuration, if the existence of such data becomes probable after analysis of the information gathered in the previous list item.
-
Which character columns might contain binary data?
Some applications might connect to your database in the pass-through configuration and use character columns to store data that is binary in nature, such as text encrypted without using the Transparent Data Encryption feature of the database, images, text in a binary format of a word processor, and so on. Obtain a list of those applications from application developers or administrators to further help you analyze the contents of your database.
See "Cleansing Scenario 3: Cleansing Invalid Representation Issues" for information about resolving invalid binary representation issues using the collected information.
-
What is the real character set of your database?
Determine if the character data in your database is actually to be interpreted in a particular common character set of the clients connecting to database and not in the declared database character set. To improve the effectiveness and accuracy of the analysis that you perform with the help of the DMU to verify that the database can be converted to Unicode.
One way to determine if the real character set of your database differs from the declared database character set is to look for the following characteristics:
-
The database (declared) character set is US7ASCII, or WE8ISO8859P1, or WE8ISO8859P15.
-
The client character set, declared in the
NLS_LANG
client setting, is the same as the database character set. If theNLS_LANG
is not specified at all, the client character set defaults to US7ASCII.NLS_LANG
affects only C or C++ clients connecting through an OCI API. Java clients connecting through a JDBC API are always UTF8 clients. -
All database client software runs on the Microsoft Windows platform and works in one of the Windows character sets, which are also known as ANSI Code Pages. This character set depends on the language version of Windows: WE8MSWIN1252 – US and Western European versions, EE8MSWIN1250 – Central European versions (for example, Polish), CL8MSWIN1251 – Cyrillic versions (for example, Russian), AR8MSWIN1256 – Arabic version, and so on.
This Microsoft Windows character set common to all clients is the real database character set that was to be identified.
If your database is used in a correct character set configuration, that is, the
NLS_LANG
setting always correctly corresponds to the real character set of the database clients, the real character set of the database is its declared character set. -
-
What is the connection information for your database?
To migrate a database with the DMU, you must have the
SYSDBA
privilege and the required connection credentials (user name and password). Starting in Oracle Database 12c, to perform scanning and cleansing operations on a PDB, you can connect as a user who has theSYSDBA
privilege in the local PDB. However, to convert a PDB to Unicode, you must connect as theSYS
user.You need to know the host name or IP address of the database server, the port number on which the database listener listens for connection requests, and the service name of the database. For performance reasons, Oracle recommends that you connect to a service that is configured for dedicated connections, that it, not using the a shared server configuration.
After you have collected the preceding information, you can install the DMU.
See Also:
Oracle Database Security Guide for further security considerations
2.4 Performing First Tasks with the DMU
Before you can begin using the DMU, you must perform installation and creation tasks.
2.4.1 Installing the DMU
Oracle recommends that you run the DMU on the database host or on a workstation connected to the database host with a fast and reliable local area network.
Before installing the DMU, always download the current release notes available for the DMU version you are using.
To install the DMU:
The DMU does not come with an installer. Install the utility by uncompressing the installation file into a directory of your choice.
- To install the DMU, uncompress the downloaded archive file to any directory on the host on which you want to run the DMU.
- After you have uncompressed the archive file, ensure that the DMU files are writable only to you and other authorized operating system users. This is very important because unprivileged users with access to the DMU host could modify the DMU files to make the DMU execute arbitrary SQL statements when the DMU is later started with
SYSDBA
credentials. Such SQL statements could compromise database security. - After you have installed the DMU, initialize the database for the migration process with the DMU.
See Also:
-
Oracle Database Migration Assistant for Unicode: Getting Started page on the Oracle Technology Network (OTN) website for the prerequisites and the detailed steps for installing the DMU software:
https://www.oracle.com/database/technologies/dmu-gettingstarted.html
-
Oracle Database Migration Assistant for Unicode Release Notes
2.4.2 Creating a Database Connection
You must create a connection to the database that you want to analyze or migrate.
To create a database connection:
You can test the connection by clicking Test Connection. For the Basic connection type, if the database administrator password is not already stored in the DMU, then you will be prompted to enter the database administrator password. For the SSH connection type, if a key file is not used, then you will be prompted to enter password for the SSH server user. For the SSH connection type, if a key file is used and a passphrase is set for the key, then you will be prompted to enter the passphrase.
If the connection can be established, the Status box will show "Success". Otherwise, it will show an error message describing the connection problem. You can create the connection and close the dialog box by clicking Save.
After a connection has been created, a new database node with the name of the connection is added to the Navigator pane. You can open the node, that is, connect to the database, by right-clicking it and choosing Connect from the context menu. Only one connection can be open in the DMU at the same time, thus the DMU will ask you for permission to close any currently open connection before it will open a new one.
You can change the details of an existing connection, such as user name or host name, by right-clicking the corresponding database node and choosing Connection Details from the context menu. The Modify Database Connection dialog box that is shown has the same elements as the Create Database Connection dialog box.
You can rename a connection by selecting Rename Connection or delete a connection by selecting Delete from the context menu of the corresponding database node. The Connection Details, Rename Connection, and Delete menu items are available in the context menu of a database node only if the node is not currently connected.
2.4.3 Installing the DMU Repository
The DMU repository manages the information necessary for each step of the migration.
The DMU repository contents include items such as objects to be processed, details on data that had an error flagged, and the progress of a scan or a conversion. For any database without a DMU repository, an automatic repository creation wizard begins each time you connect to that database. You can also start the wizard from the DMU user interface.
To install the DMU repository:
After you have the repository installed, you can begin working on the data.
2.4.4 Following the Status of a Migration
After a database connection has been established, you can follow the progress of the migration by looking at the Migration Status tab.
This tab, which is displayed in the Client pane of the DMU main window, shows the important milestones in the DMU process:
-
Installing the repository
-
Scanning the database for problems
-
Resolving any problems encountered
-
Converting the data
The Migration Status tab shows the completion status of each milestone, suggests a next action, and lists issues that prevented a successful conversion. All this information indicates the progress in the workflow. Return to the Migration Status tab periodically to enable the Database Migration Assistant for Unicode to guide you through the migration process.
An example of Migration Status tab content is shown in Figure 2-6. The Migration Status tab describes each of the four main steps of the migration:
-
Installation of the DMU repository
-
Scanning of the database
-
Resolution of migration issues
-
Conversion of the database
The Migration Status tab shows the status of each of these steps and suggests the next action to take. You can click the More links to open help pages that expand the presented information.
-
Step 1: Install the DMU Repository
This section informs you if the connected database contains a DMU repository. It can also report that the installed repository is in an incompatible version. Until the repository is installed in the correct version, you cannot proceed to the following steps of the migration process. Follow the advice given in the Next Action area. If the text in this area is a link, click the link to start the recommended action. See "Installing the DMU Repository".
-
Step 2: Scan the Database
This section shows the scan status of the database. The database might be in any of the following conditions:
-
Not yet scanned – no scan has been executed yet after the DMU repository has been created
-
Being scanned – a scan is running in the database just now
-
Partially scanned – only part of the database has been scanned
-
Entirely scanned – all tables in the database have valid scan results
-
Contained invalidated results – scanning results for some tables have been invalidated because the table structure has been modified since the last scan
If the database has not been entirely scanned, see "Scanning the Database" and scan all those tables that have missing or invalid scan results.
-
-
Step 3: Resolve Migration Issues
This section shows the list of issues that must be resolved before you can start the actual conversion of the database content. Click the plus icon to the left of the text "Unresolved convertibility issues found" to see the issues. The issues are classified as warnings and blocking issues.
Warnings, marked with a yellow color, describe features and configuration details of the database that might cause problems during conversion in some situations and not cause any problems in other situations. The DMU is not able to automatically analyze the probability of these problems to arise. Some warnings describe configuration details of the database that might decrease the performance of the database conversion, but which the DMU cannot change itself because of the possible side effects. You might see the following warnings:
-
User-defined OLAP analytical workspaces are present in the database.
-
The database is a standby database.
-
There are external tables with convertible data in the database.
-
There are convertible primary key-based object identifiers (OIDs) in the database.
-
The current setting rules out the CTAS conversion method for tables with row movement disabled.
-
Turning off the
FORCE
LOGGING
mode of the database or tablespaces might improve conversion performance, though at the expense of the ability to perform media recovery. -
The conversion of some partitioned tables might fail due to error ORA-14402, because the tables do not have the row movement option enabled.
-
Some tables have been excluded from conversion.
-
Convertibility issues in some of the columns will be ignored and the columns will be converted despite the resulting data.
You can choose to ignore some of the warnings and start the conversion, but ensure that you understand all the implications.
Blocking issues, marked with red color, are issues that prevent the DMU to enter the conversion phase. They are known to cause problems in the database, if ignored. You must resolve all blocking issues before you can proceed with converting the database. After all issues are resolved, the status area in this section shows the message "No unresolved convertibility issue found" and proceeding with the conversion process is considered safe.
-
-
Step 4: Convert the Database
This section provides the information about the status of the database conversion process.
To view the status of a migration:
- Click the Migration Status tab in the Client pane.
- If you have modified the database outside of the DMU to resolve some of the convertibility issues, click Retest to let the DMU recheck the database and update the status information.
- To reopen the tab at any time, select Migration Status Panel from the Migration menu.
2.5 Introduction to the DMU User Interface
The default layout of the main window of the DMU user interface contains a menu bar and a toolbar at the top, and a status bar at the bottom. The remaining area of the window is divided into panes.
The default layout is shown in Figure 2-7.
The Navigator pane in the top left-hand part of the window shows a tree of database objects on which the DMU operates. The first level of nodes in the tree represents databases to which the DMU can connect. Only one of the database nodes can be associated with an open connection. For further information about connections, see "Creating a Database Connection".
A database node with a currently open connection can be expanded to a subtree of schemas, tables, and columns contained in the database that are relevant to the migration process. Relevant columns are columns that contain character data and might, therefore, require character set conversion when migrated to the Unicode character set. Due to differences in the way the DMU processes them, predefined Oracle schemas comprising the data dictionary are grouped together and displayed separately under the Oracle-Maintained Schemas group node. Remaining schemas are displayed under the Application Schemas group node. Similarly, tables in a schema are grouped into materialized views, under the Materialized Views group node, and other tables, under the Tables group node.
A status icon might be displayed to the left of a node name in the Navigator tree. Table 2-1 shows the possible icons and their meanings for particular node types.
The Client pane in the upper right-hand part of the window is an area in which various tabs are opened in the process of migration. These tabs display object properties, scanning results, cleansing tools, and progress status of tasks. All the tabs are described in the remaining sections of this chapter. When you start the DMU, before a database connection is made, the client pane contains only the Start page, which is a collection of links to various sources of information about the DMU.
The Properties pane in the bottom left-hand part of the window displays selected properties of the most recently clicked node in the Navigator pane or in the Database Scan Report (see "Scanning the Database"). The properties displayed depend on the type of node.
The Log pane in the lower right-hand part of the window displays error and warning messages reported by the DMU during the migration process.
You can customize the layout of the main DMU window by dragging panes and tabs to new positions, but the customized layout is not preserved across program runs.
Table 2-1 Database Migration Assistant for Unicode Icons
Icon | Column Node | Table, Schema, Database, and Grouping Nodes |
---|---|---|
No icon |
The column has never been scanned. |
No column in the table/schema/database/group has ever been scanned. |
Check mark ![]() |
The column has been successfully scanned and no issues have been found. |
All columns in the table/schema/database/group have been successfully scanned and no issues have been found. |
Red circle with a white X ![]() |
The last attempt to scan the column failed. |
A recent attempt to scan one or more columns of the table/schema/database/group failed. |
Warning triangle in yellow ![]() |
The column has been successfully scanned, but some convertibility issues have been found. |
All columns in the table/schema/database/ group have either been successfully scanned, or they have never been scanned or their scan results have been invalidated; some convertibility issues have been found in the scanned columns. |
Round red circle with diagonal line ![]() |
The column has been previously scanned, but the scan results have been invalidated due to the containing table being altered or a cleansing action being applied. |
One or more columns in the table/schema/database/group have had their scan results invalidated while all other columns in this table/schema/database/group have never been scanned. |
Magnifying glass ![]() |
Not applicable. |
Some columns in the table/schema/database/ group have been successfully scanned, and no issues have been found, but other columns have never been scanned or their scan results have been invalidated. |
Moving magnifying glass ![]() |
The column is being scanned; this status overrides all other statuses. |
One or more columns in the table/schema/ database/group are being scanned; this status overrides all other statuses. |
2.6 Overview of Data Preparation
Data preparation ensures that no data to be migrated causes problems during or after conversion. The elements of data preparation are scanning and cleansing.
2.6.1 Data Preparation: Scanning
In this step, the DMU scans tables for problems, assessing the feasibility of migrating the data to Unicode.
The most common types of problems are with your data. Issues include values expanding during conversion beyond column or data type limits, data in a mislabeled character set, or binary data stored in character data types.
During scanning, the DMU reads specified character columns and performs a test conversion of each column value to the target character set, AL32UTF8 or UTF8. Depending on the result of this conversion, the DMU classifies data as follows:
-
Needs no conversion
The data is fine, because the binary representation of the data does not change in the conversion.
-
Needs conversion
The data must be converted, because the binary representation of the data does change, but no other data issues have been found.
-
Invalid binary representation
The binary representation of the data is invalid under the current database character set. If you do the conversion in this state, the resulting data will usually not make sense to applications and users.
-
Exceeds column limit
The data will not fit into a column after migration.
-
Exceeds data type limit
The data will exceed a data type limit after migration.
Each column value is assigned to only one of the preceding categories. Values that have invalid binary representation are classified only as such even if their lengths exceed column or data type limit after conversion. As conversion of invalid character codes usually yields the default replacement character, which has a three-byte representation in AL32UTF8 and UTF8, the length expansion issues are not rare among values with invalid representation. Because the DMU let you ignore invalid representation issues and force conversion of a column, you should be aware that forcefully converted values with invalid binary representation may be additionally truncated. You can compare the value of the Maximum Post-Conversion Length property of the column with the column and data type length limits to see if the truncation will take place.
This release of the DMU supports Oracle Database 12c extended data type limit where the maximum size limit for VARCHAR2
has been increased to 32767 bytes. The extended type limit guarantees that VARCHAR2
data created in earlier releases up to the original type limit of 4000 bytes can be migrated to Unicode on Database 12.1 with no over data type limit expansion exceptions.
After you have run a scan, the DMU creates a Database Scan Report, which can be found under the Migration drop-down menu. This scan report shows the statistics for the current data under each of the preceding categories. The counts of values are presented for each character data type column, and also summed up at the table, schema, and database levels. The scan report enables you to filter and customize the output, for example, so that only columns with selected potential problems such as exceeding the column limit are displayed.
In addition, you can use the report to iteratively go through the steps of taking the data that is not clean, reviewing this data in the Cleansing Editor, fixing the data, either immediately or at your convenience later, and rescanning to verify that the data is now clean. While scanning the data, you can generate an index check report, which can be found on the Indexes subtab of the Database Scan Report. The index check report lists the indexes defined on character columns whose key lengths will exceed the maximum allowed key size due to column modifications made with the Cleansing Editor or the increased post-conversion data size of the columns. This enables you to fix any problematic indexes before proceeding to the data conversion phase.
You can perform scans on the following objects:
-
All tables in the database
-
All tables in the data dictionary of a database
-
All application schema tables in a database
-
All tables in an application schema
-
A table column in an application schema
-
An arbitrary set of application schema tables and columns in a database
As mentioned in "Overview of Character Set Migration Considerations", certain data issues cannot be discovered automatically. For example, the DMU does not analyze character data in binary data types. Also, some single-byte character sets define almost all byte values as valid codes. For example, the only byte value that is not a valid CL8MSWIN1251 character code is 0x98. Therefore, if the incorrectly stored data happens not to contain any of the undefined byte values, then the test conversion during a scan might not show invalid binary representation problems. Even if the database scan reports no issues, collect and analyze information about use of the database by applications and try to identify these types of hidden problems before attempting the database conversion. Creating a test copy of the database, migrating it to Unicode, and thoroughly testing it with your applications is also a way to discover many problems.
2.6.2 Data Preparation: Cleansing
In this step, the DMU cleans the data in tables based on identified issues.
You can define cleansing actions for immediate or for delayed execution (immediate versus scheduled cleansing mode). You can perform certain types of cleansing in a production environment with no side effects to applications. Other cleansing operations involve metadata changes and require that you adapt applications to these changes. The scheduled mode enables you to define a cleansing action at any time but delay its execution to the conversion phase of the migration process. This is usually the most convenient moment to introduce metadata changes because the database is not used in production and new application versions can be easily deployed at the same time.
Immediate changes to metadata are defined in the Modify Column and Modify Attribute dialog boxes. Scheduled changes are defined in the Schedule Column Modification and Schedule Attribute Modification dialog boxes. All these dialog boxes can be invoked from the context menu of a column or attribute in the Cleansing Editor. Immediate editing changes of user data are performed directly in the Cleansing Editor.
In the immediate mode, any change entered is performed immediately after an appropriate Save button is selected. In this case, all SQL statements are issued to the database and the transaction is committed.
In the scheduled mode, clicking Save puts the cleansing action into the DMU repository. The corresponding statements are executed during the conversion step. You can change the scheduled action or remove it, provided the conversion step has not yet been started, by reopening the Schedule Column Modification or Schedule Attribute Modification dialog box and selecting No Modification.
You apply the editing changes performed in the Cleansing Editor on table data to the database by clicking Save on the Cleansing Editor tab. You can revert the changes that have not yet been saved by clicking the Revert button.
The Cleansing Editor can also be used to set the assumed character set of a column. This property tells the DMU to interpret the contents of the column in a character set different from the database character set. The selected character set is applied to the test conversion during scanning, actual conversion occurs in the conversion step, and the set is used to interpret data for display in the Cleansing Editor. If column data in the Cleansing Editor is not legible, the character set of the column might be selected incorrectly.
After cleansing, rescan the database to verify that your changes have successfully handled the potential problems. This is an iterative process where you scan and cleanse until there are no more conversion issues before you proceed to the conversion phase.
See Also:
"Using the DMU to Cleanse Data" for details
2.7 Overview of Data Conversion
The data conversion phase is where the actual modification of the database contents occurs. After the conversion is complete, the database character set will be Unicode.
2.7.1 Data Conversion: Preparing
When you select Convert Database in the Migration menu, the DMU first executes a conversion feasibility test.
This test checks that:
-
All database convertibility requirements listed in "Overview of Database Requirements" are met
-
All data in the database has been scanned and has valid scan results
Cleansing actions might invalidate scan results, so cleansed tables might need to be rescanned.
-
No data in the database has binary representation or length issues
If the test succeeds, the DMU presents a plan for conversion. This plan shows all SQL statements that will be executed to convert the database, including statements to handle auxiliary objects, such as indexes, constraints, and triggers. You can customize the plan as desired. There are various database-level and table-level options that can be used to influence the way DMU converts the database. For example, you can set the conversion method for a given table or the number of processes participating in the conversion step. The conversion plan is generated and displayed in the Conversion Progress tab. You can then modify the available options by clicking Edit Table Conversion Plan or Edit Database Conversion Parameters.
2.7.2 Data Conversion: Converting
This step converts the data into Unicode.
You accept the conversion plan and initiate the conversion by clicking the Convert button on the Conversion Progress tab. The DMU repeats the conversion feasibility test to ensure that no issues have been introduced while you worked on the conversion plan. It verifies also that no other sessions are connected to the database and that the database is mounted in exclusive mode. Then, the DMU begins to migrate the data, executing the statements from the conversion plan.
You can monitor the conversion progress on the Conversion Progress tab.
As an overview, the process the DMU uses to convert data is:
-
Put the database into restricted mode.
-
Disable various job queue processes.
-
Drop or disable selected indexes.
-
Disable selected triggers and constraints.
-
For tables assigned the "Convert using Data Pump" conversion method, export the tables to a database directory using Oracle Data Pump.
-
Convert the data in user tables (except blockchain and immutable tables) and in selected data dictionary tables to Unicode.
-
Convert
CLOB
columns in the data dictionary. -
Issue the
ALTER
DATABASE
CHARACTER
SET
statement, and convert the data in blockchain and immutable tables to Unicode. -
For tables assigned the "Convert using Data Pump" conversion method, import the data into the tables using Oracle Data Pump.
-
Enable triggers and constraints; and re-create indexes and constraints.
-
Restore the database instance parameters.
The conversion of a table is performed by updating its columns with an UPDATE
statement, by converting the columns while re-creating the table using the CREATE TABLE AS SELECT
statement, or by exporting and importing the table with Oracle Data Pump. The re-creation of a table is faster than an update if most of the table rows must be converted. Oracle Data Pump export and import may be faster than the other conversion methods if the table is large and contains CLOB
data.
After the conversion has finished, the DMU will re-create or re-enable any objects that were dropped or disabled earlier. You can check that no errors have been generated by looking at the Log pane.
Note:
After the database is successfully converted to Unicode, you must restart the database instance to synchronize any in-memory character set metadata information with the new Unicode database character set.See Also:
"Converting the Database" for more information about the conversion GUI