11 Netezza
This chapter includes the following sections:
11.1 Introduction
Oracle Data Integrator (ODI) seamlessly integrates data in a Netezza database. Oracle Data Integrator features are designed to work best with Netezza, including reverse-engineering, data integrity check, and mappings.
11.1.1 Concepts
The Netezza database concepts map the Oracle Data Integrator concepts as follows: A Netezza cluster corresponds to a data server in Oracle Data Integrator. Within this server, a database/owner pair maps to an Oracle Data Integrator physical schema.
Oracle Data Integrator uses Java Database Connectivity (JDBC) to connect to a Netezza database.
11.1.2 Knowledge Modules
Oracle Data Integrator provides the Knowledge Modules (KM) listed in Table 11-1 for handling Netezza data. These KMs use Netezza specific features. It is also possible to use the generic SQL KMs with the Netezza database. See Generic SQL for more information.
Table 11-1 Netezza KMs
Knowledge Module | Description |
---|---|
CKM Netezza |
Checks data integrity against constraints defined on a Netezza table. Rejects invalid records in the error table created dynamically. Can be used for static controls as well as flow controls. |
IKM Netezza Control Append |
Integrates data in a Netezza target table in replace/append mode. When flow data needs to be checked using a CKM, this IKM creates a temporary staging table before invoking the CKM. |
IKM Netezza Incremental Update |
Integrates data in a Netezza target table in incremental update mode. |
IKM Netezza To File (EXTERNAL TABLE) |
Integrates data in a target file from a Netezza staging area. It uses the native EXTERNAL TABLE feature of Netezza. |
LKM File to Netezza (EXTERNAL TABLE) |
Loads data from a File to a Netezza Server staging area using the EXTERNAL TABLE feature (dataobject). |
LKM File to Netezza (NZLOAD) |
Loads data from a File to a Netezza Server staging area using NZLOAD. |
RKM Netezza |
Retrieves JDBC metadata from a Netezza database. This RKM may be used to specify your own strategy to convert Netezza JDBC metadata into Oracle Data Integrator metadata. Consider using this RKM if you encounter problems with the standard JDBC reverse-engineering process due to some specificities of the Netezza JDBC driver. |
11.2 Installation and Configuration
Make sure you have read the information in this section before you start using the Netezza Knowledge Modules:
11.2.1 System Requirements and Certifications
Before performing any installation you should read the system requirements and certification documentation to ensure that your environment meets the minimum installation requirements for the products you are installing.
The list of supported platforms and versions is available on Oracle Technical Network (OTN):
http://www.oracle.com/technetwork/middleware/data-integrator/documentation/index.html.
11.2.2 Technology Specific Requirements
Some of the Knowledge Modules for Netezza use the NZLOAD utility.
The following requirements and restrictions apply for these Knowledge Modules:
-
The source file must be accessible by the ODI agent executing the mapping.
-
The run-time agent machine must have Netezza Performance Server client installed. And the NZLOAD install directory needs to be in the PATH variable when the agent is started.
-
All mappings need to be on the staging area.
-
All source fields need to be mapped, and must be in the same order as the target table in Netezza.
-
Date, Time, Timestamp and Numeric formats should be specified in consistent with Netezza Data Type definition.
For KMs using the EXTERNAL TABLE feature: Make sure that the file is accessible by the Netezza Server.
To know more about the supported data types, refer to the Netezza documentation.
11.3 Setting up the Topology
Setting up the Topology consists of:
11.3.1 Creating a Netezza Data Server
A Netezza data server corresponds to a Netezza cluster connected with a specific Netezza user account. This user will have access to several databases in this cluster, corresponding to the physical schemas in Oracle Data Integrator created under the data server.
11.3.1.1 Creation of the Data Server
Create a data server for the Netezza technology using the standard procedure, as described in Creating a Data Server of Administering Oracle Data Integrator. This section details only the fields required or specific for defining a Netezza data server:
Note:
Note that Oracle Data Integrator will have write access only on the database specified in the URL.
11.3.2 Creating a Netezza Physical Schema
Create a Netezza physical schema using the standard procedure, as described in Creating a Physical Schema in Administering Oracle Data Integrator.
Note:
When performing this configuration, the work and data databases names must match. Note also that the dollar sign ($) is an invalid character for names in Netezza. Remove the dollar sign ($) from work table and journalizing elements prefixes.
Create for this physical schema a logical schema using the standard procedure, as described in Creating a Logical Schema in Administering Oracle Data Integrator and associate it in a given context.
11.4 Setting Up an Integration Project
Setting up a project using the Netezza database follows the standard procedure. See Creating an Integration Project of Developing Integration Projects with Oracle Data Integrator.
It is recommended to import the following knowledge modules into your project for getting started with Netezza:
-
CKM NetezzaIKM Netezza Control AppendIKM Netezza Incremental UpdateIKM Netezza To File (EXTERNAL TABLE)LKM File to Netezza (EXTERNAL TABLE)LKM File to Netezza (NZLOAD)RKM Netezza
11.5 Creating and Reverse-Engineering a Netezza Model
This section contains the following topics:
11.5.1 Create a Netezza Model
Create a Netezza Model using the standard procedure, as described in Creating a Model of Developing Integration Projects with Oracle Data Integrator.
11.5.2 Reverse-engineer a Netezza Model
Netezza supports both Standard reverse-engineering - which uses only the abilities of the JDBC driver - and Customized reverse-engineering.
In most of the cases, consider using the standard JDBC reverse engineering for starting.
Consider switching to customized reverse-engineering if you encounter problems with the standard JDBC reverse-engineering process due to some specificities of the Netezza JDBC driver.
Standard Reverse-Engineering
To perform a Standard Reverse-Engineering on Netezza use the usual procedure, as described in Reverse-engineering a Model of Developing Integration Projects with Oracle Data Integrator.
Customized Reverse-Engineering
To perform a Customized Reverse-Engineering on Netezza with a RKM, use the usual procedure, as described in Reverse-engineering a Model of Developing Integration Projects with Oracle Data Integrator:
- In the Reverse Engineer tab of the Netezza Model, select the
KM: RKM Netezza.<project name>
.
The reverse-engineering process returns tables, views, attributes, Keys and Foreign Keys.
11.6 Setting up Data Quality
Oracle Data Integrator provides the CKM Netezza for checking data integrity against constraints defined on a Netezza table. See Flow Control and Static Control in Developing Integration Projects with Oracle Data Integrator for details.
11.7 Designing a Mapping
You can use Netezza as a source, staging area, or a target of a mapping.
The KM choice for a mapping or a check determines the abilities and performance of this mapping or check. The recommendations in this section help in the selection of the KM for different situations concerning a Netezza data server.
11.7.1 Loading Data from and to Netezza
Netezza can be used as a source, target or staging area of a mapping. The LKM choice in the Loading Knowledge Module tab to load data between Netezza and another type of data server is essential for the performance of a mapping.
11.7.1.1 Loading Data from Netezza
Use the Generic SQL KMs or the KMs specific to the other technology involved to load data from a Netezza database to a target or staging area database.
For extracting data from a Netezza staging area to a file, use the IKM Netezza to File (EXTERNAL TABLE). See Integrating Data in Netezza for more information.
11.7.1.2 Loading Data to Netezza
Oracle Data Integrator provides Knowledge Modules that implement optimized methods for loading data from a source or staging area into a Netezza database. These optimized Netezza KMs are listed in Table 11-2. In addition to these KMs, you can also use the Generic SQL KMs or the KMs specific to the other technology involved.
Table 11-2 KMs for loading data to Netezza
Source or Staging Area Technology | KM | Notes |
---|---|---|
File |
LKM File to Netezza (EXTERNAL TABLE) |
Loads data from a File to a Netezza staging area database using the Netezza External table feature. |
File |
LKM File to Netezza (NZLOAD) |
Loads data from a File to a Netezza staging area database using the NZLOAD bulk loader. |
11.7.2 Integrating Data in Netezza
Oracle Data Integrator provides Knowledge Modules that implement optimized data integration strategies for Netezza. These optimized Netezza KMs are listed in Table 11-3. In addition to these KMs, you can also use the Generic SQL KMs.
The IKM choice in the Integration Knowledge Module tab determines the performances and possibilities for integrating.
Table 11-3 KMs for integrating data to Netezza
KM | Notes |
---|---|
IKM Netezza Control Append |
Integrates data in a Netezza target table in replace/append mode. |
IKM Netezza Incremental Update |
Integrates data in a Netezza target table in incremental update mode. This KM implements a DISTRIBUTE_ON option to define the processing distribution. It is important that the chosen column has a high cardinality (many distinct values) to ensure evenly spread data to allow maximum processing performance. Please follow Netezza's recommendations on choosing a such a column.Valid options are:
If no value is set (empty), no index will be created. This KM also uses an ANALYZE_TARGET option to generate statistics on the target after integration. |
IKM Netezza to File (EXTERNAL TABLE) |
Integrates data from a Netezza staging area to a file using external tables. This KM implements an optional BASE_TABLE option to specify the name of a table that will be used as a template for the external table. |