Machine Learning (ML) Models

This section contains instructions for creating and using Machine Learning (ML) Models in data flows.

Topics:

Create an ML Model Data Entity in the Data Flow editor

To use ML models in Data Transforms you need to create two data flows. You need to first build the ML model data entity using the Data Flow editor, and then you can use the data entity in a data flow to mine data from a source connection and load it into a target server.

To build an ML Model data entity in the Data Flow editor,

  1. Drag the data entity that you want to build the ML Model on onto the Design Canvas.
  2. Select the component and click the Add Data Entity icon Add Data Entity icon present on the top right corner of the target component.
  3. Add Data Entity page appears allowing you to configure the following details of the target component:

    General tab

    • In the Name text box, enter the name of the newly created Data Entity.
    • From the Entity Type drop-down, select ML Model as the data entity type.
      When you select this entity type the user interface changes as follows:
      • The Connection drop down only lists the Oracle connections that you have created.
      • The Add Data Entity wizard displays the Properties tab where you can select the Type of Learning, Function, Algorithm, and configure parameters to define the ML Model. See ML Model Data Entity Properties for more information.
    • From the Connection Type drop-down, select the required connection from which you wish to add the newly created Data Entity. For ML Model data entities, the Connection Type drop-down only lists Oracle as the option.
    • The Connection drop-down is populated with the connections you have created with the associated connection type. From the Connection drop-down, select the server name where you wish to keep the ML model data entity.
    • In the Schema drop-down, all schema corresponding to the selected connection are listed in two groups.
      • New Database Schema (ones that you've not imported from before) and
      • Existing Database Schema (ones that you've imported from before and are potentially replacing data entities).
      From the Schema drop-down, select the required schema.
    • In the Tags text box, enter a tag of your choice. You can use tags to filter the Data Entities displayed in the Data Entity Page.
    • If you want to mark this data entity as a feature group, expand Advanced Options and click the Treat as Feature Group checkbox.
    • Click Next.

    Properties tab

    • Select the Type of Learning, Function, and Algorithm you want to use to build this data entity. For more information about the options, see ML Model Data Entity Properties.
    • Based on the options selected, the Parameters section is populated with the list of parameters that are marked as "Importance" and "High". You can add other required parameters using the Add Parameters iconicon.

      You must specify a value for each parameter so that the data flow can run successfully.

    Columns tab

    • Click the Add Columns icon Add Columns icon, to add new columns to the newly created Data Entity.

      A new column is added to the displayed table.

    • The table displays the following columns:
      • Name
      • Data Type - Click the cell to configure the required Data Type.
      • Scale
      • Length
      • Actions - Click the cross icon to delete the created column.
    • To delete the columns in bulk, select the columns and click the Delete icon Delete icon.
    • To search for the required column details, in the Search text box enter the required column name and click enter. The details of the required column are displayed.
    • Click Next.

    Preview Data Entity tab

    It displays a preview of all the created columns and their configured details. If the data entity belongs to an Oracle database, you can also view statistics of the table. See View Statistics of Data Entities for more information.

  4. Click Save to save the configuration and exit the wizard.
  5. Save and execute the data flow.

    The new Data Entity is created. displayed in the Data Entities page.

ML Model Data Entity Properties

The Properties tab of the Add Data Entity wizard provides data mining options that you can use to define the ML Model data entity.

This topic assumes prior knowledge of Oracle Machine Learning concepts such as data mining functions and algorithms. For more information, see Oracle Machine Learning for SQL API Guide.

Type of Learning Function Algorithm
Supervised Classification Decision Tree
Explicit Semantic Analysis
Generalized Linear Models
Naive Bayes
Random Forest
Neural Network
Support Vector Machines
Regression Generalized Linear Models
Neural Network
Support Vector Machines
Time Series Exponential Smoothing
Attribute Importance Minimum Description Length
Unsupervised Association Apriori
Attribute Importance CUR matrix decomposition
Anomaly Detection One Class Support Vector Machines
Clustering Expectation Maximization
k-Means
Orthogonal Partitioning Clustering
Feature Extraction Explicit Semantic Analysis
Non-Negative Matrix Factorization
Singular Value Decomposition

Use ML Model in a Data Flow

Data Transforms supports the use of ML Model in a data flow. You can use the Prediction Model database function to run ML Model algorithms on source data and load the output to a target database.

Before you use an ML Model in a data flow, you need build the ML Model. For instructions on how to create an ML model, see Create an ML Model Data Entity in the Data Flow editor.

To use an ML Model in a data flow:

  1. Follow the instructions in Create a Data Flow to create a new data flow.
  2. In the Data Flow Editor, drag the tables that you want to use as a source in the data flow and drop them on the design canvas.
  3. From the Database Functions toolbar, click Machine Learning and drag the Prediction Model transformation component drop it on the design canvas.
  4. Click the Prediction Model transformation component to view its properties.
  5. In the General tab, specify the following:
    • Connection - The drop-down lists all the available Oracle connections. Select the Oracle connection that you want to use.
    • Schema - Select the schema.
    • ML Model - The drop-down lists all the available ML models. See Create an ML Model Data Entity in the Data Flow editor for instructions on how to build an ML Model.
  6. In the Column Mapping tab, map the source column that you want to embed to the INPUT attribute of the operator. The only column available in the column mappings is prediction parameters. Drag a text column from the available columns to the Expression column.
  7. Drag the table that you want to use as a target in the data flow and drop it on the design canvas.
  8. Save and execute the data flow.

    Data Transforms will run the prediction model on the source data and write the output to the target table.