Classification Use Case

3.2 Classification Use Case

A retail store has information about its customers' behavior and the purchases they make. Now with the available data, they would like you to analyze and identify the type of customers they should target which would result in an increase in the volume of the most profitable product sold, and an increase in profit. In this use case, you will demonstrate how to identify such customers using the Random Forest algorithm.

Related Contents

Topic	Link
OML4Py GitHub Example	Classification Random Forest
About Random Forest	About Random Forest
Random Forest	Random Forest Algorithm
Shared Settings	Shared Settings

Before you start your OML4Py use case journey, ensure that you have the following:

Data Set
The data set used for this use case is from the SH schema. The SH schema can be readily accessed in Oracle Autonomous Database. For on-premises databases, the schema is installed during the installation or can be manually installed by downloading the scripts. See Installing the Sample Schemas.
Database
Select or create database out of the following options:
- Get your FREE cloud account. Go to https://cloud.oracle.com/database and select Oracle Database Cloud Service (DBCS), or Oracle Autonomous Database. Create an account and create an instance. See Autonomous Database Quick Start Workshop.
- Download the latest version of Oracle Database (on premises).
Machine Learning Tools
Depending on your database selection,
- Use OML Notebooks for Oracle Autonomous Database.
- Install and use Oracle SQL Developer connected to an on-premises database or DBCS. See Installing and Getting Started with SQL Developer.
Other Requirements
Data Mining Privileges (this is automatically set for ADW). See System Privileges for Oracle Machine Learning for SQL.

Load Data
Access the data set from the SH Schema and explore the data to understand the attributes.
Explore Data
Explore the data to understand and assess the quality of the data. At this stage assess the data to identify data types and noise in the data. Look for missing values and numeric outlier values.
Build Model
Build your model using the training data set. Use the oml.rf function to build your model and specify the model settings.
Evaluate
Before you make predictions using your model on new data, you should first evaluate model accuracy. You can evaluate the model using different methods.

Related Topics

Parent topic: Use Cases