3.1 Regression Use case
The Brooklyn housing dataset contains the sale prices of homes in brooklyn borough, along with various factors that influence these prices, such as the area of the house, its location, and the type of dwelling. You are tasked with analyzing years of historical home sales data to estimate sales prices, which will help optimize real estate operations. In this case study, you will learn how to predict sales prices using the regression technique and the GLM algorithm.
Related Contents
Topic | Link |
---|---|
OML4Py GitHub Example | OML4Py Regression GLM |
About Generalized Linear Model | About Generalized Linear Model |
About Machine Learning Classes and Algorithms | About Machine Learning Classes and Algorithms |
Shared Settings | Shared Settings |
- Data Set
Download the data set from Brooklyn housing dataset .
- DatabaseSelect or create database out of the following options:
- Get your FREE cloud account. Go to https://cloud.oracle.com/database and select Oracle Database Cloud Service (DBCS), or Oracle Autonomous Database. Create an account and create an instance. See Autonomous Database Quick Start Workshop.
- Download the latest version of Oracle Database (on premises).
- Machine Learning Tools
Depending on your database selection,
- Use OML Notebooks for Oracle Autonomous Database.
- Install and use Oracle SQL Developer connected to an on-premises database or DBCS. See Installing and Getting Started with SQL Developer.
- Other Requirements
Data Mining Privileges (this is automatically set for ADW). See System Privileges for Oracle Machine Learning for SQL.
- Load Data
Load the data in your database and examine the data set and its attributes. - Explore Data
Explore the data to understand and assess the quality of the data. At this stage assess the data to identify data types and noise in the data. Look for missing values and numeric outlier values. - Build Model
Build your model using the training data set. Use theoml.glm
function to build your model and specify model settings. - Evaluate
Before you make predictions using your model on new data, you should first evaluate model accuracy. You can evaluate the model using different methods.
Parent topic: Use Cases