4.5.5 Visualize Data in a Scatter Plot
Scatter plots represent the relationship between two numeric variables in a data set. It represents data points on a two-dimensional plane and show how much one variable is affected by another. The independent variable is plotted on the X-axis, while the dependent variable is plotted on the Y-axis. You can display points by one or more grouping variables such that each group has a distinct color and shape.
When to use this chart: Use the scatter plot
when you have paired numerical data, and you want to determine the relationship between
the related variables in certain scenarios, identifying correlations and trends (linear
and non-linear relationships), detecting outliers, understanding data distribution,
identifying groupings or clusters of data. Scatter plots can also be useful when
comparing multiple datasets where each datasets values are represented as a different
group. Scatter plots are also useful for evaluating regression models by plotting, e.g.,
actual versus predicted values.
Dataset:
CUSTOMER_INSURANCE_LTV
. In this example, we will use the example
template notebook OML-Run-me-first.
To visualize data in a scatter plot:
This completes the task of visualizing your data in a scatter plot. The scatter
plot shows a strong correlation between Income and Mortgage amount in the income range
50k to 80k.
Parent topic: Visualize your Data in Oracle Machine Learning Notebooks