MySQL HeatWave User Guide
After the ML_TRAIN routine, use
the ML_EXPLAIN routine to train
model explainers for MySQL HeatWave AutoML. By default, the
ML_TRAIN routine trains the
Permutation Importance model explainer.
This topic has the following sections.
Review the following:
Explanations help you understand which features have the most influence on a prediction. Feature importance is presented as an attribution value. A positive value indicates that a feature contributed toward the prediction. A negative value can have different interpretations depending on the specific model explainer used for the model. For example, a negative value for the permutation importance explainer means that the feature is not important.
Model explainers are used when you run the
ML_EXPLAIN routine to explain
what the model learned from the training dataset. The model
explainer provides a list of feature importances to show what
features the model considered important based on the entire
training dataset. The
ML_EXPLAIN routine can train
these model explainers:
The Permutation Importance model explainer, specified as
permutation_importance, is the default
model explainer. ML_TRAIN
generates this model explainer when it runs.
The Partial Dependence model explainer, specified as
partial_dependence, shows how changing
the values of one or more columns changes the value that
the model predicts. When you train this model explainer,
you need to specify some additional options. See
ML_EXPLAIN to
learn more.
The SHAP model explainer, specified as
shap, produces feature importance
values based on Shapley values.
The Fast SHAP model explainer, specified as
fast_shap, is a subsampling version of
the SHAP model explainer, which usually has a faster
runtime.
The model explanation is stored in the model catalog along
with the machine learning model in the
model_explanation column. See
The Model
Catalog. If you run
ML_EXPLAIN again for the same
model handle and model explainer, the field is overwritten
with the new result.
You cannot generate model explanations for the following model types:
Forecasting
Recommendation
Anomaly detection
Anomaly detection for logs
Topic modeling
Before running ML_EXPLAIN, you
must train, and then load the model you want to use.
The following example trains a dataset with the classification machine learning task.
mysql> CALL sys.ML_TRAIN('census_data.census_train', 'revenue', JSON_OBJECT('task', 'classification'), @census_model);The following example loads the trained model.
mysql> CALL sys.ML_MODEL_LOAD(@census_model, NULL);
For more information about training and loading models, see Train a Model and Load a Model.
After training and loading the model, you can generate model explanations. For option and parameter descriptions, see ML_EXPLAIN.
After training and loading a model, you can retrieve the
default model explanation using the
permutation_importance explainer from the
model catalog. See The
Model Catalog.
mysql> SELECT column FROM ML_SCHEMA_user1.MODEL_CATALOG WHERE model_handle=model_handle;
The following example retrieves the model explainer column
from the model catalog of the previously trained model. The
JSON_PRETTY parameter displays the output
in an easily readable format.
mysql> SELECT JSON_PRETTY(model_explanation) FROM ML_SCHEMA_user1.MODEL_CATALOG WHERE model_handle=@census_model;
+---------------------------------------------------------------------------------------------------------------------------------+
| JSON_PRETTY(model_explanation) |
+---------------------------------------------------------------------------------------------------------------------------------+
| {
"permutation_importance": {
"age": 0.0292,
"sex": 0.0023,
"race": 0.0019,
"fnlwgt": 0.0038,
"education": 0.0008,
"workclass": 0.0068,
"occupation": 0.0223,
"capital-gain": 0.0479,
"capital-loss": 0.0117,
"relationship": 0.0234,
"education-num": 0.0352,
"hours-per-week": 0.0148,
"marital-status": 0.024,
"native-country": 0.0
}
} |
+---------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.0427 sec)
Replace user1 and
@census_model with your own user name and
session variable.
The explanation displays values of permutation importance for each column.
To generate a model explanation, run the
ML_EXPLAIN routine.
mysql> CALL sys.ML_EXPLAIN ('table_name', 'target_column_name', model_handle, [options]);
The following example generates a model explanation on the
trained and loaded model with the shap
model explainer.
mysql> CALL sys.ML_EXPLAIN('census_data.census_train', 'revenue', @census_model, JSON_OBJECT('model_explainer', 'shap'));
Where:
census_data.census_train is the fully
qualified name of the table that contains the training
dataset
(schema_name.table_name).
revenue is the name of the target
column, which contains ground truth values.
@census_model is the session variable
for the trained model.
model_explainer is set to
shap for the SHAP model explainer.
After running ML_EXPLAIN, you
can view the model explanation in the Model Catalog. See
The Model
Catalog. The following example views the model
explanation for the previous command. It provides values for
each column representing importance values with the
shap explainer.
mysql> SELECT JSON_PRETTY(model_explanation) FROM ML_SCHEMA_user1.MODEL_CATALOG WHERE model_handle=@census_model;
+---------------------------------------------------------------------------------------------------------------------------------+
| JSON_PRETTY(model_explanation) |
+---------------------------------------------------------------------------------------------------------------------------------+
| {
"shap": {
"age": 0.0467,
"sex": 0.033,
"race": 0.0155,
"fnlwgt": 0.0185,
"education": 0.016,
"workclass": 0.0255,
"occupation": 0.0001,
"capital-gain": 0.0217,
"capital-loss": 0.0001,
"relationship": 0.0426,
"education-num": 0.0186,
"hours-per-week": 0.0148,
"marital-status": 0.024,
"native-country": 0.0
},
"permutation_importance": {
"age": -0.0057,
"sex": 0.0002,
"race": 0.0001,
"fnlwgt": 0.0103,
"education": 0.0108,
"workclass": 0.0189,
"occupation": 0.0,
"capital-gain": 0.0304,
"capital-loss": 0.0,
"relationship": 0.0195,
"education-num": 0.0152,
"hours-per-week": 0.0235,
"marital-status": 0.0099,
"native-country": 0.0
}
} |
+---------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.0427 sec)
Review ML_EXPLAIN for parameter descriptions and options.
Learn how to Generate Prediction Explanations.
Learn more about the The Model Catalog.