Examples of Using ONNX Models
The following examples use the Iris data set to showcase loading and inference from ONNX format machine learning models for machine learning techniques such as Classification, Regression, and Clustering in your Oracle Database instance.
ONNX Classification Examples
The following examples showcase various JSON metadata parameters that can be defined for ONNX models.
Example: Specifying JSON Metadata for Classification Models
The following example illustrates JSON metadata parameters with
Classification as the function. Assume the model has an output named
probabilities
for the probability of the prediction. To use the
PREDICTION_PROBABILITY
scoring function, you must set the field
classificationProbOutput
to the name of the model output that
holds the probability.
BEGIN
DBMS_VECTOR.LOAD_ONNX_MODEL('classification_model.onnx', 'doc_model',
JSON('{"function" : "classification",
"classificationProbOutput": "probabilities"}'));
END;
/
Example: Specifying labels in JSON Metadata for Classification Models
The following example illustrates how you can specify custom labels in the JSON metadata.
BEGIN
DBMS_VECTOR.LOAD_ONNX_MODEL('classification_model.onnx', 'doc_model',
JSON('{"function" : "classification",
"classificationProbOutput": "probabilities",
"labels": ["Setosa", "Versicolour", "Virginica"]}'));
END;
/
You can use the PREDICTION
and
PREDICTION_PROBABILITY
functions for inference or scoring:
SELECT
iris.*,
PREDICTION(doc_model USING *) as predicted_species_id,
PREDICTION_PROBABILITY(doc_model, 'setosa' USING *) as setosa_probability
FROM iris;
The query predicts iris
species and the probability of
setosa species using the iris
data set. The data from
iris
table is used in a SELECT
query to
predict a species ID and the probability that the species is setosa using a
machine learning model named doc_model
. The
PREDICTION
function predicts the species based on the
attributes in the table, and the PREDICTION_PROBABILITY
function
computes the probability that the predicted species is setosa. The result
includes all columns from the iris
view along with the predicted
species ID and the probability of the species being setosa.
Example: Specifying input in JSON Metadata for Classification Models
The following example illustrates how you can specify input attribute
names that map to the actual ONNX model input names. This example assumes a model
with four inputs named SEPAL_LENGTH
, SEPAL_WIDTH
,
PETAL_LENGTH
, and PETAL_WIDTH
. You can specify
alternative input attribute names using the JSON metadata as shown in this example.
Here, each input is assumed to be a tensor with a dimension of 1. The
input
field must be a JSON object where each field is a model
input name (For example, SEPAL_LENGTH
), and its value is a JSON
array sized according to the tensor's dimension (here, 1) with one attribute name
per element in the array.
BEGIN DBMS_VECTOR.LOAD_ONNX_MODEL('classification_model.onnx', 'doc_model',
JSON('{"function" : "classification",
"classificationProbOutput": "probabilities",
"input": { "SEPAL_LENGTH": ["SEPAL_LENGTH_CM"],
"SEPAL_WIDTH": ["SEPAL_WIDTH_CM"],
"PETAL_LENGTH": ["PETAL_LENGTH_CM"],
"PETAL_WIDTH": ["PETAL_WIDTH_CM"] } }'));
END;
/
You can also have a different order of the columns as input.
BEGIN DBMS_VECTOR.LOAD_ONNX_MODEL('classification_model.onnx', 'doc_model',
JSON('{"function" : "classification",
"classificationProbOutput": "probabilities",
"input": { "SEPAL_WIDTH": ["SEPAL_WIDTH_CM"],
"PETAL_LENGTH": ["PETAL_LENGTH_CM"],
"PETAL_WIDTH": ["PETAL_WIDTH_CM"],
"SEPAL_LENGTH": ["SEPAL_LENGTH_CM"] } }'));
END;
/
Example: Specifying a Single input With Four Dimensions
Here is an example where the model has a single input tensor named
x
with four dimensions. The corresponding JSON metadata for
this scenario is:
JSON('{"function" : "classification",
"classificationProbOutput": "probabilities",
"input": { "x": ["SEPAL_LENGTH_CM",
"SEPAL_WIDTH_CM",
"PETAL_LENGTH_CM",
"PETAL_WIDTH_CM"]
}'));
You can use PREDICTION
and
PREDICTION_PROBABILITY
functions for inference or scoring.
WITH
dummy_iris AS (
SELECT
4.5 as petal_length_cm,
1.5 as petal_width_cm,
4.3 as sepal_length_cm,
2.9 as sepal_width_cm
FROM iris
)
SELECT
dummy_iris.*,
PREDICTION(doc_model USING *) as predicted_species_id,
PREDICTION_PROBABILITY(doc_model 'setosa' USING *) as setosa_probability
FROM dummy_iris;
The query predicts iris
species and the probability of
setosa species using specified attributes in a temporary data set. The
query creates a temporary dummy_iris
view with attributes values
set. This temporary view is then used in a SELECT
query to predict
a species ID and the probability that the species is setosa using a machine
learning model named doc_model
. The PREDICTION
function predicts the species based on the attributes provided, and the
PREDICTION_PROBABILITY
function computes the probability that
the predicted species is setosa. The result includes all columns from the
dummy_iris
view along with the predicted species ID and the
probability of the species being setosa.
Example: Specifying defaultOnNull in JSON Metadata for Classification Models
The following examples illustrates how you can specify
defaulOnNull
provides default values to be used for specific
attributes when their values are NULL in the data set. Use the names
SEPAL_LENGTH
, SEPAL_WIDTH
,
PETAL_LENGTH
, and PETAL_WIDTH
as fields in the
defaultOnNull
object, which are the assumed input attribute
names for a ONNX model with four inputs. These names serve as the default input
attribute names, so you can use them as fields in the
defaultOnNull
.
BEGIN DBMS_VECTOR.LOAD_ONNX_MODEL('classification_model.onnx', 'doc_model',
JSON('{"function" : "classification",
"classificationProbOutput": "probabilities",
"defaultOnNull": {"SEPAL_LENGTH": "5.1",
"SEPAL_WIDTH": "3.5",
"PETAL_LENGTH": "1.4",
"PETAL_WIDTH": "0.2"}}'));
END;
/
-
"SEPAL_LENGTH": "5.1"
: If the sepal length is null, use 5.1 as the default value. -
"SEPAL_WIDTH": "3.5"
: If the sepal width is null, use 3.5 as the default value. -
"PETAL_LENGTH": "1.4"
: If the petal length is null, use 1.4 as the default value. -
"PETAL_WIDTH": "0.2"
: If the petal width is null, use 0.2 as the default value.
Example: Specifying input and defaultOnNull JSON Metadata for Classification Models
Here is a combined example of specifying input
and
defaultOnNull
values. This example uses the values that were
illustrated in the earlier examples where input
and
defaultOnNull
values are specified:
JSON('{"function" : "classification",
"classificationProbOutput": "probabilities",
"input": { "SEPAL_WIDTH": ["SEPAL_WIDTH_CM"],
"PETAL_LENGTH": ["PETAL_LENGTH_CM"],
"PETAL_WIDTH": ["PETAL_WIDTH_CM"],
"SEPAL_LENGTH": ["SEPAL_LENGTH_CM"] },
"defaultOnNull": {"SEPAL_LENGTH_CM": "5.1",
"SEPAL_WIDTH_CM": "3.5"}}')
ONNX Clustering Examples
The following examples showcase various JSON metadata parameters that can be defined for ONNX models.
Example: Specifying JSON Metadata for Clustering Models
The following example illustrates JSON metadata parameters with
Clustering as the function. Assume the model has an output named
probabilities
for the probability of the prediction. To use the
CLUSTER_PROBABILITY
scoring function, you must set the field
clusteringProbOutput
to the name of the model output that holds
the probability.
BEGIN
DBMS_VECTOR.LOAD_ONNX_MODEL('clustering_model.onnx','doc_model',
JSON('{"function": "clustering",
"clusteringProbOutput": "probabilities"
}
')
);
END;
/
You can use CLUSTER_ID
and
CLUSTER_PROBABILITY
functions for inference or scoring.
SELECT
iris.*,
CLUSTER_ID(doc_model USING *) as cluster_id,
CLUSTER_PROBABILITY(doc_model, 1 USING *) as cluster_1_probability
FROM iris;
This query predicts the cluster assignments and the probabilities of
belonging to a specific cluster for each record of the iris
data
set. The query retrieves all columns of each record (iris.*
) and
applies the clustering model named doc_model
to each record of the
iris
data set and predicts the cluster ID. The USING
*
clause tells the model to use all available columns in the
iris
table for this prediction. The
CLUSTER_PROBABILITY(doc_model, 1 USING *) as
cluster_1_probability
part of the query calculates the probability that
each record belongs to cluster 1, according to the doc_model
from
the iris
data set. This provides insights into how likely each
record is to be part of cluster 1, giving a quantitative measure of membership
strength.
Example: Specifying clusteringDistanceOutput in JSON Metadata for Clustering Models
The following example illustrates how you can specify
clusteringDistanceOutput
and for ONNX Clustering models.
In this model, an output tensor named distances
provides distances for the input, which is a single tensor named
float_input
with a dimension of 4. The JSON metadata
input
field must map attribute names to entries of the tensor,
such as "SEPAL_LENGTH", "SEPAL_WIDTH", "PETAL_LENGTH",
"PETAL_WIDTH"
.
BEGIN
DBMS_VECTOR.LOAD_ONNX_MODEL('clustering_model.onnx', 'doc_model',
JSON('{"function" : "clustering",
"clusteringDistanceOutput": "distances",
"normalizeProb": "softmax",
"input": { "float_input": ["SEPAL_LENGTH", "SEPAL_WIDTH", "PETAL_LENGTH", "PETAL_WIDTH"] }
}')
);
END;
/
You can use
CLUSTER_DISTANCE
function for inference or scoring. These SQL queries utilize clustering models to
predict cluster distances from the IRIS
data set.
SELECT CLUSTER_DISTANCE(doc_model USING *) AS predicted_target_value,
CLUSTER_DISTANCE (doc_model,1 USING *) AS dist1,
CLUSTER_DISTANCE (doc_model,2 USING *) AS dist2,
CLUSTER_DISTANCE (doc_model,3 USING *) AS dist3
FROM IRIS
ORDER BY ID
FETCH NEXT 10 ROWS ONLY;
Here, the query focuses on understanding the physical distance of data
points from cluster centroids, which is particularly useful for identifying outliers
or for performing detailed cluster analysis. The query calculates the distance of
each record in the IRIS
data set from the centroids of different
clusters using the doc_model
. The USING *
syntax
indicates that the model must use all available columns of the IRIS
data set for making the prediction. CLUSTER_DISTANCE(doc_model, n USING
*)
computes the distance from cluster n
(n
being 1, 2, and 3 in this query). Each distance is selected
as a separate column (dist1
, dist2
,
dist3
).
The output is limited to the first 10 rows of the result set ordered by
the ID
column of the IRIS
table.
Example: Specifying clusteringProbOutput and normalizeProb in JSON Metadata for Clustering Models
The following example illustrates how you can specify
clusteringProbOutput
and normalizeProb
for
ONNX Clustering models.
BEGIN
DBMS_VECTOR.LOAD_ONNX_MODEL('clustering_model.onnx', 'doc_model',
JSON('{"function" :
"clustering",
"clusteringProbOutput": "probabilities",
"normalizeProb" : "softmax",
"input": { "float_input": ["SEPAL_LENGTH", "SEPAL_WIDTH", "PETAL_LENGTH", "PETAL_WIDTH"] } }')
);
END;
/
You can use CLUSTER_PROBABILITY
and
CLUSTER_SET
functions for inference or
scoring:
SELECT CLUSTER_ID (doc_model USING *) AS predicted_target_value,
CLUSTER_PROBABILITY (doc_model,1 USING *) AS prob1,
CLUSTER_PROBABILITY (doc_model,2 USING *) AS prob2,
CLUSTER_PROBABILITY (doc_model,3 USING *) AS prob3
FROM IRIS
ORDER BY ID
FETCH NEXT 10 ROWS ONLY;
In this case, a clustering model is used to predict the cluster IDs and
associated probabilities for records from the IRIS
data set.
Because the JSON metadata specifies softmax
for the
normalizeProb
field, the model applies softmax normalization to
the probabilities before returning them as the result of the
CLUSTER_PROBABILITY
scoring operator.
The
SQL query selects CLUSTER_ID
column from the IRIS
table and adds a new column, predicted_target_value
, which contains
predictions made by the doc_model
. The USING *
syntax means that all columns of the current row are used as input features for the
doc_model
model to predict the value as
predicted_target_value
. The result of this prediction is then
included as a new column in the output of the query.
CLUSTER_PROBABILITY(model, n USING *)
: Computes the
probability that the record belongs to cluster n
(n
being 1, 2, and 3 in this query). This is done for three
different clusters, and each probability is selected as a separate column
(prob1
, prob2
, prob3
).
The output is limited to the first 10 rows of the result set ordered by
the ID
column of the IRIS
table.
SELECT S.CLUSTER_ID, S.PROBABILITY
FROM (SELECT CLUSTER_SET(doc_model USING *) pset
FROM IRIS ORDER BY ID) T,
TABLE(T.pset) S
FETCH NEXT 10 ROWS ONLY;
The CLUSTER_SET
query generates a set of cluster data
using the doc_model
. The resultant column pset
represents all possible cluster assignments for each record, which includes cluster
IDs and their respective probabilities ordered by the ID
column.
The SELECT S.CLUSTER_ID, S.PROBABILITY
part of the query selects
the cluster ID and probability from the resultant column set. The output is limited
to the first 10 rows of the result set.
ONNX Regression Examples
The following examples showcase various JSON metadata parameters that can
be defined for ONNX Regression models. All examples assume an ONNX model that has
one output named regressionOutput
and four input tensors of
dimension 1 whose name match exactly the name of the IRIS
table
columns, namely, SEPAL_LENGTH, SEPAL_WIDTH, PETAL_LENGTH,
PETAL_WIDTH
.
Example: Specifying JSON Metadata for Regression Models
The following is a simple example illustrating JSON metadata parameters
with Regression as the function. Assume the ONNX model features one output named
regressionOutput
and four input tensors of dimension 1, whose
names match exactly after the IRIS
table columns
("SEPAL_LENGTH", "SEPAL_WIDTH", "PETAL_LENGTH", "PETAL_WIDTH"
).
The JSON metadata can be as simple as the following:
BEGIN DBMS_VECTOR.LOAD_ONNX_MODEL(
'regression_model.onnx',
'doc_model',
JSON('{"function": "regression"}
')
);
END;
/
You can use the PREDICTION
function for inference or
scoring:
SELECT
iris.*,
PREDICTION(doc_model USING *) as predicted_petal_width_cm
FROM iris;
In this case, the SQL query selects all columns from the
iris
table and adds a new column,
predicted_petal_width_cm
, which contains predictions made by
the doc_model
. The USING *
syntax means that all
columns of the current row are used as input features for the
doc_model
model to predict the value of
PETAL_WIDTH
as predicted_petal_width_cm
. The
result of this prediction is then included as a new column in the output of the
query.
Example: Specifying input and defaultOnNull in JSON Metadata for Regression Models
The following example illustrates how you can specify input attribute
names that map to the actual ONNX model input names. The
defaulOnNull
providing default values to be used for specific
attributes when their values are NULL in the data set.
BEGIN DBMS_VECTOR.LOAD_ONNX_MODEL('regression_model.onnx','doc_model',
JSON('{"function": "regression",
"input": {
"SEPAL_LENGTH": ["dummy_sepal_length_cm"],
"SEPAL_WIDTH": ["dummy_sepal_width_cm"]
},
"defaultOnNull": {
"dummy_sepal_length_cm": "5.1",
"dummy_sepal_width_cm": "3.5",
}
}
')
);
END;
/
You can use the PREDICTION
function for inference or
scoring.
WITH
dummy_iris AS (
SELECT
(CASE WHEN petal_length > 5 THEN 4.9 ELSE NULL END)
as dummy_sepal_length_cm,
(CASE WHEN petal_length < 4 THEN 2.5 ELSE NULL END)
as dummy_sepal_width_cm,
petal_length
petal_width
FROM iris
)
SELECT
dummy_iris.*,
PREDICTION(doc_model USING *) as predicted_petal_width_cm
FROM dummy_iris;
In this case, a temporary dummy_iris
table is created
with three columns: dummy_sepal_length_cm
,
dummy_sepal_width_cm
, and petal_length
. The
values of the dummy_sepal_length_cm
and
dummy_sepal_width_cm
are based on petal_length
values of the iris
table. If petal_length
is
greater than 5, dummy_sepal_length_cm
is set to 4.9, otherwise it
is NULL. If petal_length
is less than 4,
dummy_sepal_width_cm
is set to 2.5, otherwise it remains NULL.
Then the SELECT
query retrieves all columns from the
dummy_iris
table and uses the doc_model
to
predict petal_width
, adding this prediction as a new column named
predicted_petal_width_cm
. The model uses the derived dummy
columns, petal_length
and petal_width
for its
predictions.
See Also:
-
LOAD_ONNX_MODEL in Oracle Database PL/SQL Packages and Types Reference