13.6.5 pyqGroupEval Function (On-Premises Database)
This topic describes the pyqGroupEval
function when used
in an on-premises Oracle Database. The pyqGroupEval
function groups data by one
or more columns and runs a user-defined Python function on each group.
The pyqGroupEval
function runs the user-defined Python
function specified by the SCR_NAME
parameter. Pass data to the Python function with the INP_NAM
parameter. Pass arguments to the Python function with the
PAR_LST
parameter. Specify one or
more grouping columns with the GRP_COL
parameter.
The Python function can return a boolean
, a
dict
, a float
, an int
, a
list
, a str
, a tuple
or a
pandas.DataFrame
object. Define the form of the returned value with the
OUT_FMT
parameter.
Syntax
pyqGroupEval(
inp_nam VARCHAR2,
par_lst VARCHAR2,
out_fmt VARCHAR2,
grp_col VARCHAR2,
scr_name VARCHAR2,
scr_owner VARCHAR2 DEFAULT NULL)
Parameters
Parameter | Description |
---|---|
|
The name of a table or view that specifies the data to pass to the
Python function specified by the |
|
A JSON string that contains additional parameters to pass to the
user-defined Python function specified by the For example, to specify the input data type as
|
|
The format of the output returned by the function. It can be one of the following:
|
GRP_COL |
The names of the grouping columns by which to partition the data. Use
commas to separate multiple columns. For example, to group by
|
|
The name of a user-defined Python function in the OML4Py script repository. |
|
The owner of the registered Python script. The default value is NULL. If NULL, will search for the Python script in the user’s script repository. |
Returns
Function pyqGroupEval
returns a table that has the structure
specified by the OUT_FMT
parameter value.
Example 13-20 Using the pyqGroupEval Function
This example defines the Python function create_iris_table
and
stores it with the name create_iris_table in the OML4Py script
repository. It then invokes pyqEval
, which invokes the user-definded Python
function and creates the IRIS database table. The example creates the package
irisPkg
and uses that package in specifying the data cursor to pass to
the irisGroupEval
function, which is a user-defined
pyqGroupEval
function. It defines another Python function,
group_count
and stores it in the script repository with the name
mygroupcount. The example then invokes the irisGroupEval
function and
passes it the Python function saved with the name mygroupcount.
In a PL/SQL block, define the Python function
create_iris_table
and store in the script repository with the name
create_iris_table.
BEGIN
sys.pyqScriptCreate('create_iris_table',
'def create_iris_table():
from sklearn.datasets import load_iris
import pandas as pd
iris = load_iris()
x = pd.DataFrame(iris.data, columns = ["Sepal_Length",\
"Sepal_Width","Petal_Length","Petal_Width"])
y = pd.DataFrame(list(map(lambda x: {0:"setosa", 1: "versicolor",\
2: "virginica"}[x], iris.target)),\
columns = ["Species"])
return pd.concat([y, x], axis=1)');
END;
/
Invoke the pyqEval
function to create the database table
IRIS
, using the Python function stored with the name
create_iris_table
in the script repository.
CREATE TABLE IRIS AS
(SELECT * FROM pyqEval(
NULL,
'{"Species":"VARCHAR2(10)","Sepal_Length":"number",
"Sepal_Width":"number","Petal_Length":"number",
"Petal_Width":"number"}',
'create_iris_table'
));
Define the Python function group_count
and store it with the
name mygroupcount
in the script repository. The function returns a
pandas.DataFrame
generated on each group of data
dat
.
BEGIN
sys.pyqScriptCreate('mygroupcount',
'def group_count(dat):
import pandas as pd
return pd.DataFrame([(dat["Species"][0], dat.shape[0])],\
columns = ["Species", "CNT"]) ');
END;
/
Issue a query that invokes the pyqGroupEval
function. In the
function, the INP_NAM
argument specifies
the data in the IRIS table to pass to the function.
The PAR_LST
argument
specifies the special control argument oml_input_type
.
The OUT_FMT
argument
specifies a JSON string that contains the column names and data types of the table returned
by pyqGroupEval
.
The GRP_COL
parameter
specifies the column to group by.
The SCR_NAME
parameter
specifies the user-defined Python function stored with the name
mygroupcount
in the script repository.
SELECT *
FROM table(
pyqGroupEval(
'IRIS',
'{"oml_input_type":"pandas.DataFrame"}',
'{"Species":"varchar2(10)", "CNT":"number"}',
'Species',
'mygroupcount'));
The output is the following.
Species CNT
---------- ----------
setosa 50
versicolor 50
virginica 50