13.7.2.5 pyqGroupEval Function (Autonomous Database)
The function pyqGroupEval
when used in Oracle Autonomous Database, groups data by one or more columns and runs a user-defined Python function on each group.
The user-defined Python function can return a boolean
, a dict
, a float
, an int
, a list
, a str
, a tuple
or a pandas.DataFrame
object. Define the form of the returned value with the OUT_FMT
parameter.
Syntax
FUNCTION PYQSYS.pyqGroupEval(
INP_NAM VARCHAR2,
PAR_LST VARCHAR2,
OUT_FMT VARCHAR2,
GRP_COL VARCHAR2,
ORD_COL VARCHAR2,
SCR_NAME VARCHAR2,
SCR_OWNER VARCHAR2 DEFAULT NULL,
ENV_NAME VARCHAR2 DEFAULT NULL
)
RETURN SYS.AnyDataSet
Parameters
Parameter | Description |
---|---|
|
The name of a table or view that specifies the data
to pass to the Python function specified by the
|
|
A JSON string that contains additional parameters to
pass to the user-defined Python function specified by the
For example, to specify the input data type as
|
|
The format of the output returned by the function. It can be one of the following:
See also: Output Formats (Autonomous Database). |
GRP_COL |
The names of the grouping columns by which to partition the data. Use commas to separate multiple columns. For example, to group by
|
ORD_COL |
Comma-separated column names to order the input data. For example to order by
If specified, the input data will first be ordered by the |
|
The name of a user-defined Python function in the OML4Py script repository. |
|
The owner of the registered Python script. The
default value is |
|
The name of the conda environment that should be used when running the named user-defined Python function. |
Example
This example calls the pyqGroupEval function, which runs the specified Python script on each partition of data in the specified data set.
The INP_NAM
argument specifies the data in the IRIS table to pass to the Python function.
The PAR_LST
argument specifies capturing images rendered in the script with the special control argument oml_graphics_flag.
The OUT_FMT
arguments specifies returning a table with BLOB containing the images generated by the Python function.
The GRP_COL
argument specifies to group the specified data by the 'Species' column.
The SCR_NAME
parameter specifies the 'test_seaborn_inp' script, which is created in pyqTableEval Function (Autonomous Database).
The ENV_NAME
parameter specifies 'seaborn', which is a Conda environment created in pyqEval Function (Autonomous Database) .
select *
from table(pyqGroupEval(
inp_nam => 'IRIS',
par_lst => '{"oml_graphics_flag":true}',
out_fmt => 'PNG',
grp_col => 'Species',
ord_col => NULL,
scr_name => 'test_seaborn_inp',
scr_owner => NULL,
env_name => 'seaborn'
));
The output is the following.
NAME
--------------------------------------------------------------------------------
ID
----------
VALUE
--------------------------------------------------------------------------------
TITLE
--------------------------------------------------------------------------------
IMAGE
--------------------------------------------------------------------------------
GROUP_setosa
1
"hello world"
NAME
--------------------------------------------------------------------------------
ID
----------
VALUE
--------------------------------------------------------------------------------
TITLE
--------------------------------------------------------------------------------
IMAGE
--------------------------------------------------------------------------------
Iris plot
89504E470D0A1A0A0000000D4948445200000280000001E0080600000035D1DCE400000039744558
74536F667477617265004D6174706C6F746C69622076657273696F6E332E332E332C206874747073
NAME
--------------------------------------------------------------------------------
ID
----------
VALUE
--------------------------------------------------------------------------------
TITLE
--------------------------------------------------------------------------------
IMAGE
--------------------------------------------------------------------------------
3A2F2F6D6174706C6F746C69622E6F72672FC897B79C000000097048597300000F6100000F6101A8
3FA76900006AC649444154789CEDDD797854E5DD3EF0FBCC9EC92CD9F7B02624EC088A804B444005
AAD2AAB5D4166DD5F7ADDAB75A972AD60D375C706B2D2E588BED5B6A5FFD556AA91B555114549045
NAME
--------------------------------------------------------------------------------
ID
----------
VALUE
--------------------------------------------------------------------------------
TITLE
--------------------------------------------------------------------------------
IMAGE
--------------------------------------------------------------------------------
GROUP_versicolor
1
NAME
--------------------------------------------------------------------------------
ID
----------
VALUE
--------------------------------------------------------------------------------
TITLE
--------------------------------------------------------------------------------
IMAGE
--------------------------------------------------------------------------------
"hello world"
Iris plot
89504E470D0A1A0A0000000D4948445200000280000001E0080600000035D1DCE400000039744558
NAME
--------------------------------------------------------------------------------
ID
----------
VALUE
--------------------------------------------------------------------------------
TITLE
--------------------------------------------------------------------------------
IMAGE
--------------------------------------------------------------------------------
74536F667477617265004D6174706C6F746C69622076657273696F6E332E332E332C206874747073
3A2F2F6D6174706C6F746C69622E6F72672FC897B79C000000097048597300000F6100000F6101A8
3FA76900009E9149444154789CECDD77785CE5993FFCEF2973CE548D7AB124F76E6C03A69966AAC1
NAME
--------------------------------------------------------------------------------
ID
----------
VALUE
--------------------------------------------------------------------------------
TITLE
--------------------------------------------------------------------------------
IMAGE
--------------------------------------------------------------------------------
B0244E42360B01872CC96F43CC2659922C980D9B42884902A9FB420229904D1CD22064E90EC1A619
GROUP_virginica
NAME
--------------------------------------------------------------------------------
ID
----------
VALUE
--------------------------------------------------------------------------------
TITLE
--------------------------------------------------------------------------------
IMAGE
--------------------------------------------------------------------------------
1
"hello world"
Iris plot
NAME
--------------------------------------------------------------------------------
ID
----------
VALUE
--------------------------------------------------------------------------------
TITLE
--------------------------------------------------------------------------------
IMAGE
--------------------------------------------------------------------------------
89504E470D0A1A0A0000000D4948445200000280000001E0080600000035D1DCE400000039744558
74536F667477617265004D6174706C6F746C69622076657273696F6E332E332E332C206874747073
3A2F2F6D6174706C6F746C69622E6F72672FC897B79C000000097048597300000F6100000F6101A8
NAME
--------------------------------------------------------------------------------
ID
----------
VALUE
--------------------------------------------------------------------------------
TITLE
--------------------------------------------------------------------------------
IMAGE
--------------------------------------------------------------------------------
3FA7690000838E49444154789CEDDD797854E5D906F07BF6996496EC7B2010F64D14C1065450C105
8A625B6B2D0A56ED82B8B756B12AA245D4AAD5B61FA8B8606BA9AD56DC91E2120465DFF73D044242
Example
This example uses the IRIS table created in the example shown in pyqEval Function (Autonomous Database).
Define the Python function group_count
and store it with the name mygroupcount
in the script repository. The function returns a pandas.DataFrame
generated on each group of data dat
. The function also plots the sepal length with the petal length values on each group.
BEGIN
sys.pyqScriptCreate('mygroupcount',
'def group_count(dat):
import pandas as pd
import matplotlib.pyplot as plt
plt.plot(dat[["Sepal_Length"]], dat[["Petal_Length"]], ".")
plt.xlabel("Sepal Length")
plt.ylabel("Petal Length")
plt.title("{}".format(dat["Species"][0]))
return pd.DataFrame([(dat["Species"][0], dat.shape[0])],\
columns = ["Species", "CNT"]) ',
FALSE, TRUE); -- V_GLOBAL, V_OVERWRITE
END;
/
Issue a query that invokes the pyqGroupEval
function. In the function, the INP_NAM
argument specifies the data in the IRIS table to pass to the function.
The PAR_LST
argument specifies the special control argument oml_input_type
.
The OUT_FMT
argument specifies a JSON string that contains the column names and data types of the table returned by pyqGroupEval
.
The GRP_COL
parameter specifies the column to group by.
The SCR_NAME
parameter specifies the user-defined Python function stored with the name mygroupcount
in the script repository.
SELECT *
FROM table(
pyqGroupEval(
inp_nam => 'IRIS',
par_lst => '{"oml_input_type":"pandas.DataFrame"}',
out_fmt => '{"Species":"varchar2(10)", "CNT":"number"}',
grp_col => 'Species',
ord_col => NULL,
scr_name => 'mygroupcount'));
The output is the following:
Species CNT
---------- ----------
virginica 50
setosa 50
versicolor 50
3 rows selected.
Run the same script with IRIS data and return the XML output. The PAR_LST
argument specifies the special control argument oml_graphics_flag
to capture images rendered in the script. Both structured data and images are included in the XML output. The XML output is a CLOB; you can call set long [length]
to get more output.
set long 300
SELECT *
FROM table(
pyqGroupEval(
inp_nam => 'IRIS',
par_lst => '{"oml_input_type":"pandas.DataFrame", "oml_graphics_flag":true, "oml_parallel_flag":true", "oml_service_level":"MEDIUM"}',
out_fmt => 'XML',
grp_col => 'Species',
ord_col => NULL,
scr_name => 'mygroupcount'));
The output is the following.
NAME VALUE
virginica <root><Py-data><pandas_dataFrame><ROW-pandas_dataFrame><Species>virginica</Species><CNT>50</CNT></ROW-pandas_dataFrame></pandas_dataFrame></Py-data><images><image><img src="data:image/pngbase64"><![CDATA[iVBORw0KGgoAAAANSUhEUgAAAoAAAAHgCAYAAAA10dzkAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMu
setosa <root><Py-data><pandas_dataFrame><ROW-pandas_dataFrame><Species>setosa</Species><CNT>50</CNT></ROW-pandas_dataFrame></pandas_dataFrame></Py-data><images><image><img src="data:image/pngbase64"><![CDATA[iVBORw0KGgoAAAANSUhEUgAAAoAAAAHgCAYAAAA10dzkAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMyw
versicolor <root><Py-data><pandas_dataFrame><ROW-pandas_dataFrame><Species>versicolor</Species><CNT>50</CNT></ROW-pandas_dataFrame></pandas_dataFrame></Py-data><images><image><img src="data:image/pngbase64"><![CDATA[iVBORw0KGgoAAAANSUhEUgAAAoAAAAHgCAYAAAA10dzkAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjM
Run the same script with IRIS data and get the PNG output. The PAR_LST
argument specifies the special control argument oml_graphics_flag
to capture images.
column name format a7
column value format a15
column title format a16
column image format a15
SELECT *
FROM table(
pyqGroupEval(
inp_nam => 'IRIS',
par_lst => '{"oml_input_type":"pandas.DataFrame", "oml_graphics_flag":true}',
out_fmt => 'PNG',
grp_col => 'Species',
ord_col => NULL,
scr_name => 'mygroupcount'));
The output is the following:
NAME ID VALUE TITLE IMAGE
------- ---------- --------------- ---------------- ---------------
GROUP_s 1 [{"Species":"se setosa 89504E470D0A1A0
etosa tosa","CNT":50} A0000000D494844
] 520000028000000
1E0080600000035
D1DCE4000000397
4455874536F6674
77617265004D617
4706C6F746C6962
2076657273696F6
E332E332E332C20
6874747073
NAME ID VALUE TITLE IMAGE
------- ---------- --------------- ---------------- ---------------
GROUP_v 1 [{"Species":"ve versicolor 89504E470D0A1A0
ersicol rsicolor","CNT" A0000000D494844
or :50}] 520000028000000
1E0080600000035
D1DCE4000000397
4455874536F6674
77617265004D617
4706C6F746C6962
2076657273696F6
E332E332E332C20
NAME ID VALUE TITLE IMAGE
------- ---------- --------------- ---------------- ---------------
6874747073
GROUP_v 1 [{"Species":"vi virginica 89504E470D0A1A0
irginic rginica","CNT": A0000000D494844
a 50}] 520000028000000
1E0080600000035
D1DCE4000000397
4455874536F6674
77617265004D617
4706C6F746C6962
2076657273696F6