11.7.2.4 pyqRowEval Function (Autonomous Database)
The function pyqRowEval
when used in Oracle Autonomous Database, chunks data into sets of rows and then runs a user-defined Python function on each chunk.
The ROW_NUM
parameter specifies the maximum number of rows to pass to each invocation of the user-defined Python function. The last set of rows may have fewer rows than the number specified.
The user-defined Python function can return a boolean
, a dict
, a float
, an int
, a list
, a str
, a tuple
or a pandas.DataFrame
object. You can define the form of the returned value with the OUT_FMT
parameter.
Syntax
FUNCTION PYQSYS.pyqRowEval(
INP_NAM VARCHAR2,
PAR_LST VARCHAR2,
OUT_FMT VARCHAR2,
ROW_NUM NUMBER,
SCR_NAME VARCHAR2,
SCR_OWNER VARCHAR2 DEFAULT NULL,
ENV_NAME VARCHAR2 DEFAULT NULL
)
RETURN SYS.AnyDataSet
Parameters
Parameter | Description |
---|---|
|
The name of a table or view that specifies the data
to pass to the Python function specified by the
|
|
A JSON string that contains additional parameters to
pass to the user-defined Python function specified by the
For example, to specify the input data type as
|
|
The format of the output returned by the function. It can be one of the following:
See also: Output Formats (Autonomous Database). |
ROW_NUM |
The number of rows in a chunk. The Python script is executed in each chunk. |
|
The name of a user-defined Python function in the OML4Py script repository. |
|
The owner of the registered Python script. The
default value is |
|
The name of the conda environment that should be used when running the named user-defined Python function. |
Example
This example calls the pyqRowEval function, which runs the specified Python script on each chunk of rows in the specified data set.
The INP_NAM
argument specifies the data in the IRIS table to pass to the Python function.
The PAR_LST
argument specifies capturing images rendered in the script with the special control argument oml_graphics_flag.
The OUT_FMT
arguments specifies returning a table with BLOB containing the images generated by the Python function.
The ROW_NUM
argument specifies that 50 rows are included in each invocation of the function specified by SCR_NAME.
The SCR_NAME
parameter specifies the 'test_seaborn_inp' script, which is created in pyqTableEval Function (Autonomous Database).
The ENV_NAME
parameter specifies 'seaborn', which is a Conda environment created in pyqEval Function (Autonomous Database) .
select *
from table(pyqRowEval(
inp_nam => 'IRIS',
par_lst => '{"oml_graphics_flag":true}',
out_fmt => 'PNG',
row_num => 50,
scr_name => 'test_seaborn_inp',
scr_owner => NULL,
env_name => 'seaborn'
));
The output is the following.
NAME
--------------------------------------------------------------------------------
ID
----------
VALUE
--------------------------------------------------------------------------------
TITLE
--------------------------------------------------------------------------------
IMAGE
--------------------------------------------------------------------------------
CHUNK_1
1
"hello world"
NAME
--------------------------------------------------------------------------------
ID
----------
VALUE
--------------------------------------------------------------------------------
TITLE
--------------------------------------------------------------------------------
IMAGE
--------------------------------------------------------------------------------
Iris plot
89504E470D0A1A0A0000000D4948445200000280000001E0080600000035D1DCE400000039744558
74536F667477617265004D6174706C6F746C69622076657273696F6E332E332E332C206874747073
NAME
--------------------------------------------------------------------------------
ID
----------
VALUE
--------------------------------------------------------------------------------
TITLE
--------------------------------------------------------------------------------
IMAGE
--------------------------------------------------------------------------------
3A2F2F6D6174706C6F746C69622E6F72672FC897B79C000000097048597300000F6100000F6101A8
3FA7690000812549444154789CEDDD7774D5F5FD3FF0E767DC99BD13C82081B0041450101C888A0A
54455B57ADA3167F75B46AF5EBB7DA6FADB5D6E2AAA3B52A6A155A6B69B56A97685DE0420501058A
NAME
--------------------------------------------------------------------------------
ID
----------
VALUE
--------------------------------------------------------------------------------
TITLE
--------------------------------------------------------------------------------
IMAGE
--------------------------------------------------------------------------------
CHUNK_2
1
NAME
--------------------------------------------------------------------------------
ID
----------
VALUE
--------------------------------------------------------------------------------
TITLE
--------------------------------------------------------------------------------
IMAGE
--------------------------------------------------------------------------------
"hello world"
Iris plot
89504E470D0A1A0A0000000D4948445200000280000001E0080600000035D1DCE400000039744558
NAME
--------------------------------------------------------------------------------
ID
----------
VALUE
--------------------------------------------------------------------------------
TITLE
--------------------------------------------------------------------------------
IMAGE
--------------------------------------------------------------------------------
74536F667477617265004D6174706C6F746C69622076657273696F6E332E332E332C206874747073
3A2F2F6D6174706C6F746C69622E6F72672FC897B79C000000097048597300000F6100000F6101A8
3FA7690000ABB149444154789CECDD79985C65993EFEFB2C55A7BABBBA7A4D6F49670F21210B1002
NAME
--------------------------------------------------------------------------------
ID
----------
VALUE
--------------------------------------------------------------------------------
TITLE
--------------------------------------------------------------------------------
IMAGE
--------------------------------------------------------------------------------
84C5B093804846C5AF8E4C4445470467181CD1388A0A32114401C71FA88802A31807075019092092
CHUNK_3
NAME
--------------------------------------------------------------------------------
ID
----------
VALUE
--------------------------------------------------------------------------------
TITLE
--------------------------------------------------------------------------------
IMAGE
--------------------------------------------------------------------------------
1
"hello world"
Iris plot
NAME
--------------------------------------------------------------------------------
ID
----------
VALUE
--------------------------------------------------------------------------------
TITLE
--------------------------------------------------------------------------------
IMAGE
--------------------------------------------------------------------------------
89504E470D0A1A0A0000000D4948445200000280000001E0080600000035D1DCE400000039744558
74536F667477617265004D6174706C6F746C69622076657273696F6E332E332E332C206874747073
3A2F2F6D6174706C6F746C69622E6F72672FC897B79C000000097048597300000F6100000F6101A8
NAME
--------------------------------------------------------------------------------
ID
----------
VALUE
--------------------------------------------------------------------------------
TITLE
--------------------------------------------------------------------------------
IMAGE
--------------------------------------------------------------------------------
3FA76900008C7149444154789CEDDD77945BD5D536F0E7AA774D2F9EE25EB14D316D3060209862E2
50120204B009900487040229E004028418432881242FA663E7230402015E9AF14B33C5543730B671
Example
This example loads the Python model linregr
to predict row chunks of sample iris data. The model is created and saved in the datastore pymodel
, which is shown in the example for pyqTableEval Function (Autonomous Database).
The example defines a Python function and stores it in the OML4Py script repository. It uses the user-defined Python function to create a database table as the result of the pyqEval
function. It defines a Python function that runs a prediction function on a model loaded from the OML4Py datastore. It then invokes the pyqTableEval
function to invoke the function on chunks of rows from the database table.
In a PL/SQL block, define the function sample_iris_table
and store it in the script repository. The function loads the iris data set, creates two pandas.DataFrame
objects, and then returns a sample of the concatenation of those objects.
BEGIN
sys.pyqScriptCreate('sample_iris_table',
'def sample_iris_table(size):
from sklearn.datasets import load_iris
import pandas as pd
iris = load_iris()
x = pd.DataFrame(iris.data, columns = ["Sepal_Length",\
"Sepal_Width","Petal_Length","Petal_Width"])
y = pd.DataFrame(list(map(lambda x: {0:"setosa", 1: "versicolor",\
2: "virginica"}[x], iris.target)),\
columns = ["Species"])
return pd.concat([y, x], axis=1).sample(int(size))',
FALSE, TRUE); -- V_GLOBAL, V_OVERWRITE
END;
/
Create the SAMPLE_IRIS
table in the database as the result of a SELECT
statement, which invokes the pyqEval
function on the sample_iris_table
user-defined Python function saved in the script repository with the same name. The sample_iris_table
function returns an iris data sample of size size
.
CREATE TABLE sample_iris AS
SELECT *
FROM TABLE(pyqEval(
'{"size":20}',
'{"Species":"varchar2(10)","Sepal_Length":"number",
"Sepal_Width":"number","Petal_Length":"number",
"Petal_Width":"number"}',
'sample_iris_table'));
Define the Python function predict_model
and store it with the name linregrPredict
in the script repository. The function predicts the data in dat
with the Python model specified by the modelName
argument, which is loaded from the datastore specified by the datastoreName
argument. The function also plots the actual petal width values with the predicted values. The predictions are finally concatenated and returned with dat
as the object that the function returns.
BEGIN
sys.pyqScriptCreate('linregrPredict',
'def predict_model(dat, modelName, datastoreName):
import oml
import pandas as pd
objs = oml.ds.load(name=datastoreName, to_globals=False)
pred = objs[modelName].predict(dat[["Sepal_Length",\
"Sepal_Width","Petal_Length"]])
return pd.concat([dat, pd.DataFrame(pred, \
columns=["Pred_Petal_Width"])], axis=1)',
FALSE, TRUE); -- V_GLOBAL, V_OVERWRITE
END;
/
Run a SELECT
statement that invokes the pyqRowEval
function, which runs the specified Python function on each chunk of rows in the specified data set.
The INP_NAM
argument specifies the data in the SAMPLE_IRIS
table to pass to the Python function.
The PAR_LST
argument specifies passing the input data as a pandas. DataFrame with the special control argument oml_input_type
, along with values for the function arguments modelName
and datastoreName
.
In the OUT_FMT
argument, the JSON string specifies the column names and data types of the structured table output.
The ROW_NUM
argument specifies that five rows are included in each invocation of the function specified by SCR_NAME
.
The SCR_NAME
parameter specifies linregrPredict
, which is the name in the script repository of the user-defined Python function to invoke.
SELECT *
FROM table(pyqRowEval(
inp_nam => 'SAMPLE_IRIS',
par_lst => '{"oml_input_type":"pandas.DataFrame",
"modelName":"linregr", "datastoreName":"pymodel"}',
out_fmt => '{"Species":"varchar2(12)", "Petal_Length":"number", "Pred_Petal_Width":"number"}',
row_num => 5,
scr_name => 'linregrPredict'));
The output is the following.
Species Petal_Length Pred_Petal_Width
setosa 1.2 0.0653133202
versicolor 4.5 1.632087234
setosa 1.3 0.2420812759
setosa 1.9 0.5181904241
setosa 1.4 0.2162518989
setosa 1.4 0.1732424372
setosa 1.5 0.2510460971
setosa 1.3 0.1907951829
versicolor 3.9 1.1999981051
versicolor 4.2 1.4017887483
versicolor 4 1.2332360562
versicolor 4.8 1.765473067
virginica 5.6 2.0095892178
versicolor 4.7 1.5824801232
Species Petal_Length Pred_Petal_Width
virginica 5.4 2.0623088225
versicolor 4.7 1.6524411804
virginica 5.6 1.9919751044
virginica 5.8 2.1206308288
virginica 5.1 1.7983383572
versicolor 4.4 1.3677441077
20 rows selected.
Run a SELECT
statement that invokes the pyqRowEval
function and return the XML output. Each invocation of script linregrPredict
is applied to 10 rows of data in the SAMPLE_IRIS
table. The XML output is a CLOB; you can call set long [length]
to get more output.
set long 300
SELECT *
FROM table(pyqRowEval(
inp_nam => 'SAMPLE_IRIS',
par_lst => '{"oml_input_type":"pandas.DataFrame",
"modelName":"linregr", "datastoreName":"pymodel", "oml_parallel_flag":true", "oml_service_level":"MEDIUM"}',
out_fmt => 'XML',
row_num => 10,
scr_name => 'linregrPredict'));
The output is the following:
NAME VALUE
<root><pandas_dataFrame><ROW-pandas_dataFrame><Species>setosa</Species><Sepal_Length>5</Sepal_Length><Sepal_Width>3.2</Sepal_Width><Petal_Length>1.2</Petal_Length><Petal_Width>0.2</Petal_Width><Pred_Petal_Width>0.0653133201897007</Pred_Petal_Width></ROW-pandas_dataFrame><ROW-pandas_dataFrame><Species>