13.5.1 About Python API for Embedded Python Execution
You may choose to run your functions in a data-parallel or task-parallel manner in one or more of these Python engines. In data-parallel processing, the data is partitioned and the same user-defined Python function of each data subset is invoked using one or more Python engines. In task-parallel processing, a user-defined function is invoked multiple times in one or more Python engines with a unique index passed in as an argument; for example, you may use task parallelism for Monte Carlo simulations in which you use the index to set a random seed.
The following table lists the Python functions for Embedded Python Execution.
Function | Description |
---|---|
oml.do_eval |
Runs a user-defined Python function in a Python engine spawned and managed by the database environment. |
oml.group_apply |
Partitions a database table by the values in one or more columns and runs the provided user-defined Python function on each partition. |
oml.index_apply |
Runs a Python function multiple times, passing in a unique index of the invocation to the user-defined function. |
oml.row_apply |
Partitions a database table into sets of rows and runs the provided user-defined Python function on the data in each set. |
oml.table_apply |
Runs a Python function on data in the database as a
single |
About Special Control Arguments
Special control arguments control what happens before or after the
running of the function that you pass to an Embedded Python Execution function. You
specify a special control argument with the **kwargs
parameter of a
function such as oml.do_eval
. The control arguments are not passed
to the function specified by the func
argument of that
function.
Table 13-1 Special Control Arguments
Argument | Description |
---|---|
oml_input_type |
Identifies the type of input data object that you
are supplying to the The input types are the following:
If all columns are numeric, then default type is a
2-dimensional |
oml_na_omit |
Controls the handling of missing values in the input
data. If you specify |
About Output
When a user-defined Python function runs in OML4Py, by default it returns
the Python objects returned by the function. Also, OML4Py captures all
matplotlib.figure.Figure
objects created by the user-defined
Python function and converts them into PNG format.
If graphics = True
, the Embedded Python Execution
functions return oml.embed.data_image._DataImage
objects. The
oml.embed.data_image._DataImage
class contains Python objects
and PNG images. Calling the method __repr__()
displays the PNG
images and prints out the Python object. By default, .dat
returns
the Python object that the user-defined Python function returned;
.img
returns a list containing PNG image data for each
figure.
About the Script Repository
Embedded Python Execution includes the ability to create and store user-defined Python functions in the OML4Py script repository, grant or revoke the read privilege to a user-defined Python function, list the available user-defined Python functions, load user-defined Python functions into the Python environment, or drop a user-defined Python function from the script repository.
Along with whatever other actions a user-defined Python function performs, it can also create, retrieve, and modify Python objects that are stored in OML4Py datastores.
In Embedded Python Execution, a user-defined Python function runs in one or more Python engines spawned and managed by the database environment. The engines are dynamically started and managed by the database. From the same user-defined Python function you can get structured data and PNG images.
You can make the user-defined Python function either private or global. A global function is available to any user. A private function is available only to the owner or to users to whom the owner of the function has granted the read privilege.
Parent topic: Python API for Embedded Python Execution