13.5.7.2 Create and Store a User-Defined Python Function
Use the oml.script.create
function to add a user-defined Python function to the script repository.
With the oml.script.create
function, you can store a single user-defined Python function in the OML4Py script repository. You can then specify the user-defined Python function as the func
argument to the Embedded Python Execution functions oml.do_eval
, oml.group_apply
, oml.index_apply
, oml.row_apply
, and oml.table_apply
.
You can make the user-defined Python function either private or global. A private user-defined Python function is available only to the owner, unless the owner grants the read privilege to other users. A global user-defined Python function is available to any user.
The syntax of oml.script.create
is the following:
oml.script.create(name, func, is_global=False, overwrite=False)
The name
argument is a string that specifies a name for the user-defined Python function in the Python script repository.
The func
argument is the Python function to run. The argument can be a Python function or a string that contains the definition of a Python function. You must specify a string in an interactive session if readline
cannot get the command history.
The is_global
argument is a boolean that specifies whether to create a global user-defined Python function. The default value is False
, which indicates that the user-defined Python function is a private function available only to the current session user. When is_global
is True
, it specifies that the function is global and every user has the read privilege and the execute privilege to it.
The overwrite
argument is a boolean that specifies whether to overwrite the user-defined Python function if it already exists. The default value is False
.
Example 13-11 Using the oml.script.create Function
This example stores two user-defined Python functions in the script repository. It then lists the contents of the script repository using different arguments to the oml.script.dir
function.
Load the iris dataset as a pandas dataframe from the seaborn library. Use the
oml.create
function to create the IRIS database table and the proxy
object for the table.
%python
from sklearn import datasets
import pandas as pd
import oml
# Load the iris data set and create a pandas.DataFrame for it.
iris = datasets.load_iris()
# Create objects containing data for the user-defined functions to use.
x = pd.DataFrame(iris.data,
columns = ['Sepal_Length','Sepal_Width','Petal_Length','Petal_Width'])
y = pd.DataFrame(list(map(lambda x:
{0: 'setosa', 1: 'versicolor', 2:'virginica'}[x], iris.target)),
columns = ['Species'])
# Create the IRIS database table and the proxy object for the table.
try:
oml.drop(table="IRIS")
except:
pass
oml_iris = oml.create(pd.concat([x, y], axis=1), table = 'IRIS')
Create an user-defined function build_lm1
and use
oml.script.create
function to store it in the OML4Py script repository.
The parameter "build_lm1"
is a string that specifies the name of the
user-defined function. The parameter func=build_lm1
is the Python function
to run. Run the user-defined Python function in embedded Python execution.
%python
# Define a function.
build_lm1 = '''def build_lm1(dat):
from sklearn import linear_model
regr = linear_model.LinearRegression()
import pandas as pd
dat = pd.get_dummies(dat, drop_first=True)
X = dat[["Sepal_Width", "Petal_Length", "Petal_Width", "Species_versicolor", "Species_virginica"]]
y = dat[["Sepal_Length"]]
regr.fit(X, y)
return regr'''
# Create a private user-defined Python function.
oml.script.create("build_lm1", func=build_lm1, overwrite=True)
# Run the user-defined Python function in embedded Python execution
res = oml.table_apply(oml_iris, func="build_lm1", oml_input_type="pandas.DataFrame")
res
res.coef_
The output is the following:
array([[ 0.49588894, 0.82924391, -0.31515517, -0.72356196, -1.02349781]])
Define another user-defined function build_lm2
, store the function as a
global script in the OML4Py script repository. Run the user-defined Python function in
embedded Python execution.
%python
# Define another function
build_lm2 = '''def build_lm2(dat):
from sklearn import linear_model
regr = linear_model.LinearRegression()
X = dat[["Petal_Width"]]
y = dat[["Petal_Length"]]
regr.fit(X, y)
return regr'''
# Save the function as a global script to the script repository, overwriting any existing function with the same name.
oml.script.create("build_lm2", func=build_lm2, is_global=True, overwrite=True)
res = oml.table_apply(oml_iris, func="build_lm2", oml_input_type="pandas.DataFrame")
res
The output is the following:
LinearRegression()
List the user-defined Python functions in the script repository available to the current user only.
%python
oml.script.dir()
The output is similar to the following:
name ... date
0 build_lm1 ... 2022-12-15 19:02:44
1 build_mod ... 2022-12-12 23:02:31
2 myFitMultiple ... 2022-12-14 22:30:43
3 sample_iris_table ... 2022-12-14 22:21:24
[4 rows x 4 columns]
List all of the user-defined Python functions available to the current user.
%python
oml.script.dir(sctype='all')
The output is similar to the following:
owner ... date
0 PYQSYS ... 2022-02-11 06:06:44
1 PYQSYS ... 2022-10-19 16:59:50
2 PYQSYS ... 2022-10-19 16:59:52
3 PYQSYS ... 2022-10-19 16:59:53
List the user-defined Python functions available to all users.
%python
oml.script.dir(sctype='global')
The output is similar to the following:
name ... date
0 GLBLM ... 2022-02-11 06:06:44
1 RandomRedDots ... 2022-10-19 16:59:50
2 RandomRedDots2 ... 2022-10-19 16:59:52
3 RandomRedDots3 ... 2022-10-19 16:59:53
4 TEST ... 2021-08-13 17:37:02
5 TEST4 ... 2021-08-13 17:42:49
6 TEST_FUN ... 2021-08-13 22:38:54