2.1 What Is Oracle Machine Learning for Python

Oracle Machine Learning for Python (OML4Py) provides a Python API to Oracle’s in-database machine learning algorithms, enabling scalable, high-performance model building and scoring directly where the data resides. In addition, OML4Py supports data transformations and statistical, machine learning, and graphical analysis on data stored in or accessible through Oracle AI Database. You can run user-defined Python functions through database-spawned and controlled Python engines, with optional built-in data-parallelism and task-parallelism. This embedded execution functionality enables calling user-defined functions from SQL, and on Autonomous Database, from REST. OML4Py also supports Automated Machine Learning (AutoML) for algorithm and feature selection, and model tuning and selection. You can further augment the built-in functionality with third-party packages from the Python ecosystem.

OML4Py is a Python module that enables Python users to manipulate data in database tables and views using Python syntax. OML4Py functions and methods transparently translate a select set of Python functions into SQL for in-database execution.

OML4Py is available in the following Oracle AI Database environments:

  • OML4Py is available in the Python interpreter in Oracle Machine Learning Notebooks in your Oracle Autonomous Database. For more information, see Use the Python Interpreter in a Notebook Paragraph in Using Oracle Machine Learning Notebooks.

  • An OML4Py client connection to OML4Py in an on-premises Oracle AI Database instance.

    For this environment, you must install Python, the required Python libraries, and the OML4Py server components in the database, and you must install the OML4Py client. See Install OML4Py for On-Premises Databases.

Designed for problems involving both large and small volumes of data, OML4Py integrates Python with the database. With OML4Py, you can do the following:

  • Run overloaded Python functions and use native Python syntax to manipulate in-database data, without having to learn SQL.

  • Use Automated Machine Learning (AutoML) to enhance user productivity and machine learning results through automated algorithm and feature selection, as well as model tuning and selection.

  • Use Embedded Python Execution to run user-defined Python functions in Python engines spawned and managed by the database environment. The user-defined functions and data are automatically loaded to the engines as required, and when data-parallel and task-parallel execution is enabled. Develop, refine, and deploy user-defined Python functions and machine learning models that leverage the parallelism and scalability of the database to automate data preparation and machine learning.

  • Use a natural Python interface to build in-database machine learning models.