Import Pretrained Models in ONNX Format

This topic describes about ONNX Pipeline Models for converting pretrained models to ONNX format.

Open Neural Network Exchange or ONNX is an open standard format of machine learning models. Consider the following use cases:
  • Using complex media input such as text or image for similarity search
  • Perform text classification
  • Perform reranking

You need vector embedding of the media input to perform all the tasks mentioned above. ONNX pipeline models allows you to convert text and/or image models into ONNX format if they are not already in ONNX format, import the ONNX format models into Oracle Database, and generate embeddings from your data within the database. The ONNX format models imported to the database could be used for tasks such as search and classification.

Pretrained models are models that are already trained on a media data (text, image, etc.) and saved to a storage format for future use. Hugging Face is the most popular platform that hosts pretrained models typically created with PyTorch.

The Python package oml.utils contains three classes: ONNXPipeline, ONNXPipelineConfig, and MiningFunction. The package handles ONNX pipeline generation and export, while also incorporating the necessary pre- and post-processing steps.
  • ONNXPipeline : Allows you to import a pretrained model (Your own pretrained model or one of the supported models from Hugging Face)
  • ONNXPipelineConfig : Allows you to configure the attributes of pretrained model (Your own pretrained model or one of the supported models from Hugging Face)
  • MiningFunction: Allows you to choose from one of the following mining options:
    • EMBEDDING : Corresponds to text and image embedding
    • CLASSIFICATION : Corresponds to text classification
    • REGRESSION : Corresponds to re-ranking

WARNING:

EmbeddingModel and EmbeddingModelConfig are deprecated. Instead, please use ONNXPipeline and ONNXPipelineConfig respectively. The details of the deprecated classes can be found in Python Classes to Convert Pretrained Models to ONNX Models (Deprecated). If a you choose to use a deprecated class, a warning message will be shown indicating that the classes will be removed in the future and advising the user to switch to the new class.

The ONNX pipeline models are available for text embedding, image embedding, multi-modal embedding, text classification and re-ranking.