Generate Multi-modal Embeddings Using CLIP

This section provides end-to-end instructions from installing the OML4Py client to generating multi-modal embeddings using CLIP.

These instructions assume you have configured your Oracle Linux 8 repo in /etc/yum.repos.d, configured a Wallet if using an Autonomous Database, and set up a proxy if needed.
  1. Install Python:
    sudo yum install libffi-devel openssl openssl-devel tk-devel xz-devel zlib-devel bzip2-devel readline-devel libuuid-devel ncurses-devel libaio
    mkdir -p $HOME/python
    wget https://www.python.org/ftp/python/3.12.3/Python-3.12.3.tgz
    tar -xvzf Python-3.12.3.tgz --strip-components=1 -C $HOME/python
    cd $HOME/python
    ./configure --prefix=$HOME/python
    make clean; make
    make altinstall
  2. Set variables PYTHONHOME, PATH, and LD_LIBRARY_PATH:
    export PYTHONHOME=$HOME/python
    export PATH=$PYTHONHOME/bin:$PATH
    export LD_LIBRARY_PATH=$PYTHONHOME/lib:$LD_LIBRARY_PATH
  3. Create symlink for python3 and pip3:
    cd $HOME/python/bin
    ln -s python3.12 python3
    ln -s pip3.12 pip3
  4. Install Oracle Instant client if you will be exporting embedded models to the database from Python. If you will be exporting to a file, skip steps 4 and 5 and see the note under environment variables in step 6:
    cd $HOME
    wget https://download.oracle.com/otn_software/linux/instantclient/2340000/instantclient-basic-linux.x64-23.4.0.24.05.zip
    unzip instantclient-basic-linux.x64-23.4.0.24.05.zip
  5. Set variable LD_LIBRARY_PATH:
    export LD_LIBRARY_PATH=$HOME/instantclient_23_4:$LD_LIBRARY_PATH
  6. Create an environment file, for example, env.sh, that defines the Python and Oracle Instant client environment variables and source these environment variables before each OML4Py client session. Alternatively, add the environment variable definitions to .bashrc so they are defined when the user logs into their Linux machine.
    # Environment variables for Python
    export PYTHONHOME=$HOME/python
    export PATH=$PYTHONHOME/bin:$PATH
    export LD_LIBRARY_PATH=$PYTHONHOME/lib:$LD_LIBRARY_PATH

    Note:

    Environment variable for Oracle Instant Client - only if the Oracle Instant Client is installed for exporting models to the database.

    export LD_LIBRARY_PATH=$HOME/instantclient_23_4:$LD_LIBRARY_PATH

    .
  7. Create a file named requirements.txt that contains the required third-party packages listed below.
    --extra-index-url https://download.pytorch.org/whl/cpu
    pandas==2.1.1
    setuptools==68.0.0
    scipy==1.12.0
    matplotlib==3.8.4
    oracledb==2.2.0
    scikit-learn==1.4.1.post1
    numpy==1.26.4
    onnxruntime==1.17.0
    onnxruntime-extensions==0.10.1
    onnx==1.16.0
    torch==2.2.0+cpu
    transformers==4.38.1
    sentencepiece==0.2.0
  8. Upgrade pip3 and install the packages listed in requirements.txt.
    pip3 install --upgrade pip
    pip3 install -r requirements.txt
  9. Install OML4Py client. Download OML4Py 2.0.1 client from OML4Py download page and upload it to the Linux machine.
    unzip oml4py-client-linux-x86_64-2.0.1.zip
    pip3 install client/oml-2.0-cp312-cp312-linux_x86_64.whl
  10. Get a list of all preconfigured models. Start Python and import ONNXPipelineConfig from oml.utils.
    python3
    
    from oml.utils import ONNXPipelineConfig
    
    ONNXPipelineConfig.show_preconfigured()
    ['sentence-transformers/all-mpnet-base-v2',
    'sentence-transformers/all-MiniLM-L6-v2',
    'sentence-transformers/multi-qa-MiniLM-L6-cos-v1',
    'sentence-transformers/distiluse-base-multilingual-cased-v2',
    'sentence-transformers/all-MiniLM-L12-v2',
    'BAAI/bge-small-en-v1.5',
    'BAAI/bge-base-en-v1.5',
    'taylorAI/bge-micro-v2',
    'intfloat/e5-small-v2',
    'intfloat/e5-base-v2',
    'thenlper/gte-base',
    'thenlper/gte-small',
    'TaylorAI/gte-tiny',
    'sentence-transformers/paraphrase-multilingual-mpnet-base-v2',
    'intfloat/multilingual-e5-base',
    'intfloat/multilingual-e5-small',
    'sentence-transformers/stsb-xlm-r-multilingual',
    'Snowflake/snowflake-arctic-embed-xs',
    'Snowflake/snowflake-arctic-embed-s',
    'Snowflake/snowflake-arctic-embed-m',
    'mixedbread-ai/mxbai-embed-large-v1',
    'openai/clip-vit-large-patch14',
    'google/vit-base-patch16-224',
    'microsoft/resnet-18',
    'microsoft/resnet-50',
    'WinKawaks/vit-tiny-patch16-224',
    'Falconsai/nsfw_image_detection',
    'WinKawaks/vit-small-patch16-224',
    'nateraw/vit-age-classifier',
    'rizvandwiki/gender-classification',
    'AdamCodd/vit-base-nsfw-detector',
    'trpakov/vit-face-expression',
    'BAAI/bge-reranker-base']
  11. Use OML4Py to load a multi-modal model on the database.
    • To use an alternate method other than OML4Py, skip this step and proceed to step 12.
    Export a preconfigured embedding model to the database. Import the oml library and import ONNXPipeline and ONNXPipelineConfig from oml.utils. This exports the ONNX-format model to your local file system. In the following steps, replace the placeholders with your own credentials.
    import oml
    from oml.utils import ONNXPipeline, ONNXPipelineConfig

    If your Oracle Database is on premises, set embedded mode to false. This step is not supported or required for Oracle Autonomous Database.

    oml.core.methods.__embed__ = False

    Create a database connection.

    • Using Oracle Database on premises:
      oml.connect("<user>", "<password>", port=<port number> host="<hostname>", 
      service_name="<service name>")
      
      pipeline = ONNXPipeline(model_name="openai/clip-vit-large-patch14")
      pipeline.export2db("CLIP")
    • Using Oracle Autonomous Database:
      oml.connect(user="<user>", password="<password>", dsn="myadb_low")
      
      pipeline = ONNXPipeline(model_name="openai/clip-vit-large-patch14")
      pipeline.export2db("CLIP")
      

    Once this step is complete, there will be two models loaded on the database called "CLIP_TXT" and "CLIP_IMG".

  12. Export a preconfigured embedding model to a local file.

    This exports the ONNX-format model to your local file system:

    # Export to file
    pipeline = ONNXPipeline(model_name="openai/clip-vit-large-patch14")
    pipeline.export2file("clip",output_dir="/tmp/models")

    Move the ONNX file to a directory on the database server and create a directory on the file system and in the database for the import.

    mkdir -p /tmp/models
    sqlplus / as sysdba
    alter session set container=<name of pluggable database>;

    Apply the necessary permissions and grants.

    -- directory to store ONNX files for import
    CREATE DIRECTORY ONNX_IMPORT AS '/tmp/models';
    -- grant your OML user read and write permissions on the directory
    GRANT READ, WRITE ON DIRECTORY ONNX_IMPORT to OMLUSER;
    -- grant to allow user to import the model
    GRANT CREATE MINING MODEL TO OMLUSER;

    Use the DBMS_VECTOR.LOAD_ONNX_MODEL procedure to load the models in your OML user schema. In this example, the procedure loads the ONNX model files named clip_txt.onnx and clip_img.onnx from the ONNX_IMPORT directory into the database as models named CLIP_TXT and CLIP_IMG, respectively.

    BEGIN
        DBMS_VECTOR.LOAD_ONNX_MODEL(
        directory => 'ONNX_IMPORT',
        file_name => 'clip_txt.onnx',
        model_name => 'CLIP_TXT',
        metadata => JSON('{"function" : "embedding", "embeddingOutput" : "embedding", "input": {"input": ["DATA"]}}'));
    END;
    BEGIN
        DBMS_VECTOR.LOAD_ONNX_MODEL(
        directory => 'ONNX_IMPORT',
        file_name => 'clip_img.onnx',
        model_name => 'CLIP_IMG',
        metadata => JSON('{"function" : "embedding", "embeddingOutput" : "embedding", "input": {"input": ["DATA"]}}'));
    END;
  13. Verify the models exist using SQL.
    sqlplus $USER/pass@PDBNAME;
    SELECT model_name, algorithm, mining_function
    FROM user_mining_models
    WHERE model_name='CLIP_TXT' OR model_name='CLIP_IMG';
    --------------------------------------------------------
    MODEL_NAME          ALGORITHM            MINING_FUNCTION
    --------------------------------------------------------
    CLIP_TXT            ONNX                 EMBEDDING
    CLIP_IMG            ONNX                 EMBEDDING
  14. Generate embeddings with exported models using Python.
    from oracledb import DB_TYPE_BLOB
    with open('cat.jpg', 'rb') as f:
        img = f.read()
    cr = oml.cursor()
    blob = cr. var(DB_TYPE_BLOB)
    blob.setvalue(0, img)
    data = cr.execute("select vector_embedding(CLIP_TXT using 'RES' as DATA) from dual")
    txt_embed = data.fetchall()
    data = cr.execute("select vector_embedding(CLIP_IMG using to_blob(:1) as DATA) from dual", [blob])
    img_embed = data.fetchall()

    Calculate similarity between an image and text using Python:

    from oracledb import DB_TYPE_BLOB
    with open('cat.jpg', 'rb') as f:
        img = f.read()
    cr = oml.cursor()
    blob = cr.var(DB_TYPE_BLOB)
    blob.setvalue(0, img)
    data = cr.execute("""select 1-vector_distance(vector_embedding(CLIP_TXT using 'RES' as DATA), 
                        vector_embedding(CLIP_IMG using to_blob(:1) as DATA)) from dual""", [blob])
    data.fetchall()

    Result:

    [(0.1637756726800217,)]

    Generate embeddings with the exported models using SQL:

    • SELECT VECTOR_EMBEDDING(CLIP_TXT USING 'RES' as DATA) AS embedding;

      An example of results are shown in the following excerpt:

      EMBEDDING
      --------------------------------------------------------------------------------
      [2.86132172E-002,-5.59654366E-003,8.66401661E-003,-1.4299524E-002,1.02012949E-00
      2,6.00034464E-003,1.86244473E-002,-7.81036681E-003, ...
    • SELECT VECTOR_EMBEDDING(CLIP_IMG USING TO_BLOB(BFILENAME('ONNX_IMPORT', 'cat.jpg')) as DATA) AS embedding;

      An example of results are shown in the following excerpt:

      EMBEDDING
      --------------------------------------------------------------------------------
      [-2.2028232E-002,1.29058748E-003,-2.0222881E-004,4.58140159E-003,1.98919605E-002
      ,-9.51210782E-003,8.22519697E-003,1.06151737E-002, ...

    Calculate the similarity between an image and text using SQL:

    SELECT 1-VECTOR_DISTANCE(VECTOR_EMBEDDING(
                CLIP_IMG USING TO_BLOB(BFILENAME('ONNX_IMPORT', 'cat.jpg')) as DATA),
                VECTOR_EMBEDDING(CLIP_TXT USING 'RES' as DATA)) AS similarity;

    Example result:

    SIMILARITY
    ----------
    1.638E-001