Generate Multi-modal Embeddings Using CLIP

Install Python:

sudo yum install libffi-devel openssl openssl-devel tk-devel xz-devel zlib-devel bzip2-devel readline-devel libuuid-devel ncurses-devel libaio
mkdir -p $HOME/python
wget https://www.python.org/ftp/python/3.12.3/Python-3.12.3.tgz
tar -xvzf Python-3.12.3.tgz --strip-components=1 -C $HOME/python
cd $HOME/python
./configure --prefix=$HOME/python
make clean; make
make altinstall

Set variables PYTHONHOME, PATH, and


                        LD_LIBRARY_PATH

:

export PYTHONHOME=$HOME/python
export PATH=$PYTHONHOME/bin:$PATH
export LD_LIBRARY_PATH=$PYTHONHOME/lib:$LD_LIBRARY_PATH

Create symlink for python3 and pip3:

cd $HOME/python/bin
ln -s python3.12 python3
ln -s pip3.12 pip3

Install Oracle Instant client if you will be exporting embedded models to the database from Python. If you will be exporting to a file, skip steps 4 and 5 and see the note under environment variables in step 6:

cd $HOME
wget https://download.oracle.com/otn_software/linux/instantclient/2340000/instantclient-basic-linux.x64-23.4.0.24.05.zip
unzip instantclient-basic-linux.x64-23.4.0.24.05.zip

Set variable LD_LIBRARY_PATH:

export LD_LIBRARY_PATH=$HOME/instantclient_23_4:$LD_LIBRARY_PATH

Create an environment file, for example, env.sh, that defines the Python and Oracle Instant client environment variables and source these environment variables before each OML4Py client session. Alternatively, add the environment variable definitions to .bashrc so they are defined when the user logs into their Linux machine.

# Environment variables for Python
export PYTHONHOME=$HOME/python
export PATH=$PYTHONHOME/bin:$PATH
export LD_LIBRARY_PATH=$PYTHONHOME/lib:$LD_LIBRARY_PATH

Note:

Environment variable for Oracle Instant Client - only if the Oracle Instant Client is installed for exporting models to the database.

export LD_LIBRARY_PATH=$HOME/instantclient_23_4:$LD_LIBRARY_PATH

.

Create a file named requirements.txt that contains the required third-party packages listed below.

--extra-index-url https://download.pytorch.org/whl/cpu
pandas==2.1.1
setuptools==68.0.0
scipy==1.12.0
matplotlib==3.8.4
oracledb==2.2.0
scikit-learn==1.4.1.post1
numpy==1.26.4
onnxruntime==1.17.0
onnxruntime-extensions==0.10.1
onnx==1.16.0
torch==2.2.0+cpu
transformers==4.38.1
sentencepiece==0.2.0

Upgrade pip3 and install the packages listed in requirements.txt.

pip3 install --upgrade pip
pip3 install -r requirements.txt

Install OML4Py client. Download OML4Py 2.0.1 client from OML4Py download page and upload it to the Linux machine.

unzip oml4py-client-linux-x86_64-2.0.1.zip
pip3 install client/oml-2.0-cp312-cp312-linux_x86_64.whl

Get a list of all preconfigured models. Start Python and import ONNXPipelineConfig from oml.utils.

python3

from oml.utils import ONNXPipelineConfig

ONNXPipelineConfig.show_preconfigured()

['sentence-transformers/all-mpnet-base-v2',
'sentence-transformers/all-MiniLM-L6-v2',
'sentence-transformers/multi-qa-MiniLM-L6-cos-v1',
'sentence-transformers/distiluse-base-multilingual-cased-v2',
'sentence-transformers/all-MiniLM-L12-v2',
'BAAI/bge-small-en-v1.5',
'BAAI/bge-base-en-v1.5',
'taylorAI/bge-micro-v2',
'intfloat/e5-small-v2',
'intfloat/e5-base-v2',
'thenlper/gte-base',
'thenlper/gte-small',
'TaylorAI/gte-tiny',
'sentence-transformers/paraphrase-multilingual-mpnet-base-v2',
'intfloat/multilingual-e5-base',
'intfloat/multilingual-e5-small',
'sentence-transformers/stsb-xlm-r-multilingual',
'Snowflake/snowflake-arctic-embed-xs',
'Snowflake/snowflake-arctic-embed-s',
'Snowflake/snowflake-arctic-embed-m',
'mixedbread-ai/mxbai-embed-large-v1',
'openai/clip-vit-large-patch14',
'google/vit-base-patch16-224',
'microsoft/resnet-18',
'microsoft/resnet-50',
'WinKawaks/vit-tiny-patch16-224',
'Falconsai/nsfw_image_detection',
'WinKawaks/vit-small-patch16-224',
'nateraw/vit-age-classifier',
'rizvandwiki/gender-classification',
'AdamCodd/vit-base-nsfw-detector',
'trpakov/vit-face-expression',
'BAAI/bge-reranker-base']

Use OML4Py to load a multi-modal model on the database.

To use an alternate method other than OML4Py, skip this step and proceed to step 12.

Export a preconfigured embedding model to the database. Import the oml library and import ONNXPipeline and ONNXPipelineConfig from oml.utils. This exports the ONNX-format model to your local file system. In the following steps, replace the placeholders with your own credentials.

import oml
from oml.utils import ONNXPipeline, ONNXPipelineConfig

If your Oracle Database is on premises, set embedded mode to false. This step is not supported or required for Oracle Autonomous Database.

oml.core.methods.__embed__ = False

Create a database connection.

Using Oracle Database on premises:

oml.connect("<user>", "<password>", port=<port number> host="<hostname>", 
service_name="<service name>")

pipeline = ONNXPipeline(model_name="openai/clip-vit-large-patch14")
pipeline.export2db("CLIP")

Using Oracle Autonomous Database:

oml.connect(user="<user>", password="<password>", dsn="myadb_low")

pipeline = ONNXPipeline(model_name="openai/clip-vit-large-patch14")
pipeline.export2db("CLIP")

Once this step is complete, there will be two models loaded on the database called "CLIP_TXT" and "CLIP_IMG".

Export a preconfigured embedding model to a local file.

This exports the ONNX-format model to your local file system:

# Export to file
pipeline = ONNXPipeline(model_name="openai/clip-vit-large-patch14")
pipeline.export2file("clip",output_dir="/tmp/models")

Move the ONNX file to a directory on the database server and create a directory on the file system and in the database for the import.

mkdir -p /tmp/models
sqlplus / as sysdba
alter session set container=<name of pluggable database>;

Apply the necessary permissions and grants.

-- directory to store ONNX files for import
CREATE DIRECTORY ONNX_IMPORT AS '/tmp/models';
-- grant your OML user read and write permissions on the directory
GRANT READ, WRITE ON DIRECTORY ONNX_IMPORT to OMLUSER;
-- grant to allow user to import the model
GRANT CREATE MINING MODEL TO OMLUSER;

Use the DBMS_VECTOR.LOAD_ONNX_MODEL procedure to load the models in your OML user schema. In this example, the procedure loads the ONNX model files named clip_txt.onnx and clip_img.onnx from the ONNX_IMPORT directory into the database as models named CLIP_TXT and CLIP_IMG, respectively.

BEGIN
    DBMS_VECTOR.LOAD_ONNX_MODEL(
    directory => 'ONNX_IMPORT',
    file_name => 'clip_txt.onnx',
    model_name => 'CLIP_TXT',
    metadata => JSON('{"function" : "embedding", "embeddingOutput" : "embedding", "input": {"input": ["DATA"]}}'));
END;
BEGIN
    DBMS_VECTOR.LOAD_ONNX_MODEL(
    directory => 'ONNX_IMPORT',
    file_name => 'clip_img.onnx',
    model_name => 'CLIP_IMG',
    metadata => JSON('{"function" : "embedding", "embeddingOutput" : "embedding", "input": {"input": ["DATA"]}}'));
END;

Verify the models exist using SQL.

sqlplus $USER/pass@PDBNAME;

SELECT model_name, algorithm, mining_function
FROM user_mining_models
WHERE model_name='CLIP_TXT' OR model_name='CLIP_IMG';

--------------------------------------------------------
MODEL_NAME          ALGORITHM            MINING_FUNCTION
--------------------------------------------------------
CLIP_TXT            ONNX                 EMBEDDING
CLIP_IMG            ONNX                 EMBEDDING

Generate embeddings with exported models using Python.

from oracledb import DB_TYPE_BLOB
with open('cat.jpg', 'rb') as f:
    img = f.read()
cr = oml.cursor()
blob = cr. var(DB_TYPE_BLOB)
blob.setvalue(0, img)
data = cr.execute("select vector_embedding(CLIP_TXT using 'RES' as DATA) from dual")
txt_embed = data.fetchall()
data = cr.execute("select vector_embedding(CLIP_IMG using to_blob(:1) as DATA) from dual", [blob])
img_embed = data.fetchall()

Calculate similarity between an image and text using Python:

from oracledb import DB_TYPE_BLOB
with open('cat.jpg', 'rb') as f:
    img = f.read()
cr = oml.cursor()
blob = cr.var(DB_TYPE_BLOB)
blob.setvalue(0, img)
data = cr.execute("""select 1-vector_distance(vector_embedding(CLIP_TXT using 'RES' as DATA), 
                    vector_embedding(CLIP_IMG using to_blob(:1) as DATA)) from dual""", [blob])
data.fetchall()

Result:

[(0.1637756726800217,)]

Generate embeddings with the exported models using SQL:

SELECT VECTOR_EMBEDDING(CLIP_TXT USING 'RES' as DATA) AS embedding;

An example of results are shown in the following excerpt:

EMBEDDING
--------------------------------------------------------------------------------
[2.86132172E-002,-5.59654366E-003,8.66401661E-003,-1.4299524E-002,1.02012949E-00
2,6.00034464E-003,1.86244473E-002,-7.81036681E-003, ...

SELECT VECTOR_EMBEDDING(CLIP_IMG USING TO_BLOB(BFILENAME('ONNX_IMPORT', 'cat.jpg')) as DATA) AS embedding;

An example of results are shown in the following excerpt:

EMBEDDING
--------------------------------------------------------------------------------
[-2.2028232E-002,1.29058748E-003,-2.0222881E-004,4.58140159E-003,1.98919605E-002
,-9.51210782E-003,8.22519697E-003,1.06151737E-002, ...

Calculate the similarity between an image and text using SQL:

SELECT 1-VECTOR_DISTANCE(VECTOR_EMBEDDING(
            CLIP_IMG USING TO_BLOB(BFILENAME('ONNX_IMPORT', 'cat.jpg')) as DATA),
            VECTOR_EMBEDDING(CLIP_TXT USING 'RES' as DATA)) AS similarity;

Example result:

SIMILARITY
----------
1.638E-001