Artificial Intelligence Vectors and Semantic Search

Oracle Database 23ai introduces semantic search capabilities using Artificial Intelligence (AI) vector search. These capabilities include a new vector data type, vector indexes, and vector search SQL operators that allow the database to store semantic document content, images, and other unstructured data as vectors and then run fast performing similarity queries. The key innovation is that the database better understands user intent and the search context to find similar matches, rather than only find exact matches.

Building upon this capability, the Oracle database can use Generative AI constructs, such as Retrieval Augmented Generation (RAG) to combine large language models (LLMs) and private business data to respond to natural language questions. RAG provides higher accuracy and avoids exposing any of the private data to the LLM training data.

Oracle Database Vector Data Types

There are two Oracle database vector types: dense and sparse. Dense vectors were the first AI vector type Oracle introduced in Oracle Database 23ai. They are ideal for data sets in which nearly every dimension stores a non-zero value, meaning they hold much rich and complex information, patterns, and relationships. A great use case for dense vectors is semantic search when word meaning nuances better identify matches.

On the other hand, sparse vectors typically have a large number of dimensions and few non-zero dimension values. To improve efficiency, they only store non-zero values. That makes sparse vectors more storage and memory efficient. Moreover, ODP.NET can process them faster than dense vectors assuming the data profile has few non-zero dimension values. They are suited for other AI application types, such as text retrieval systems in which a search phrase identifies relevant matching documents.

ODP.NET Vectors

Starting with version 23.3.2, managed ODP.NET and ODP.NET Core support Oracle database vector data access as existing .NET or ODP.NET data types: arrays, strings, CLOBs, and VARCHAR2s. In managed ODP.NET and ODP.NET Core 23.8, new ODP.NET vector-specific types for dense vectors (OracleVector) and sparse vectors (OracleSparseVector) are introduced for richer user interfaces that more naturally and easily operate on vectors.

ODP.NET vector data types are flexible to use. They have one or more dimensions, up to 65,536 (64K) dimensions. Developers can define vectors with a fixed or variable number of dimensions. The numeric format of each dimension in a vector can be of the same numeric format or a different numeric format for each dimension:

Table 3-28 Dense Vector Numeric Formats

Vector Numeric Format Description Mapped .NET Type Mapped ODP.NET Type Size Examples

BINARY

packed UINT8 bytes where each dimension is a single bit

Byte

OracleVector

1 byte

Vector(1024, BINARY), Vector(1024, BINARY, DENSE)

INT8

8-bit integers

Int16

OracleVector

2 bytes

Vector(4, INT8), Vector(4, INT8, DENSE)

FLOAT32

32-bit floating point

Float

OracleVector

4 bytes

Vector(768, FLOAT32), Vector(768, FLOAT32, DENSE)

FLOAT64

64-bit floating point

Double

OracleVector

8 bytes

Vector(10000, FLOAT64), Vector(10000, FLOAT64, DENSE)

For example, a one dimension INT8 vector would be Vector(1, INT8). A four dimension FLOAT32 vector would be Vector(4, FLOAT32). A variable dimension FLOAT64 vector would be Vector(*, FLOAT64).

To specify that each vector dimension can be either an INT8, FLOAT32, or FLOAT64 data type, use the * notation, such as Vector(4, *) to represent a four dimensional vector. You may specify a vector as Vector(*, *) to have a variable number of dimensions and variable data type vector.

Table 3-29 Sparse Vector Numeric Formats

Vector Numeric Format Description Mapped .NET Type Mapped ODP.NET Type Size Example

INT8

8-bit integers

Int16

OracleSparseVector

2 bytes

Vector(4, INT8, SPARSE)

FLOAT32

32-bit integers

Float

OracleSparseVector

4 bytes

Vector(768, FLOAT32, SPARSE)

FLOAT64

64-bit integers

Double

OracleSparseVector

8 bytes

Vector(10000, FLOAT64, SPARSE)

ODP.NET OracleVector (dense) can use all four numeric formats. ODP.NET OracleSparseVector (sparse) can use all except BINARY.

ODP.NET can retrieve the database vector data type in .NET as any of the following:

Data Type ODP.NET or .NET Data Type Description

.NET numeric array

.NET

This array type can be Int16[], float[], or double[] depending on the vector's numeric format.

.NET byte array

.NET

Only Oracle dense vectors can be retrieved as .NET byte arrays. Oracle BINARY numeric format must be retrieved as a .NET byte array.

Note:

InvalidCastException is thrown for any Get*Array() methods, except GetByteArray() method, for BINARY format.

.NET string

.NET

This contains the vector representation in JSON format.

OracleString

ODP.NET

This contains the vector representation in JSON format.

OracleClob

ODP.NET

This contains the vector representation in JSON format.

OracleVector

ODP.NET

ODP.NET native dense vector data type. This is the base type for all ODP.NET native vector data types.

OracleSparseVector

ODP.NET

ODP.NET native sparse vector data type.

ODP.NET can bind vector data as the following .NET types for SQL or stored procedure execution:

Data Type .NET or ODP.NET Data Type Bind Type Description

OracleVector, OracleSparseVector

ODP.NET

OracleDbType.Vector

ODP.NET native vector data types. ODP.NET will bind the Vector data type as the OracleVector’s ProviderType.

OracleVector

ODP.NET

OracleDbType.Vector_Binary,

OracleDbType.Vector_Int8,

OracleDbType.Vector_Float32,

OracleDbType.Vector_Float64

ODP.NET native dense vector data type. ODP.NET will bind the Vector data type based on the binding type accordingly.

OracleSparseVector

ODP.NET

OracleDbType.Vector_Int8,

OracleDbType.Vector_Float32,

OracleDbType.Vector_Float64

ODP.NET native sparse vector data type. ODP.NET will bind the Vector data type based on the binding type accordingly.

Any numeric array

.NET

OracleDbType.Vector

This array type can be any numeric array type. ODP.NET will infer the numeric array type and bind the Vector data type as OracleDbType.Vector_Int8, OracleDbType.Vector_Float32, or OracleDbType.Vector_Float64 accordingly.

Any numeric array

.NET

OracleDbType.Vector_Int8,

OracleDbType.Vector_Float32,

OracleDbType.Vector_Float64

This array type can be any numeric array type. ODP.NET will bind the Vector data type based on the binding type accordingly.

.NET byte array

.NET

OracleDbType.Vector,

OracleDbType.Vector_Binary,

OracleDbType.Vector_Int8,

OracleDbType.Vector_Float32,

OracleDbType.Vector_Float64

All vector numeric formats can be bound as .NET byte arrays. BINARY vectors can only be bound as .NET byte arrays.

.NET string, OracleString

.NET or ODP.NET

OracleDbType.Vector

This contains the vector representation in JSON format. ODP.NET will bind the Vector data type as OracleDbType.Vector_Float32

.NET string, OracleString

.NET or ODP.NET

OracleDbType.Vector_Int8,

OracleDbType.Vector_Float32,

OracleDbType.Vector_Float64

This contains the vector representation in JSON format. ODP.NET will bind the Vector data type based on the binding type accordingly.

OracleClob

ODP.NET

OracleDbType.Clob

OracleDbType.Varchar2

This contains the vector representation in JSON format.

Note:

Sparse vectors do not support binding as OracleDbType.Vector_Binary nor .NET byte array types.

When using vectors with OracleDataAdapter ReturnProviderSpecificTypes set to false, for dense VECTORs, a .NET numeric array will populate the DataTable or DataSet after the Fill method is called; and for sparse VECTORs, a .NET string representing the sparse vector will be used. When true, OracleVector or OracleSparseVector will be used.

For dense VECTOR columns of '*' numeric format and ReturnProviderSpecificTypes set to false, a mix of short[], float[], double[], and/or byte[] will be populated based on the VECTOR’s numeric format upon an OracleDataAdapter Fill(). For dense VECTOR columns of all other formats, short[], float[], double[], or byte[] will be populated based on the VECTOR column's numeric format.

Since vectors cannot be compared directly with each other, they cannot be used as JOIN keys, ORDER BY keys, GROUP BY keys, or other related scenarios. Thus, for OracleCommandBuilder, vector comparison is not included in the SQL WHERE clause of the generated commands.

Vectors as CLOB or VARCHAR2 Data Type

Unmanaged ODP.NET, as well as managed ODP.NET and ODP.NET Core prior to version 23.3.2, support vectors using existing CLOB or VARCHAR2 data types. In these cases, vectors are stored with a JSON format. Data retrieval and manipulation use existing CLOB or VARCHAR2 or .NET APIs.

In later managed ODP.NET and ODP.NET Core versions, vectors can remain stored as CLOBs or VARCHAR2s for backwards compatibility purposes by setting the OracleConfiguration MapVectorColumnAsClob property to true.

When sparse vectors are sent to ODP.NET versions that do not support sparse vectors natively, the vector data becomes CLOB data in JSON dense vector format.

ODP.NET Vector Sample Code


//This ODP.NET artificial intelligence (AI) vector sample demonstrates how to insert, 
retrieve, update, and delete multi-dimensional FLOAT64, FLOAT32, and INT8 vector data 
types using the Oracle database.

//Requires ODP.NET 23ai (23.3.2) or higher and Oracle Database 23ai (23.4) or higher.
//Add User Id, Password, and Data Source, such as Easy Connect Plus or TNS, to the connection 
string to connect to the DB.

using Oracle.ManagedDataAccess.Client;

namespace VectorDemo
{
    public class VectorDemo
    {
        //Provide User Id, Password, and Data Source values for your database.
        public const string conStr = "User Id=<USER>;Password=<PASSWORD>;Data Source=<DATA SOURCE>;";
        public const int id = 1;

        public static void Main(string[] args)
        {
            using (OracleConnection con = new OracleConnection(conStr))
            {
                using (OracleCommand cmd = con.CreateCommand())
                {
                    try
                    {
                        con.Open();
                        cmd.CommandText = "begin " +
                            "execute immediate 'drop table VectorTable';" +
                            "exception when others then if sqlcode <> -942 then raise;" +
                            "end if;" +
                            "end;";
                        cmd.ExecuteNonQuery();

                        //Create table with 2 dimensional FLOAT64, 3 dimensional FLOAT32, and 4 dimensional INT8 vector columns
                        cmd.CommandText = "create table VectorTable (id number, float64s vector(2, float64), 
float32s vector(3, float32), int8s vector(4, INT8), constraint pk primary key (id))";
                        cmd.ExecuteNonQuery();

                        //Insert vector row into DB
                        InsertVectors();

                        //Update vector values in DB
                        UpdateVectors();

                        //Delete vector values in DB
                        DeleteVectors();
                    }
                    catch (Exception ex)
                    {
                        Console.WriteLine(ex.Message);
                    }
                }
            }
        }
 
        public static void InsertVectors()
        {
            string sql = "insert into VectorTable values (:id, :float64, :float32, :int8)";
            try
            {
                double[] doubles = new double[] { 1.234, 2.345 };
                float[] floats = new float[] { 1.23f, 2.34f, 3.45f };
                short[] int16s = new short[] { 1, 2, 3, 4 };

                OracleConnection con = new OracleConnection(conStr);
                con.Open();

                OracleCommand cmd = new OracleCommand(sql, con);
                cmd.Parameters.Add("id", OracleDbType.Int16, 0, id, System.Data.ParameterDirection.Input);
                
                //Binding FLOAT64 variable as vector - doubles
                cmd.Parameters.Add("float64", OracleDbType.Vector, 0, doubles, System.Data.ParameterDirection.Input);
                //Binding FLOAT32 variable as vector - floats
                cmd.Parameters.Add("float32", OracleDbType.Vector, 0, floats, System.Data.ParameterDirection.Input);
                //Binding INT8 variable as vector - int16s
                cmd.Parameters.Add("int8", OracleDbType.Vector, 0, int16s, System.Data.ParameterDirection.Input);

                // Insert vectors into VectorTable
                cmd.ExecuteNonQuery();

                //Retrieve vector values from DB
                RetrieveVectors();
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex.Message);
            }
        }

        public static void UpdateVectors()
        {
            string sql = "update VectorTable set float64s=:float64, float32s=:float32, int8s=:int8 where id=:id";
            try
            {
                double[] doubles = new double[] { 9.876, 8.765 };
                float[] floats = new float[] { 9.87f, 8.76f, 7.65f };
                short[] int16s = new short[] { 9, 8, 7, 6 };

                OracleConnection con = new OracleConnection(conStr);
                con.Open();

                OracleCommand cmd = new OracleCommand(sql, con);

                //Binding FLOAT64 variable as vector - doubles
                cmd.Parameters.Add("float64", OracleDbType.Vector_Float64, 0, doubles, System.Data.ParameterDirection.Input);
                //Binding FLOAT32 variable as vector - floats
                cmd.Parameters.Add("float32", OracleDbType.Vector_Float32, 0, floats, System.Data.ParameterDirection.Input);
                //Binding INT8 variable as vector - int16s
                cmd.Parameters.Add("int8", OracleDbType.Vector_Int8, 0, int16s, System.Data.ParameterDirection.Input);

                cmd.Parameters.Add("id", OracleDbType.Int16, 0, id, System.Data.ParameterDirection.Input);

                // Update vectors in VectorTable
                cmd.ExecuteNonQuery();

                Console.WriteLine("Database vector values updated!");
                //Retrieve vector values from DB
                RetrieveVectors();
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex.Message);
            }
        }

        public static void DeleteVectors()
        {
            string sql = "delete from VectorTable where id=:id";
            try
            {
                OracleConnection con = new OracleConnection(conStr);
                con.Open();

                OracleCommand cmd = new OracleCommand(sql, con);
                cmd.Parameters.Add("id", OracleDbType.Int16, 0, id, System.Data.ParameterDirection.Input);

                // Delete row with vector values from VectorTable
                cmd.ExecuteNonQuery();

                Console.WriteLine("Database vector values deleted!");
                //Confirm vectors removed
                RetrieveVectors();
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex.Message);
            }
        }
        public static void RetrieveVectors()
        {
            string sql = "select * from VectorTable";
            try
            {
                OracleConnection con = new OracleConnection(conStr);
                con.Open();

                OracleCommand cmd = new OracleCommand(sql, con);
                OracleDataReader reader = cmd.ExecuteReader();

                //Use ODP.NET vector accessors to retrieve data
                if (reader.Read())
                {
                    Console.WriteLine("Retrieve FLOAT64S Vector value:");
                    double[] vecD = reader.GetDoubleArray(1);
                    PrintDoubles(vecD);

                    Console.WriteLine("Retrieve FLOAT32S Vector value:");
                    float[] vecF = reader.GetFloatArray("FLOAT32S");
                    PrintFloats(vecF);

                    Console.WriteLine("Retrieve INT8S vector value:");
                    short[] vecInt8s = reader.GetInt16Array(3);
                    PrintInt16s(vecInt8s);

                    Console.WriteLine();
                }
                //Return no results if no vector row is found
                else 
                    Console.WriteLine("No vector row found.");
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex.Message);
            }
        }

        static public void PrintDoubles(double[] doubles)
        {
            Console.Write("{");
            for (int i = 0; i < doubles.Length - 1; i++)
                Console.Write(doubles[i].ToString() + ", ");
            Console.WriteLine(doubles[doubles.Length - 1].ToString() + "}");
        }

        static public void PrintFloats(float[] floats)
        {
            Console.Write("{");
            for (int i = 0; i < floats.Length - 1; i++)
                Console.Write(floats[i].ToString() + ", ");
            Console.WriteLine(floats[floats.Length - 1].ToString() + "}");
        }

        static public void PrintInt16s(Int16[] int16s)
        {
            Console.Write("{");
            for (int i = 0; i < int16s.Length - 1; i++)
                Console.Write(int16s[i].ToString() + ", ");
            Console.WriteLine(int16s[int16s.Length - 1].ToString() + "}");
        }
    }
}