Custom Distance Function
JavaScript user-defined functions can be used to define a custom vector distance. This provides greater flexibility in the types of distance equations that can be employed, extending vector search functionality to a broader range of use cases.
A custom distance function is created by a user-defined JavaScript function defined
in a Multilingual Engine (MLE) inline call specification. The signature of the function
must match the signature of existing built-in distance functions. As in, it must accept
exactly two arguments of type VECTOR
and return a
BINARY_DOUBLE
. The function signature must also include the
DETERMINISTIC
keyword. The following function definition provides
an example of a custom distance function, in this case implementing the Euclidean
Squared distance:
CREATE OR REPLACE FUNCTION euclidean_sq_vector_distance("a" VECTOR, "b" VECTOR)
RETURN BINARY_DOUBLE
DETERMINISTIC PARALLEL_ENABLE
AS MLE LANGUAGE JAVASCRIPT PURE
{{
let len = a.length;
let sum = 0;
for(let i = 0; i < len; i++) {
const tmp = a[i] - b[i];
sum += tmp * tmp;
}
return sum;
}};
/
Custom distance functions can be used with HNSW vector indexes. If the
degree of parallelism in the vector index is greater than 1, the custom distance
function must include the PARALLEL_ENABLE
clause. Upon index creation,
a custom distance can be specified by name in the DISTANCE
clause. In
queries, the custom distance can be used in the ORDER BY
clause and in
the SELECT
list. The distance function tied to a vector index can be
viewed by querying the VECSYS.VECTOR$INDEX
view.
When used in the creation of an HNSW index, the PURE
keyword must be
specified in the MLE call specification. The PURE
clause indicates that
the JavaScript program should be run in a restricted execution context, which guarantees
that the code will not modify stateful objects, such as database tables or PL/SQL
packages, regardless of database privileges currently in effect. A user-defined function
used to create a custom distance metric only handles computations on function inputs,
which do not require access to database state. Restricted contexts provide an extra
layer of security by prohibiting unwanted database modifications. For more information
on restricted execution contexts and the PURE
keyword, see Oracle Database JavaScript Developer's
Guide.
In order to use a vector index dependent on a custom distance function, you
must have EXECUTE
privileges on the function specified during index
creation. You must also have EXECUTE
privileges on
JAVASCRIPT
. For vector indexes, only definer's rights are
supported.
If the distance function is modified, the associated vector index will be in
an UNUSABLE
state.
Note:
The use of custom distance functions with IVF indexes is not currently supported.Use the previously created custom distance function,
euclidean_sq_custom_distance
, to first create a vector index:
CREATE TABLE custom_dist_tab( id NUMBER, data_vector VECTOR(2, FLOAT32));
INSERT INTO custom_dist_tab VALUES (1, vector('[1.1,2.2]', 2, float32));
INSERT INTO custom_dist_tab VALUES (2, vector('[2.2,3.3]', 2, float32));
INSERT INTO custom_dist_tab VALUES (3, vector('[3.3,4.4]', 2, float32));
INSERT INTO custom_dist_tab VALUES (4, vector('[4.4,5.5]', 2, float32));
INSERT INTO custom_dist_tab VALUES (5, vector('[5.5,6.6]', 2, float32));
CREATE VECTOR INDEX cust_dist_idx_hnsw ON custom_dist_tab (data_vector)
ORGANIZATION INMEMORY
NEIGHBOR GRAPH WITH TARGET ACCURACY 95
DISTANCE CUSTOM EUCLIDEAN_SQ_VECTOR_DISTANCE
PARALLEL 3;
The custom distance function can be referenced in similarity search queries in the
ORDER BY
clause or in the SELECT
list:
SELECT data_vector
FROM custom_dist_tab
ORDER BY euclidean_sq_vector_distance(data_vector, VECTOR('[1, 2]'))
FETCH FIRST 5 ROWS ONLY;
SELECT
data_vector,
euclidean_sq_vector_distance(data_vector, VECTOR('[1, 2]')) edist
FROM custom_dist_tab
ORDER BY edist
FETCH FIRST 5 ROWS ONLY;
See Also:
- Oracle Database JavaScript Developer's Guide for information about using inline call specifications to publish JavaScript functions with the Multilingual Engine (MLE)
Parent topic: Vector Distance Metrics