Using an OpenSearch Pretrained Model
OCI Search with OpenSearch provides built in support for OpenSearch pretrained models.
This topic describes how to register and deploy any of the pretrained hugging sentence transformer models to a cluster by only specifying the model name. See Pretrained Models for the list of approved models. For an end-to-end procedure on how to use an OpenSearch pretrained model for semantic search in OCI Search with OpenSearch, see Semantic Search Walkthrough.
Prerequisites
Before you start, you need to do the following:
- Select one of the pretrained models supported by OCI Search with OpenSearch
- Confirm that the OpenSearch cluster is version 2.11 or newer.
- Update the cluster settings to perform semantic search. The following example includes the recommended settings:
PUT _cluster/settings { "persistent": { "plugins": { "ml_commons": { "only_run_on_ml_node": "false", "model_access_control_enabled": "true", "native_memory_threshold": "99", "rag_pipeline_feature_enabled": "true", "memory_feature_enabled": "true", "allow_registering_model_via_local_file": "true", "allow_registering_model_via_url": "true", "model_auto_redeploy.enable":"true", "model_auto_redeploy.lifetime_retry_times": 10 } } } }
Step 1: Register the Model Group
Model groups enable you to manage access to specific models. Registering a model group is optional, however if you don't register a model group, ML Commons creates registers a new model group for you, so we recommend that you register the model group.
Register a model group using the register operation in the Model Group APIs, as shown in the following example:
POST /_plugins/_ml/model_groups/_register
{
"name": "new_model_group",
"description": "A model group for local models"
}
Make note of the model_group_id
returned in the response:
{
"model_group_id": "<modelgroupID>",
"status": "CREATED"
}
Step 2: Register the Model
model_group_id
: If you completed Step 1, this is the value for model_group_id to the _register request.name
: The model name for the pretrained model you want to use.version
: The version number for the pretrained model you want to use.model_format
: The format for the model, eitherTORCH_SCRIPT
orONNX
.
Register the model using the register operation from the Model APIs, as shown in the following example:
POST /_plugins/_ml/models/_register
{
"name": "huggingface/sentence-transformers/msmarco-distilbert-base-tas-b",
"version": "1.0.2",
"model_group_id": "TOD4Zo0Bb0cPUYbzwcpD",
"model_format": "TORCH_SCRIPT"
}
Make note of the task_id
returned in the response, you can use the task_id
to check the status of the operation.
For example, from the following response:
{
"task_id": "TuD6Zo0Bb0cPUYbz3Moz",
"status": "CREATED"
}
to check the status of the register operation, use the task_id
with the Get operation of the Task APIs, as shown in the following example:
GET /_plugins/_ml/tasks/TuD6Zo0Bb0cPUYbz3Moz
When the register operation is complete, the status
value in the response to the Get operation is COMPLETED
, as shown the following example:
{
"model_id": "iDf6Zo0BkIugivXi3E7z",
"task_type": "REGISTER_MODEL",
"function_name": "TEXT_EMBEDDING",
"state": "COMPLETED",
"worker_node": [
"3qSqVfK2RvGJv1URKfS1bw"
],
"create_time": 1706829732915,
"last_update_time": 1706829780094,
"is_async": true
}
Make note of the model_id
value returned in the response to use when you deploy the model.
Step 3: Deploy the Model
After the register operation is completed for the model, you can deploy the model to the cluster using the deploy operation of the Model APIs, passing the model_id
from the Get operation response in the previous step, as shown in the following example:
POST /_plugins/_ml/models/cleMb4kBJ1eYAeTMFFg4/_deploy
Make note of the task_id
returned in the response, you can use the task_id
to check the status of the operation.
For example, from the following response:
{
"task_id": "T-D7Zo0Bb0cPUYbz-cod",
"task_type": "DEPLOY_MODEL",
"status": "CREATED"
}
to check the status of the register operation, use the task_id
with the Get operation of the Tasks APIs, as shown in the following example:
GET /_plugins/_ml/tasks/T-D7Zo0Bb0cPUYbz-cod
When the deploy operation is complete, the status
value in the response to the Get operation is COMPLETED
, as shown the following example:
{
"model_id": "iDf6Zo0BkIugivXi3E7z",
"task_type": "DEPLOY_MODEL",
"function_name": "TEXT_EMBEDDING",
"state": "COMPLETED",
"worker_node": [
"3qSqVfK2RvGJv1URKfS1bw"
],
"create_time": 1706829732915,
"last_update_time": 1706829780094,
"is_async": true
}
Step 4: Test the Model
Use the Predict API to test the model, as shown in the following example for a text embedding model:
POST /_plugins/_ml/_predict/text_embedding/<your_embedding_model_ID>
{
"text_docs":[ "today is sunny"],
"return_number": true,
"target_response": ["sentence_embedding"]
}
The response contains text embeddings for the provided sentence, as shown in the following response example:
"inference_results" : [
{
"output" : [
{
"name" : "sentence_embedding",
"data_type" : "FLOAT32",
"shape" : [
768
],
"data" : [
0.25517133,
-0.28009856,
0.48519906,
...
]
}
]
}
]
}