Using an OpenSearch Pretrained Model

OCI Search with OpenSearch provides built in support for OpenSearch pretrained models.

This topic describes how to register and deploy any of the pretrained hugging sentence transformer models to a cluster by only specifying the model name. See Pretrained Models for the list of approved models. For an end-to-end procedure on how to use an OpenSearch pretrained model for semantic search in OCI Search with OpenSearch, see Semantic Search Walkthrough.

Prerequisites

Before you start, you need to do the following:

Select one of the pretrained models supported by OCI Search with OpenSearch
Confirm that the OpenSearch cluster is version 2.11 or newer.

Update the cluster settings to perform semantic search. The following example includes the recommended settings:

PUT _cluster/settings
{
  "persistent": {
    "plugins": {
      "ml_commons": {
        "only_run_on_ml_node": "false",
        "model_access_control_enabled": "true",
        "native_memory_threshold": "99",
        "rag_pipeline_feature_enabled": "true",
        "memory_feature_enabled": "true",
        "allow_registering_model_via_local_file": "true",
        "allow_registering_model_via_url": "true",
        "model_auto_redeploy.enable":"true",
        "model_auto_redeploy.lifetime_retry_times": 10
      }
    }
  }
}

Step 1: Register the Model Group

Model groups enable you to manage access to specific models. Registering a model group is optional, however if you don't register a model group, ML Commons creates registers a new model group for you, so we recommend that you register the model group.

POST /_plugins/_ml/model_groups/_register
{
  "name": "new_model_group",
  "description": "A model group for local models"
}

Make note of the model_group_id returned in the response:

{
  "model_group_id": "<modelgroupID>",
  "status": "CREATED"
}

Step 2: Register the Model

To register a pretrained model, you need the following:

model_group_id: If you completed Step 1, this is the value for model_group_id to the _register request.
name: The model name for the pretrained model you want to use.
version: The version number for the pretrained model you want to use.
model_format: The format for the model, either TORCH_SCRIPT or ONNX.

POST /_plugins/_ml/models/_register
{
  "name": "huggingface/sentence-transformers/msmarco-distilbert-base-tas-b",
  "version": "1.0.2",
  "model_group_id": "TOD4Zo0Bb0cPUYbzwcpD",
  "model_format": "TORCH_SCRIPT"
}

Make note of the task_id returned in the response, you can use the task_id to check the status of the operation.

For example, from the following response:

{
  "task_id": "TuD6Zo0Bb0cPUYbz3Moz",
  "status": "CREATED"
}

to check the status of the register operation, use the task_id with the Get operation of the Task APIs, as shown in the following example:

GET /_plugins/_ml/tasks/TuD6Zo0Bb0cPUYbz3Moz

When the register operation is complete, the status value in the response to the Get operation is COMPLETED, as shown the following example:

{
  "model_id": "iDf6Zo0BkIugivXi3E7z",
  "task_type": "REGISTER_MODEL",
  "function_name": "TEXT_EMBEDDING",
  "state": "COMPLETED",
  "worker_node": [
    "3qSqVfK2RvGJv1URKfS1bw"
  ],
  "create_time": 1706829732915,
  "last_update_time": 1706829780094,
  "is_async": true
}

Make note of the model_id value returned in the response to use when you deploy the model.

Step 3: Deploy the Model

After the register operation is completed for the model, you can deploy the model to the cluster using the deploy operation of the Model APIs, passing the model_id from the Get operation response in the previous step, as shown in the following example:

POST /_plugins/_ml/models/cleMb4kBJ1eYAeTMFFg4/_deploy

Make note of the task_id returned in the response, you can use the task_id to check the status of the operation.

For example, from the following response:

{
  "task_id": "T-D7Zo0Bb0cPUYbz-cod",
  "task_type": "DEPLOY_MODEL",
  "status": "CREATED"
}

to check the status of the register operation, use the task_id with the Get operation of the Tasks APIs, as shown in the following example:

GET /_plugins/_ml/tasks/T-D7Zo0Bb0cPUYbz-cod

When the deploy operation is complete, the status value in the response to the Get operation is COMPLETED, as shown the following example:

{
  "model_id": "iDf6Zo0BkIugivXi3E7z",
  "task_type": "DEPLOY_MODEL",
  "function_name": "TEXT_EMBEDDING",
  "state": "COMPLETED",
  "worker_node": [
    "3qSqVfK2RvGJv1URKfS1bw"
  ],
  "create_time": 1706829732915,
  "last_update_time": 1706829780094,
  "is_async": true
}

Step 4: Test the Model

Use the Predict API to test the model, as shown in the following example for a text embedding model:

POST /_plugins/_ml/_predict/text_embedding/<your_embedding_model_ID>
{
  "text_docs":[ "today is sunny"],
  "return_number": true,
  "target_response": ["sentence_embedding"]
}

The response contains text embeddings for the provided sentence, as shown in the following response example:

  "inference_results" : [
    {
      "output" : [
        {
          "name" : "sentence_embedding",
          "data_type" : "FLOAT32",
          "shape" : [
            768
          ],
          "data" : [
            0.25517133,
            -0.28009856,
            0.48519906,
            ...
          ]
        }
      ]
    }
  ]
}