Embedding Model

The embedding model converts documents into vector representations for efficient retrieval, and is what all operations within a document store revolve around. Users can choose between local and remote models depending on their performance, cost, and privacy requirements.

Any ONNX-compatible model available on Hugging Face can be specified, and will be automatically downloaded to each Coherence RAG cluster member, and run locally using ONNX Runtime.

The syntax used to specify the model is `<provider>/<modelName>`. In the configuration example above, `sentence-transformers` is the provider, while the `all-mpnet-base-v2` is the model name. Not coincidentally, this matches exactly what you get when you click on the 'Copy model name to clipboard' link next to the model name on Hugging Face.

The remote models use the same syntax, but in this case the provider is the name of the cloud provider hosting the model, such as `OCI` or `OpenAI` , while the model name is the actual name of the model as defined by the provider, such as `cohere.embed-multilingual-v3.0` for OCI, or `text-embedding-3-small` for OpenAI.

Coherence RAG also supports Ollama, which provides somewhat a hybrid approach: a self-hosted remote model. To use Ollama hosted model, you would use `ollama` as the provider name, and the actual name of the model hosted in Ollama as the model name.