About the Embedding Models in Generative AI

The OCI Generative AI embedding models transforms each phrase, sentence, or paragraph that you input, into an array with 384 (light models) or 1024 numbers, depending on the embedding model that you select.

You can use these embeddings for finding similarity in phrases that are similar in context or category. Embeddings are typically stored in a vector database. Embeddings are mostly used for semantic searches where the search function focuses on the meaning of the text that it's searching through rather than finding results based on keywords.

Choosing a Model

Use the Cohere Embed English models to generate text embeddings from English documents.
Use the Cohere Embed Multilingual models when:
- Instead of English, the documents are written in one of the supported languages.
- The documents are written in more than one language and those languages are one of the supported languages.

Create embeddings from an image: For the text and image embed models, such as Cohere Embed English Image V3 you can either add text or add one image only. For the image, you can use API. Image input isn't available in the Console. For API, input a base64 encoded image in each run. For example, a 512 x 512 image is converted to about 1,610 tokens.

Input sizes

You can add sentences, phrases, or paragraphs for embeddings either one phrase at a time, or by uploading a file.
Only files with a .txt extension are allowed.
If you use an input file, each input sentence, phrase, or paragraph in the file must be separated with a newline character.
A maximum of 96 inputs are allowed for each run.
In the Console, each input must be less than 512 tokens for the text only models.
If an input is too long, select whether to cut off the start or the end of the text to fit within the token limit by setting the Truncate parameter to Start or End. If an input exceeds the 512 token limit and the Truncate parameter is set to None, you get an error message.
For the text and image models, you can have files and inputs that all add up to 128,000 tokens.

Visualizing the Embeddings: To visualize the outputs with embeddings, output vectors are projected into two dimensions and plotted as points in the Oracle Cloud Console. Points that are close together correspond to phrases that the model considers similar. Select Export output to get an array of 1024 vectors for each embedding saved in a JSON file.

Use Cases

The following uses cases are ideal for text embeddings.

Semantic search: Search through call transcripts, internal knowledge sources, and so on.
Text classification: Classify intent in customer chat logs and support tickets.
Text clustering: Identify salient topics in customer reviews or new data.
Recommendation systems: Represent podcast descriptions, for example, as a numerical feature to use in a recommendation model.

Embedding Model Parameter

When using the embedding models, you can get a different output by changing the following parameter.

Truncate: Whether to truncate the start or end tokens in a sentence, when that sentence exceeds the maximum number of allowed tokens. For example, a sentence has 516 tokens, but the maximum token size is 512. If you select to truncate the end, the last 4 tokens of that sentence are cut off.

Oracle Cloud Infrastructure Documentation

About the Embedding Models in Generative AI

Embedding Model Parameter