Creating a Dedicated AI Cluster for Hosting Models
Create a dedicated AI cluster resource in OCI Generative AI to host endpoints for pretrained base models and custom models.
Important
- Not Available on-demand: All OCI Generative AI foundational pretrained models supported for the on-demand serving mode that use the text generation and summarization APIs (including the playground) are now retired. We recommend that you use the chat models instead.
- Can be hosted on clusters: If you host a summarization or a generation model such as
cohere.command
on a dedicated AI cluster, (dedicated serving mode), you can continue to use that model until it's retired. These models, when hosted on a dedicated AI cluster are only available in US Midwest (Chicago). See Retiring the Models for retirement dates and definitions.
- Note
Clusters take a few minutes to create. After the cluster is in an active state, you can select that cluster to host a model, when you create an endpoint for that model. Use the dedicated-ai-cluster create command and required parameters to create a dedicated AI cluster:
oci generative-ai dedicated-ai-cluster create --compartment-id <compartment-OCID> --type HOSTING --unit-count [integer] --unit-shape [text] [OPTIONS]
For a complete list of parameters and values for CLI commands, see the CLI Command Reference.
Run the CreateDedicatedAiCluster operation to create a dedicated cluster.