Paying for Dedicated AI Clusters

Dedicated AI clusters in OCI Generative AI provide predictable pricing and dedicated capacity for fine-tuning and hosting models.

For OCI Generative AI pretrained models, the following minimum commitments apply:

Hosting clusters: Minimum commitment of 744 unit-hours per hosting cluster.
Fine-tuning clusters: Minimum commitment of 1 unit-hour per fine-tuning job. Some models require at least 2 units for fine-tuning.

Note

Imported models don't require the 744 unit-hour hosting commitment. If you create a dedicated AI cluster to host an imported model, you can host the model without committing to the minimum hosting commitment that applies to OCI Generative AI pretrained and fine-tuned models.

The following examples show how to calculate dedicated AI cluster costs in OCI Generative AI. For on-demand inferencing costs, see Paying for On-Demand Inferencing.

Matching Models to Dedicated Cluster Unit Prices

If you're hosting foundational models or fine-tuning them on dedicated AI clusters, you're charged by the unit hour rather than by transaction.

Go to the pretrained models page and select the model that you want to work with. In the Dedicated AI Cluster for the Model section, find the unit size for the dedicated AI cluster that matches the model and the Pricing Page Information. Then, review the examples in this section to learn how to calculate the cost for using these models.

Important

Some OCI Generative AI foundational pretrained base models supported for the dedicated serving mode are now deprecated and will retire no sooner than 6 months after the release of the 1^st replacement model. You can host a base model, or fine-tune a base model and host the fine-tuned model on a dedicated AI cluster (dedicated serving mode) until the base model is retired. For dedicated serving mode retirement dates, see Retiring the Models.

Hosting a Foundational Model Example 1

John wants to host an instance of the Command R+ 08-2024 (cohere.command-r-plus-08-2024) model on dedicated infrastructure. John deletes the cluster after 40 days and wants to know cost of the cluster. To host a cohere.command-r-plus-08-2024 model, John first needs to identify the unit size that can host the cohere.command-r-plus-08-2024 model. The unit size for cohere.command-r-plus-08-2024 model is a Large Cohere V2_2 unit. See Dedicated AI Cluster for the Model.

John needs a minimum of one Large Cohere V2_2 unit to host the cohere.command-r-plus-08-2024 model. Here are the steps to calculate the cost of a hosting cluster with one Large Cohere V2_2 unit.

Calculate the unit hours for 40 days.

40 days x 24 hours per day x 1 unit = 960 unit hours.

Ensure that the unit hours exceed the minimum commitment for hosting the models.
```
960 unit hours > 744 minimum unit hours
```
Go to AI Pricing and under OCI Generative AI, for Oracle Cloud Infrastructure Generative AI- Large Cohere - Dedicated, find the <Large-Cohere-dedicated-unit-per-hour-price>.
From the dedicated AI cluster section of Dedicated AI Cluster for the Model section, find the multiplier for the cohere.command-r-plus-08-2024 model:
```
                                For Hosting, Multiply the Unit Price: x 2
```

Calculate the price for 40 days.

price = (960 unit hours) x $<Large-Cohere-dedicated-unit-per-hour-price> x 2

Hosting a Foundational Model Example 2

Alice wants to host an instance of the Command R 08-2024 (cohere.command-r-08-2024) model on dedicated infrastructure. To host a cohere.command-r-08-2024 model, Alice first needs to identify the unit size that can host the Command R 08-2024 model. The unit size for Command R 08-2024 is a Small Cohere V2 unit. See Dedicated AI Cluster for the Model.

Alice decides to buy three units of Small Cohere V2 to handle a higher call volume to the model than a single unit would provide. Alice plans to delete the cluster after five days. Here are the steps to calculate the cost of a hosting cluster with three Small Cohere V2 units for five days.

Calculate the unit hours.

5 days x 24 hours per day x 3 units = 360 unit hours.

Compare the unit hours to the minimum commitment for hosting the models.

360 unit hours < 744 minimum unit hours
Alice is charged for 744 unit hours.

Go to AI Pricing and under OCI Generative AI, for Oracle Cloud Infrastructure Generative AI- Small Cohere - Dedicated, find the <Small-Cohere-dedicated-unit-per-hour-price>.
From the Dedicated AI Cluster for the Model section, find the multiplier for the cohere.command-r-08-2024 model.

You don't need to multiply the price for hosting cohere.command-r-08-2024 model.

Calculate the cost for five days.

price = (744 unit hours) x $<Small-Cohere-dedicated-unit-per-hour-price>

Fine-Tuning and Hosting a Model Example

Bob wants to fine-tune a Command R 08-2024 (cohere.command-r-08-2024) model. Bob creates a fine-tuning dedicated AI cluster with the preset value of eight Small Cohere V2 units. Bob creates a custom model on the fine-tuning dedicated AI cluster and fine-tunes the Command R 08-2024 foundational model with training data. The fine-tuning job takes 5 hours to complete. Bob creates a fine-tuning cluster every week.

To host a cohere.command-r-08-2024 model, Bob needs to identify the unit size that can host the cohere.command-r-08-2024 model. The unit size for cohere.command-r-08-2024 model is a Small Cohere V2 unit. See Dedicated AI Cluster for the Model. Bob can host up to 50 fine-tuned models on a single hosting cluster. Here are the steps to calculate the monthly cost for fine-tuning and hosting the models.

Calculate the unit hours for each fine-tuning.

Each fine-tuning cluster requires 8 units and each cluster is active for 5 hours
fine-tuning per cluster = 40 unit-hours

Compare the unit hours to the minimum commitment for fine-tuning the models.
```
40 unit hours > 1 unit hour
```

Calculate the unit hours for hosting.

31 days x 24 hours per day x 1 unit = 744 unit hours

Compare the unit hours to the minimum commitment for hosting the models.
```
744 unit hours = 744 minimum unit hours
```
Go to AI Pricing and under OCI Generative AI, for Oracle Cloud Infrastructure Generative AI- Small Cohere - Dedicated, find the <Small-Cohere-dedicated-unit-per-hour-price>.

Find the total monthly price.

fine-tuning price = (40 unit hours) per week x (4 weeks) x $<Small-Cohere-dedicated-unit-per-hour-price> 
                            
fine-tuning price = 160 x <Small-Cohere-dedicated-unit-per-hour-price>

hosting price = (744 unit hours) x $<Small-Cohere-dedicated-unit-per-hour-price>

total monthly price = (160 + 744 unit hours) x $<Small-Cohere-dedicated-unit-per-hour-price>

Tip

In addition to calculating the price, you can estimate the cost by selecting the AI and Machine Learning category and loading the cost estimator for OCI Generative AI.

Oracle Cloud Infrastructure Documentation

Paying for Dedicated AI Clusters

Matching Models to Dedicated Cluster Unit Prices

Hosting a Foundational Model Example 1

Hosting a Foundational Model Example 2

Fine-Tuning and Hosting a Model Example