Creating an Endpoint

Create an endpoint for a custom, pretrained, or imported model on a hosting dedicated AI cluster in OCI Generative AI.

Important

Disclaimer

Our Content Moderation (CM) and Prompt Injection (PI) guardrails have been evaluated on a range of multilingual benchmark datasets. However, actual performance might vary depending on the specific languages, domains, data distributions, and usage patterns present in customer-provided data as the content is generated by AI and might contain errors or omissions. So, it's intended for informational purposes only, should not be considered professional advice and OCI makes no guarantees that identical performance characteristics will be observed in all real-world deployments. The OCI Responsible AI team is continuously improving these models.

Our content moderation capabilities have been evaluated against RTPLX, one of the largest publicly available multilingual benchmarking datasets, covering more than 38 languages. However, these results should be interpreted with appropriate caution as the content is generated by AI and might contain errors or omissions. Multilingual evaluations are inherently bounded by the scope, representativeness, and annotation practices of public datasets, and performance observed on RTPLX might not fully generalize to all real-world contexts, domains, dialects, or usage patterns. So, the findings are intended to be informational purposes only and should not be considered professional advice.

Note

To add a model to a private endpoint, first create a private endpoint and then return to this page for steps to attach the model.

Private endpoints support pretrained and custom models only. Imported models aren't supported.

  • On the Endpoints list page, select Create endpoint. If you need help finding the list page, see Listing Endpoints.

    Endpoint Information

    1. Select a compartment to create the endpoint in. The default compartment is the same as the list page, but you can select any compartment that you have permission to work in.
      Tip

      We recommend that you create the endpoint in the same compartment as the model.
    2. (Optional) Enter a name for the endpoint. Start the name with a letter or underscore, followed by letters, numbers, hyphens, or underscores. The length can be 1 to 255 characters. If you don't enter a name, the system generates a name that you can change later.
      The generated name has the format generativeaiendpoint<timestamp>. Example: generativeaiendpoint20250531235319
    3. (Optional) Enter a description for the model.

    Hosting configuration

    1. Select the compartment that hosts the model that you want to add an endpoint to.
    2. Select the model that you want to add an endpoint to. This model can be a custom model, imported model, or a ready-to-use pretrained foundational model available in the region that you're working in.
    3. If the model that you selected has several versions, select a model version.
      For the ready-to-use pretrained foundational models, this field populates when you select the model.
    4. Select a hosting dedicated AI cluster by performing one of the following actions:
      • Select a Dedicated AI cluster from the list. If you created a cluster a few minutes ago, wait for that cluster to become active.
      • Select Create new dedicated AI cluster and perform the following steps:
        1. (Optional) Enter a name and description.
        2. For Base model, select one of the following:
          • The pretrained foundational model that you're hosting.
          • If using a custom model, fine-tuned from a foundational model, select the original foundation (base) model it was trained on.
          • If using an imported model, select that imported model.
        3. If you selected an imported model, select a recommended Unit size based on this guide.
        4. For model replica you need at least one unit for an endpoint.
        5. Read the commitment unit hours for the hosting dedicated AI cluster and select the checkbox to agree to the commitment.
        6. (Optional) Select Add tag and assign tags to this dedicated AI cluster. See Resource Tags.
        7. Select Create and wait for the cluster to become active.
        8. From the Dedicated AI cluster list, select the dedicated AI cluster that you created.

    Networking resources (for pretrained and custom models)

    Select one of the following options:
    • Public endpoint
    • Private endpoint: If you select this option, then select the compartment for the private endpoint, and then the private endpoint that you want to use. (Not available for imported models.)
    By default, imported models have public endpoints.

    Guardrails (for pretrained and custom models)

    Note

    Guardrails aren't available for imported models.
    1. Select a setting for each guardrail. For background information, see Learn about guardrails and before you use them see the disclaimer on this page.
      • Content moderation
        • Off: No content moderation is applied.
        • Block: Helps detect content that requires moderation and aims to block the request or response based on your configuration.
        • Inform: Doesn't block content, but aims to return an indication when content that requires moderation is detected
      • Prompt injection (PI) protection
        • Off: No prompt injection protection is applied.
        • Block: Helps detect prompt injection attempts and aims to block the request based on your configuration.
        • Inform: Doesn't block the request, but aims to return an indication when prompt injection risk is detected.
      • Personally identifiable information (PII) protection
        • Off: No PII protection is applied.
        • Block: Helps detect PII and aims to block the request or response based on your configuration.
        • Inform: Doesn't block content, but aims to return an indication when PII is detected.
    2. (Optional) Select Add tag and assign tags to this endpoint. See Resource Tags.
    3. Select Create.
      You're directed to the endpoint details page where you can track the state of the endpoint.
    4. After the endpoint is active, select View in playground and start using the model from this endpoint.
  • Use the endpoint create command and required parameters to create an endpoint:

    oci generative-ai endpoint create 
    --model-id <model-OCID>
    --compartment-id <compartment-OCID> 
    --dedicated-ai-cluster-id <hosting-dedicated-AI-cluster-OCID> 
    [OPTIONS]

    For a complete list of parameters and values for CLI commands, see the CLI Command Reference.

    Note

    For pretrained models, instead of an OCID, you can use the model name exactly as listed in the Console's playground. You can also find this OCI model name, in the model's detail page in Offered Pretrained Foundational Models in Generative AI.
  • Run the CreateEndpoint operation to create an endpoint.

    Note

    For pretrained models, instead of an OCID, you can use the model name exactly as listed in the Console's playground. You can also find this OCI model name, in the model's detail page in Offered Pretrained Foundational Models in Generative AI.