Compute Cloud@Customer Known Issues

GPU Drivers Not Included in Oracle Linux Platform Images

The Oracle Linux 8 or Oracle Linux 9 platform images provided with Compute Cloud@Customer don't include GPU drivers. If you create a GPU instance, you must manually install the GPU drivers.

Details

If a Compute Cloud@Customer installation includes compute nodes with GPUs, you can access them by selecting a dedicated shape. The GPU shapes can be selected for compute instances based on an Oracle Linux 8 or Oracle Linux 9 platform image. The current image versions don't include GPU drivers. The instance OS detects the allocated GPUs, but to use them, you need the CUDA Toolkit from the NVIDIA developer site to install the required drivers.

Note

The large download and local repository installation need a large amount of disk space. The default 50GB boot volume is insufficient on Oracle Linux 9 and only just large enough on Oracle Linux 8. We highly recommend increasing the boot volume size to at least 60GB, and extend the file system accordingly.

Workaround

After creating the instance, log in to the instance and install the CUDA Toolkit. Follow the instructions for your version of Oracle Linux.

Installing GPU Drivers in an Oracle Linux 9 Instance
  1. From the command line of the instance, download and install the CUDA Toolkit rpm for your OS.

    $ wget https://developer.download.nvidia.com/compute/cuda/12.8.0/local_installers/cuda-repo-rhel9-12-8-local-12.8.0_570.86.10-1.x86_64.rpm
    $ sudo rpm -i cuda-repo-rhel9-12-8-local-12.8.0_570.86.10-1.x86_64.rpm
    $ sudo dnf clean all
    $ sudo dnf install cuda-toolkit-12-8
  2. Enable the Oracle Linux 9 EPEL yum repository. Install the dkms package.

    $ sudo yum-config-manager --enable ol9_developer_EPEL
    $ sudo dnf install dkms
  3. Install the GPU drivers.

    $ sudo dnf install cuda-12-8
  4. Verify the installation with the NVIDIA System Management Interface.

    $ nvidia-smi
    +-----------------------------------------------------------------------------------------+
    | NVIDIA-SMI 570.86.10              Driver Version: 570.86.10      CUDA Version: 12.8     |
    |-----------------------------------------+------------------------+----------------------+
    | GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
    | Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
    |                                         |                        |               MIG M. |
    |=========================================+========================+======================|
    |   0  NVIDIA L40S                    Off |   00000000:00:05.0 Off |                    0 |
    | N/A   26C    P8             23W /  350W |       1MiB /  46068MiB |      0%      Default |
    |                                         |                        |                  N/A |
    +-----------------------------------------+------------------------+----------------------+
    
    +-----------------------------------------------------------------------------------------+
    | Processes:                                                                              |
    |  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
    |        ID   ID                                                               Usage      |
    |=========================================================================================|
    |  No running processes found                                                             |
    +-----------------------------------------------------------------------------------------+
Installing GPU Drivers in an Oracle Linux 8 Instance
  1. From the command line of the instance, download and install the CUDA Toolkit rpm for your OS.

    $ wget https://developer.download.nvidia.com/compute/cuda/12.8.0/local_installers/cuda-repo-rhel8-12-8-local-12.8.0_570.86.10-1.x86_64.rpm
    $ sudo rpm -i cuda-repo-rhel8-12-8-local-12.8.0_570.86.10-1.x86_64.rpm
    $ sudo dnf clean all
    $ sudo dnf install cuda-toolkit-12-8
  2. Enable the Oracle Linux 8 EPEL yum repository. Install the dkms package.

    $ sudo yum-config-manager --enable ol8_developer_EPEL
    $ sudo dnf install dkms
  3. Install the GPU drivers.

    $ sudo dnf install cuda-12-8
  4. Install the NVIDIA kernel module.

    $ sudo scl enable gcc-toolset-13 bash
    # dkms install nvidia-open -v 570.86.10

    If this make error appears while the kernel module is built, you can safely ignore it.

    Cleaning build area...(bad exit status: 2)
    Failed command:
    make -C /lib/modules/5.15.0-206.153.7.el8uek.x86_64/build M=/var/lib/dkms/nvidia-open/570.86.10/build clean
  5. Verify the installation with the NVIDIA System Management Interface.

    # nvidia-smi
    +-----------------------------------------------------------------------------------------+
    | NVIDIA-SMI 570.86.10              Driver Version: 570.86.10      CUDA Version: 12.8     |
    |-----------------------------------------+------------------------+----------------------+
    | GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
    | Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
    |                                         |                        |               MIG M. |
    |=========================================+========================+======================|
    |   0  NVIDIA L40S                    Off |   00000000:00:05.0 Off |                    0 |
    | N/A   26C    P8             23W /  350W |       1MiB /  46068MiB |      0%      Default |
    |                                         |                        |                  N/A |
    +-----------------------------------------+------------------------+----------------------+
    
    +-----------------------------------------------------------------------------------------+
    | Processes:                                                                              |
    |  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
    |        ID   ID                                                               Usage      |
    |=========================================================================================|
    |  No running processes found                                                             |
    +-----------------------------------------------------------------------------------------+

After an API key is created or changed, the initial CLI command might fail

Details

When an API key is added or changed for a user, the first CLI command with the new or changed key might fail.

Workaround
Wait a few minutes for the new key to synchronize on the Compute Cloud@Customer infrastructure, then retry the CLI command.

The oci ccc get infrastructure and oci ccc infrastructure update CLI commands return null for the provisioning_pin value

Details

When you create an infrastructure, a PIN is generated and displayed in the output.

However, if you use the oci ccc get infrastructure command right after creating or updating the infrastructure, the PIN might not be returned.

This happens because the PIN isn't available to the get command for up to 5 minutes after creation.

Example output:

{
  "compartment_id": "ocid1.compartment.oc1..uniqueID",
. . .
  },
  "display_name": "C3ResourcePrincipal_infra",
  "freeform_tags": {},
  "id": "ocid1.cccinfrastructure.uniqueID",
  "lifecycle_details": null,
  "lifecycle_state": "ACTIVE",
  "provisioning_fingerprint": null,
  "provisioning_pin": null,
  "rack_inventory": {
    "capacity_storage_tray_count": null,
    "compute_node_count": null,
    "management_node_count": null,
    "performance_storage_tray_count": null,
    "serial_number": null
. . .

}
Workaround

Obtain the PIN from the create command, or wait 5 minutes to retrieve the PIN using the get command.

For more information, see the ccc infrastructure CLI Reference page.

When using the ccc infrastructure list CLI command with the --compartment-id-in-subtree true option, no results are returned

Details

You get an empty list even though there are items in the subtree.

Example:

oci ccc infrastructure list --profile user1 --compartment-id-in-subtree true -c ocid1.tenancy.oc1..uniqueID
{
  "data": {
    "items": []
  }
}
Workaround

Instead of using the --compartment-id-in-subtree option, query each compartment directly using the -compartment option.

Example:

oci ccc infrastructure list --profile user1 -compartment ocid1.tenancy.oc1..uniqueID
{
  "data": {
    "items": [ list of compartment details ]
  }
}

For more information, see the ccc infrastructure CLI Reference page.

Output from oci iam user get doesn't list user capabilities

Details

The output from oci iam user get differs between Oracle Cloud Infrastructure (OCI) and Compute Cloud@Customer. The Compute Cloud@Customer output shows null for capabilites and omits the list of capabilities, as shown in the following table.

OCI Output Compute Cloud@Customer Output
oci iam user get --user-id ocid1.user.oc1..uniqueID
{
"data": {
"capabilities": {
"can-use-api-keys": true,
"can-use-auth-tokens": true,
"can-use-console-password": true,
"can-use-customer-secret-keys": true,
"can-use-o-auth2-client-credentials": true,
"can-use-smtp-credentials": true
},
"compartment-id":
"ocid1.tenancy.oc1..uniqueID",
"defined-tags": {
"Oracle-Recommended-Tags": {
"ResourceType": "group",
"UtilExempt": "minrequired"
}
},
"description": "user-1",
"email": null,
"email-verified": false,
"external-identifier": null,
"freeform-tags": {},
"id":
"ocid1.user.oc1..uniqueID"
,
"identity-provider-id": null,
"inactive-status": null,
"is-mfa-activated": false,
"last-successful-login-time": "2024-02-08T10:25:44.036000+00:00",
"lifecycle-state": "ACTIVE",
"name": "user-1",
"previous-successful-login-time": null,
"time-created": "2024-02-08T09:12:35.256000+00:00"
},
"etag": "60f0527b3bbd0f40f137d4149d131fbf77eb44ab"
}
oci iam user get --user-id
ocid1.user.oc1..uniqueID
{
"data": {
"capabilities": null,

"compartment-id":
"ocid1.tenancy.oc1..uniqueID",
"defined-tags": {
"Oracle-Recommended-Tags": {
"ResourceType": "group",
"UtilExempt": "minrequired"
}
},
"description": "user-1",
"email": null,
"email-verified": null,
"external-identifier": null,
"freeform-tags": {},
"id":
"ocid1.user.oc1..uniqueID"
,
"identity-provider-id": null,
"inactive-status": null,
"is-mfa-activated": null,
"last-successful-login-time": null,
"lifecycle-state": "ACTIVE",
"name": "user-1",
"previous-successful-login-time": null,
"time-created": "2023-02-08T09:12:35.256000+00:00"
},
"etag": "bee44237-6d70-4691-b7f9-a98fbb332b12"
Workaround
To see the list of capabilities, run the oci iam user get command in your OCI tenancy.