Diagnosing Server Component Hardware Faults
This section contains maintenance-related information and procedures that you can use to troubleshoot and repair server hardware issues.
When a server hardware fault event occurs, the system lights the Fault-Service Required LED and captures the event in the Oracle ILOM event log. If you set up notifications through Oracle ILOM, you also receive an alert through the notification method you choose. When you become aware of a hardware fault, address it immediately. For details, refer to Oracle ILOM Documentation.
Use the following process to address a hardware fault.
-
Identify the server subsystem containing the fault.
See Server Status Indicator LEDs.You can use Oracle ILOM to identify a failed component. See Accessing Oracle ILOM.
-
To collect system-level information or to verify the system health status from the CLI, type show /System. To access subsystem and component health details from the CLI, type show /System/subsystem-name.
- PROCESSORS
- MEMORY
- POWER
- COOLING
- STORAGE
- NETWORKING
- PCI_DEVICES
- FIRMWARE
-
See Monitoring Component Health and Faults Using Oracle ILOM to identify a failed component.
-
For a step-by-step diagnostic procedure, see Troubleshoot Hardware Faults Using Oracle ILOM Web Interface.
-
The Oracle ILOM Fault Management Shell enables you to view and manage fault activity on managed servers and other types of devices. For more information about how to use the Oracle ILOM Fault Management Shell, refer to Oracle x86 Servers Diagnostics and Troubleshooting Guide at Oracle ILOM Documentation.
-
-
Review Exadata Server X10M Product Information and Known Issues for any late-breaking information about the server. Refer to Oracle AMD-Based Cloud Servers Product Notes. Review up-to-date information about the server, including hardware-related known issues.
-
Prepare the server for service using Oracle ILOM.
If you determined that the hardware fault requires service (physical access to the server), use Oracle ILOM to take the server offline, activate the Locate button/LED, and if necessary, power off the server. See Accessing Oracle ILOM. See Preparing for Service.
-
Prepare the service workspace.
Before servicing the server, prepare the workspace, ensuring Electrostatic Discharge Safety (ESD) protection for the server and components. See Preparing for Service.
-
Service the components.
To service replaceable components, see the removal, installation, and replacement procedures in this document.
Note:
Server components must be replaced by Oracle Service personnel. Contact Oracle Service. -
Clear the fault in Oracle ILOM.
Depending on the component, you might need to clear the fault in Oracle ILOM. Generally, components that have a FRU ID, clear the fault automatically. For details, refer to Oracle Integrated Lights Out Manager (ILOM) documentation at Oracle ILOM Documentation.