Hardware and Environmental Faults

This section describes the hardware and environmental faults. It includes information about fan speed, voltage, temperature, and power supply for the system.

Note:

If you suspect you have a hardware fault, contact Oracle Support for assistance with running the diagnostics image loaded on the Oracle Communications Session Border Controller.

Hardware Temperature Alarm

The following table describes the hardware temperature alarms.

Alarm Name Alarm ID Alarm Severity Cause(s) Example Log Message Actions Trap
TEMPERATURE HIGH 65538 CRITICAL: -100 MAJOR: -50 MINOR: -25 Fans are obstructed or stopped. The room is abnormally hot. Temperature: XX.XXC (where XX.XX is the temperature in degrees) 1. Check if fans or air vents that are obstructed 2. Check dust accumulation on vents or fans 3. Check Heating, Ventillation and Air Conditioning (HVAC) of the room. 4. Clean the filter apSysMgmtTempTrap
SD5_TEMPERATURE_HIGH_PHY0 NONE CRITICAL:>100°C MAJOR:>95°C MINOR:>90°C Fans are obstructed or stopped. The room is abnormally hot. Temperature: XX.XXC (where XX.XX is the temperature in degrees) Temperature X is at Y degrees C over minor/major/critical threshold of Z (Where X is sensor name, Y is temperature and Z is threshold) 1. Check if fans or air vents that are obstructed 2. Check dust accumulation on vents or fans 3. Check Heating, Ventillation and Air Conditioning (HVAC) of the room. 4. Clean the filter ap-env-monitor

Note:

If this alarm occurs, the system turns the fan speed up to the fastest possible speed.

Fan Speed Alarm

The following table describes the fan speed alarm.

Alarm Name Alarm ID Alarm Severity Cause(s) Example Log Message Actions Trap Generated
FAN STOPPED 65537 CRITICAL (-100): any fan speed is <50%. Or speed of two or more fans is >50% and <75%. MAJOR (-50): speed of two or more fans is > 75% and < 90%. Or speed of one fan is >50% and <75% and the other two fans are at normal speed. MINOR (-25): speed of one fan> 75% and <90%, the other two fans are at normal speed Fan speed failure. Fan speed: XXXX XXXX XXXX where xxxx xxxx xxxx is the Revolutions per Minute (RPM) of each fan on the fan module 1. Check if fans or air vents that are obstructed 2. Check dust accumulation on vents or fans. 3. Clean the filter apSysMgmtFanTrap

Note:

If this alarm occurs, the system turns the fan speed up to the fastest possible speed.

If the fan speed drops directly from functional RPM to 0 RPM, the SBC will generate a MINOR alarm and will decrement the health score by 25.

If the fan speed drops gradually to 0 RPM with intermediate RPM's involved, that is from a functional RPM to some lower RPM and then to 0 RPM, the SBC will generate a CRITICAL alarm and will decrement the health score by 100.

Environmental Sensor Alarm

The following table describes the environmental sensor alarm.

Alarm Name Alarm ID Alarm Severity Cause(s) Example Log Message Actions Health Score Impact
ENVIRONMENTAL SENSOR FAILURE 65539 CRITICAL (-10) The environmental sensor component cannot detect fan speed and temperature. Hardware monitor failure! Unable to monitor fan speed and temperature! Power cycle the standby Oracle Communications Session Border Controller peer using the power supply on/off switches located on the rear panel of the chassis Force a manual switchover by executing the ACLI notify berpd force command Power cycle the active Oracle Communications Session Border Controller peer apEnvMonI2CFailNotification

Media Link Alarms

Media link alarms include the following:

  • Major

    If the Oracle Communications Session Border Controller’s media link goes from being up to being down, it is considered a major alarm. This alarms applies to both slots 1 and 2 on the Oracle Communications Session Border Controller. A message appears on the front panel of the Oracle Communications Session Border Controller’s chassis, similar to the following:

    MAJOR ALARM
    Gig Port 1 DOWN
  • Minor

    If the Oracle Communications Session Border Controller’s media link goes from being down to being up, it is considered a minor alarm. This alarm applies to both slots 1 and 2 on the Oracle Communications Session Border Controller.

Power Supply Alarms

The following table describes the power supply alarms

Alarm Alarm ID Alarm Severity Cause(s) Log Message Actions Trap Generated
PLD POWER A FAILURE 65540 MINOR (-10) Power supply A has failed. Back Power Supply A has failed! 1. Check if the power supply is powered down. 2. Check whether the physical layer is functioning properly. This includes the server hardware, any attached peripherals, and the cabling. apSysMgmtPowerTrap
PLD POWER A UP 65541 MINOR Power supply A is now present and functioning. Back Power Supply A is present! Run Hardware Diagnostics to rule out any issues, if there was no power fluctuation at datacentre. apEnvMonPowerSupplyStatusEntr
PLD POWER B FAILURE 65542 MINOR (-10) Power supply B has failed. Back Power Supply B has failed! 1. Check if the power supply is powered down. 2. Check whether the physical layer is functioning properly. This includes the server hardware, any attached peripherals, and the cabling. apSysMgmtPowerTrap
PLD POWER B UP 65543 MINOR Power supply B is now present and functioning. Back Power Supply B is present! Run Hardware Diagnostics to rule out any issues, if there was no power fluctuation at datacentre. apEnvMonVoltageStatusEntry

Note:

If the system boots up with one power supply, the health score will be 100, and no alarm will be generated. If another power supply is then added to that same system, this same alarm will be generated, but the health score will not be decremented.

Physical Interface Card Alarms

The following table describes the physical interface card alarms.

Alarm Alarm ID Alarm Severity Cause(s) Log Message Actions Trap Name
PHY0 Removed 65550 MAJOR Physical interface card 0 was removed. PHY card 0 has been removed. Check for loose connection and reseat the card. Run HW Diagnostics if this was not performed by administrator. apEnvPhyCardStatusEntry
PHY0 Inserted 65552 MAJOR Physical interface card 0 was inserted. None N/A apEnvPhyCardStatusEntry
PHY1 Removed 65553 MAJOR Physical interface card 1 was removed. PHY card 1 has been removed. Check for loose connection and reseat the card. Run HW Diagnostics if this was not performed by administrator. apEnvPhyCardStatusEntry
PHY1 Inserted 65554 MAJOR Physical interface card 1 was inserted. None N/A apEnvPhyCardStatusEntry

Transcoding Alarms

The transcoding feature employs several hardware and software alarms to alert the user when the system is not functioning properly or overload conditions are reached.
Alarm Name Alarm ID Alarm Severity Cause(s) Example Log Message Action to diagnose the fault Trap Name
No DSPs Present with Transcoding Feature Card (DSP_NONE_PRESENT) NONE Minor/0 A transcoding feature card is installed but no DSP modules are discovered. NONE Check transcoding modules are plugged in properly. Check for loose connection and reseat the TCU card. Run HW Diagnostics if this was not performed by administrator. apSysMgmtHardwareErrorTrap
DSP Boot Failure (DSP_BOOT_FAILURE) NONE Critical/0 A DSP device fails to boot properly at system initialization. This alarm is not health affecting for a single DSP boot failure. DSPs that fail to boot will remain uninitialized and will be avoided for transcoding. NONE Run HW Diagnostics and contact Oracle Support. apSysMgmtHardwareErrorTrap
DSP Communications Timeout (DSP_COMMS_TIMEOUT) NONE Critical/100 A DSP fails to respond after 2 seconds with 3 retry messages. This alarm is critical and is health affecting. NONE Run HW Diagnostics and contact Oracle Support. apSysMgmtHardwareErrorTrap
DSP Alerts (DSP_CORE_HALT) NONE Critical/100 A problem with the health of the DSP such as a halted DSP core. The software will attempt to reset the DSP and gather diagnostic information about the crash. This information will be saved in the /code directory to be retrieved by the user. NONE Run HW Diagnostics and contact Oracle Support. apSysMgmtHardwareErrorTrap
DSP Temperature(DSP_TEMPERATURE_HIGH) NONE Clear 85°C Warning 86°C / 5 Minor 90°C / 25 Major 95°C/ 50 Critical 100°C/ 100 A DSP device exceeds the temperature threshold. If the temperature exceeds 90°C, a minor alarm will be set. If it exceeds 95°C, a major alarm will be set. If it exceeds 100°C, a critical alarm will be set. The alarm is cleared if the temperature falls below 85°C. The alarm is health affecting. NONE Check for Defective DSP, HVAC & environmental condition apSysMgmtHardwareErrorTrap
Transcoding Capacity Threshold Alarm (XCODE_UTIL_OVER_THRESHOLD) / 131329 NONE Clear 80% Warning 95% A warning alarm will be raised when the transcoding capacity exceeds a high threshold of 95%. The alarm will be cleared after the capacity falls below a low threshold of 80%. This alarm warns the user that transcoding resources are nearly depleted. This alarm is not health affecting. NONE Evaluate capacity planning & check your consumption to see if more capacity is needed. If that’s the case, reach out to Oracle Sales apSysMgmtGroupTrap
Licensed AMR Transcoding Capacity Threshold Alarm/131330 NONE Clear 80% Warning 95% A warning alarm is triggered if the AMR transcoding capacity exceeds a high threshold of 95% of licensed session in use. The alarm clears after the capacity falls below a low threshold of 80%. This alarm is not health affecting. NONE Evaluate capacity planning & check your consumption to see if more capacity is needed. If that’s the case, reach out to Oracle Sales apSysMgmtGroupTrap
Licensed AMR-WB Transcoding Capacity Threshold Alarm/131331 NONE Clear 80% Warning 95% A warning alarm is triggered if the AMR-WB transcoding capacity exceeds a high threshold of 95% of licensed session in use. The alarm clears after the capacity falls below a low threshold of 80%. This alarm is not health affecting. NONE Evaluate capacity planning & check your consumption to see if more capacity is needed. If that’s the case, reach out to Oracle Sales apSysMgmtGroupTrap
Licensed EVRC Transcoding Capacity Threshold Alarm/131332 NONE Clear 80% Warning 95% A warning alarm is triggered if the EVRC transcoding capacity exceeds a high threshold of 95% of licensed session in use. The alarm clears after the capacity falls below a low threshold of 80%. This alarm is not health affecting. NONE Evaluate capacity planning & check your consumption to see if more capacity is needed. If that’s the case, reach out to Oracle Sales apSysMgmtGroupTrap
Licensed EVRCB Transcoding Capacity Threshold Alarm/131333 NONE Clear 80% Warning 95% A warning alarm is triggered if the EVRCB transcoding capacity exceeds a high threshold of 95% of licensed session in use. The alarm clears after the capacity falls below a low threshold of 80%. This alarm is not health affecting. NONE Evaluate capacity planning & check your consumption to see if more capacity is needed. If that’s the case, reach out to Oracle Sales apSysMgmtGroupTrap
Licensed Opus Transcoding Capacity Threshold Alarm/131159 NONE Clear 80% Warning 95% A warning alarm is triggered if the Opus transcoding capacity exceeds a high threshold of 95% of licensed session in use. The alarm clears after the capacity falls below a low threshold of 80%. This alarm is not health affecting. NONE Evaluate capacity planning & check your consumption to see if more capacity is needed. If that’s the case, reach out to Oracle Sales apSysMgmtGroupTrap
Licensed SILK Transcoding Capacity Threshold Alarm/131159 NONE Clear 80% Warning 95% A warning alarm is triggered if the SILK transcoding capacity exceeds a high threshold of 95% of licensed session in use. The alarm clears after the capacity falls below a low threshold of 80%. This alarm is not health affecting. NONE Evaluate capacity planning & check your consumption to see if more capacity is needed. If that’s the case, reach out to Oracle Sales apSysMgmtGroupTrap

Viewing PROM Information

Display PROM statistics for the following Oracle Communications Session Border Controller components by using the show prom-info command.

For example:

ORACLE# show prom-info mainboard
Contents of Main Board IDPROM
        Assy, NetNet4600
        Part Number:                   002-0610-50
        Serial Number:                 091132009670
        FunctionalRev:                 5.06
        BoardRev:                      05.00
        PCB Family Type:               Main Board
        ID:                            NetNet 4600 Main Board
        Options:                       0
        Manufacturer:                  Benchmark Electronics
        Week/Year:                     32/2017
        Sequence Number:               009670
        Number of MAC Addresses:       16
        Starting MAC Address:          00 08 25 a2 56 20

The following example shows the host CPU PROM contents.

ORACLE# show prom-info cpu
Contents of CPU IDPROM
        Part Number:                   MOD-0026-62
        Manufacturer:                  RadiSys

Graphic Window Display

The Environment display lets you scroll through information about the operational status of the hardware displayed in the Oracle Communications Session Border Controller chassis’s graphic window. For example, you can view hardware- and link-related alarm information, highest monitored temperature reading, and fan speed.

The graphic display window presents the following Environment information in the order listed:

Alarm state
temperature
fan speed
  • alarm state: HW ALARM: X (where X is the number of hardware alarms, excluding ENVIRONMENTAL SENSOR FAILURE) and LINK ALARM: X (where X is the number of link down alarms)
  • temperature: format is XX.XX C, where XX.XX is the temperature in degrees
  • fan speed: XXXX, where XXXX is the RPM of the failing fan on the fan module

For example:

HW ALARM: 1
LINK ALARM: 2
TEMPERATURE: 38.00 C
FAN SPEED: 5800

From this display, pressing Enter for the Return selection refreshes the information and returns you to the main Environment menu heading.

Note:

Environmental sensor failure alarms are not displayed in the graphic display window on the front panel.

Fan Stopped Alarm

The fan stopped alarm presents the following in the graphic display window:

X HW ALARM(S) (where X indicates the number of HW alarms that exist on the system)

Temperature High Alarm

The temperature high alarm presents the following in the graphic display window:

X HW ALARM(S) (where X indicates the number of HW alarms that exist on the system)