Hardware and Environmental Faults
This section describes the hardware and environmental faults. It includes information about fan speed, voltage, temperature, and power supply for the system.
Note:
If you suspect you have a hardware fault, contact Oracle Support for assistance with running the diagnostics image loaded on the Oracle Communications Session Border Controller.Hardware Temperature Alarm
The following table describes the hardware temperature alarms.
Alarm Name | Alarm ID | Alarm Severity | Cause(s) | Example Log Message | Actions | Trap |
---|---|---|---|---|---|---|
TEMPERATURE HIGH | 65538 | CRITICAL: -100 MAJOR: -50 MINOR: -25 | Fans are obstructed or stopped. The room is abnormally hot. | Temperature: XX.XXC (where XX.XX is the temperature in degrees) | 1. Check if fans or air vents that are obstructed 2. Check dust accumulation on vents or fans 3. Check Heating, Ventillation and Air Conditioning (HVAC) of the room. 4. Clean the filter | apSysMgmtTempTrap |
SD5_TEMPERATURE_HIGH_PHY0 | NONE | CRITICAL:>100°C MAJOR:>95°C MINOR:>90°C | Fans are obstructed or stopped. The room is abnormally hot. | Temperature: XX.XXC (where XX.XX is the temperature in degrees) | Temperature X is at Y degrees C over minor/major/critical threshold of Z (Where X is sensor name, Y is temperature and Z is threshold) 1. Check if fans or air vents that are obstructed 2. Check dust accumulation on vents or fans 3. Check Heating, Ventillation and Air Conditioning (HVAC) of the room. 4. Clean the filter | ap-env-monitor |
Note:
If this alarm occurs, the system turns the fan speed up to the fastest possible speed.Fan Speed Alarm
The following table describes the fan speed alarm.
Alarm Name | Alarm ID | Alarm Severity | Cause(s) | Example Log Message | Actions | Trap Generated |
---|---|---|---|---|---|---|
FAN STOPPED | 65537 | CRITICAL (-100): any fan speed is <50%. Or speed of two or more fans is >50% and <75%. MAJOR (-50): speed of two or more fans is > 75% and < 90%. Or speed of one fan is >50% and <75% and the other two fans are at normal speed. MINOR (-25): speed of one fan> 75% and <90%, the other two fans are at normal speed | Fan speed failure. | Fan speed: XXXX XXXX XXXX where xxxx xxxx xxxx is the Revolutions per Minute (RPM) of each fan on the fan module | 1. Check if fans or air vents that are obstructed 2. Check dust accumulation on vents or fans. 3. Clean the filter | apSysMgmtFanTrap |
Note:
If this alarm occurs, the system turns the fan speed up to the fastest possible speed.If the fan speed drops directly from functional RPM to 0 RPM, the SBC will generate a MINOR alarm and will decrement the health score by 25.
If the fan speed drops gradually to 0 RPM with intermediate RPM's involved, that is from a functional RPM to some lower RPM and then to 0 RPM, the SBC will generate a CRITICAL alarm and will decrement the health score by 100.
Environmental Sensor Alarm
The following table describes the environmental sensor alarm.
Alarm Name | Alarm ID | Alarm Severity | Cause(s) | Example Log Message | Actions | Health Score Impact |
---|---|---|---|---|---|---|
ENVIRONMENTAL SENSOR FAILURE | 65539 | CRITICAL (-10) | The environmental sensor component cannot detect fan speed and temperature. | Hardware monitor failure! Unable to monitor fan speed and temperature! | Power cycle the standby Oracle Communications Session Border Controller peer using the power supply on/off switches located on the rear panel of the chassis Force a manual switchover by executing the ACLI notify berpd force command Power cycle the active Oracle Communications Session Border Controller peer | apEnvMonI2CFailNotification |
Media Link Alarms
Media link alarms include the following:
- Major
If the Oracle Communications Session Border Controller’s media link goes from being up to being down, it is considered a major alarm. This alarms applies to both slots 1 and 2 on the Oracle Communications Session Border Controller. A message appears on the front panel of the Oracle Communications Session Border Controller’s chassis, similar to the following:
MAJOR ALARM Gig Port 1 DOWN
- Minor
If the Oracle Communications Session Border Controller’s media link goes from being down to being up, it is considered a minor alarm. This alarm applies to both slots 1 and 2 on the Oracle Communications Session Border Controller.
Power Supply Alarms
The following table describes the power supply alarms
Alarm | Alarm ID | Alarm Severity | Cause(s) | Log Message | Actions | Trap Generated |
---|---|---|---|---|---|---|
PLD POWER A FAILURE | 65540 | MINOR (-10) | Power supply A has failed. | Back Power Supply A has failed! | 1. Check if the power supply is powered down. 2. Check whether the physical layer is functioning properly. This includes the server hardware, any attached peripherals, and the cabling. | apSysMgmtPowerTrap |
PLD POWER A UP | 65541 | MINOR | Power supply A is now present and functioning. | Back Power Supply A is present! | Run Hardware Diagnostics to rule out any issues, if there was no power fluctuation at datacentre. | apEnvMonPowerSupplyStatusEntr |
PLD POWER B FAILURE | 65542 | MINOR (-10) | Power supply B has failed. | Back Power Supply B has failed! | 1. Check if the power supply is powered down. 2. Check whether the physical layer is functioning properly. This includes the server hardware, any attached peripherals, and the cabling. | apSysMgmtPowerTrap |
PLD POWER B UP | 65543 | MINOR | Power supply B is now present and functioning. | Back Power Supply B is present! | Run Hardware Diagnostics to rule out any issues, if there was no power fluctuation at datacentre. | apEnvMonVoltageStatusEntry |
Note:
If the system boots up with one power supply, the health score will be 100, and no alarm will be generated. If another power supply is then added to that same system, this same alarm will be generated, but the health score will not be decremented.Physical Interface Card Alarms
The following table describes the physical interface card alarms.
Alarm | Alarm ID | Alarm Severity | Cause(s) | Log Message | Actions | Trap Name |
---|---|---|---|---|---|---|
PHY0 Removed | 65550 | MAJOR | Physical interface card 0 was removed. | PHY card 0 has been removed. | Check for loose connection and reseat the card. Run HW Diagnostics if this was not performed by administrator. | apEnvPhyCardStatusEntry |
PHY0 Inserted | 65552 | MAJOR | Physical interface card 0 was inserted. | None | N/A | apEnvPhyCardStatusEntry |
PHY1 Removed | 65553 | MAJOR | Physical interface card 1 was removed. | PHY card 1 has been removed. | Check for loose connection and reseat the card. Run HW Diagnostics if this was not performed by administrator. | apEnvPhyCardStatusEntry |
PHY1 Inserted | 65554 | MAJOR | Physical interface card 1 was inserted. | None | N/A | apEnvPhyCardStatusEntry |
Transcoding Alarms
Alarm Name | Alarm ID | Alarm Severity | Cause(s) | Example Log Message | Action to diagnose the fault | Trap Name |
---|---|---|---|---|---|---|
No DSPs Present with Transcoding Feature Card (DSP_NONE_PRESENT) | NONE | Minor/0 | A transcoding feature card is installed but no DSP modules are discovered. | NONE | Check transcoding modules are plugged in properly. Check for loose connection and reseat the TCU card. Run HW Diagnostics if this was not performed by administrator. | apSysMgmtHardwareErrorTrap |
DSP Boot Failure (DSP_BOOT_FAILURE) | NONE | Critical/0 | A DSP device fails to boot properly at system initialization. This alarm is not health affecting for a single DSP boot failure. DSPs that fail to boot will remain uninitialized and will be avoided for transcoding. | NONE | Run HW Diagnostics and contact Oracle Support. | apSysMgmtHardwareErrorTrap |
DSP Communications Timeout (DSP_COMMS_TIMEOUT) | NONE | Critical/100 | A DSP fails to respond after 2 seconds with 3 retry messages. This alarm is critical and is health affecting. | NONE | Run HW Diagnostics and contact Oracle Support. | apSysMgmtHardwareErrorTrap |
DSP Alerts (DSP_CORE_HALT) | NONE | Critical/100 | A problem with the health of the DSP such as a halted DSP core. The software will attempt to reset the DSP and gather diagnostic information about the crash. This information will be saved in the /code directory to be retrieved by the user. | NONE | Run HW Diagnostics and contact Oracle Support. | apSysMgmtHardwareErrorTrap |
DSP Temperature(DSP_TEMPERATURE_HIGH) | NONE | Clear 85°C Warning 86°C / 5 Minor 90°C / 25 Major 95°C/ 50 Critical 100°C/ 100 | A DSP device exceeds the temperature threshold. If the temperature exceeds 90°C, a minor alarm will be set. If it exceeds 95°C, a major alarm will be set. If it exceeds 100°C, a critical alarm will be set. The alarm is cleared if the temperature falls below 85°C. The alarm is health affecting. | NONE | Check for Defective DSP, HVAC & environmental condition | apSysMgmtHardwareErrorTrap |
Transcoding Capacity Threshold Alarm (XCODE_UTIL_OVER_THRESHOLD) / 131329 | NONE | Clear 80% Warning 95% | A warning alarm will be raised when the transcoding capacity exceeds a high threshold of 95%. The alarm will be cleared after the capacity falls below a low threshold of 80%. This alarm warns the user that transcoding resources are nearly depleted. This alarm is not health affecting. | NONE | Evaluate capacity planning & check your consumption to see if more capacity is needed. If that’s the case, reach out to Oracle Sales | apSysMgmtGroupTrap |
Licensed AMR Transcoding Capacity Threshold Alarm/131330 | NONE | Clear 80% Warning 95% | A warning alarm is triggered if the AMR transcoding capacity exceeds a high threshold of 95% of licensed session in use. The alarm clears after the capacity falls below a low threshold of 80%. This alarm is not health affecting. | NONE | Evaluate capacity planning & check your consumption to see if more capacity is needed. If that’s the case, reach out to Oracle Sales | apSysMgmtGroupTrap |
Licensed AMR-WB Transcoding Capacity Threshold Alarm/131331 | NONE | Clear 80% Warning 95% | A warning alarm is triggered if the AMR-WB transcoding capacity exceeds a high threshold of 95% of licensed session in use. The alarm clears after the capacity falls below a low threshold of 80%. This alarm is not health affecting. | NONE | Evaluate capacity planning & check your consumption to see if more capacity is needed. If that’s the case, reach out to Oracle Sales | apSysMgmtGroupTrap |
Licensed EVRC Transcoding Capacity Threshold Alarm/131332 | NONE | Clear 80% Warning 95% | A warning alarm is triggered if the EVRC transcoding capacity exceeds a high threshold of 95% of licensed session in use. The alarm clears after the capacity falls below a low threshold of 80%. This alarm is not health affecting. | NONE | Evaluate capacity planning & check your consumption to see if more capacity is needed. If that’s the case, reach out to Oracle Sales | apSysMgmtGroupTrap |
Licensed EVRCB Transcoding Capacity Threshold Alarm/131333 | NONE | Clear 80% Warning 95% | A warning alarm is triggered if the EVRCB transcoding capacity exceeds a high threshold of 95% of licensed session in use. The alarm clears after the capacity falls below a low threshold of 80%. This alarm is not health affecting. | NONE | Evaluate capacity planning & check your consumption to see if more capacity is needed. If that’s the case, reach out to Oracle Sales | apSysMgmtGroupTrap |
Licensed Opus Transcoding Capacity Threshold Alarm/131159 | NONE | Clear 80% Warning 95% | A warning alarm is triggered if the Opus transcoding capacity exceeds a high threshold of 95% of licensed session in use. The alarm clears after the capacity falls below a low threshold of 80%. This alarm is not health affecting. | NONE | Evaluate capacity planning & check your consumption to see if more capacity is needed. If that’s the case, reach out to Oracle Sales | apSysMgmtGroupTrap |
Licensed SILK Transcoding Capacity Threshold Alarm/131159 | NONE | Clear 80% Warning 95% | A warning alarm is triggered if the SILK transcoding capacity exceeds a high threshold of 95% of licensed session in use. The alarm clears after the capacity falls below a low threshold of 80%. This alarm is not health affecting. | NONE | Evaluate capacity planning & check your consumption to see if more capacity is needed. If that’s the case, reach out to Oracle Sales | apSysMgmtGroupTrap |
Viewing PROM Information
Display PROM statistics for the following Oracle Communications Session Border Controller components by using the show prom-info command.
For example:
ORACLE# show prom-info mainboard
Contents of Main Board IDPROM
Assy, NetNet4600
Part Number: 002-0610-50
Serial Number: 091132009670
FunctionalRev: 5.06
BoardRev: 05.00
PCB Family Type: Main Board
ID: NetNet 4600 Main Board
Options: 0
Manufacturer: Benchmark Electronics
Week/Year: 32/2017
Sequence Number: 009670
Number of MAC Addresses: 16
Starting MAC Address: 00 08 25 a2 56 20
The following example shows the host CPU PROM contents.
ORACLE# show prom-info cpu
Contents of CPU IDPROM
Part Number: MOD-0026-62
Manufacturer: RadiSys
Graphic Window Display
The Environment display lets you scroll through information about the operational status of the hardware displayed in the Oracle Communications Session Border Controller chassis’s graphic window. For example, you can view hardware- and link-related alarm information, highest monitored temperature reading, and fan speed.
The graphic display window presents the following Environment information in the order listed:
Alarm state
temperature
fan speed
- alarm state: HW ALARM: X (where X is the number of hardware alarms, excluding ENVIRONMENTAL SENSOR FAILURE) and LINK ALARM: X (where X is the number of link down alarms)
- temperature: format is XX.XX C, where XX.XX is the temperature in degrees
- fan speed: XXXX, where XXXX is the RPM of the failing fan on the fan module
For example:
HW ALARM: 1
LINK ALARM: 2
TEMPERATURE: 38.00 C
FAN SPEED: 5800
From this display, pressing Enter for the Return selection refreshes the information and returns you to the main Environment menu heading.
Note:
Environmental sensor failure alarms are not displayed in the graphic display window on the front panel.