Changes in this Release

This preface lists changes in the Oracle Autonomous Health Framework Checks and Diagnostics User's Guide 25.3.

Oracle Automatic Storage Management (ASM) Disk Group Status Now Displayed Per Node

ASM Disk groups status are now displayed for each node.

Previously, AHF Insights reported the status of an ASM disk group without distinguishing between nodes. This could lead to confusion if a disk group was online on one node but offline on another, as the report displayed only a single-node status.

With this enhancement, AHF Insights now provides the ASM disk group status for each node individually, offering greater clarity and accuracy.

The Group Details in the ASM Details section now includes a Disk Group Status button.

Figure -1 Insights ASM disk group status


This image illustrates Insights ASM disk group status

Clicking this provides details for each node.

Figure -2 Insights ASM disk group status details


This image illustrates Insights ASM disk group status details

How to Access the Updated Performance Reports
  1. Run the following command to collect diagnostic data: tfactl diagcollect.
  2. Extract the diagnostic collection and open the Insights report.
  3. Navigate to Cluster → ASM Details and review the Group Details section for per-node ASM disk group status.

Insights Now Identifies Events That Triggered an Auto Collection

AHF Insights now displays the specific events that triggered an auto collection in the Event Timeline, providing better context for diagnosing database issues.

When troubleshooting database problems, users rely on the Event Timeline in AHF Insights to understand the sequence of events. If an Insights report is generated within a diagnostic collection, AHF captures additional context about the issue the collection was created for.

With this enhancement, AHF Insights highlights the Triggering Event within the Timeline section:
  • The Triggering Event is visually emphasized in the Timeline view.
  • At the bottom of the timeline, the Triggering Event is also highlighted, with a button to jump directly to it.
How to View the Triggering Event Details
  • Extract the insights.zip file from an AHF diagnostic collection.
  • Open index.html and navigate to the Timeline section.

Figure -3 Insights triggering event


This image illustrates Insights triggering event

Deprecated AHF CLI Commands and Options Now Emit Deprecation Warnings

Older AHF CLI commands now display a deprecation message and guide users to the corresponding AHF commands that replace their functionality.

Previously, when new commands were introduced in the AHF CLI, existing commands in older tools remained available without any deprecation notice. As older functionality is now being integrated into AHF, deprecated commands and options will emit a warning message.

The deprecation message:
  • Notifies users that the command still works but will be removed in the future.
  • Directs users to the new AHF command they should use instead.

This feature is enabled by default for all deprecated commands—no action is required to activate it.

Minimum Collection Period Requirement for Diagnostics and Insights

Starting in AHF 25.3, a minimum collection or analysis period of 15 minutes is required for collecting diagnostics or insights.

If the specified collection time is less than 15 minutes, TFACTL returns an error message prompting the user to extend the collection period.

Related Topics

Reliable Datagram Sockets (RDS) Signatures in AHF

Reliable Datagram Socket (RDS) is an open-source protocol designed for high-performance, low-latency communication over InfiniBand. It operates as a connectionless protocol, minimizing CPU utilization, making it the preferred choice for InfiniBand communication.

As part of ExaWatcher on Exadata compute and storage nodes, various system metrics are collected at fixed intervals, including process details, top CPU consumers, memory usage, and more. RDS-related metrics are also gathered approximately every minute and stored in hourly log files. These logs contain information on RDS IB Connections, RDS Connections, Counters, RDS-Pings, and other relevant details.

AHF analyzes these hourly log files using chm-ostool-parsers, which convert the data into human-readable JSON format. The parsed data is then processed by chm-analyzer and chm-reportgen, which evaluate RDS signatures against predefined thresholds. If a threshold is exceeded, the corresponding signature is recorded along with its relevant details and displayed in the report.

The following RDS signatures have been introduced in AHF 25.3:

Table -1 RDS Signatures

Signature name Description Threshold set
RDSLatency List of IPs and Lanes with high(>{LatencyThreshold}usec) RDS Ping latency 20 usecs
RDSCountersConnReset RDS Counter with diff value >={ErrorThreshold} since previous sample 1
RDSCountersCongUpdateQueued Detected increase in RDS Counter – cong_update_queued 1
RDSCountersCongUpdateReceived Detected increase in RDS Counter – cong_update_received 1
RDSCountersCongSendError Detected increase in RDS Counter – cong_send_error 1
RDSCountersIBTxRingFull Detected increase in RDS Counter – ib_tx_ring_full 1
RDSCountersIBTxStalled Detected increase in RDS Counter – ib_tx_stalled 1
RDSCountersIBRxTotalFrags Detected increase in RDS Counter – ib_rx_total_frags 1
RDSCountersIBRDMAMr8kPoolDepleted Detected increase in RDS Counter – ib_rdma_mr_8k_poll_depleted 1
RDSCountersIBRDMAMr1mPoolDepleted Detected increase in RDS Counter – ib_rdma_mr_1m_poll_depleted 1

Figure -4 AHF Report with RDS Signatures


This image illustrates AHF Report with RDS Signatures

Diff Analysis for Low memory Signature in Orachk CHM Analysis section

The diff analysis feature compares and analyzes operating system metrics at two different points in time.

In the Orachk CHM analysis section, the AvailableMemoryLow signature now provides a detailed diff analysis of the system when memory pressure was at its highest versus when the issue was not present.

The analysis is divided into following sections:
  • Changes in system level available memory and swap usage.
  • Changes in the number of processes and maximum RSS usage, categorized by database foreground/background processes, ASM, Clusterware, and other processes.
  • Changes in per-process RSS/VIRT memory consumption, covering existing processes, newly spawned processes, and processes that have exited.

Figure -5 CHM diff analysis


This image illustrates CHM diff analysis

New Oracle Orachk and Oracle Exachk Best Practice Checks

Release 25.3 includes the following new Oracle Orachk and Oracle Exachk best practice checks.

Oracle Orachk Specific Best Practice Checks

  • Verify voting disk integrity

All checks can be explored in more detail via the Health Check Catalogs: