7.3 Resolving Database and Database Instance Delays
Blocker Resolver preserves the database performance by resolving delays and keeping the resources available.
- Blocker Resolver Architecture
Blocker Resolver autonomously runs as aDIA0
task within the database. - Optional Configuration for Blocker Resolver
You can adjust the sensitivity, and control the size and number of the log files used by Blocker Resolver. - Blocker Resolver Diagnostics and Logging
Blocker Resolver autonomously resolves delays and continuously logs the resolutions in the database alert logs and the diagnostics in the trace files. - Using the Cluster Resource Activity Log to Monitor Cluster Resource Failures
The cluster resource activity log provides precise and specific information about a resource failure, separate from diagnostic logs.
Parent topic: Resolve Database Issues
7.3.1 Blocker Resolver Architecture
Blocker Resolver autonomously runs as a DIA0
task within
the database.
Blocker Resolver works in the following three phases:
-
Detect: In this phase, Blocker Resolver collects the data on all the nodes and detects the sessions that are waiting for the resources held by another session.
-
Analyze: In this phase, Blocker Resolver analyzes the sessions detected in the Detect phase to determine if the sessions are part of a potential delay. If the sessions are suspected as delayed, Blocker Resolver then waits for a certain threshold time period to ensure that the sessions are delayed.
-
Verify: In this phase, after the threshold time period is up, Blocker Resolver verifies that the sessions are delayed and selects a session that's causing the delay.
After selecting the session that's causing the delay, Blocker Resolver applies resolution methods on that session. If the chain of sessions or the delay resolves automatically, then Blocker Resolver does not apply delay resolution methods. However, if the delay does not resolve by itself, then Blocker Resolver resolves the delay by terminating the session that's causing the delay. If terminating the session fails, then Blocker Resolver terminates the process of the session. This entire process is autonomous and does not block resources for a long period and does not affect the performance.
For example, if a high rank session is included in the chain of delayed sessions, then Blocker Resolver expedites the termination of the session that's causing the delay. Termination of the session that's causing the delay prevents the high rank session from waiting too long and helps to maintain performance objective of the high rank session.
Parent topic: Resolving Database and Database Instance Delays
7.3.2 Optional Configuration for Blocker Resolver
You can adjust the sensitivity, and control the size and number of the log files used by Blocker Resolver.
Note:
TheDBMS_HANG_MANAGER
package is deprecated in Oracle Database 23ai. Use DBMS_BLOCKER_RESOLVER
instead. The DBMS_HANG_MANAGER
package provides a method of changing some configuration parameters and constraints to address session issues. This package is being replaced with DBMS_BLOCKER_RESOLVER
. DBMS_HANG_MANAGER
can be removed in a future release.
Sensitivity
If Blocker Resolver detects a delay, then Blocker Resolver waits for a certain
threshold time period to ensure that the sessions are delayed. Change threshold time
period by using DBMS_BLOCKER_RESOLVER
to set the
sensitivity
parameter to either Normal
or
High
. If the sensitivity
parameter is set to
Normal
, then Blocker Resolver waits for the default time
period. However, if the sensitivity is set to High
, then the time
period is reduced by 50%.
By default, the sensitivity
parameter is set to
Normal
. To set Blocker Resolver sensitivity, run the following
commands in SQL*Plus as SYS
user:
-
To set the
sensitivity
parameter toNormal
:exec dbms_blocker_resolver.set(dbms_blocker_resolver.sensitivity, dbms_blocker_resolver.sensitivity_normal);
-
To set the
sensitivity
parameter toHigh
:exec dbms_blocker_resolver.set(dbms_blocker_resolver.sensitivity, dbms_blocker_resolver.sensitivity_high);
Size of the Trace Log File
_base_
in the file name. Change the size of the trace files in
bytes with the base_file_size_limit
parameter. Run the following
command in SQL*Plus, for example, to set the trace file size limit to 100
MB:exec dbms_blocker_resolver.set(dbms_blocker_resolver.base_file_size_limit, 104857600);
Number of Trace Log Files
base_file_set_count
parameter. Run the following command in
SQL*Plus, for example, to set the number of trace files in trace file set to
6:exec dbms_blocker_resolver.set(dbms_blocker_resolver.base_file_set_count,6);
By default, base_file_set_count
parameter is set to 5.
Parent topic: Resolving Database and Database Instance Delays
7.3.3 Blocker Resolver Diagnostics and Logging
Blocker Resolver autonomously resolves delays and continuously logs the resolutions in the database alert logs and the diagnostics in the trace files.
Blocker Resolver logs the resolutions in the database alert logs as Automatic
Diagnostic Repository (ADR) incidents with incident code ORA–32701
.
You also get detailed diagnostics about the delay detection in the trace files. Trace
files and alert logs have file names starting with database
instance_dia0_
.
- The trace files are stored in the
$ ADR_BASE/diag/rdbms/database name/database instance/incident/incdir_xxxxxx
directory - The alert logs are stored in the
$ ADR_BASE/diag/rdbms/database name/database instance/trace
directory
Example 7-1 Blocker Resolver Trace File for a Local Instance
This example shows an example of the output you see for Blocker Resolver for the local database instance
Trace Log File .../oracle/log/diag/rdbms/hm1/hm11/incident/incdir_111/hm11_dia0_11111_i111.trc
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production
...
*** 2016-07-16T12:39:02.715475-07:00
HM: Hang Statistics - only statistics with non-zero values are listed
current number of active sessions 3
current number of hung sessions 1
instance health (in terms of hung sessions) 66.67%
number of cluster-wide active sessions 9
number of cluster-wide hung sessions 5
cluster health (in terms of hung sessions) 44.45%
*** 2016-07-16T12:39:02.715681-07:00
Resolvable Hangs in the System
Root Chain Total Hang
Hang Hang Inst Root #hung #hung Hang Hang Resolution
ID Type Status Num Sess Sess Sess Conf Span Action
----- ---- -------- ---- ----- ----- ----- ------ ------ -------------------
1 HANG RSLNPEND 3 44 3 5 HIGH GLOBAL Terminate Process
Hang Resolution Reason: Although hangs of this root type are typically
self-resolving, the previously ignored hang was automatically resolved.
Example 7-2 Error Message in the Alert Log Indicating a Delayed Session
This example shows an example of a Blocker Resolver alert log on the primary instance
2016-07-16T12:39:02.616573-07:00
Errors in file .../oracle/log/diag/rdbms/hm1/hm1/trace/hm1_dia0_i1111.trc (incident=1111):
ORA-32701: Possible hangs up to hang ID=1 detected
Incident details in: .../oracle/log/diag/rdbms/hm1/hm1/incident/incdir_1111/hm1_dia0_11111_i1111.trc
2016-07-16T12:39:02.674061-07:00
DIA0 requesting termination of session sid:44 with serial # 23456 (ospid:34569) on instance 3
due to a GLOBAL, HIGH confidence hang with ID=1.
Hang Resolution Reason: Although hangs of this root type are typically
self-resolving, the previously ignored hang was automatically resolved.
DIA0: Examine the alert log on instance 3 for session termination status of hang with ID=1.
Example 7-3 Error Message in the Alert Log Showing a Session Delay Resolved by Blocker Resolver
This example shows an example of a Blocker Resolver alert log on the local instance for resolved delays
2016-07-16T12:39:02.707822-07:00
Errors in file .../oracle/log/diag/rdbms/hm1/hm11/trace/hm11_dia0_11111.trc (incident=169):
ORA-32701: Possible hangs up to hang ID=1 detected
Incident details in: .../oracle/log/diag/rdbms/hm1/hm11/incident/incdir_169/hm11_dia0_30676_i169.trc
2016-07-16T12:39:05.086593-07:00
DIA0 terminating blocker (ospid: 30872 sid: 44 ser#: 23456) of hang with ID = 1
requested by master DIA0 process on instance 1
Hang Resolution Reason: Although hangs of this root type are typically
self-resolving, the previously ignored hang was automatically resolved.
by terminating session sid:44 with serial # 23456 (ospid:34569)
...
DIA0 successfully terminated session sid:44 with serial # 23456 (ospid:34569) with status 0.
Parent topic: Resolving Database and Database Instance Delays
7.3.4 Using the Cluster Resource Activity Log to Monitor Cluster Resource Failures
The cluster resource activity log provides precise and specific information about a resource failure, separate from diagnostic logs.
If an Oracle Clusterware-managed resource fails, then Oracle Clusterware logs messages about the failure in the cluster resource activity log. Failures can occur as a result of a problem with a resource, a hosting node, or the network. The cluster resource activity log provides a unified view of the cause of resource failure.
Writes to the cluster resource activity log are tagged with an activity ID and any related data gets the same parent activity ID, and is nested under the parent data. For example, if Oracle Clusterware is running and you run the crsctl stop clusterware -all
command, then all activities get activity IDs, and related activities are tagged with the same parent activity ID. On each node, the command creates sub-IDs under the parent IDs, and tags each of the respective activities with their corresponding activity ID. Further, each resource on the individual nodes creates sub-IDs based on the parent ID, creating a hierarchy of activity IDs. The hierarchy of activity IDs enables you to analyze the data to find specific activities.
For example, you may have many resources with complicated dependencies among each other, and with a database service. On Friday, you see that all of the resources are running on one node but when you return on Monday, every resource is on a different node, and you want to know why. Using the crsctl query calog
command, you can query the cluster resource activity log for all activities involving those resources and the database service. The output provides a complete flow and you can query each sub-ID within the parent service failover ID, and see, specifically, what happened and why.
You can query any number of fields in the cluster resource activity log using filters. For example, you can query all the activities written by specific operating system users such as root
. The output produced by the crsctl query calog
command can be displayed in either a tabular format or in XML format.
The cluster resource activity log is an adjunct to current Oracle Clusterware logging and alert log messages.
Note:
Oracle Clusterware does not write messages that contain security-related information, such as log-in credentials, to the cluster activity log.
Parent topic: Resolving Database and Database Instance Delays