What You May Need to Know About Fault Management Behavior When the Number of Instance Retries is Exceeded

When you configure a fault policy to recover instances with the ora-retry action and the number of specified instance retries is exceeded, the instance is marked as open.faulted (in-flight state). The instance remains active.

Marking instances as open.faulted ensures that no instances are lost. You can then configure another fault handling action following the ora-retry action in the fault policy file, such as the following:

  • Configure an ora-human-intervention action to manually perform instance recovery from Oracle Enterprise Manager Fusion Middleware Control.

  • Configure an ora-terminate action to close the instance (mark it as closed.faulted) and never retry again.

However, if you do not set an action to be performed after an ora-retry action in the fault policy file and the number of instance retries is exceeded, the instance remains marked as open.faulted, and recovery attempts to handle the instance.

For example, if no action is defined in the fault policy file shown in the following code after ora-retry:

<Action id="ora-retry">
       <retry>
          <retryCount>2</retryCount>
          <retryInterval>2</retryInterval>
          <exponentialBackoff/>
       </retry>
  </Action>

The following actions are performed:

  • The invoke activity is attempted (using the above-mentioned fault policy code to handle the fault).

  • Two retries are attempted at increasing intervals (after two seconds, then after four seconds).

  • If all retry attempts fail, the following actions are performed:

    • A detailed fault error message is logged in the audit trail.

    • The instance is marked as open.faulted (in-flight state).

    • The instance is picked up and the invoke activity is re-attempted.

  • Recovery may also fail. In that case, the invoke activity is re-executed. Additional audit messages are logged.