2 Recovery Appliance

This chapter provides information about the Recovery Appliance metrics.

For each metric, it provides the following information:

  • Description

  • Metric table

    The metric table can include some or all of the following: target version, default collection frequency, default warning threshold, default critical threshold, and alert text.

Data Sent/Received

These metrics collect information about the data that is backed up, copied to tape, and replicated for all protected databases.

Target Version: All versions

Collection Frequency: Every 5 Minutes

Data Source: RA_DATABASE view in the Recovery Appliance database

User Action: Not applicable

Metric Description
Backup Data Rate (GB/s) This metric provides the rate (GB/s) at which backup data is being ingested by the Recovery Appliance for all protected databases.
Cumulative Backup Data Delta (GB) This metric provides the change in the cumulative amount of backup data received for all protected databases since the last collection of this metric.
Cumulative Backup Data Received (GB) This metric provides the cumulative amount of backup data received by the Recovery Appliance for all protected databases.
Cumulative Copy-to-Tape Data Delta (GB) This metric provides the change in the cumulative amount of data copied to tape for all protected databases since the last collection of this metric.
Cumulative Copy-to-Tape Data Sent (GB) This metric provides the cumulative amount of data copied to tape for all protected databases.
Cumulative Replication Data Delta (GB) This metric provides the change in the cumulative amount of data replicated for all protected databases since the last collection.
Cumulative Replication Data Sent (GB) This metric provides the cumulative amount of data replicated for all protected databases.
Replication Data Rate (GB/s) This metric provides the rate at which data is replicated for all protected databases.

Health

These metrics collect incident data for the Recovery Appliance.

The Health metric for a Zero Data Loss Recovery Appliance (ZDLRA) target lists all the incidents generated by the Recovery Appliance while monitoring the internal operations and status of the databases protected by the appliance. This information is sourced from the RA_INCIDENT_LOG view in the Recovery Appliance.

For all Recovery Appliance incidents that have a severity of WARNING, ERROR, or INTERNAL, out-of-the-box alerts are raised in Oracle Enterprise Manager. This results in a relatively high volume of alerts, however, you have the option of setting the following metric collection properties to filter and be notified of only some of these alerts.

  • ignorePattern: You can use this property to specify a regular expression pattern that can be used to filter out incidents. This pattern is applied to the Error Text column of the Health metric, and any rows matching the pattern are automatically filtered and not collected as part of the Health metric in Oracle Enterprise Manager. This greatly reduces the number of alerts generated for the Recovery Appliance. If you do not want to view ORA-45175 errors, you must add "ORA-45175" to the pattern: .*ORA-*(45175)\D.*. In addition, you can add other ORA errors that you want to ignore to the same pattern .*ORA-*(-45160|45175|45168)\D.*.

  • ignoreProtectionPolicies: You can use this property on a Health metric to ignore the incidents generated against databases associated with a specific protection policy. If backups for certain databases are stopped deliberately, for example, with the intention of retiring the database, incidents are generated for these databases in the Recovery Appliance. You can create a new protection policy called "DECOMMISSIONED", add the databases for which you want to ignore alerts to this protection policy, and specify "DECOMMISSIONED" as the value for the ignoreProtectionPolicies property.

To set these properties:
  1. Click Monitoring and then Metric and Collection Settings for the Recovery Appliance.
  2. Click the pencil icon in the Edit column of any Health metric row to access Edit Advanced Settings.

Metrics Details

Target Version: All versions

Collection Frequency: Every 5 Minutes

Data Source: RA_INCIDENT_LOG view in the Recovery Appliance database.

Metric Description
Component This metric provides the component of the Recovery Appliance detecting this incident.

Collection Frequency: Every 15 Minutes

Database Key This metric provides the primary key of the protected database (if any) involved in this incident.
Database Unique Name This metric provides the db_unique_name of the protected database (if any) involved in this incident.
Error Code This metric provides the Oracle error code for the message describing the incident.
Error Text This metric provides the text of the message describing the incident.
First Incident Time This metric provides the timestamp when the Recovery Appliance first detected the incident.
Incident ID This metric provides the unique ID for the incident.
Incident Status This metric provides the status of this incident: ACTIVE, FIXED, or RESET.
Last Incident Time This metric provides the timestamp when the Recovery Appliance most recently detected the incident.
Number of Incidents This metric provides the number of times the Recovery Appliance detected the incident.
Parameter This metric provides the parameter qualifying the scope of the error code.
Severity This metric provides the relative severity of the incident in the context of the operation of the Recovery Appliance.

Default Warning Threshold: WARNING

Default Critical Threshold: ERROR, INTERNAL

Alert Text: %error_text%.

Storage Location Key This metric provides the primary key of the storage location (if any) involved in this incident.
Storage Location Name This metric provides the name of the storage location (if any) involved in this incident.
Task ID This metric provides the ID of the task, if any, in which the incident was detected.
Task State This metric provides the processing state of the task: EXECUTABLE, RUNNING, COMPLETED, TASK_WAIT, FAILED, and so on.
Task Type This metric provides the type of processing performed by the task.

Protected Databases

These metrics collect information about the databases protected by this Recovery Appliance.

For protected databases, you can choose to set up thresholds and obtain alerts based on metrics such as recovery window goal or unprotected data window. If you stop sending backups to the Recovery Appliance for a protected database and you have previously set up thresholds and configured alerts, you will continue to receive the alerts even though your action was deliberate. In such a scenario, you have the option of setting the following metric collection property:

  • ignoreProtectionPolicies: You can use this property on a Protected Databases metric to ignore the incidents generated against databases associated with a specific protection policy. If backups for certain databases are stopped deliberately, for example, with the intention of retiring the database, incidents are generated for these databases in the Recovery Appliance. You can create a new protection policy called "DECOMMISSIONED", add the databases for which you want to ignore alerts to this protection policy, and specify "DECOMMISSIONED" as the value for the ignoreProtectionPolicies property.

To set this property:
  1. Click Monitoring and then Metric and Collection Settings for the Recovery Appliance.
  2. Click the pencil icon in the Edit column of any Protected Databases metric row to access Edit Advanced Settings.

Metrics Details

Target Version: All versions

Collection Frequency: Every 15 Minutes

Data Source: RA_DATABASE, RA_DISK_RESTORE_RANGE and other views in the Recovery Appliance database.

Metric Description
Backup Data Rate (GB/s) This metric provides the rate (GB/s) at which backup data is being ingested by the Recovery Appliance for this protected database.
Copy-to-Tape Data Rate (GB/s) This metric provides the rate (GB/s) at which the data has been copied to tape for this protected database.
Copy-to-Tape Queued Data (GB) This metric provides the amount of data (GB) that is in the queue to be copied to tape for this protected database.
Copy-to-Tape Queued Data Age (hours) This metric provides information about how long the data has been in the queue to be copied to tape for this protected database.
Copy-to-Tape Total Data on Tape (GB) This metric provides the total amount of data that has been copied to tape for this protected database.
Cumulative Backup Data (GB) The metric provides the cumulative amount of backup data ingested by the Recovery Appliance for this protected database.
Cumulative Backup Data Delta (GB) This metric provides the change in the cumulative amount of backup data ingested for this protected database since the last collection of this metric.
Cumulative Copy-to-Tape Data (GB) This metric provides the cumulative amount of data copied to tape for this protected database.
Cumulative Copy-to-Tape Data Delta (GB) This metric provides the change in the cumulative amount of data copied to tape for this protected database since the last collection of this metric.
Cumulative Replication Data Delta (GB) This metric provides the change in the cumulative amount of data replicated for this protected database since the last collection of this metric.
Cumulative Replication Data (GB) This metric provides the cumulative amount of data replicated for this protected database.
Current Recovery Window (interval) This metric provides the current recovery window of this protected database (as an interval).
Current Recovery Window (sec) This metric provides the current recovery window of this protected database (in seconds).
Database Key This metric provides the primary key for this protected database in the Recovery Appliance metadata.
Database Unique Name This metric provides the db_unique_name of this protected database.
Date Added as Protected Database This metric provides the time when this protected database was enrolled with the Recovery Appliance.
Deduplication Ratio This metric provides the ratio of the total size of received backups for this protected database to the space consumed for this database in Recovery Appliance storage.
Keep Backup Space (GB) This metric provides the total amount of space used by backups that have a KEEP retention setting that overrides the retention policy used for this protected database.
Last Complete Backup This metric provides the latest point in time for which a complete backup is available for all data files in this protected database.
Last Copy to Tape This metric provides the last time that data was copied to tape for this protected database.
Last Replication This metric provides the last time data was replicated for this protected database.
Most Recent Recovery Point This metric provides the latest time to which the protected database can be recovered.
Near-Zero Data Loss Enabled This metric indicates whether this protected database is shipping redo data to the Recovery Appliance.

User Action: Check the Near-Zero Data Loss setting in Backup Settings for this protected database.

Number of Protected Databases This metric provides the total number of protected databases enrolled with the Recovery Appliance.
Oldest Recovery Point This metric provides the earliest time to which the protected database can be recovered.
Protection Policy This metric provides the name of the protection policy used by this protected database.
Recovery Window Goal (interval) This metric provides the recovery window goal (as an interval) for disk backups, as specified in the protection policy used by this protected database.
Recovery Window Goal (sec) This metric provides the recovery window goal in seconds for disk backups, as specified in the protection policy used by this protected database.
Recovery Window Ratio (%) This metric provides the ratio between the current recovery window and the recovery window goal for this protected database.
Recovery Window Space (GB) This metric provides an estimation of the required space in Recovery Appliance storage to meet the recovery window goal specified in the protection policy used by this protected database.
Recovery Window Space as a Percentage of Reserved Space This metric provides the ratio between the recovery window space and the reserved space for this protected database.

Default Warning Threshold: Not Defined

Default Critical Threshold: Not Defined

Alert Text: The space required to meet the recovery window for database %db_unique_name% is %value%% of the reserved space for the database.

Replication Data Rate (GB/s) This metric provides the rate at which data is being replicated for this protected database.
Replication Queued Data (GB) This metric provides the amount of data (GB) that is in the queue to be replicated for this protected database.
Reserved Space (GB) This metric provides the minimum amount of disk space (GB) that will be reserved on the Recovery Appliance for this protected database.
Storage Location This metric provides the name of the Recovery Appliance storage location used by this protected database.
Unprotected Data Window (sec) This metric provides the current actual amount of potential data loss for this protected database.
Unprotected Data Window Threshold (sec) This metric provides the maximum amount of acceptable potential data loss exposure specified in the protected policy used by this protected database.
Used Space (GB) This metric provides the amount of disk space currently used for this protected database in the Recovery Appliance.

Queued Data

These metrics provide an overview of the amount of data and number of tasks queued on the Recovery Appliance for backup, copy-to-tape, and replication operations.

Target Version: All versions

Collection Frequency: Every Hour

Data Source: RA_SBT_TASK and RA_TASK views in the Recovery Appliance database

User Action: Not applicable

Metric Description
Backup Tasks Queued Since Last Collection This metric provides the number of backup tasks queued on the Recovery Appliance since the last collection of this metric.
Copy-to-Tape Tasks Queued Since Last Collection This metric provides the number of copy-to-tape tasks queued on the Recovery Appliance since the last collection of this metric.
Replication Tasks Queued Since Last Collection This metric provides the number of replication tasks queued on the Recovery Appliance since the last collection of this metric.
Total Backup Tasks Queued This metric provides the total number of backup tasks queued on the Recovery Appliance.
Total Copy-to-Tape Data Queued (bytes) This metric provides the cumulative amount of data in the queued copy-to-tape tasks.
Total Copy-to-Tape Tasks Queued This metric provides the total number of copy-to-tape tasks queued on the Recovery Appliance.
Total Replication Data Queued (bytes) This metric provides the cumulative amount of data in the queued replication tasks.
Total Replication Tasks Queued This metric provides the total number of replication tasks queued on the Recovery Appliance.

Replication Status

These metrics collect information about the replication servers configured on the Recovery Appliance.

Target Version: All versions

Collection Frequency: Every 15 Minutes

Data Source: RA_REPLICATION_SERVER and RA_SBT_LIBRARY views in the Recovery Appliance database

User Action: Not applicable

Metric Description
Replication Server Name This metric provides the name of the replication server, as specified when the replication server was created.
Replication Status This metric provides the tape library status (READY, PAUSE, ERROR, or null).
SBT Library Name This metric provides the name of the tape library that the replication server is associated with.

Response

The metrics in this category show the status of the Recovery Appliance instance.

Target Version: All versions

Collection Frequency: Every 5 Minutes

Data Source: RA_SERVER view in the Recovery Appliance database

User Action: Not applicable

Metric Description
Status This metric shows the status of the Recovery Appliance processes. Valid values:
  • 1: Recovery Appliance processes are running
  • 0: Recovery Appliance processes are not running

Default Warning Threshold: Not Defined

Default Critical Threshold: 0

Alert Text: Recovery Appliance is down. %oraerr%

Storage Locations

These metric collect information about the storage locations configured for this Recovery Appliance.

Target Version: All versions

Collection Frequency: Every 15 Minutes

Data Source: RA_DATABASE and RA_STORAGE_LOCATION views in the Recovery Appliance database.

User Action: Not applicable

Metric Description
Incoming Backup Data Rate (GB/s) This metric provides the rate at which backup data is being ingested, aggregated across all databases using this storage location.
Key This metric provides the primary key for this storage location in the Recovery Appliance metadata.
Name This metric provides the Recovery Appliance storage location name.
Number of Storage Locations This metric provides the total number of storage locations for this Recovery Appliance.
Recovery Window Space (GB) This metric provides the estimated space that is needed to meet the recovery window goal for all databases using this storage location.
Recovery Window Space as a Percentage of Reserved Space This metric provides the ratio between the total space required to meet the recovery window for all databases using this storage location and the total reserved space for all databases using the storage location.

Evaluation and Collection Frequency: Every 15 Minutes

Default Warning Threshold: Not Defined

Default Critical Threshold: Not Defined

Alert Text: The total space required to meet the recovery window for all databases using storage location %sl_name% is %value%% of the total reserved space for all databases using the storage location.

Recovery Window Space as a Percentage of Storage Location Size This metric provides the ratio between the total space required to meet the recovery window for all databases using storage location and the size of the storage location.

Evaluation and Collection Frequency: Every 15 Minutes

Default Warning Threshold: 85

Default Critical Threshold: 97

Alert Text: The total space required to meet the recovery window for all databases using storage location %sl_name% is %value%% of the size of the storage location.

Reserved Space (GB) This metric provides the amount of disk space reserved for all databases using this storage location.
Size (GB) This metric provides the maximum amount of storage (in GB) that the Recovery Appliance storage location can use for the backup data.
Unreserved Space (GB) This metric provides the difference between the maximum amount of storage that the storage location can use for backup data and the amount of disk space reserved for all databases using this storage location.
Unused Space (GB) This metric provides the amount of unused space in this storage.
Used Space (GB) This metric provides the total amount of disk space used in this storage location.