19 Running Recovery Appliance Checks

Recovery Appliance checks validate that its components are in a stable and healthy state.

The checks are available through the RACLI utility and can be run together or individually. The Recovery Appliance component checks include:

  • ZDLRA Services - verifies whether the Recovery Appliance Services (RA Server, DB, CRS) are online.
  • Compute Server Alerts - checks the compute nodes for dbmcli alert history with severity greater than warnings.
  • Storage Server Alerts - checks the storage cells for dbmcli alert history with severity greater than warnings.
  • Active Incidents in the Database - checks the Recovery Appliance Database for incidents in the database. Can often be bypassed during patching with '–ignore_incidents' during the patch appliance steps.
  • Invalid Objects in the Database - checks the Recovery Appliance database for invalid objects that need to be recompiled.
  • Consistency between the Deployed vs Installed RA Automation RPM - checks the Recovery Appliance to ensure the deployed RPM vs the installed RPM are consistent.
  • Exadata Image Version Consistency Across All the Hosts - checks the compute nodes and storage cells to ensure there is only one (1) existing image version for consistency.
  • Init Parameter Validation - checks the Recovery Appliance database to confirm that set init parameters are consistent for a Recovery Appliance configuration.
  • Export Bundle Availability - checks Recovery Appliance to ensure an export bundle has been successfully taken. In the event of a disaster/crash, the export bundle is used to rebuild the Recovery Appliance to a known working state. The export bundle must be copied to a safe system or location before the Recovery Appliance is rebuilt.
  • Oracle Password Status - checks to ensure the oracle password has not expired.
  • RASYS User Wallet Status - checks the validity of the rasys wallet. This is required for operations including patching, expansion and upgrade.

racli list check

Use the racli list check command to learn the spelling and enabled status of the various checks.

  1. From the compute server as raadmin group member, run the command.

    [adminra1@zdlra05 ~]# racli list check 
    active_incidents
    agent_list
    appliance_status
    cell_alerts
    compute_alerts
    filesystem_space
    image_versions
    init_parameter
    invalid_objects
    oracle_password
    osb_pieces
    ra_catalog
    ra_compliance
    ra_export
    ra_fips
    ra_partner
    ra_prechecks
    ra_quota
    ra_selinux
    ra_version
    ra_vip
    rasys_wallets
    service_health
    session_count
    tls_health 
    [adminra1@zdlra05 ~]#
  2. List which checks are enabled.

    [adminra1@zdlra05 ~]# racli list check --status=enabled
    active_incidents
    appliance_status
    cell_alerts
    compute_alerts
    image_versions
    init_parameter
    invalid_objects
    ra_export
    ra_version
    [adminra1@zdlra05 ~]#
  3. List which checks are disabled.

    [adminra1@zdlra05 ~]# racli list check --status=disabled
    ra_prechecks
    [adminra1@zdlra05 ~]#racli list check --status=disabled --verbose
    ra_prechecks
    VERSION=1.0.0.0
    GROUP_NAME=DEV
    SCRIPT=/opt/oracle.RecoveryAppliance/bin/ra_prechecks.pl
    TYPE=system
    OPTS=''
    ORDER=15
    ENABLED=NO
    DB_USER=''
    [adminra1@zdlra05 ~]#

racli run check

Recovery Appliance checks can be run one or more at a time, or all checks that are enabled.

  1. From the compute server as raadmin group member, run the command.

    [adminra1@zdlra05 ~]# racli run check --check_name=active_incidents,invalid_objects
    Wed Oct 10 13:53:07 2018: Start: racli run check --check_name=active_incidents,invalid_objects
    HOST: [nnnnnn01.oracle.com]
    
    Created log file scas10adm01.us.oracle.com:/opt/oracle.RecoveryAppliance/log/racli_run_check_20181010.1353.log
    Wed Oct 10 13:53:07 2018: CHECK: Active Incidents - PASS
    Wed Oct 10 13:53:09 2018: CHECK: Invalid Objects - PASS
    Wed Oct 10 13:53:09 2018: End: racli run check --check_name=active_incidents,invalid_objects
    HOST: [nnnnnn01.oracle.com]
    [adminra1@zdlra05 ~]#
  2. Run all checks that are enabled.

    [adminra1@zdlra05 ~]# racli run check --all
    Wed Oct 10 13:50:28 2018: Start: racli run check --all
    HOST: [nnnnnn01.oracle.com]
    
    Created log file scas10adm01.us.oracle.com:/opt/oracle.RecoveryAppliance/log/racli_run_check_20181010.1350.log
    
    Wed Oct 10 13:50:29 2018: CHECK: RA Services - PASS
    Wed Oct 10 13:50:32 2018: CHECK: Compute Node AlertHistory
    Wed Oct 10 13:50:32 2018: HOST: [nnnnnn01] - PASS
    Wed Oct 10 13:50:32 2018: HOST: [nnnnnn01] - PASS
    Wed Oct 10 13:50:43 2018: CHECK: Storage Cell AlertHistory
    Wed Oct 10 13:50:43 2018: HOST: [scyyyyyyyadm09] - PASS
    Wed Oct 10 13:50:43 2018: HOST: [scyyyyyyyadm05] - PASS
    Wed Oct 10 13:50:43 2018: HOST: [scyyyyyyyadm03] - PASS
    Wed Oct 10 13:50:43 2018: HOST: [scyyyyyyyadm07] - PASS
    Wed Oct 10 13:50:43 2018: HOST: [scyyyyyyyadm01] - PASS
    Wed Oct 10 13:50:43 2018: HOST: [scyyyyyyyadm04] - PASS
    Wed Oct 10 13:50:43 2018: HOST: [scyyyyyyyadm02] - PASS
    Wed Oct 10 13:50:43 2018: HOST: [scyyyyyyyadm06] - PASS
    Wed Oct 10 13:50:43 2018: HOST: [scyyyyyyyadm08] - PASS
    Wed Oct 10 13:50:44 2018: CHECK: ZDLRA Version
    Wed Oct 10 13:50:44 2018: HOST: [scyyyyyyyadm02] - FAIL
    Wed Oct 10 13:50:44 2018:
    Wed Oct 10 13:50:44 2018: CAUSE:
    Wed Oct 10 13:50:44 2018: Unexpected ZDLRA version found.
    Wed Oct 10 13:50:44 2018: For more details, see log file:
    Wed Oct 10 13:50:44 2018: - /opt/oracle.RecoveryAppliance/log/racli_check_ra_versions_20181010.1350.log
    Wed Oct 10 13:50:44 2018:
    Wed Oct 10 13:50:44 2018: HOST: [scyyyyyyyadm01] - FAIL
    Wed Oct 10 13:50:44 2018:
    Wed Oct 10 13:50:44 2018: CAUSE:
    Wed Oct 10 13:50:44 2018: Unexpected ZDLRA version found.
    Wed Oct 10 13:50:44 2018: For more details, see log file:
    Wed Oct 10 13:50:44 2018: - /opt/oracle.RecoveryAppliance/log/racli_check_ra_versions_20181010.1350.log
    Wed Oct 10 13:50:44 2018:
    Wed Oct 10 13:50:53 2018: CHECK: Exadata Image Version - PASS
    Wed Oct 10 13:50:53 2018: CHECK: Active Incidents - PASS
    Wed Oct 10 13:50:56 2018: CHECK: Init Parameters - FAIL
    Wed Oct 10 13:50:56 2018:
    Wed Oct 10 13:50:56 2018: CAUSE:
    Wed Oct 10 13:50:56 2018: Init Parameter Error found
    Wed Oct 10 13:50:56 2018: ZDLRA DB Init Parameter Errors:
    Wed Oct 10 13:50:56 2018: For more details, see log file:
    Wed Oct 10 13:50:56 2018: - /opt/oracle.RecoveryAppliance/log/racli_check_init_params_20181010.1350.log
    Wed Oct 10 13:50:56 2018:
    Wed Oct 10 13:50:56 2018: Parameter: _report_capture_cycle_time
    Wed Oct 10 13:50:56 2018:
    Wed Oct 10 13:50:56 2018: Instance ID: 1
    Wed Oct 10 13:50:56 2018: Recomended Value: N/A
    Wed Oct 10 13:50:56 2018: Actual Value: 0
    Wed Oct 10 13:50:56 2018: Error Text: Init Parameters have non default value
    Wed Oct 10 13:50:56 2018:
    Wed Oct 10 13:50:56 2018: Instance ID: 2
    Wed Oct 10 13:50:56 2018: Recomended Value: N/A
    Wed Oct 10 13:50:56 2018: Actual Value: 0
    Wed Oct 10 13:50:56 2018: Error Text: Init Parameters have non default value
    Wed Oct 10 13:50:56 2018:
    Wed Oct 10 13:50:56 2018: Please run dbms_ra_adm.update_init_param
    Wed Oct 10 13:50:56 2018: in SQL env and bounce database to make them
    Wed Oct 10 13:50:56 2018: validate.
    Wed Oct 10 13:50:57 2018: CHECK: Invalid Objects - PASS
    Wed Oct 10 13:50:58 2018: CHECK: Export Backup - PASS
    Wed Oct 10 13:50:58 2018: End: racli run check --all
    HOST: [nnnnnn01.oracle.com]
    [adminra1@zdlra05 ~]#