Database Impact Advisor

The Database Impact Advisor can be run against an individual Exadata system to perform system-wide database CPU usage noisy-neighbor analysis in order to identify databases whose performance is potentially impacted by other databases or other operating system processes. The analysis applies algorithms established by the Autonomous Health Framework (AHF) Balance feature to Oracle Enterprise Manager historical metric data from the past 30 days. Further, the Database Impact Advisor integrates directly with AHF Balance to generate recommendations for optimizing Database Resource Manager (DBRM) settings across all databases to minimize any CPU-based performance impacts that were found.

Database Impact Advisor is available for Database Machine targets for which the Exadata Management Pack is enabled. You can launch it by clicking Database Machine and selecting Database Impact Advisor from the menu.

Prerequisites

  • Self Update should be configured in Oracle Enterprise Manager. See Setting Up Self Update in Cloud Control Administrator's Guide.
  • Database Impact Advisor uses a specific AHF installation configured and managed from Oracle Enterprise Manager. This AHF installation must be configured by an Oracle Enterprise Manager administrator by following instructions in AHF Configuration for Database Impact Advisor.
  • In order to run AHF Balance reports from Database Impact Advisor, Oracle Enterprise Manager users must be granted one of the roles listed in AHF Configuration for Database Impact Advisor.

Key Concepts

Following are some of the core concepts underlying the Database Impact Advisor analysis framework. For additional details about these and the other concepts, see Resolve Noisy Neighbor Issues in Oracle Autonomous Health Framework User's Guide.

  • Limit: The maximum number of vCPUs a database instance may use simultaneously. The DBRM parameter CPU_COUNT implements a limit for the instance.

  • Guarantee: The number of vCPUs a database instance is guaranteed to be able to use at any time. When a cluster is dedicated to running databases, the DBRM and the operating system cooperate to provide a guarantee. If the over-provisioning ratio R=sum(CPU_COUNT)/physical vCPUs, then the guarantee for a database instance is its CPU_COUNT/R.

    For example, if we had a 64 vCPU machine running 8 database instances, all with CPU_COUNT set to 16, then the oversubscription ratio R would be 2, that is, 8 * 16/64, and each individual database instance would have a guarantee of 8, that is, 16/2.

  • Not Exposed Hour: An hour when no database instance's CPU use exceeds its CPU guarantee. When an instance is not exposed, it cannot experience CPU-based noisy neighbor problems regardless of the CPU consumption of the other instances running on the machine.

  • Exposed Hour: An hour when one or more database instance's CPU use exceeds its CPU guarantee. When an instance is exposed, it may experience noisy neighbor problems depending on the CPU consumption of the other instances running on the machine.

  • Impacted Hour: An exposed hour, when the host's CPU utilization exceeded 70% during the hour. When an instance is impacted, it is likely to be experiencing noisy neighbor problems because the total CPU consumption of the machine is high.

  • Partitioned: When a cluster is partitioned, each database instance has dedicated CPU capacity. CPU consumption by neighbors cannot interfere with a database instance. CPU resources (up to a configured limit - CPU_COUNT) are guaranteed to be available at all times. However, since CPU resources are dedicated to specific database instances, instances cannot take advantage of (borrow) CPU cycles that are not being used by other instances. Typically, when a cluster is partitioned, the degree of database consolidation is limited by the number of physical CPUs on each machine in the cluster, and the peak CPU consumption of each database hosted on the cluster.

    A cluster is partitioned when the sum of the CPU_COUNT DBRM parameter values for all the database instances running on each machine in the cluster is less than or equal to the number of physical CPUs on the machine. For example, if the machines in a cluster each have 64 CPUs, and each machine is hosting 4 database instances, each with CPU_COUNT set to 16, the cluster is partitioned.

    If the goal is to partition a cluster, then appropriate CPU_COUNT settings can be determined by analyzing historical CPU consumption data. AHF Balance supports this analysis.

  • Impacted Status: Overall impact status of the database. If the database has any impacted hours within the collection then its status is FAIL, if it has any exposed hours its status is WARNING, else status is INFO.

    If the cluster is not over-provisioned, then by definition there can be no impacted or exposed hours, and the status is indicated as PASS.

Using Database Impact Advisor

The Database Impact Advisor CPU Impact tab has charts that provide a top-level summary of how many clusters, hosts, databases, and database instances on the Exadata system are in the Exposed (warning) and Impacted (fail) categories. The table below the charts enumerates the specific impact status details for each cluster, database, and database instance. Selecting a specific database or instance in the table provides in-depth historical visualization of the exposed and impacted hours for the database or instance.