15 Sizing Your Enterprise Manager Deployment
Oracle Enterprise Manager has the ability to scale for hundreds of users and thousands of systems and services on a single Enterprise Manager implementation.
By maintaining routine housekeeping and monitoring performance regularly, you insure that you will have the required data to make accurate forecasts of future sizing requirements. Receiving good baseline values for the Enterprise Manager vital signs and setting reasonable warning and critical thresholds on baselines allows Enterprise Manager to monitor itself for you.
Sizing is a critical factor in Enterprise Manager performance. Inadequately-sized Enterprise Manager deployments may result in the overall benefits of Enterprise Manager being compromised. The resources required for the Enterprise Manager Oracle Management (OMS) Service and Management Repository tiers will vary significantly based on the number of monitored targets. While there are many additional aspects to be considered when sizing Enterprise Manager infrastructure, these guidelines provide a simple methodology that can be followed to determine the minimum required hardware resources and initial configuration settings for the OMS and Management Repository tiers.
This chapter contains the following sections:
For an overview of Enterprise Manager components, see Architecture of Enterprise Manager in Oracle Enterprise Manager Introduction documentation.
Enterprise Manager Sizing
Oracle Enterprise Manager provides a highly available and scalable deployment topology. This chapter lays out the basic minimum sizing and tuning recommendations for initial capacity planning for your Oracle Enterprise Manager deployment. This chapter assumes a basic understanding of Oracle Enterprise Manager components and systems. A complete description of Oracle Enterprise Manager can be obtained from Enterprise Manager Introduction . This information is a starting point for site sizing. Every site has its own characteristics and should be monitored and tuned as needed.
Sizing is a critical factor for Enterprise Manager performance. Inadequately sized Enterprise Manager deployments will result in frustrated users and the overall benefits of Enterprise Manager may be compromised. The resources required for Enterprise Manager OMS and Repository tiers will vary significantly based on the number of monitored targets. While there are many additional aspects to be considered when sizing Enterprise Manager infrastructure, the following guidelines provide a simple methodology that can be followed to determine the minimum required hardware resources and initial configuration settings for the OMS and Repository tiers.
Overview of Sizing Guidelines
The following sections provide an overview of the sizing guidelines.
Hardware Information
The sizing guidelines outlined in this chapter were obtained by running a virtual environment on the following hardware and operating system combination.
-
Hardware -- OCI E5 compute and E5 Oracle Base Database Service deployments.
-
Operating System -- Oracle Linux 8
This information is based on an x64 Oracle Linux 8 environment. If you are running on other platforms, you will need to convert the sizing information based on similar hardware performance. This conversion should be based on single-thread performance. Running on a machine with 24 slow cores is not equivalent to running on a machine with 12 fast cores even though the total machine performance might be the same on a throughput benchmark. Single thread performance is critical for good Enterprises Manager user interface response times.
Sizing Specifications
The sizing guidelines for Oracle Enterprise Manager are divided into four sizes: small, medium, large and extra large. The definitions of each size are shown in Table 15-1.
Table 15-1 Oracle Enterprise Manager Site Sizes
Size | Agent Count | Target Count | Concurrent User Sessions |
---|---|---|---|
Small |
<200 |
< 5000 |
<10 |
Medium |
>=200, < 1000 |
>=5000, < 15000 |
>= 10, < 25 |
Large |
>=1000, < 2500 |
>=15000, < 50000 |
>= 25, < 50 |
Extra Large |
>=2500 |
>=50000 |
>=50 |
Sizing for Upgraded Installs
If upgrading from a previous release of Enterprise Manager to Enterprise Manager 24ai, the following queries can be run as the sysman user to obtain the Management Agent and target counts for use in Table 1.
-
Agent count - select count (*) from mgmt_targets where target_type = ‘oracle_emd'
-
Target count – select count (*) from mgmt_targets
Minimum Hardware Requirements
Table 15-2 lists the minimum hardware requirements for the four configurations.
Table 15-2 Oracle Enterprise Manager Minimum Hardware Requirements
Size | OMS Machine Count* | Cores per OMS | Memory per OMS (GB) | Storage per OMS (GB) | Database Machine Count* | Cores per Database Machine | Memory per Database Machine (GB) |
---|---|---|---|---|---|---|---|
Small |
1 |
4 |
12 |
60 |
1 |
4 |
10 |
Medium |
2 |
6 |
14 |
60 |
2 (Oracle RAC) |
6 |
12 |
Large |
2 4 |
12 6 |
24 12 |
60 60 |
2 (Oracle RAC) 2 (Oracle RAC) |
12 12 |
20 20 |
Extra Large |
4 |
24 |
32 |
60 |
2 (Oracle RAC) |
36 |
48 |
Table 15-3 Oracle Enterprise Manager Minimum Storage Requirements
Size | MGMT_TABLESPACE (GB) | MGMT_AD4J_TS (GB) | MGMT_ECM_DEPOT_TS (GB) | TEMP | ARCHIVE LOG AREA (GB) |
---|---|---|---|---|---|
Small |
200 |
10 |
1 |
12 |
50 |
Medium |
400 |
30 |
4 |
20 |
150 |
Large |
500 |
50 |
8 |
40 |
250 |
Extra Large |
800 |
80 |
16 |
80 |
350 |
Network Topology Considerations
A critical consideration when deploying Enterprise Manager is network performance between tiers. Enterprise Manager ensures tolerance of network glitches, failures, and outages between application tiers through error tolerance and recovery. The Management Agent in particular is able to handle a less performant or reliable network link to the Management Service without severe impact to the performance of Enterprise Manager as a whole. The scope of the impact, as far as a single Management Agent's data being delayed due to network issues, is not likely to be noticed at the Enterprise Manager system wide level.
The impact of slightly higher network latencies between the Management Service and Management Repository will be substantial, however. Implementations of Enterprise Manager have experienced significant performance issues when the network link between the Management Service and Management Repository is not of sufficient quality.
The Management Service host and Repository host should be located in close proximity to each other. Ideally, the round trip network latency between the two should be less than 1 millisecond.
Software Configurations
The following sections provide information about the different deployment configurations.
For an overview of Enterprise Manager components, see Architecture of Enterprise Manager in Oracle Enterprise Manager Introduction documentation.
Note:
Thejob_queue_processes
parameter is set to 50
by default for all deployment types: small, medium, large and extra
large.
Small Configuration
The Small configuration is based on the minimum requirements that are required by the Oracle Enterprise Manager installer.
Minimum OMS Settings
WebLogic container: 2 GB
API Gateway: 384 MB
Zero Downtime WebLogic container: 768 MB
Minimum Database Settings
Table 15-4 lists the minimum recommended database settings.
Table 15-4 Small Site Minimum Database Settings
Parameter | Minimum Value |
---|---|
processes |
300 |
pga_aggregate_target |
2 GB |
sga_target or sga_min_size |
3 GB |
redo log file size |
400 MB |
Medium Configuration
The Medium configuration modifies several out-of-box Oracle Enterprise Manager settings.
Minimum OMS Settings
WebLogic container: 3 GB
API Gateway: 384 MB
Zero Downtime WebLogic container: 768 MB
Minimum Repository Database Settings
Table 15-5 lists the minimum repository database settings that are recommended for a Medium configuration.
Table 15-5 Medium Site Minimum Database Settings
Parameter | Minimum Value |
---|---|
processes |
600 |
pga_aggregate_target |
3 GB |
sga_target or sga_min_size |
6 GB |
redo log file size |
600 MB |
Large Configuration
The Large configuration modifies several out-of-box Oracle Enterprise Manager settings.
Minimum OMS Settings
Table 15-6 lists the minimum OMS settings that are recommended for Large configurations.
Table 15-6 Large Site Minimum OMS Settings
OMS Count: 2 |
---|
|
OMS Count: 4 |
---|
|
Minimum Repository Database Settings
Table 15-7 lists the minimum repository database settings that are recommended for a Large configuration.
Table 15-7 Large Site Minimum Database Settings
Parameter | Minimum Value |
---|---|
processes |
1000 |
pga_aggregate_target |
4000 MB |
sga_target or sga_min_size |
10 GB |
redo log file size |
1500 MB |
Extra Large Configuration
The Extra Large configuration requires the following settings:
Minimum OMS Settings
Table 15-8 lists the minimum OMS settings that are recommended for Extra Large configurations.
Table 15-8 Extra Large Site Minimum OMS Settings
OMS Count:4 |
---|
|
Minimum Repository Database Settings
Table 15-9 lists the minimum repository database settings that are recommended for a Extra Large configuration.
Table 15-9 Extra Large Site Minimum Database Settings
Parameter | Minimum Value |
---|---|
processes |
2000 |
pga_aggregate_target |
8 GB |
sga_target or sga_min_size |
20 GB |
redo log file size |
2 GB |
Note:
It is recommended to set RAC services for ping alerts, jobs, rollup, events, and Config Metric Post Load Callbacks. For more information, see Step 4: Eliminating Bottlenecks Through Tuning.Additional Configurations
Some Enterprise Manager installations may need additional tuning settings based on larger individual system loads.
Additional settings are listed below:
Large Concurrent UI Load
If more than 50 concurrent users are expected per OMS, the following settings should be altered as seen in Table 15-10.
Table 15-10 Large Concurrent UI Load Additional Settings
Process | Parameter | Value | Where To Set |
---|---|---|---|
WebLogic container |
Heap Size |
Additional 4GB for every increment of 50 users |
Per WebLogic container |
Database |
sga_target |
Additional 2 GB for every increment of 50 users |
Per Instance |
Higher user loads will require more hardware capacity. An additional 2 cores for both the database and OMS hosts for every 50 concurrent users.
Example: A site with 1500 agents and 15,000 targets with 150 concurrent users would require at a minimum the setting modifications listed in Table 15-11 (based on a LARGE 2 OMS configuration).
Table 15-11 Large Concurrent UI Load Additional Settings Example for 2 OMS Configurations
Process | Parameter | Value | Calculation |
---|---|---|---|
WebLogic container |
Heap Size |
11 GB (set on each OMS) |
7 GB (standard large setting) + ((150 users – 50 default large user load) / 2 OMS)* (4GB / 50 users) |
Database |
sga_target |
12 GB |
10 GB (standard large setting) + (150 users - 50 default large user load) * (1GB / 50 users) |
Minimum Additional Hardware required is listed in Table 15-12.
Table 15-12 Large Concurrent UI Load Minimum Additional Hardware Example For 2 OMS Configuration
Tier | Parameter | Value | Calculation |
---|---|---|---|
OMS |
CPU cores |
32 (total between all OMS hosts) |
12 cores * 2 OMS (default large core count) + (150 users - 50 default large user load) *(2 cores * 2 OMS)/ 50 users) |
Database |
CPU cores |
32 (total between all Database hosts) |
12 cores * 2 OMS (default large core count) + (150 users - 50 default large user load) *(2 cores * 2 OMS / 50 users) |
The physical memory of each machine would have to be increased to support running this configuration as well.
The memory parameters values can be changed using the command:
emctl set property -name <property name> -value <property value>
For example: emctl set property -name OMS_HEAP_MAX -value 2000M
Large Job System Load
If the jobs system has a backlog for long periods of time or if you would like the backlog processed faster, set the following parameters with the emctl set property command.
Table 15-13 Large Job System Backlog Settings
Parameter | Value |
---|---|
oracle.sysman.core.jobs.shortPoolSize |
50 |
oracle.sysman.core.jobs.longPoolSize |
24 |
oracle.sysman.core.jobs.longSystemPoolSize |
20 |
oracle.sysman.core.jobs.systemPoolSize |
50 |
oracle.sysman.core.conn.maxConnForJobWorkers |
144* |
Note:
*This setting may require an increase in the processes setting in the database of 144 number of OMS servers.
These settings assume that there are sufficient database resources available to support more load.
Changing OMS Properties
The following section provides examples of changing the OMS settings recommended in this chapter. You may need to change OMS property settings, for example, when increasing the Job Backlog. The values in the examples should be substituted with the appropriate value for your configuration. Use the following instructions to change OMS properties.
API Gateway:
- Set the Heap Size
emctl set webtier_property -name JVM_MEMORY_OPTIONS -value <new value eg. -Xmx1024M>
- Stop API Gateway
emctl stop oms -webtier_only
- Start API Gateway
emctl start oms -webtier_only
- Verify the Heap Size
emctl get webtier_property -name JVM_MEMORY_OPTIONS
WebLogic Container:
- OMS_HEAP_MIN
- OMS_HEAP_MAX
- OMS_PERMGEN_MIN
- OMS_PERMGEN_MAX
Zero Downtime WebLogic Container:
- EMEXT_ZDT_HEAP_MIN
- EMEXT_ZDT_HEAP_MAX
The following table describes the above parameters and provides a description, default values, recommendations for their use, and any notes, warnings or issues of which to be aware.
Name | Description | Default | Recommendation | Notes, Warnings or Issues |
---|---|---|---|---|
OMS_HEAP_MIN (-Xms) |
Change of –Xms is not really required. Should maintain post-installation default value. If a large setup becomes a ‘very large setup' over a period of time, then user/sysadmin may choose to increase the value at the time of increasing the value of –Xmx. |
Small: 256M Medium: 256M Large: 256M Extra Large: 256M For IBM JVM, irrespective of the app size, use the following settings: 1740M |
Same as mentioned in the Default section. These are post installation defaults, thus the recommended setup. |
N/A |
OMS_HEAP_MAX (-Xmx) |
As targets are added after the initial installation/setup of Enterprise Manager, increasing the HEAP size is recommended to avoid any unforeseen Out Of Memory Error of Tenured/Old Gen. |
Small: 2G Medium: 3G Large: 3584M Extra Large: 8G For IBM JVM, irrespective of the app size, there are no limits on the heap size. |
Same as mentioned in the Default section. These are post installation defaults, thus the recommended setup. |
All these parameters should be changed, once users experience a lower throughput over a period of time, due to consistently high memory usage. The person (preferably sysadmin) manipulating the parameters must be aware of the limits/warnings. |
OMS_PERMGEN_MIN (-XX:PermSize) |
Change of –XX: PermSize is not required. Should maintain post-installation default value. |
Small: 128M Medium: 128M Large: 128M Extra Large: 128M For IBM JVM, irrespective of the app size, use the following settings: 128M |
Same as mentioned in the Default section. These are post installation defaults, thus the recommended setup. |
N/A |
OMS_PERMGEN_MAX (-XX:MaxPermSize) |
In Large configurations, where too many activities in the OMS container result in a large number of classloaders and ‘Class' objects being created, the perm gen may become full, resulting in an Out Of Memory Error. |
Small: 768M Medium: 768M Large: 768M Extra Large: 768M For IBM JVM, irrespective of the app size, use the following settings: 612M |
Same as mentioned in the Default section. These are post installation defaults, thus the recommended setup. |
N/A |
EMEXT_ZDT_HEAP_MIN (-Xms) |
Change of –Xms is not really required. Should maintain post-installation default value. If a large setup becomes a ‘very large setup' over a period of time, then user/sysadmin may choose to increase the value at the time of increasing the value of –Xmx. |
Small: 256M Medium: 256M Large: 256M Extra Large: 256M For IBM JVM, irrespective of the app size, use the following settings: 1740M |
Same as mentioned in the Default section. These are post installation defaults, thus the recommended setup. |
N/A |
EMEXT_ZDT_HEAP_MAX (-Xmx) |
As targets are added after the initial installation/setup of Enterprise Manager, increasing the HEAP size is recommended to avoid any unforeseen Out Of Memory Error of Tenured/Old Gen. |
Small: 768M Medium: 768M Large: 1G Extra Large: 2G For IBM JVM, irrespective of the app size, there are no limits on the heap size. |
Same as mentioned in the Default section. These are post installation defaults, thus the recommended setup. |
All these parameters should be changed, once users experience a lower throughput over a period of time, due to consistently high memory usage. The person (preferably sysadmin) manipulating the parameters must be aware of the limits/warnings. |
You can use the following command to set the value for any of the above properties:
emctl set property –name <property_name> -value <number_followed_by_G_or_M>
For example:
emctl set property –name OMS_PERMGEN_MAX –value 1024M
Use the following command to get the property name:
emctl get property –name <property_name>
An OMS restart using the below commands is required on each OMS after changing the property value:
emctl stop oms -all emctl start oms
To change the OMS property, oracle.sysman.core.jobs.shortPoolSize, follow these recommendations:
To set the property, enter the following command:
$ emctl set property -name oracle.sysman.core.jobs.shortPoolSize -value 200
To get the property (after changing from the default), enter the following command:
$ emctl get property -name “oracle.sysman.core.jobs.shortPoolSize"
To delete the property (revert to original setting), enter the following command:
$ emctl delete property -name “oracle.sysman.core.jobs.shortPoolSize"
After changing the property, the default value is 25.
Note:
Starting from Enterprise Manager 13.5 Release Update 2, theshortPoolSize
parameter update is made hot deployable on OMS
and does not require any OMS restart.
To change the OMS property, oracle.sysman.core.jobs.longPoolSize, follow these recommendations:
To set the property, enter the following command:
$ emctl set property -name oracle.sysman.core.jobs.longPoolSize -value 200
To get the property (after changing from the default), enter the following command:
$ emctl get property -name “oracle.sysman.core.jobs.longPoolSize"
To delete the property (revert to original setting), enter the following command:
$ emctl delete property -name “oracle.sysman.core.jobs.longPoolSize"
After changing the property, the default value is 12.
Note:
Starting from Enterprise Manager 13.5 Release Update 2, thelongPoolSize
parameter update is made hot deployable on OMS and
does not require any OMS restart.
To change the OMS property, oracle.sysman.core.jobs.longSystemPoolSize, follow these recommendations:
To set the property, enter the following command:
$ emctl set property -name oracle.sysman.core.jobs.longSystemPoolSize -value 200
To get the property (after changing from the default), enter the following command:
$ emctl get property -name “oracle.sysman.core.jobs.longSystemPoolSize"
To delete the property (revert to original setting), enter the following command:
$ emctl delete property -name “oracle.sysman.core.jobs.longSystemPoolSize"
After changing the property, the default value is 10.
Note:
Starting from Enterprise Manager 13.5 Release Update 2, thelongSystemPoolSize
parameter update is made hot deployable on
OMS and does not require any OMS restart.
To change the OMS property, oracle.sysman.core.jobs.systemPoolSize, follow these recommendations:
To set the property, enter the following command:
$ emctl set property -name oracle.sysman.core.jobs.systemPoolSize -value 200
To get the property (after changing from the default), enter the following command:
$ emctl get property -name “oracle.sysman.core.jobs.systemPoolSize"
To delete the property (revert to original setting), enter the following command:
$ emctl delete property -name “oracle.sysman.core.jobs.systemPoolSize"
After changing the property, the default value is 25.
Note:
Starting from Enterprise Manager 13.5 Release Update 2, thesystemPoolSize
parameter update is made hot deployable on
OMS and does not require any OMS restart.
To change the OMS property, oracle.sysman.core.conn.maxConnForJobWorkers, follow these recommendations:
To set the property, enter the following command:
$ emctl set property -name oracle.sysman.core.conn.maxConnForJobWorkers -value 200
To get the property (after changing from the default), enter the following command:
$ emctl get property -name “oracle.sysman.core.conn.maxConnForJobWorkers"
To delete the property (revert to original setting), enter the following command:
$ emctl delete property -name “oracle.sysman.core.conn.maxConnForJobWorkers"
An OMS restart using ‘emctl stop oms; emctl start oms' is required on each OMS after changing the property value. The default value is 25.
Changing omsAgentComm.ping.heartbeatPingRecorderThreads
To change the OMS property, oracle.sysman.core.omsAgentComm.ping.heartbeatPingRecorderThreads, follow these recommendations:
To set the property, enter the following command:
emctl set property -name oracle.sysman.core.omsAgentComm.ping.heartbeatPingRecorderThreads -value 5
To get the property (after changing from the default), enter the following command:
emctl get property -name oracle.sysman.core.omsAgentComm.ping.heartbeatPingRecorderThreads
To delete the properties (revert to original setting), enter the following command:
emctl delete property -name oracle.sysman.core.omsAgentComm.ping.heartbeatPingRecorderThreads
An OMS restart using ‘emctl stop oms; emctl start oms' is required on each OMS after changing the property value.
Modifying Database Settings
If you have downloaded the Database Templates for a Preconfigured Repository, you can run the appropriate SQL script to adjust the database parameters to the recommended settings. The scripts that you should run are listed in the following table:
Table 15-14 Scripts for Deployment Sizes for DB 19c
Size | Script |
---|---|
Small |
|
Medium |
|
Large |
|
Extra Large |
|
Note:
The above scripts do not adjust SGA_TARGET/ PGA_AGGREGATE_TARGET therefore these parameters must be modified manually.Enterprise Manager Performance Methodology
An accurate predictor of capacity at scale is the actual metric trend information from each individual Enterprise Manager deployment. This information, combined with an established, rough, starting host system size and iterative tuning and maintenance, produces the most effective means of predicting capacity for your Enterprise Manager deployment. It also assists in keeping your deployment performing at an optimal level.
Here are the steps to follow to enact the Enterprise Manager sizing methodology:
-
If you have not already installed Enterprise Manager, choose a rough starting host configuration as listed in Table 15-1.
-
Periodically evaluate your site's vital signs (detailed later).
-
Eliminate bottlenecks using routine DBA/Enterprise Manager administration housekeeping.
-
Eliminate bottlenecks using tuning.
-
Extrapolate linearly into the future to plan for future sizing requirements.
Step one need only be done once for a given deployment. Steps two, three, and four must be done, regardless of whether you plan to grow your Enterprise Manager site, for the life of the deployment on a regular basis. These steps are essential to an efficient Enterprise Manager site regardless of its size or workload. You must complete steps two, three, and four before you continue on to step five. This is critical. Step five is only required if you intend to grow the deployment size in terms of monitored targets. However, evaluating these trends regularly can be helpful in evaluating any other changes to the deployment.
Step 1: Choosing a Starting Platform Enterprise Manager Deployment
For information about choosing a starting platform Enterprise Manager deployment, see Overview of Sizing Guidelines.
Step 2: Periodically Evaluating the Vital Signs of Your Site
This is the most important step of the five. Without some degree of monitoring and understanding of trends or dramatic changes in the vital signs of your Enterprise Manager site, you are placing site performance at serious risk. Every monitored target sends data to the Management Repository for loading and aggregation through its associated Management Agent. This adds up to a considerable volume of activity that requires the same level of management and maintenance as any other enterprise application.
Enterprise Manager has "vital signs" that reflect its health. These vital signs should be monitored for trends over time as well as against established baseline thresholds. You must establish realistic baselines for the vital signs when performance is acceptable. Once baselines are established, you can use built-in Oracle Enterprise Manager functionality to set baseline warning and critical thresholds. This allows you to be notified automatically when something significant changes on your Enterprise Manager site. The following table is a point-in-time snapshot of the Enterprise Manager vital signs for two sites:
Module | Metrics | EM Site 1 | EM Site 2 |
---|---|---|---|
Site |
- |
emsite1 |
emsite2 |
Target Counts |
Database Targets |
192 (45 not up) |
1218 (634 not up) |
- |
Host Targets |
833 (12 not up) |
1042 (236 not up) |
- |
Total Targets |
2580 (306 not up) |
12293 (6668 not up) |
Overall Status |
Overall Backoff Requests in the Last 10 Mins |
0 |
500 |
Job Statistics |
Estimated time for clearing current Job steps backlogJob |
0.1 |
7804 |
Event Statistics |
Pending Events Count |
2 |
4000 |
Management Service Host Statistics |
Average % CPU (Host 1) |
9 (emhost01) |
13 (emhost01) |
- |
Average % CPU (Host 2) |
6 (emhost02) |
17 (emhost02) |
- |
Average % CPU (Host 3) |
N/A |
38 (em6003) |
- |
Average % CPU (Host 4) |
N/A |
12 (em6004) |
- |
Number of cores per host |
2 X 2.8 (Xeon) |
4 X 2.4 (Xeon) |
- |
Memory per Host (GB) |
8 |
8 |
Management Repository Host Statistics |
Average % CPU (Host 1) |
12 (db01rac) |
64 (em6001rac) |
- |
Average % CPU (Host 2) |
14 (db02rac) |
78 (em6002rac) |
- |
Number of CPU cores per host |
4 |
8 |
- |
Memory target (GB) |
5.25 |
7.5 |
- |
Memory per Host (GB) |
8 |
16 |
- |
Total Management Repository Size (GB) |
56 |
98 |
- |
Oracle RAC Interconnect Traffic (MB/s) |
1 |
4 |
- |
Management Server Traffic (MB/s) |
4 |
4 |
- |
Total Management Repository I/O (MB/s) |
6 |
27 |
Enterprise Manager UI Page Response/Sec |
Home Page |
3 |
6 |
- |
All Host Page |
3 |
30+ |
- |
All Database Page |
6 |
30+ |
- |
Database Home Page |
2 |
2 |
- |
Host Home Page |
2 |
2 |
The two Enterprise Manager sites are at the opposite ends of the scale for performance.
EM Site 1 is performing very well with very few backoff requests. It also has a very low job and event backlogs. The CPU utilization on both the OMS and Management Repository Server hosts are low. Most importantly, the UI Page Response times are excellent. To summarize, Site 1 is doing substantial work with minimal effort. This is how a well configured, tuned and maintained Oracle Enterprise Manager site should look.
Conversely, EM Site 2 is having difficulty. The site has substantial amounts of backoffs and sizable job and event backlogs. Worst of all are the user interface page response times. There is clearly a bottleneck on Site 2, possibly more than one.
These vital signs are all available from within the Enterprise Manager interface. Most values can be found on the All Metrics page for each host, or the All Metrics page for the OMS. Keeping an eye on the trends over time for these vital signs, in addition to assigning thresholds for warning and critical alerts, allows you to maintain good performance and anticipate future resource needs. You should plan to monitor these vital signs as follows:
-
Take a baseline measurement of the vital sign values seen in the previous table when the Enterprise Manager site is running well.
-
Set reasonable thresholds and notifications based on these baseline values so you can be notified automatically if they deviate substantially. This may require some iteration to fine-tune the thresholds for your site. Receiving too many notifications is not useful.
-
On a daily (or weekly at a minimum) basis, watch for trends in the 7-day graphs for these values. This will not only help you spot impending trouble, but it will also allow you to plan for future resource needs.
Another crucial vital sign to monitor on the Enterprise Manager console is the self-monitoring Managing the Manager Repository pages which provide visibility into the inflow of metrics and events. Fine tuning incoming metric and events data is crucial for maintaining overall Enterprise Manager health and performance.
The next step provides guidance of what to do when the vital sign values are not within established baseline thresholds, though the inflow trend of Metrics and Events data in the self-monitoring pages does not show any abnormality. Also, it explains how to maintain your site's performance through routine housekeeping.
Step 3: Using DBA and Enterprise Manager Tasks To Eliminate Bottlenecks
It is critical to note that routine housekeeping helps keep your Enterprise Manager site running well. The following are lists of housekeeping tasks and the interval on which they should be done.
Offline Monthly Tasks
Enterprise Manager Administrators should monitor the database built-in Segment Advisor for recommendations on Enterprise Manager Repository segment health. The Segment Advisor advises administrators which segments need to be rebuilt/reorganized and provides the commands to do so.
For more information about Segment Advisor and issues related to system health, refer to notes 242736.1 and 314112.1 in the My Oracle Support Knowledge Base.
Step 4: Eliminating Bottlenecks Through Tuning
The most common causes of performance bottlenecks in the Enterprise Manager application are listed below (in order of most to least common):
-
Housekeeping that is not being done (far and away the biggest source of performance problems)
-
Hardware or software that is incorrectly configured
-
Hardware resource exhaustion
When the vital signs are routinely outside of an established threshold, or are trending that way over time, you must address two areas. First, you must ensure that all previously listed housekeeping is up to date. Secondly, you must address resource utilization of the Enterprise Manager application. The vital signs listed in the previous table reflect key points of resource utilization and throughput in Enterprise Manager. The following sections cover some of the key vital signs along with possible options for dealing with vital signs that have crossed thresholds established from baseline values.
High CPU Utilization
When you are asked to evaluate a site for performance and notice high CPU utilization, there are a few common steps you should follow to determine what resources are being used and where.
-
Use the Processes display on the Enterprise Manager Host home page to determine which processes are consuming the most CPU on any Management Service or Management Repository host that has crossed a CPU threshold.
-
Once you have established that Enterprise Manager is consuming the most CPU, use Enterprise Manager to identify what activity is the highest CPU consumer. Typically this manifests itself on a Management Repository host where most of the Management Service's work is performed. Here are a few typical spots to investigate when the Management Repository appears to be using too many resources.
-
Check out Top Wait Events metrics for the Enterprise Manager Repository.
-
Click the CPU Used database resource listed on the Management Repository's Database Performance page to examine the SQL that is using the most CPU at the Management Repository.
-
Check the Database Locks on the Management Repository's Database Performance page looking for any contention issues.
-
Check the SQL Monitoring on the Management Repository's Database for any resource intensive SQL.
-
High CPU utilization is probably the most common symptom of any performance bottleneck. Typically, the Management Repository is the biggest consumer of CPU, which is where you should focus. A properly configured and maintained Management Repository host system that is not otherwise hardware resource constrained should average roughly 40 percent or less total CPU utilization. An OMS host system should average roughly 20 percent or less total CPU utilization. These relatively low average values should allow sufficient headroom for spikes in activity. Allowing for activity spikes helps keep your page performance more consistent over time. If your Enterprise Manager site interface pages happen to be responding well (approximately 3 seconds) while there are no significant backlogs, and it is using more CPU than recommended, you may not have to address it unless you are concerned it is part of a larger upward trend.
The recommended path for tracking down the root cause of high Management Repository CPU utilization is captured under steps 3.a, 3b, 3c, and 3.d listed above. CPU should be always be the topmost wait event. Log File Sync wait event indicating slow I/O performance should not appear in the top 5 waits ideally. To identify the root cause, start at the Management Repository Performance page and work your way down to the SQL that is consuming the most CPU in its processing. Correlate your findings with the AWR report. This approach has been used very successfully on several real world sites.
If you are running Enterprise Manager on Intel based hosts, the Enterprise Manager Management Service and Management Repository will both benefit from Hyper-Threading (HT) being enabled on the host or hosts on which they are deployed. HT is a function of certain late models of Intel processors, which allows the execution of some amount of CPU instructions in parallel. This gives the appearance of double the number of CPUs physically available on the system. Testing has proven that HT provides approximately 1.5 times the CPU processing power as the same system without HT enabled. This can significantly improve system performance. The Management Service and Management Repository both frequently have more than one process executing simultaneously, so they can benefit greatly from HT.
Loader Vital Signs
The vital signs for the loader indicate exactly how much data is continuously coming into the system from all the Enterprise Manager Agents. The most important item here is the “Number of Agents Sent Back in the Last Hour" metric. The metric can be found in the All Metrics page of each management service. This is the number of agents instructed to defer loading of data in the last hour. Ideally no agent should be instructed to defer loading, but some level of deferred loading is normal. If this value is above 2 percent of your deployed agent count and it is growing continuously, then action should be taken.
Ensure that back-off requests are spread uniformly across OMS in a multi-OMS environment. If the back-off requests pertain to a specific OMS and does not show uniform trend across OMS, verify that the load-balancing algorithm set at Server Load Balancer is round-robin. Add loader threads only if there are backoffs on important channels in the range of hundreds an hour consistently and there are sufficient free resources on the database.
The number of Loader Threads is always set to 20 per OMS by default. Adding loader threads to an OMS increases the overall host CPU utilization. Customers can change this value as their site requires.
There are diminishing returns when adding loader threads if your repository does not have sufficient resources available. If you have available repository resources, as you add loader threads, you should see the “Number of Agents Sent Back in the Last Hour" metric decrease. If you are not seeing improvement you should explore other tuning or housekeeping opportunities.
To add more loader threads, you can change the following configuration parameter:
oracle.sysman.core.gcloader.max_recv_thread
Rollup Vital Signs
The rollup process is the aggregation mechanism for Enterprise Manager. The two vital signs for the rollup are the rows/second and % of hour run. Due to the large volume of data rows processed by the rollup, it tends to be the largest consumer of Management Repository buffer cache space. Because of this, the rollup vital signs can be great indicators of the benefit of increasing buffer cache size.
Rollup rows/second shows exactly how many rows are being processed, or aggregated and stored, every second. This value is usually around 2,000 (+/- 500) rows per second on a site with a decent size buffer cache and reasonable speedy I/O. A downward trend over time for this value may indicate a future problem, but as long as % of hour run is under 100 your site is probably fine.
If rollup % of hour run is trending up (or is higher than your baseline), and you have not yet set the Management Repository buffer cache to its maximum, it may be advantageous to increase the buffer cache setting. Usually, if there is going to be a benefit from increasing buffer cache, you will see an overall improvement in resource utilization and throughput on the Management Repository host. The loader statistics will appear a little better. CPU utilization on the host will be reduced and I/O will decrease. The most telling improvement will be in the rollup statistics. There should be a noticeable improvement in both rollup rows/second and % of hour run. If you do not see any improvement in any of these vital signs, you can revert the buffer cache to its previous size. The old Buffer Cache Hit Ratio metric can be misleading. It has been observed in testing that Buffer Cache Hit Ratio will appear high when the buffer cache is significantly undersized and Enterprise Manager performance is struggling because of it. There will be times when increasing buffer cache will not help improve performance for Enterprise Manager. This is typically due to resource constraints or contention elsewhere in the application. Consider using the steps listed in the High CPU Utilization section to identify the point of contention. Enterprise Manager also provides advice on buffer cache sizing from the database itself. This is available on the database Memory Parameters page.
Rollup Process
If rollup % of hour run is trending up (or is higher than your baseline) and buffer cache is already set to optimal but there is still many cluster Wait events reported in the AWR report, configure the Rollup database service and set affinity to run the Rollup Service only on a single-instance RAC node. Ensure that single-instance RAC node is sized to handle large I/O volume.
Use the following configuration steps for Rollup database service:
-
Create database service "rollup" and set one of the RAC instances as the primary instance in "-r".
-
srvctl add service -d <dbname>-s rollup -r <primary instance> -a <the other instances> -y automatic
-
srvctl start service -d <dbname>-s rollup
srvctl status service -d <dbname>
-
-
As sys user, execute DBMS_SCHEDULER.create_job_class( job_class_name => 'ROLLUP', service => 'rollup')
-
GRANT EXECUTE ON sys.ROLLUP TO sysman;
-
As sysman user, execute DBMS_SCHEDULER.SET_ATTRIBUTE ( name => 'EM_ROLLUP_SCHED_JOB', attribute => 'job_class', value => 'ROLLUP')
-
As sysman user, execute GC_SCHED_JOB_REGISTRAR.SET_JOB_CLASS('EM_ROLLUP_SCHED_JOB', 'ROLLUP')
In addition to configuration of Rollup database service, add Rollup worker threads if the database can handle the increased load from these threads. Configure additional rollup worker threads using configure option in Metric Rollup Performance Chart available in self- monitoring "Managing the Manager" Repository page.
Job, Notification, and Alert Vital Signs
Jobs, notifications, and alerts are indicators of the processing efficiency of the Management Service(s) on your Enterprise Manager site.
Jobs
A growing backlog in Jobs Steps Scheduled at the repository indicates there are not enough resources available at the repository. High Job Dispatcher processing time (%) indicates a repository bottleneck. Low throughput with High Job Dispatcher processing time (%) indicates a processing bottleneck. The Jobs subsystem uses the locks to maintain sequence internally, so you can see application locks wait events, and transaction locking wait events in the AWR report in repository. It is normal to observe the Job system consuming 5-8% of waiting time, but if that value crosses 20-30%, it is quite abnormal and should be triaged. If there are significant amounts of cluster waits for Job SQLs in AWR, you could potentially optimize the Job system by introducing RAC services. Create a database service for Jobs and then set affinity to run on a two-node RAC instance for better optimal performance.
Use the following configuration steps to set up the Rollup database service:
-
Create the database service emjob and set two of the RAC instances as primary instance in "-r".
srvctl add service -d <dbname> -s emjob -r <primary instances> -a <the other instances> -y automatic
After creating the database service, you need to restart the service using the
srvctl start service
command. -
Execute the following DBMS_SCHEDULER jobs:
-
As a sys user, execute DBMS_SCHEDULER.create_job_class( job_class_name => 'EMJOB', service => 'emjob ')
-
GRANT EXECUTE ON sys.EMJOB TO sysman;
-
As a sysman user, execute DBMS_SCHEDULER.SET_ATTRIBUTE (name => ' EM_JOBS_STEP_SCHED ', attribute => 'job_class', value => 'EMJOB')
-
As a sysman user, execute DBMS_SCHEDULER.SET_ATTRIBUTE (name => ' EM_JOB_PURGE_POLICIES ', attribute => 'job_class', value => 'EMJOB')
-
As a sysman user, execute GC_SCHED_JOB_REGISTRAR.SET_JOB_CLASS('EM_JOBS_STEP_SCHED', 'EMJOB')
-
As a sysman user, run GC_SCHED_JOB_REGISTRAR.SET_JOB_CLASS('EM_JOB_PURGE_POLICIES', 'EMJOB')
-
INSERT INTO MGMT_PARAMETERS(parameter_name, parameter_value) VALUES ('EM_jobs_step_sched_job_class', 'EMJOB')
-
-
Set the connect string with ping service name to the emctl property
oracle.sysman.core.omsAgentComm.ping.connectionService.connectDescriptor
-
Sample:
emctl set property -name "company.sysman.core.jobs.conn.service" -value "\(DESCRIPTION=\(ADDRESS_LIST=\(ADDRESS=\(PROTOCOL=TCP\)\(HOST=xxx.example.com\)\(PORT=1521\)\)\)\(CONNECT_DATA=\(SERVICE_NAME=emjob\)\)\)"
-
Events and Notifications
If the vital sign has crossed the baseline threshold, look for vital signs in self-monitoring Managing the Manager pages. Monitor charts for consistent drastic increase in Metric alerts backlog, Metric Collection errors backlog, and Notification backlog. Key Metrics to check event backlogs are Total Events Pending and Total Events Processed (Last Hour). If Total Events Pending remains high but Total Events Processed (Last Hour) is making good progress, it could be a temporary spike which can be ignored, but if there is a consistent increase in both metrics, the Events subsystem will benefit by introducing a database service and setting affinity to only run on a single-instance RAC node.
Use these configuration steps for an Events database service:
-
Create a database service event and set one of the RAC instances as the primary instance in "-r"
srvctl add service -d <dbname>-s event -r <primary instance> -a <the the other instances> -y automatic
-
Set the connect string with the 'ping' service name to the emctl property oracle.sysman.core.events.connectDescriptor
Sample
emctl set property -name "oracle.sysman.core.events.connectDescriptor" -value "\(DESCRIPTION=\(ADDRESS_LIST=\(ADDRESS=\(PROTOCOL=TCP\)\(HOST=xxx.example.com\)\(PORT=1521\)\)\)\(CONNECT_DATA=\(SERVICE_NAME=event\)\)\)"
Ping Alerts
Ping Alerts performance is crucial for determining target availability. If the vital signs have crossed a baseline threshold and there are many cluster waits in the AWR report, there is a measurable benefit by introducing a database service for Ping and setting affinity to run only on a single instance RAC node.
Use these configuration steps for defining a Pings database service:
-
Create database service ping and set one of RAC instance as primary instance in "-r"
srvctl add service -d <dbname>-s ping -r <primary instance> -a <the the other instances> -y automatic
-
Execute the following DBMS_SCHEDULER jobs
-
As a sys user, execute DBMS_SCHEDULER.create_job_class( job_class_name => 'PING', service => 'ping')
-
GRANT EXECUTE ON sys.PING TO sysman;
-
As a sysman user, execute DBMS_SCHEDULER.SET_ATTRIBUTE ( name => 'EM_PING_MARK_NODE_STATUS', attribute => 'job_class', value => 'PING')
-
As a sysman user, execute DBMS_SCHEDULER.SET_ATTRIBUTE ( name => 'EM_REPOS_SEV_EVAL', attribute => 'job_class', value => 'PING')
-
As a sysman user, execute GC_SCHED_JOB_REGISTRAR.SET_JOB_CLASS('EM_REPOS_SEV_EVAL', 'PING')
-
As a sysman user, execute GC_SCHED_JOB_REGISTRAR.SET_JOB_CLASS('EM_PING_MARK_NODE_STATUS', 'PING')
-
-
Set the connect string with ping service name to emctl property oracle.sysman.core.omsAgentComm.ping.connectionService.connectDescriptor
Sample
emctl set property -name
oracle.sysman.core.omsAgentComm.ping.connectionService.connectDescriptor" -value "\(DESCRIPTION=\(ADDRESS_LIST=\(ADDRESS=\(PROTOCOL=TCP\)\(HOST=xxx.example.com\)\(PORT=1521\)\)\)\(CONNECT_DATA=\(SERVICE_NAME=ping\)\)\)
Config Metric Post Load Callbacks
-
Agents upload the Config Metric collections to OMS, OMS registers the upload in the Enterprise Manager Repository, generates a snapshot and then, feeds the uploaded payload (or data) into a queue, called as the Loader-Job Queue, for processing later. This allows the Loader module at OMS off-load the additional processing required with Config Metric Upload, and free up resources to process more uploads from Agents.
-
There is a separate module responsible for pulling the entries out of this queue, and then calling the callbacks responsible to work on the data and then assimilate the data in the repository in order of their insertion into the Loader-Job Queue. This module is referred to as the Config Metric Post Upload Callback Executor (or loader-job) module. This module allows the end user to configure the number of threads that will process the data, number of SQL connections to the Enterprise Manager Repository that these threads have access to, and whether to use a dedicated DB service pinned on one of the RAC nodes which hosts the Enterprise Manager Repository.
The default values of the settings on Small and Medium Enterprise Manager site sizes works fine. You might need to override the default values for Large and Extra Large configurations.
srvctl add service -d <dbname>-s loaderjob -r <primary instance> -a <the otherinstances> -y automatic
emctl set property -name "oracle.sysman.core.pbs.gcloader.connectDescriptor" -value "\(DESCRIPTION=\(ADDRESS_LIST=\(ADDRESS=\(PROTOCOL=TCP\)\(HOST=xxx.example.com\)\(PORT=1521\)\)\)\(CONNECT_DATA=\(SERVICE_NAME=loaderjob\)\)\)"
Large and Extra Large
emctl set property -name "oracle.sysman.core.pbs.gcloader.numThreads" -value 5
emctl set property -name "oracle.sysman.core.gcloader.loaderjob.maxConnections" -value 5
I/O Vital Signs
Monitoring the I/O throughput of the different channels in your Enterprise Manager deployment is essential to ensuring good performance. At minimum, there are three different I/O channels on which you should have a baseline and alert thresholds defined:
-
Disk I/O from the Management Repository instance to its data files
-
Network I/O between the OMS and Management Repository
-
Oracle RAC interconnect (network) I/O (on Oracle RAC systems only)
You should understand the potential peak and sustained throughput I/O capabilities for each of these channels. Based on these and the baseline values you establish, you can derive reasonable thresholds for warning and critical alerts on them in Enteprise Manager. You will then be notified automatically if you approach these thresholds on your site. Some site administrators can be unaware or mistaken about what these I/O channels can handle on their sites. This can lead to Enterprise Manager saturating these channels, which in turn cripples performance on the site. In such an unfortunate situation, you would see that many vital signs would be impacted negatively.
To discover whether the Management Repository is involved, you can use Enterprise Manager to check the Database Performance page. On the Performance page for the Management Repository, click the wait graph showing the largest amount of time spent. From this you can continue to drill down into the actual SQL code or sessions that are waiting. This should help you to understand where the bottleneck is originating.
Another area to check is unexpected I/O load from non-Enterprise Manager sources like backups, another application, or a possible data-mining co-worker who engages in complex SQL queries, multiple Cartesian products, and so on.
Total Repository I/O trouble can be caused by two factors. The first is a lack of regular housekeeping. Some of the Enterprise Manager segments can be very badly fragmented causing a severe I/O drain. Second, there can be some poorly tuned SQL statements consuming much of the site I/O bandwidth. These two main contributors can cause most of the Enterprise Manager vital signs to plummet. In addition, the lax housekeeping can cause the Management Repository's allocated size to increase dramatically.
One important feature of which to take advantage is asynchronous I/O. Enabling asynchronous I/O can dramatically improve overall performance of the Enterprise Manager application. The Sun Solaris™ and Linux operating systems have this capability, but may be disabled by default. The Microsoft Windows™ operating system uses asynchronous I/O by default. Oracle strongly recommends enabling of this operating system feature on the Management Repository hosts and on Management Service hosts as well.
Automatic Storage Management (ASM) is recommended for Enterprise Manager repository database storage.
About the Oracle Enterprise Manager Performance Page
There may be occasions when Enterprise Manager user interface pages are slow in the absence of any other performance degradation. The typical cause for these slow downs will be an area of Enterprise Manager housekeeping that has been overlooked. The first line of monitoring for Enterprise Manger page performance is the use of Enterprise Manager beacons. These functionalities are also useful for web applications other than Enterprise Manager.
Beacons are designed to be lightweight page performance monitoring targets. After defining a beacon target on an Management Agent, you can then define UI performance transactions using the beacon. These transactions are a series of UI page hits that you will manually walk through once. Thereafter, the beacon will automatically repeat your UI transaction on a specified interval. Each time the beacon transaction is run, Enterprise Manager will calculate its performance and store it for historical purposes. In addition, alerts can be generated when page performance degrades below thresholds you specify.
When you configure the Enterprise Manager beacon, you begin with a single predefined transaction that monitors the home page you specify during this process. You can then add as many transactions as are appropriate. You can also set up additional beacons from different points on your network against the same web application to measure the impact of WAN latency on application performance. This same functionality is available for all Web applications monitored by Enterprise Manager.
After you are alerted to a UI page that is performing poorly, you can then use the second line of page performance monitoring in Enterprise Manager. This end-to-end (or E2E) monitoring functionality in Enterprise Manager is designed to allow you to break down processing time of a page into its basic parts. This will allow you to pinpoint when maintenance may be required to enhance page performance. E2E monitoring in Enterprise Manager lets you break down both the client side processing and the server side processing of a single page hit.
The next page down in the Middle Tier Performance section will break out the processing time by tier for the page. By clicking the largest slice of the Processing Time Breakdown pie chart, which is JDBC time above, you can get the SQL details. By clicking the SQL statement, you break out the performance of its execution over time.
The JDBC page displays the SQL calls the system is spending most of its page time executing. This SQL call could be an individual DML statement or a PL/SQL procedure call. In the case of an individual SQL statement, you should examine the segments (tables and their indexes) accessed by the statement to determine their housekeeping (rebuild and reorganization) needs. The PL/SQL procedure case is slightly more involved because you must look at the procedure's source code in the Management Repository to identify the tables and associated indexes accessed by the call.
Once you have identified the segments, you can then run the necessary rebuild and reorganization statements for them with the OMS down. This should dramatically improve page performance. There are cases where page performance will not be helped by rebuild and reorganization alone, such as when excessive numbers of open alerts, system errors, and metric errors exist. The only way to improve these calls is to address (for example, clean up or remove) the numbers of these issues. After these numbers are reduced, then the segment rebuild and reorganization should be completed to optimize performance. These scenarios are covered in Step 3: Using DBA and Enterprise Manager Tasks To Eliminate Bottlenecks. If you stay current, you should not need to analyze UI page performance as often, if at all.
For more information about new features for monitoring the performance of SQL procedures from the Enterprise Manager console, see the chapter, "Maintaining Enterprise Manager" in the Enterprise Manager Administration guide.
Determining the Optimum Number of Middle Tier OMS Servers
Determining the optimum number of middle tier OMS servers is not a trivial task. A number of data points must be considered for an informed, justified and acceptable decision for introducing additional OMS instances. The number of monitored targets is one of the first considerations, but its weight in decision making is normally not substantial.
The following items should be considered and examined as part of this exercise:
-
The volume of job automation and scheduling used
-
The number of administrators working simultaneously in the console
-
Network bandwidth and data channel robustness from agents to the OMS servers
-
Number of triggered violations and notifications
-
Speed and stability of the IO system the OMS servers use
Careful investigation of each category is essential to making an informed decision. In some cases, just adding an OMS server or providing more CPU or memory to the same host may not make any difference in performance enhancement. You can use the current running OMS instances to collect accurate statistics on current OMS performance to calculate the number of required OMS servers for current or future deployments. Enterprise Manager has vital signs that reflect its health. These vital signs should be monitored for trends over time as well as against established baseline thresholds.
Step 5: Extrapolating Linearly Into the Future for Sizing Requirements
Determining future storage requirements is an excellent example of effectively using vital sign trends. You can use two built-in Enterprise Manager charts to forecast this: the total number of targets over time and the Management Repository size over time.
Both of the graphs are available on the All Metrics page for the Management Service. It should be obvious that there is a correlation between the two graphs. A straight line applied to both curves would reveal a fairly similar growth rate. After a target is added to Enterprise Manager for monitoring, there is a 31-day period where Management Repository growth will be seen because most of the data that will consume Management Repository space for a target requires approximately 31 days to be fully represented in the Management Repository. A small amount of growth will continue for that target for the next year because that is the longest default data retention time at the highest level of data aggregation. This should be negligible compared with the growth over the first 31 days.
When you stop adding targets, the graphs will level off in about 31 days. When the graphs level off, you should see a correlation between the number of targets added and the amount of additional space used in the Management Repository. Tracking these values from early on in your Enterprise Manager deployment process helps you to manage your site's storage capacity pro-actively. This history is an invaluable tool.
The same type of correlation can be made between CPU utilization and total targets to determine those requirements. There is a more immediate leveling off of CPU utilization as targets are added. There should be no significant increase in CPU cost over time after adding the targets beyond the relatively immediate increase. Introducing new monitoring to existing targets, whether new metrics or increased collections, would most likely lead to increased CPU utilization.
Using Returning Query Safeguards to Improve Performance
On the All Targets page, Enterprise Manager uses a safeguard that prevents a flood of data from slowing performance and consuming excessive resources within the OMS by limiting the number of rows that can be returned from a query. By default, the limit is set to 2000, but an Enterprise Manager administrator can modify the limit with the following command:
emctl set property -name oracle.sysman.core.uifwk.maxRows -value 2000
Providing a value equal to 0 will turn off the safeguard and fetch all rows. The new value takes immediate effect; no OMS restart is required. If the value is less than 0, the default value (2000) will be used instead. The only way to indicate that no limiting should be performed is to set the value to exactly 0.
When there are too many results returned from a query and this limit comes into effect, the following message appears under the results table:
"This table of search results is limited to 2000 targets. Narrow the results by using Refine Search or Search Target Name. See the tuning guide for how to modify this limit."
Similar behaviors (and messages) are applied to other large tables throughout Enterprise Manager. The same OMS property (oracle.sysman.core.uifwk.maxRows
) controls the maximum limit for all of them together. This matches the behavior (and reuses the existing property) from previous Enterprise Manager releases.
Overview of Sizing Requirements for Fusion Middleware Monitoring
A Fusion Middleware target is like any other Enterprise Manager target. Therefore any repository or sizing guideline that is applicable for an Enterprise Manager target would be applicable on a Fusion Middleware target.
One major concern in the case of Fusion Middleware discovery is that too many targets may be discovered, created and monitored. This adds additional load on the OMS instance, repository and agent. In the case of very large number of targets, after target discovery Oracle recommends that users should review all the targets and their respective metrics.
Based on requirements, users should finalize which targets and metrics should be monitored and the required frequency those targets should be monitored.
After discovery, Oracle recommends you allow Fusion Middleware/ADP/JVMD monitoring to run for some duration (a few days to possibly a few weeks) and continuously monitor the database size and Operating System file system growth (in the case of ADP; ADP Manager requires a minimum of 10GB of disk space) until it becomes constant. You can then fine tune various parameters associated with these different features.
In Enterprise Manager version 24ai, both ADP and JVMD use the Enterprise Manager repository as their repository. Their data are stored in the MGMT_AD4J_TS tablespace.