4 Improving ASAP Performance

This chapter describes ways to improve Oracle Communications ASAP performance.

About Improving ASAP Performance

This chapter is intended to aid those who have prior knowledge of the ASAP configuration and UNIX operating systems. Before starting the tuning exercises described in this chapter, you should be familiar with the following items:

  • Location of ASAP diagnostic files and the UNIX utilities that are used to view and manipulate them such as grep, tail, pg, top, vmstat, sar, prstat, glance (on HP), and so on.

  • Location of the ASAP configuration files (ASAP.cfg, ASAP.properties, Environment_Profile, NEP.jinterpreter, config.xml, startWebLogic.sh), how to use an editor such as vi, how to modify the configuration files, the layout of configuration files, for example, server specific versus global variables.

  • How to use UNIX utilities, such as top and sar to monitor the resources being used by ASAP.

For more information, consult your system's online documentation about UNIX utilities or the ASAP documentation.

Recommended Configuration

This section provides you with tools and guidelines to properly select a default configuration.

About Pre-tuned ASAP Configurations

ASAP ships with pre-tuned configurations for small, medium and large installations. (Definitions of small, medium and large configurations in this chapter are consistent with those in the planning chapter of the ASAP Installation Guide.)

Details are given below for these stock configurations, with sample performance figures. Your results will vary depending on your hardware and other configuration choices you may have implemented.

Installing a Pre-tuned Configuration

The pre-tuned configuration files are generated by a script which is run as part of the installation process for new installs only. The files are placed in ASAP_Home/samples/sample_configs, where ASAP_Home is the directory in which ASAP is installed.

Table 4-1 lists and described the pre-tuned configuration files.

Table 4-1 Pre-tuned Configuration Files

Default Configuration File Small Medium Large

ASAP.cfg file

ASAP.cfg.small

ASAP.cfg.medium

ASAP.cfg.large

ASAP.properties file

ASAP.properties.small

ASAP. properties.medium

ASAP. properties.large

Environment_Profile file

Environment_Profile.small

Environment_Profile.medium

Environment_Profile.large

Performance

50,000 orders/day

1 order/sec

7 ASDL/sec

500,000 orders/day

11.5 orders/sec

80.5 ASDL/sec

20.95 orders/sec

146.65 ASDL/sec

DB connections

52

89

145

Log level

SANE

SANE

SANE

Using a Pre-tuned Configuration with a New ASAP Installation

After the installation of ASAP is complete, back up the following files:

  • ASAP_Home/config/ASAP.cfg

  • ASAP_Home/ASAP.properties

  • ASAP_Home/Environment_Profile

Replace these files with the appropriate three files corresponding to the desired pre-tuned configuration from ASAP_Home/samples/sample_configs.

Generating Pre-tuned Configuration Files

For upgrade installations, the pre-tuned configuration files are not generated. However, the generation script can be run manually:

$ASAP_BASE/scripts/generate_sample_configs.ksh $ASAP_BASE

This will generate the pre-tuned configuration files to ASAP_home/samples/sample_configs.

Merging Pre-tuned File Settings into an Existing Installation

If you have already made changes to the ASAP.cfg, ASAP.properties, or Environment_Profile files, you will have to manage the differences between your altered files and the pre-tuned files. For example, simply copying over the pre-tuned files will overwrite any changes you have made. To merge the pre-tuned file settings with your existing settings, compare the differences between your existing files and the pre-tuned files and manually add in the changes from the pre-tuned files.

Example Pre-tuned Configuration Performance

Table 4-2, Table 4-3, and Table 4-4 provide example performance results on hardware. Your actual results will vary.

Table 4-2 Small Installation Pre-tuned Configuration Performance

- - CPU, % of server CPU, % of 1 proc RAM, MB

V880

Service Activation Request Manager (SARM)

0.79

3.16

43

V880

Network Element Processor (NEP)

1.39

5.56

37

V880

JENEP

0.64

2.56

181

sclust01

WebLogic Server

4.73

9.46

261

Compaq

Oracle

1.65

13.2

N/A

N/A

Total ASAP (SARM, NEP, JENEP, etc.) memory

N/A

N/A

350

N/A

WebLogic Server memory

N/A

N/A

261

N/A

N/A

N/A

Total

611

Table 4-3 Medium Installation Pre-tuned Configuration Performance

- - CPU, % of server CPU, % of 1 proc RAM, MB

V880

SARM

12.86

51.44

52

V880

NEP

15.63

62.52

44

V880

JENEP

7.49

29.96

376

sclust01

WebLogic Server

24.23

48.46

750

Compaq

Oracle

10.14

81.12

N/A

N/A

Total ASAP (SARM, NEP, JENEP etc.) memory

N/A

N/A

550

N/A

WebLogic Server memory

N/A

N/A

750

N/A

N/A

N/A

Total

1300

Table 4-4 Large Installation Pre-tuned Configuration Performance

- - CPU, % of server CPU, % of 1 proc RAM, MB

V880

SARM

28.71

114.84

64

V880

NEP1

13.04

52.16

41

V880

JENEP1

9.22

36.88

374

N/A

NEP2

18.03

72.12

43

N/A

JENEP2

13.73

54.92

375

sclust01

WebLogic Server

51.54

103.08

750

Compaq

Oracle

18.74

147.76

N/A

N/A

Total ASAP (SARM, NEP, JENEP, etc.) memory

N/A

N/A

1050

N/A

WebLogic Server memory

N/A

N/A

750

N/A

N/A

N/A

Total

1800

Troubleshooting and Monitoring ASAP Performance

The WebLogic Server Administration Console can be used to monitor the Java Service Request Processor (JSRP).

In case of errors, while running stored procedures by the ASAP servers:

Increase the value of the configuration variable, APPL_POOL_SIZE to make more connections available.

If a stored procedure fails, the thread running the procedure goes to sleep for the time determined by the RPC_RETRY_SLEEP configuration parameter. ASAP then tries to run the procedure with the number of times determined by the RPC_RETRY_COUNT configuration parameter. All of these attempts may fail. Since the thread cannot be released for the entire duration of this error retry process, poor performance is reported, increasing the number of connections in the use and long waiting times.

Manually Tuning ASAP Performance

The performance of an ASAP system is governed by the available hardware, installation, and configuration decisions made during the initial installation phase. Due to the multi-threaded nature of ASAP, fine-tuning the system to will help you to obtain the maximum benefits from the allocated resources.

This section provides you with tools and guidelines to tune your ASAP system in a short period of time. It covers the following topics:

  • A recommended approach to tuning.

  • A list of system limits that must be monitored to ensure that they are not exceeded during tuning.

  • Guidelines to tune the JSRP, SARM, and NEP processes.

Tuning Guidelines

If you wish to go beyond the provided pre-tuned configurations, there are many ways to tune an ASAP system. However, the following technique has been verified by the internal Oracle Communications testing team. You can use it to optimize a simple ASAP configuration in less than half a day. A simple configuration consists of all ASAP components residing in the same system with small numbers of individual components, for example, fewer than five NEPs and one or two SRPs.

The following steps are the order in which the tuning process is carried out:

1. Setting a Target

Select a performance target that is based on realistic throughput (work orders (WOs) per second) or resource consumption early in the process. Without a goal, an iterative process, such as tuning, could continue indefinitely.

2. Using Simple Work Orders

To achieve consistent results, use simple familiar WOs during the tuning process. Use a repeatable test and pick a scenario such as batches, or workflows. Once tuning is complete, verify the performance with realistic data.

3. Starting with Minimum Configuration Values

Start with minimum configuration values because it is easier to detect and correct bottlenecks than it is to determine where excess resources are being consumed.

4. Following Work Order Flow

The tuning process follows the same flow that WOs take through the system. Tuning starts at the SRP ( that is JSRP depending on your implementation) proceeds through to the SARM and then to the NEP (that is Java-enabled NEP (JNEP) depending on your implementation) and on to the NE, then it returns back through the same steps.

5. Checking for Bottlenecks

Bottlenecks that can develop as resources are shifted among the components which make up an ASAP system. Bottlenecks may occur in areas that were previously optimized. When you move to a new area of the system, you should review the servers that have already been tuned to ensure that their configurations have remained optimal. For example, if you tune the NEP after the SRP and SARM have been optimized, review the SRP and SARM after you have finished tuning the NEP to ensure that they have remained optimized.

Setting System Limits

During the tuning process, you must change configuration variable settings to levels that are higher than their defaults. These increases have two direct effects which you must monitor during the tuning process:

  • Increased demands are placed on the hardware allocated to the ASAP system. You must use a utility, such as top to continually monitor the ASAP system in order to ensure that it is not consuming more resources than planned.

  • If system limits are exceeded, increased ASAP resource consumption may cause errors to be reported in the systems diagnostic files. Monitor the diagnostic files closely during the tuning process so that these limits can be altered to higher values when required.

This section details errors which may appear in the diagnostic files and the configuration variables used to control system limits. Configuration variables are located in the ASAP.cfg file. The following are the configuration variables used for tuning.

  • APPL_POOL_SIZE

  • CONTROL_POOL_SIZE

  • MAX_CMD_DBPROCS

  • MAX_CONNECTIONS

  • MAX_CORE_DBPROCS

  • MAX_MSGPOOL

  • MAX_MSGQUEUES

  • MAX_SERVER_PROCS

  • MAX_THREADS

  • MAX_ORDERS_IN_PROGRESS

  • WO_AUDIT_LEVEL

For more information on configuration variables, refer to the chapter describing configuration parameters in the ASAP Server Configuration Guide.

Tuning Message Queue Guidelines

The optimum balance between throughput and response time for ASAP operation is achieved when all tunable message queues in the system are stable, short, and non-zero. When message queues are short and stable, threads operate more efficiently and are able to keep up with the flow of incoming messages.

Queue lengths depend on the quantity of threads that are either added or removed from the queue. To decrease the length of a queue, either decrease the rate at which messages are added to the queue or increase the rate at which they are removed. To increase the length of the queue, reverse the process.

Workload balancing is an end-to-end process. Bottlenecks can occur anywhere along the message processing path, reducing the message flow over the remainder of the path. For example, to avoid bottlenecks, the SRP must have a sufficient number of translation threads to handle the WO volume rate and enough SARM drivers to send WOs to the SARM as fast as they are generated. The SARM must also have enough Work Order Handler threads to handle the incoming Common Service Description Layer (CSDL) commands. By this point enough NEPs should be configured to efficiently secure the network elements (NEs). The following section guides you in tuning an ASAP system to achieve the ideal operation outlined above.

Tuning ASAP Server Message Queues

The following sections contains the ASAP servers that can be tuned:

  • JSRP/SRT

  • SARM

  • NEP

  • JNEP

  • WebLogic Server domain

Tuning JSRP Message Queues

The purpose of tuning the JSRP is to provide WOs at a rate that creates an even flow to the downstream SARM process.

Figure 4-1 illustrates the schematic flow of the JSRP.

Table 4-5 lists and describes the WO manager queue.

Table 4-5 Work Order Manager Queue

Item Description

Parameter Controlling Message Addition Rate to Queue

Number of SRP Driver Threads in the SARM.

MAX_SRP_DRIVERS

Parameter Controlling Message Removal Rate from Queue

Number of WO Manager Threads

MAX_WO_MGRS

Table 4-6 lists the SARM driver message queues.

Table 4-6 SARM Drive Queue

Item Description

Parameter Controlling Message Addition Rate to Queue

Number of Translation Threads (implementation dependent).

Parameter Controlling Message Removal Rate from Queue

Number of SARM Driver Threads

MAX_SARM_DRIVER

Variable size:

  • small: 5

  • medium: 10

  • large: 25

Tuning SARM Message Queues

The purpose of tuning the SARM is to:

  • Provide the Atomic Service Description Layer (ASDL) commands to the NEPs.

  • Send event notices back to the SRPs at an even rate to both the upstream and downstream processes.

Since only one SARM process exists in an ASAP system, the performance of the SARM cannot be enhanced by spreading the load across multiple Central Processing Units (CPUs) or systems. Therefore, the SARM must be well tuned to get high performance from ASAP. The WO_ACCEPT and WO_STARTUP events are turned off by default to improve SARM performance. You can turn the events on by setting wo_start_evt and wo_accept_evt to NULL in database table tbl_asap_srp. See "tbl_asap_srp" in Developer's Guide for more details.

Figure 4-2 illustrates the flow of messages through the SARM.

Table 4-7 lists and describes the group Mgr message queue.

Table 4-7 Group Mgr Message Queue

Item Description

Parameter Controlling Message Addition Rate to Queue

The number of WO Handler threads and number of NEPs in the system are fixed for the purpose of Group Manager message queue tuning. Do not configure them at this time.

Parameter Controlling Message Removal Rate from Queue

Number of Group Manager Threads.

MAX_GROUP_MGRS

Table 4-8 lists and describes the WO Mgr message queues.

Table 4-8 Work Order Mgr Message Queues

Item Description

Parameter Controlling Message Addition Rate to Queue

The number of WO Handler Threads and Number of NEPs in the system are fixed for the purpose of WO Manager message queue tuning. Do not configure them.

Parameter Controlling Message Removal Rate from Queue

Number of WO Manager Threads

MAX_WO_MGRS

Table 4-9 lists and describes the WO provision queue.

Table 4-9 Work Order Provision Queue

Item Description

Parameter Controlling Message Addition Rate to Queue

Number of Group Manager threads

MAX_GROUP_MGRS

Parameter Controlling Message Removal Rate from Queue

Number of WO Provision threads

MAX_WO_HANDLERS

Table 4-10 lists and describes the ASDL Provision Message Queues.

Table 4-10 ASDL Provision Message Queues

Item Description

Parameter Controlling Message Addition Rate to Queue

Number of WO Provision Threads

MAX_WO_HANDLERS

Parameter Controlling Message Removal Rate from Queue

Number of ASDL Provision Threads

MAX_PROVISION_HANDLERS–(Less) MAX_WO_HANDLERS

Example

If you have five (5) WO Handlers and you want ten (10) ASDL Provision Threads, set the MAX_PROVISION_HANDLERS to fifteen (15). The difference is the ten (10) that you wanted.

Table 4-11 lists and describes the NEP driver message queues.

Table 4-11 NEP Driver Message Queues

Item Description

Parameter Controlling Message Addition Rate to Queue

Number of ASDL Provision Threads

MAX_PROVISION_HANDLERS (less) MAX_WO_HANDLERS

Parameter Controlling Message Removal Rate from Queue

Number of NEPs in the system (dependent on throughput requirements and machine resources).

There is one NEP Driver Queue for each NEP in the system.

Table 4-12 lists and describes the SRP driver message queues.

Table 4-12 SRP Driver Message Queues

Item Description

Parameter Controlling Message Addition Rate to Queue

Nearly every thread in the SARM can add messages to this queue. Therefore, it is not possible to control the number of messages that are added.

Parameter Controlling Message Removal Rate from Queue

Number of SRP Driver Threads.

MAX_SRP_DRIVERS

There is one SRP Drive Queue for each SRP in the system.

Tips for Tuning the SARM

To tune the SARM, use the following:

  • The number of WO threads (MAX_WO_HANDLERS) in the SARM must be equal to the sum of all SARM Driver Threads (MAX_SARM_DRIVER) in all of the SRPs.

  • The total number of configured handler threads (MAX_WO_MGRS) in all SRPs must be equal to the driver threads (MAX_SRP_DRIVERS) in the SARM.

  • All event notifications not used by any customized SRP implementation must be turned off.

  • Sanity level diagnostics during production should be used.

  • The configured number of provision handlers (ASDL and WO) must be no less than the sum of the number of handler threads (MAX_WO_HANDLERS) and the number of operating NEP servers. Start tuning with a ratio of 1:3 of WO Manager Threads to provision handlers (ASDL and WO).

Tuning NEP Message Queues

The purpose of tuning a NEP is to provide:

  • ASDLs to the NEs

  • Remote Procedure Call (RPC) responses back to the SARM at a rate that does not cause excessive buildup in any of the SARM Notify, Session Manager, or NE Command queues.

Figure 4-3 illustrates the schematic flow of the NEP.

Table 4-13 lists and describes the SARM notify message queue.

Table 4-13 SARM Notify Message Queues

Item Description

Tool

NEP Server

Parameter Controlling Message Addition Rate to Queue

Session Manager Thread for the NE. Non-configurable

Parameter Controlling Message Removal Rate from Queue

SARM Notify Thread for the NEP. Non-configurable

You can manage this queue indirectly by decreasing the number of NEs supported by the NEP (usually by increasing the absolute number of NEPs, if machine resources permit) or by balancing the load between NEPs (by moving a busy NE from a busy NEP to a less busy NEP).

Table 4-14 lists and describes the NE command queue.

Table 4-14 NE Command Message Queues

Item Description

Tool

NEP Server

Parameter Controlling Message Addition Rate to Queue

Session Manager Thread for the NE. Non-configurable

Parameter Controlling Message Removal Rate from Queue

Command Processor Thread for the NE. Non-configurable

You can manage this queue indirectly by increasing or decreasing NE response times (faster response times decrease the length of the queue).

Table 4-15 lists and describes the session manager queue for the NE.

Table 4-15 Session Manager Message Queues for the NE

Item Description

Tool

NEP Server

Parameter Controlling Message Addition Rate to Queue

Command Processor Thread of the NE - NEP Driver thread in the SARM for the NEP. Non-configurable

Parameter Controlling Message Removal Rate from Queue

Session Manager Thread for the NE. Non-configurable

You can manage this queue indirectly using the NE response times (fast responses increase the length of the queue). However, depending on communication and switch technology, this may not be configurable.

Tips for Tuning NEP

To tune the NEP, use the following:

  • Unless machine resources are limited, target for a ratio of 50 NEs per NEP.

  • If possible, group similar NE technologies and switch software loads on a single NEP. This reduces the number of ASDL programs held in memory because each NEP caches its own copy of every ASDL command that it has been asked to perform.

  • Balance the load on each NEP by distributing your busy NEs across several NEPs. You must determine the expected load based on both the number and complexity of the ASDL commands being serviced.

Other Performance Issues

You must take into consideration other factors that can affect performance. This section covers the diagnostic levels and query optimization that you need in the tuning process.

The topics in this section include:

  • Local versus NFS-Mounted File Systems for Diagnostic Files

  • Server Diagnostic Levels

  • Diagnostic Messages Output

  • Query Optimization

  • Table Partitioning

Local Versus NFS-mounted File Systems for Diagnostic Files

ASAP diagnostic files on NFS-mounted file systems increase network traffic and slow down disk I/O. For production systems, the log directories should be local, not NFS mounted.

Server Diagnostic Levels

Low-level server diagnostic levels increase disk I/O. Oracle recommends that you set the diagnostic level for the production system at either PROGRAM_LEVEL or SANITY_LEVEL.

In the Control server database table tbl_appl_proc, set the diagnostics level (column diag_level) to SANE.

Diagnostic levels set in the tbl_appl_proc are persistent through reboots. For more information on tbl_appl_proc, see ASAP Developer's Guide.

To set a diagnostic level temporarily, use asap_utils parameter 107. See the ASAP Server Configuration Guide for more information.

The Table 4-16 provides the default diagonostic levels

Table 4-16 Server Diagnostic Levels

Diagnostic Level JAVA Code Value Description

KERNEL _LEVEL

KERN

Used by the kernel to generate diagnostic messages. It is only to be used by the core libraries for very low-level debugging of core code.

LOW_LEVEL

LOW

Used by the application to generate low-level diagnostic messages from any of its functions. Such messages should enable the programmer to debug an application. Once debugged, the diagnostic level of the application should be elevated above LOW_ LEVEL.

FUNCTION_LEVEL

FUNC

Used by the application at the beginning and end of each function to track the operation of the application. This is generally not used in the core application.

RPC_LEVEL

RPC

Used by the application to produce remote procedure call (RPC) diagnostic messages.

SANITY_LEVEL

SANE

Used by the application for high-level diagnostics. This level of diagnostic messages provides user information about the processing of the system.

PROGRAM_LEVEL

PROG

Only error messages will be logged. This is primarily used to generate error messages when the ASAP system is running in a high performance production environment.

FATAL_LEVEL

SHUT

Used for fatal error conditions if the process is terminated. You only use this level if an error condition occurs within the application so that if the application were to continue, more errors would occur and compound the problem. For instance, if a stored procedure is missing from the database, then the application terminates and manual intervention is required.

The most commonly used diagnostic levels are LOW_LEVEL, SANITY_LEVEL, and PROGRAM_LEVEL.

Diagnostic Messages Output

Output of diagnostic messages can be written to disk line-by-line or buffered by the UNIX I/O subsystem. Buffered output results in diagnostics being written to disk in pages, which results in optimal performance.

Note:

Oracle highly recommends that you do not use diagnostic line flushing for production systems. Flushing diagnostic messages to disk in lines results in high disk I/O frequency.

To disable diagnostic line flushing, set the configuration parameter DIAG_LINE_FLUSH (in the common application programming interface (API) section of the ASAP.cfg file) to “0." The default value is 1.

Query Optimization

Oracle RDBMSs use a cost-based query optimizer to determine query access paths. The optimizer is primarily influenced by table and index statistics (number or rows, data distribution, etc.) available to it in the system catalogue tables. These statistics can be updated manually (usually when the amount and distribution of data in a table changes significantly) using utilities provided by the RDBMS. Keeping these statistics current is extremely important since the optimizer will make default assumptions in the absence of these statistics. The quality of the database optimizer decisions depends on the accuracy of these statistics, and hence, directly affects the performance of the ASAP application.

Updating the statistics on large tables (over 1 million rows) can take a long time. This can affect your online response time or induce rollover to small error messages.

Note:

When the Oracle statistics are collected on the SARM schema, you should also collect the histograms on the TBL_WKR_ORD.WO_STAT column.