6 Troubleshooting and Tuning Oracle WebLogic Server Proxy Plug-Ins

You might encounter some problems when using the WLS proxy plug-ins. Descriptions of how to solve these problems are provided.

This chapter includes the following topics:

Tuning Oracle HTTP Server for High Throughput for WebSocket Upgrade Requests

Oracle WebLogic Server 14c (14.1.2.0.0) supports deploying WebSocket applications. The 14.1.2.0.0 WLS OHS Plug-In can handle such WebSocket connection upgrade requests and effectively proxy to WebSocket applications hosted within Oracle WebLogic Server 14c (14.1.1.0.0) and later.

As a result of adding this support, a new configuration parameter WLMaxWebSocketClients is introduced.

The WLMaxWebSocketClients parameter limits the number of active WebSocket connections at any instant of time. The maximum value you can set for this parameter is 75 percent of ThreadsPerChild (Windows) or 75 percent of MaxRequestWorkers (non-Windows). Hence, to tune your HTTP Server for maximum WebSocket connection upgrade requests, set MaxRequestWorkers/ThreadsPerChild to a value that can accommodate WebSocket connections as well. Also, ensure that WLMaxWebSocketClients is set to 75 percent of MaxRequestWorkers/ThreadsPerChild.

Understanding Connection Errors and Clustering Failover

When the proxy plug-in attempts to connect to Oracle WebLogic Server, the proxy plug-in uses several configuration parameters to determine how long to wait for connections to the Oracle WebLogic Server host and, after a connection is established, how long the proxy plug-in waits for a response.

If the proxy plug-in cannot connect or does not receive a response, the proxy plug-in attempts to connect and send the request to the other Oracle WebLogic Server instances in the cluster. If the connection fails or there is no response from any Oracle WebLogic Server in the cluster, an error message is sent. For an illustration of how the proxy plug-in handles failover, see Figure 6-1.

This section includes the following topics:

Possible Causes of Connection Failures

Failure of the Oracle WebLogic Server host to respond to a connection request could indicate the following problems:

  • Physical problems with the host machine (such as power outages, hardware malfunction, operating system crash, and so on).
  • Network problems.
  • Other server failures.

Failure of a Oracle WebLogic Server instance to respond could indicate the following problems:

  • Oracle WebLogic Server is not running or is unavailable.
  • A hung server.
  • A database problem.
  • An application-specific failure.

Tips for Reducing CONNECTION_REFUSED Errors

Under load, a proxy plug-in may receive CONNECTION_REFUSED errors from a back-end Oracle WebLogic Server instance. For example, the following error is logged in the log file:

weblogic: Trying GET /uri at backend host 'xx.xx.xx.xx/port; got exception 'CONNECTION_REFUSED [os error=xxx, line xxxx of URL.cpp]: apr_socket_connect call failed with error=xxx, host=xx.xx.xx.xx, port=xxxx'

Oracle WebLogic Server might have reached the maximum allowed backlog connections. Follow these tuning tips to reduce CONNECTION_REFUSED errors:

  • Increase the AcceptBackLog setting in the configuration of your Oracle WebLogic Server domain.
  • Decrease the time wait interval. This setting varies according to the operating system you are using. For example, on Linux, set the net.ipv4.tcp_fin_timeout parameter to a lower value in the /etc/sysctl.conf file.
  • Increase the open file descriptor limit on your machine. This limit varies by operating system. Using the limit (.csh) or ulimit (.sh) directives, you can make a script to increase the limit.

Failover with a Single, Non-Clustered Oracle WebLogic Server

If you run only a single Oracle WebLogic Server instance, the proxy plug-in only attempts to connect to the server defined with the WebLogicHost parameter. If the attempt fails, an HTTP 503 error message is returned. The proxy plug-in continues trying to connect to that same Oracle WebLogic Server instance for the maximum number of retries as specified by the ratio of ConnectTimeoutSecs and ConnectRetrySecs.

The Dynamic Server List

The WebLogicCluster parameter is required to proxy to a list of back-end servers that are clustered, or to perform load balancing among non-clustered managed server instances.

In the case of proxying to clustered managed servers, when you use the WebLogicCluster parameter to specify a list of Oracle WebLogic Servers, the proxy plug-in uses that list as a starting point for load balancing among the members of the cluster. After the first request is routed to one of these servers, a dynamic server list is returned containing an updated list of servers in the cluster.

The updated list adds any new servers in the cluster and deletes any that have been shut down, or are being suspended, or are no longer part of the cluster or that have failed to respond to requests. This feature can be controlled by using DynamicServerList. For example, to disable this feature, set DynamicServerList to OFF.

DynamicServerList ON is a preferred performance tuning parameter. It is useful, for example, if a member of a cluster is temporarily down for maintenance or if administrators decide they want to add another member, and not need to restart the web server.

Note:

If DynamicServerList is set to ON, and the list of the back-end Oracle WebLogic Servers specified in WebLogicCluster is not in a cluster, then the behavior would be undefined.

Failover, Cookies, and HTTP Sessions

When a request contains session information stored in a cookie or in the POST data, or encoded in a URL, the session ID contains a reference to the specific server instance in which the session was originally established (called the primary server). A request containing a cookie attempts to connect to the primary server. If that attempt fails, the proxy plug-in attempts to make a connection to the next available server in the list in a round-robin fashion. That server retrieves the session from the original secondary server and makes itself the new primary server for that same session. See Figure 6-1.

Note:

If the POST data is larger than 64K, the proxy plug-in will not parse the POST data to obtain the session ID. Therefore, if you store the session ID in the POST data, the proxy plug-in cannot route the request to the correct primary or secondary server, resulting in possible loss of session data.

In this figure, the Maximum number of retries allowed in the red loop is equal to ConnectTimeoutSecs/ConnectRetrySecs.

Failover Behavior When Using Firewalls and Load Directors

In some configurations that use combinations of firewalls and load-directors, any one of the servers (firewall or load-directors) can accept the request and return a successful connection while the primary instance of Oracle WebLogic Server is unavailable. After attempting to direct the request to the primary instance of Oracle WebLogic Server (which is unavailable), the request is returned to the proxy plug-in as "connection reset."

Requests running through combinations of firewalls (with or without load-directors) are handled by Oracle WebLogic Server. In other words, responses of connection reset fail over to a secondary instance of Oracle WebLogic Server. Because responses of connection reset fail over in these configurations, servlets must be idempotent. Otherwise duplicate processing of transactions may result.

Oracle WebLogic Server Session Issues

The WLS proxy plug-in routes the requests to back-end Oracle WebLogic Server or cluster. Oracle WebLogic Server maintains sessions so that subsequent requests from the same client are routed to the same server. However, due to various reasons, if the WLS proxy plug-in cannot communicate with the Oracle WebLogic Server server, the request is handled in the following ways:
  • If the request is routed to a single Oracle WebLogic Server instance, the WLS proxy plug-in continues trying to connect to that same Oracle WebLogic Server instance for the maximum number of retries as specified by the ratio of ConnectTimeoutSecs and ConnectRetrySecs. If all attempts fail, an HTTP 503 error message is returned back to the client.
  • If the request is routed to the WebLogic cluster, the current Oracle WebLogic Server is marked as bad, and the request is routed to the next available Oracle WebLogic Server. If all attempts fail, an HTTP 503 error message is returned back to the client.

In addition to sending a HTTP 503 error message, the following is displayed as a response in the HTTP client:

Failure of Web Server bridge:
No backend server available for connection: timed out after xx seconds or idempotent set to OFF or method not idempotent.

NO_RESOURCES Errors

Occasionally, under stress conditions, a few requests might fail with the error logged in the error log file.

The following error is logged in the log file:

weblogic: *******Exception type [NO_RESOURCES] (apr_socket_connect call failed with error=70007, host=xx.xx.xx.xx, port=xxxx) raised at line xxxx of URL.cpp

This usually occurs if Oracle WebLogic Server is too busy to respond to the connect request from the WLS proxy plug-in. This can be resolved by setting WLSocketTimeoutSecs to a higher value. This allows the WLS proxy plug-in to wait longer for the connect request to be responded to by the Oracle WebLogic Server.

POST Data Files Issues

The temporary POST file is located under /tmp/_wl_proxy for UNIX. For Windows it is located as follows (if WLTempDir is not specified):
  • Environment variable TMP
  • Environment variable TEMP
  • C:\Temp
The /tmp/_wl_proxy is a fixed directory and is owned by the HTTP Server user. When there are multiple HTTP Servers installed by different users, some HTTP Servers might not be able to write to this directory. This condition results in an error.

To correct this condition, use the WLTempDir parameter to specify a different location for the _wl_proxy directory for POST data files.