Understanding Server Configuration Parameters and Their Effect on Server Statistics

(1)

Understanding

Server Configuration Parameters and

Their Effect on Server Statistics

Technical Note

V2.0, 3 April 2012

(2)

Content

Introduction ... 3

“Time to Save” Process ... 3

“Time to Obtain” Process ... 4

The “Process Count” Server parameter... 5

The “Process Idle Timeout” Server parameter ... 5

Additional Performance Statistics in ActiveVOS v9.0 ... 6

Other important parameters influencing performance... 7

What can be tweaked ... 8

Obtaining Troubleshooting Assistance ... 9

About Active Endpoints ...10

(3)

Introduction

This tech note describes certain key monitoring properties observed in the ActiveVOS Admin console > Monitor > Server Monitoring > Server Statistics. These historical engine statistics are collected by ActiveVOS in- memory and are aggregated in intervals. The default threshold period and the evaluation frequency for collecting/aggregating statistics is five minutes. If the period of default minutes is needed to be modified, then these are configurable in the ActiveVOS admin console at Admin >

Configure Server > Monitoring Thresholds.

“Time to Save” Process

The “Time to Save Process” is strictly the amount of time required to serialize and write all process state data (activity state, variables, etc.) to the database. If you’re seeing outliers (time to save that are orders of magnitude over the average), these can only be attributed to these two reasons:

1. The time required to obtain a database connection.

Suggested investigation: Are there enough connections in the pool?

Hint:

 The "Time to acquire database connection" would indicate this. If this parameter increases from time to time, then you can try increasing the size of the database connection pool, using your application server’s settings.

 You should also refer to the application server’s monitors to determine if there are any large pools of requests backing up. For Oracle WebLogic, refer to:

- http://download.oracle.com/docs/cd/E12840_01/wl s/docs103/ConsoleHelp/taskhelp/jdbc/jdbc_datasou rces/MonitorDataSourceStatistics.html

- http://download.oracle.com/docs/cd/E12840_01/wl s/docs103/ConsoleHelp/pagehelp/JDBCjdbcdatasour

(4)

“Time to Obtain” Process

The “Time to Obtain” process includes the time it takes to acquire a lock on a process as well as restore its state from storage if necessary.

The time it takes to obtain a process is useful to determine if this operation is trending significantly higher under load situations. The monitoring includes maximum and average values.

What’s involved:

1. Server threads first attempt to acquire a lock on a process. If another thread is holding the lock, it needs to wait. This could be attributed to contention obtaining a process lock. The following might be the reasons for contention:

a. The process is loaded on another node (i.e. the node in which the process was served initially is down for some reasons or the load balancer for some reasons choose to load the process in a different node. In that case, the load balancer settings would need to be verified.) b. Multiple requests are made to the same process.

c. Someone is loading state in the process viewer (Active Process Detail) while requests for the same process are active. Should not impact much, but is likely to

complicate the time to obtain the process based on the factors of size of the process, variables used, etc.) 2. If the process is already in memory on a given node, it will simply

acquire the lock and we’re done (i.e. the engine has “acquired the process”).

3. If the process is not in memory when requested and needs to be loaded into memory, the engine needs to read it (its definition, activity state, variables, etc.) from the database. The process count and the process idle time out parameters are to be looked at. See explanation below.

(5)

The “Process Count” Server parameter

The process count (a server configuration parameter, in admin >

configure server > server properties) defines a threshold of how many processes should be loaded at once in memory. The “Process Count” is a soft limit and the server will wait for 5 seconds before allowing the process to enter memory. To avoid the 5 second delay, increasing Process Count will have the effect of keeping the process instance and its state in memory. This will have the effect of increasing memory

requirements. Throughput is higher and response times are shorter when the process instance is already in memory and does not have to be brought in from disk, so increasing this value is desirable. To reduce demands on memory, the, free heap space should be monitored under a typical workload to see if this value can be increased.

When there is “room” in the process manager, the process state will be loaded from the database. The performance impact is similar to those for “Time to Save Process” (# of connections available, size of process data, etc.)

The “Process Idle Timeout” Server parameter

You can find the “Process Idle Timeout” server configuration parameter, in admin > configure server > server properties.

This setting specifies the time in seconds that a process instance must be idle, before it becomes eligible to be purged from memory (thus freeing up one of the process instance slots the number of which is set using Process Count) and subsequently persisted to the database. Too short a timeout can cause premature purging.

Hints:

 Generally, increasing the process idle time (i.e. lag time) will help ensure that processes remain in memory to avoid reloading from the database. The downside is there may be idle processes taking up room in memory for longer than they need to.

 In high load situation, the idle time out must ideally be the recommended default number (10 seconds), to allow a swapping

(6)

when a request arrives, but doesn’t hang out in memory needlessly for long periods.

 Another aspect of process idle time is that processes in memory have journal entries associated with them and are eligible for recovery. The more processes in memory, the longer your recovery time will be on server restart.

 If you notice a steady increase in the memory heap usage, and the amount of free available memory is going down steeply, then it is advised to reduce the idle timeout seconds, so that when the process idle timer goes off, the process state is saved in full and the process leaves memory.

 Only the in-flight processes are counted towards the process count and process idle timeout factors. Any process is in a final state (i.e. completed/faulted), would not be stored in the memory and will not be counted for these factors.

 The only workload where a high setting (more than 50) of this value is appropriate is, when it is certain that *all* the process instances can be accommodated in memory simultaneously. An example of that would be an environment where the processes are very small, and are bound to complete very quickly. In these cases, for achieving maximum responsiveness and performance, both the process count and the process idle time out count can be increased to a very large vale (ex: process count can start with 10000, process idle time out can be set at 60).

Additional Performance Statistics in ActiveVOS v9.0

In ActiveVOS version 9.0, we introduced many other performance and health monitoring capabilities. For instance, our 9.0 System Performance page includes the database connection pool statistics for application servers that provide a good level of monitoring for data sources.

Please refer to our documentation on this topic at:

http://infocenter.activevos.com/infocenter/ActiveVOS/v90/index.jsp?to pic=/com.activee.rt.bpeladmin.enterprise.help/html/SvrUG3-3-4.html

(7)

Other important parameters influencing performance

There may be a times when requests to the services are not dispatched as quickly as they should and it is possible that this may lead to a backlog of requests which in turn will degrade the performance of the server. We describe below things to look for that may help you identify the need to tune the server.

If there is an upward trend of the numbers of the following parameters, the most likely cause is a lack of work manager threads available to process requests.

i) Under Server Monitoring > Server Statistics, does the "Work Manager Start Delay" increase over a period of time? The default current interval is 5 minutes. To change the interval update it on the Admin > Monitoring Thresholds page) ii) Are there many backed up requests i.e. ‘Queued’ displayed

at Monitor > Dispatch Service? Note that the Dispatch service was introduced starting v9.1, which helps throttling requests to the ActiveVOS engine.

iii) When there is a problem with the performance use Monitor

> System Performance and monitor metrics over a period of 5-10 mins. The main area to look at is follows: the Monitor >

System Performance > Node Monitoring – Work Manager, In-Memory Processes, and Unmatched Receives

Work manager:

- Are there many idle thread counts or is it 0 (which indicates that there are no new threads at the moment)?

- A large Queued Request Count?

Unmatched Receives section:

- Many Timed-Out Messages?

- Does the Average Message Waiting Time (ms) increase over time?

(8)

at

org.a",ERLH21ST,"org.activebpel.rt.bpel.impl.AeBpelExceptio n: Timeout waiting for reply from process ID 0 (xxxx).

What can be tweaked

1. Increase the Work Manager Thread Pool Max Count:

You can try increasing the number of threads in the work manager. To control this, you can increase the max thread count from the default of 100, to a level high enough (say 300-400 to start with) to accommodate any foreseeable load and then adjust the dispatch manager settings to throttle the number of executing threads (Max Concurrent) to a point well below the max thread count so you don’t run right up against the limit.

2. You can create additional dispatch service configurations for your services or for the system services, so that request to the services are stored and processed at a that can actually be handled by the server. The dispatch service also provides a way to limit the number of process instances that are calling the back end service simultaneously.

Refer to

http://infocenter.activevos.com/infocenter/ActiveVOS/v91/index.jsp?to pic=/com.activee.rt.bpeladmin.enterprise.help/html/SvrUG3-1-18.html

3. Look for evidence that a process spawns many other instances. If so, it is recommended to throttle requests for that particular process/service via dispatch configuration to a level that the server can handle the generated requests.

Note that, subprocess invokes bypass the dispatch manager entirely and get dispatched directly to the target process so any throttling needs to be performed on the parent process.

(9)

Obtaining Troubleshooting Assistance

If users still need more assistance, please submit a support request in the ActiveVOS support forums with the following information:

1. Server Configuration:- Type and Version of:

ActiveVOS

Application Server Database

Operating System containing the application server JVM

Any clusters involved?

2. Is the performance problem consistently reproducible and occurs even after restart?

3. What is the approximate number of total processes in the system when the problem occurs? If it can’t be determined from the ActiveVOS console, users can query the AeProcess table to find the count and currently running processes (ProcessState=1)

4. Are there non-persistent processes involved in the application design?

5. Provide screen prints of the following from the ActiveVOS console

 Export of the configuration from ActiveVOS Home >

Server Status

 Monitor > Dispatch service (main screen)

 Monitor > Dispatch Service > individual configurations including system default

 Monitor > Server Statistics, 3 screen prints captured every 5 mins.

 Monitor > System Performance, 3 screen prints captured every 5 mins.



(10)

About Active Endpoints

Active Endpoints (www.activevos.com) ActiveVOS is the leader in service- oriented BPM software for process automation. ActiveVOS empowers project teams to create business process management (BPM)

applications using services, making their businesses more agile and effective. ActiveVOS promotes mass adoption of SOA-enabled BPM applications by focusing on accelerating project delivery time with a complete, affordable and easy-to-use system. Active Endpoints is headquartered in Waltham, MA with development facilities in Shelton, CT.

To find out how Active Endpoints can help your business, visit

http://www.activevos.com, call +1 781 547 2900 and press 1 for Sales, or email us at [email protected].