CAPACITY PLANNING FOR PEOPLESOFT

(1)

CAPACITY PLANNING FOR PEOPLESOFT

Yefim Somin and Leonid Gross

BMC Software, Inc.

Abstract

PeopleSoft is a complex multi-functional, multi-tier

application. Important questions about its

configuration, load balancing and capacity planning

need to be answered during its deployment and

production use. This paper describes an approach to

answering such questions for the most commercially

important configurations of PeopleSoft (3-tier, 2-tier

and mixed). It covers architectural features and

available measurements relevant to performance

analysis and capacity planning. Comprehensive

treatment of all the information sources is necessary

for the solution.

1. Performance analysis of an

application

Before dealing with PeopleSoft let us consider what it means to analyze

performance and do capacity planning for an application. These activities can be

subdivided into several related but distinct areas.

1.1 Workload characterization

This activity provides information for day-to-day performance management and

monitoring, as well as the foundation for modeling what-if performance questions. It can also be used to derive data for sizing, as described in the next section. To

characterize workloads (and in real production environments there are usually multiple workloads to be distinguished) the following needs to be done:

• Resource consumption for different types of work, e.g., business functions, should be determined

• Relationships among different steps of transaction execution, i.e., different layers and servers should be established

• Counts of business units of work need to be obtained, or at least ratios of those counts in different situations

Under distributed systems, e.g., UNIX or NT, the user visible representation of application work is a collection of system processes. Therefore, the first approach to the above tasks lies through identification of the role of relevant application processes and their relationships. As the first cut of workload characterization an analyst should do the following:

• Identify the list of processes doing the work of the application under study • Identify client-server relationships, if

any, among the processes • Determine if the user-relevant

breakdown of work by business function could be specified at the process level

(2)

Additional measurements describing the workload composition may be provided by the database, if used, or by the

instrumentation of the application itself. Most major databases provide some relevant information, whereas the level of

applications instrumentation varies widely from case to case.

1.2 Sizing using resource utilization profiles

This analysis requires resource utilization breakdown by specific types of business functions (a.k.a., units of work or

transactions). Benchmarking is usually needed to obtain resource profiles for the appropriate types of work. The profiles include resource consumption at different servers, hardware and software, and are suitable inputs for analytic modeling. They allow the user to do sizing in the absence of a running system or to test-drive the effect of adding new loads to the loads already present.

1.3 Application architecture-specific modeling

Several areas of architecture can be listed: • memory (buffers) use and the resulting

saturation behavior

• internal application scheduling and prioritization

• number and type of configured service components

These architectural features usually require targeted benchmarking and additional algorithms to represent them.

Little has been published about capacity planning for PeopleSoft and especially about the most common configurations moving

into the corporate IS. One recent paper, [CHU98], addresses derivation and modeling of service demand for low-level requests resulting from application activities such as update, insert, etc. However the problem of characterizing workload on a working system, as described in Workload Characterization above, was not covered in that paper, and indeed anywhere to date, to the authors’ knowledge. This problem is the main topic of the present paper.

2. PeopleSoft architecture

In order to approach workload

characterization, relevant architectural attributes of PeopleSoft need to be presented.

Until version 7, PeopleSoft was typically used as a 2-tier system where clients would be connected directly to the database server. Occasionally, ad hoc middleware would be introduced between clients and the database as an application server. Starting with Version 7 (current PeopleSoft version) in addition to 2-tier client/server architecture PeopleSoft provided a 3-tier model with a PC client and a TUXEDO application server. This introduced a level of standardization into the 3-tier configuration and made its capacity planning important for an increasing number of users. The major advantages of the 3-tier configuration are significant performance gains (especially with PeopleSoft 7.5) and a reduction in the network traffic.

At the same time 2-tier and 3-tier versions can and do coexist in the same

implementation at most sites. Hence one needs to consider both versions and design an approach valid in a mixed environment (see Fig.1).

(3)

Application Server A TUXEDO Domain X Payroll PC Client PC Client PC Client Application Server B TUXEDO Domain Y Manufact. PC Client Web Server Web Client 3-tier PC clients RDBMS Instance DB Server PC Client PC Client PC Client PS Tools

2-tier configuration

3-tier configuration

(4)

2.1 3-tier architecture

The 3 tiers are:

• Users (client workstations, typically PCs, web users, etc.)

• Application Server (Tuxedo) • Database Server

User requests are initiated at the user tier and directed to the Application Server where they undergo appropriate pre-processing. Any resulting database requests are packaged and shipped to the Database Server under the direction of the Application Server. Responses are post-processed at the Application Server and delivered to the User.

2.1.1 User Tier

This tier usually runs on a desktop PC or across the web, therefore no workload breakdown characterization is necessary at that level. Potential delays at the client or the network affecting performance as seen by this tier could be included in the overall service modeling but they are currently beyond the scope of this paper.

2.1.2 Application Server Tier

This tier is TUXEDO, a middleware product from BEA, configured for use by PeopleSoft. A Tuxedo system has a hierarchical

structure with several layers listed, top down, below:

• Domain • Machine • Group • Server

There may be several units of the lower layer for each one at the higher layer. PeopleSoft does not use more than one application server Machine per Domain. All of the above layers are logical layers within TUXEDO. The physical configuration may be mapped to the logical one in different ways, e.g., several physical systems per Domain or several Domains per physical system, with the database tier on a separate or the same node.

PeopleSoft has a number of different modules, e.g. Human Resources, Manufacturing, etc., for which typically a separate domain is allocated. If this scheme is followed, separation by Domain is

equivalent to separation by the type of work. Each unit of the lowest layer, Server, is represented by one specific system process (see Fig.2).

Each Domain, has a number of OS processes with distinct names doing the work:

• BBL (Bulletin Board Liaison Server), 1 per Domain

• WSL, WSH (Workstation Listeners and Handlers), several could be configured • PSAPPSRV, PSAUTH, PSQCKSRV,

PSSAMSRV (PeopleSoft Application Servers), a configurable number • JOBMAN, JRAD, JRLY, JREPSVR,

JSH, JSL (web connection handlers) It is possible to distinguish processes belonging to different domains if more than one domain exists on the same physical host because full command names for these processes include their domain names .

(5)

2.1.3 Database Server Tie Domain X Payroll Domain Y Manufact. PSAPPSRV PID=97

Figure 2. TUXEDO Logical Hierarchy

Domain Z Supply Chain TUXEDO System Machine L Group

BASE

Group

APPSW

WSL PID=54 PSAUTH PID=78 PSQCKSRV PID=110 Servers - Processes

(6)

This is a commercial RDBMS. The processes that actually perform the database work are usually well known and could be easily identified at the OS level. For Oracle, for instance, servers and

background processes should be included in the list.

2.2 2-tier architecture

2-tier users access the database directly bypassing any application servers. There are two types of 2-tier users, UNIX-based and PC-based.

2.2.1 UNIX-based 2-tier users

These users usually do batch and power query processing. In the Oracle

environment, they are typically connected to the database through a dedicated server. This means that database processes associated with this work run under the user names of the owners of batch jobs and are in fact children of the front-end processes. Therefore, UNIX processes permit

identification of both the front-end and the back-end resource use for this work. Processes participating in this work are described below.

PSRUNxxxx – PeopleSoft process (or batch) scheduler where xxxx is the Oracle instance name. Process scheduler is implemented as a UNIX daemon for each database instance, and is usually run under a UNIX user who owns the PeopleSoft installation. Process scheduler is always running in the system waiting for jobs to submit and always uses Oracle dedicated server connection (two-task). Therefore, you will see the following structure for batch jobs: PSRUNxxxx processes, owned by the UNIX user in charge of batch jobs, and their

descendants, most importantly processes called

oracle owned by the same user. All these processes are part of the batch workload.

sqr – a tool widely used in the PeopleSoft environment for the report generation. Could be and usually is used under different UNIX names, though the list of UNIX users who actually use the tool is limited; these are typically power users or PeopleSoft developers.

2.2.2 PC-based 2-tier users

PC-based 2-tier users run PeopleSoft-supplied clients and connect directly to the database. Their work is reflected in the measurements obtained from RDBMS from the well-known v$session tables. In

particular, resource consumption breakdown by application can be obtained for these users’ work. The names of these

applications are the same as the

executables for their PC front-ends, e.g., PSTOOLS.EXE, PSIDE.EXE, etc. Since the names of PeopleSoft executables on PC client are well known (they usually end with

.EXE) they are easily identifiable. The following names of the applications (executables) are typical for PeopleSoft V 7: PSTOOLS.EXE - PeopleSoft front end; all interactive functions are performed through it

CRW32.EXE - Crystal reports; a report generating tool

PSNVS.EXE - NVision; a reporting tool with the interface to Microsoft Excel PSPM, PSPMD.EXE - PeopleSoft Process Monitor

PSQED.EXE - a PeopleSoft query tool PSIDE.EXE - PeopleSoft application designer tool

All application processes mentioned above access Oracle as user SYSADM, so user name cannot be used directly to form workloads. The reason is PeopleSoft security mechanism, where each user signs on to the PeopleSoft database through the front end as himself and then is switched by PeopleSoft to SYSADM with the proper clearance (there are many clearance classes in PeopleSoft database: Administrator, Payroll Clerk, Benefit Administrator etc).

nhttp.EXE - these processes are run for the Web interface, several pre-spawned connections could be created through which all Intranet users are getting their data.

(7)

3. Modeling PeopleSoft

3.1 Modeling a pure 3-tier

environment

The approach to modeling a 3-tier environment is to treat it is a client-server configuration with, potentially, several Application Servers as clients of one Database Server. PeopleSoft Application Server (TUXEDO) currently does not provide measurements of resource consumption, therefore we need to look at other sources of information.

3.1.1 Measurements available for Application Server

Since TUXEDO application services are performed by processes we can identify (see the list above), all the process-based resource measurements from the Operating System are available, in particular, we can identify resources consumed by each Domain. If a PeopleSoft installation is configured such that the work is broken into domains based on the functional areas, e.g., HR, Materials Management, etc., we obtain resource consumption by business function at the same time.

3.1.2 Measurements available for Database Server

The Database Server is measured at 2 levels, Operating System and RDBMS. At the OS level, one can determine all of the processes that do the work and

consequently consume resources for the database. For Oracle, for instance, these are usually processes whose command name begins with ora.

In RDBMS measurements, there is an additional resource breakdown of database work. This breakdown is described below. It is not utilized for the pure 3-tier case: different types of 3-tier work are simply not reflected in this breakdown. It is useful, however, for the 2-tier load separation and breakdown described further on in this paper.

The available RDBMS measurements and their utility in PeopleSoft environment are as follows:

USER

All PeopleSoft users are represented in Oracle as one user SYSADM (PeopleSoft schema owner). For this reason the use of the USER field is not helpful for creating workloads.

APPLICATION

PC-based 2-tier applications (non-TUXEDO) identify themselves to Oracle if the Oracle PC configuration file (ORACLE.INI) contains their ID fields, therefore their resource use is represented in Oracle measurements. 3-tier requests (coming from Tuxedo) do not identify their application (as observed so far) and don't allow for the resource breakdown of work.

3.1.3 What questions can be answered for a 3-tier environment

Once a workload characterization description is created according to the above guidelines, collected data can be processed, using appropriate tools, and resource use for different services and types of activities can be distributed into

corresponding buckets. Using analytic queueing methods, such as the ones described in [LAZO 84], the following types of questions could be answered:

• what domains/functions consume what resources, day-to-day or for special situations

• how will performance parameters (resource utilization, relative response time) change if the load on any of the domains changes

• same if load/functionality is moved around, consolidated or split • same if the underlying hardware is

(8)

3.2 Separating relevant RDBMS instances

There may be more than one instance of the database running on the Database Server host. Not all of them may be involved in doing PeopleSoft 3-tier work. Only the work of the relevant instances should be included in PeopleSoft workloads. This separation is possible because the full command name of a database process usually contains a string with the instance name.

3.3 Separating the 2-tier work

If both 3-tier users and 2-tier users send requests to PeopleSoft, one may want to separate work done on behalf of the 3-tier environment. This would be needed, for instance, in case what-if questions about the load change assume different growth rates for 2 and 3-tier loads. As mentioned above, there are 2 types of 2-tier users, UNIX-based and PC-UNIX-based.

3.3.1 Separating UNIX 2-tier users

As mentioned in the architecture section, database processes associated with this work run under the user names of the owners of batch jobs. Therefore, identifying database processes owned by batch users can separate RDBMS work for batch. The remainder of database work is attributable to the 3-tier work and non-Unix 2-tier users.

3.3.2 Separating PC 2-tier users

While the measurements from the RDBMS are not very helpful for the 3-tier load, the APPLICATION type measurements are useful in reflecting the work of PC-based 2-tier users. As mentioned in the architecture section, each type of PC client has its RDBMS resources consumption

summarized by PC-side executables. Thus, both the breakdown and the total amount of database work on behalf of such clients is provided. Since all resource utilization by database applications with well-known names listed above is caused by PC-based 2-tier users, it should be subtracted from the total database resources. The remainder represents 3-tier related database work.

3.4 Using additional information from TUXEDO

TUXEDO is instrumented to count requests to specific servers (PSAPPSRV, PSAUTH, etc. for PeopleSoft) and place them in a MIB [BEA97]. In case of PSAUTH, for instance, this count is the number of requests for authorization. In case of PSAPPSRV it may be requests from a lengthy list of different types. These request counts (Requests Done) could be summarized up, to the higher levels of hierarchy (Groups,

Machines, Domains). Note that there is only one Machine per Domain in PeopleSoft, so that Machine level counts could be used to define the number of transactions per Domain. Alternatively, since each service could be linked to a specific system process, a finer granularity of breakdown could be achieved by calculating TUXEDO resource utilization per unit of work (Requests Done) for different services.

While counts of requests are provided, relative resource use weights of transactions from different domains are not. This is true for TUXEDO level resource, with the

exception of the possibility of breaking down by Domains/Services. This is also true at the database level. It appears at the time of this writing that both BEA (the owner of

TUXEDO) and PeopleSoft are planning to improve their performance instrumentation, possibly using ARM (Application Response Measurement), see, e.g., [DING97]. This would allow them to catch up with SAP R/3 where workload statistics instrumentation is part of the system (CCMS). In the absence of this level of instrumentation, the

alternatives are

• assume that all requests have equal resource weight (within the chosen granularity entity)

• conduct experiments for specific types of requests similar to the ones described in [CHU98], except at a different

hierarchy level, e.g., type of request as opposed to SQL statement

(9)

3.5 Modeling a 2-tier PeopleSoft environment

In the previous sections techniques to separate 2-tier work from 3-tier work were described. This section summarizes analysis of 2-tier work.

3.5.1 Modeling UNIX 2-tier work

Looking back at the corresponding architecture section, one simply needs to describe a workload which combines the work done by batch tools (PSRUN, sqr) and their children processes, including the ones doing database work (e.g. ora…) and owned by batch users. Batch front-ends are clients here, while database processes constitute the server. This assignment naturally requires knowledge of batch user names.

3.5.2 Modeling PC 2-tier work

For the PC-based 2-tier work, only the database part is considered here. The breakdown of that work available with the current instrumentation is based on the client-application (PSTOOLS, PSQED, etc.). As mentioned in the architecture section, this resource breakdown is provided by the RDBMS where each type of client software is treated as a separate application. If an effort is made to include the name of every PC user into the oracle.ini file on the PC, user names could also be employed for workload characterization.

4. Examples

4.1 Sample resource breakdown

Figure 3 demonstrates a sample resource breakdown for a mixed 2-3-tier

configuration. The environment consists of a UNIX system hosting both TUXEDO and database. The system consists of 6

processors, that is why CPU utilizations are computed out of 600%. There is some TUXEDO work, some batch jobs, and some requests by PC clients connected directly to the database. To obtain the database resource breakdown, the following steps have been taken.

• The total CPU utilization of all database processes happens to be 309%. The batch jobs present (PSRUNxxx) are all owned by the UNIX user hrprod, therefore all the database processes owned by that user need to be

separated and included into the batch workload (55% CPU). The rest of the database work is broken down based on database measurements.

• As described in the paper, resource consumption of applications is

measured by the database. In this case, we have several PS clients running PSTOOLS and other utilities for a total database CPU consumption of 153% (we have chosen to separate PSTOOLS and combine all the others).

• The rest of the work, as reported by the database, is done by database user SYSADMIN and belongs to undefined

application. This is the work attributed to TUXEDO requests.

As mentioned in the paper, a finer

breakdown of TUXEDO work by servers is available. In addition, a way to match server work breakdown with the corresponding database work is currently investigated.

(10)

PC Client PS Tools

Figure 3. 2 – 3-tier sample resource

breakdown

PSTOOLS.EXE 102% Other PS*.EXE 51% SYSADM (APP undefined)=3-tier work 101% ora* UNIX Processes UID=hrprod 55%

DATABASE

PC Client TUXEDO

Processes PSRUN* UNIX

Processes UID=hrprod PC Client UNIX Measurements Database Measurements UNIX System

(11)

4.2 Sample workload resource utilization profiles

To obtain resource utilization profiles for specific types of activities, to enable sizing and capacity planning, one needs to conduct controlled experiments and measure

resources as described earlier in this paper. The results of 2 such experiments are presented in Table 1. They include CPU consumption for 2 queries: disability benefits and payroll hours. The queries were run in a 3-tier environment (with TUXEDO). No extraneous activity was present. Resource consumption was obtained from OS and RDBMS data, whereas the query counts were derived from TUXEDO statistics.

Query TUXEDO

CPU

Oracle CPU

Payroll hours 143 msec 101 msec Disability

benefits

53 msec 1672 msec

Table 1. Transaction CPU utilization profiles

One can see that the benefits query is much heavier on the database than the payroll one, a useful piece of knowledge for configuring and balancing loads between application and database servers. Further analysis, not reflected in the table, shows that most of the app server work for the payroll query was done by a PSAPPSRV process, the main server engine of

TUXEDO. On the other hand, about half of the app server work for the benefits query was due to WSH, the workstation handler. The database used in the tests was the small default sample that comes with PeopleSoft, therefore the amount of I/O was negligible.

The following SQL statements describe the 2 queries: Disability benefits SELECT TO_CHAR(A.DEDUCTION_BEGIN_DT,'YYY Y-MM-DD'), TO_CHAR(A.COVERAGE_BEGIN_DT,'YYY Y-MM-DD') FROM PS_DISABILITY_VW A, PS_EMPLMT_SRCH_GBL A1 WHERE A.EMPLID = A1.EMPLID AND A.EMPL_RCD# = A1.EMPL_RCD# AND A1.OPRCLASS = 'ALLPANLS'

Payroll hours SELECT A.BALNC_LINES, A.BALNC_REG_PAY_HRS, A.BALNC_OTH_HRS, A.BALNC_REG_PAY FROM PS_PAY_PAGE A

5. Conclusions

• PeopleSoft application has a complex multi-tiered structure that needs to be clearly understood and reflected when doing capacity analysis and planning Structures and components for different types of access and even different number flavor of tiers can coexist in the same environment and be considered in the analysis

• There is a variety of relevant

measurements at different levels but currently no comprehensive internal instrumentation to provide resource utilization breakdown by functions, a la SAP CCMS instrumentation

OS, RDBMS, TUXEDO and PeopleSoft itself are sources of disparate metrics that need to be pulled together for successful analysis • An approach to deriving workload

information using the available

measurements and suitable for capacity planning is introduced in this paper Using this approach one can separate activities from different tiers and obtain further breakdown of work. Resource

(12)

profiles for specific types of activities can be obtained using controlled tests.

6. References

[BEA97] BEA TUXEDO Administrator’s Guide, Rel. 6.3, 1997

[BILB99] Darrel Bilbrey, PeopleSoft Administrator's Guide. The Essential Resource for Implementation and Administration, Cybex, 1999

[CHU98] Yi-Chun Chu, Charles J. Antonelli, and Toby J. Teorey, Performance

Measurement of the PeopleSoft Multi-tier Remote Computing Application, Proc. of CMG’98, 1998

[DING97] Yiping Ding, Performance Modeling with Application Response Measurement (ARM): Pros and Cons, Proc. of CMG’97, 1997

[LAZO84] Edward Lazowska, John Zahorjan, Scott Graham, and Kenneth Sevcik, Quantitative System Performance,