CAPACITY PLANNING FOR PEOPLESOFT
Yefim Somin and Leonid Gross
BMC Software, Inc.
Abstract
PeopleSoft is a complex multi-functional, multi-tier
application. Important questions about its
configuration, load balancing and capacity planning
need to be answered during its deployment and
production use. This paper describes an approach to
answering such questions for the most commercially
important configurations of PeopleSoft (3-tier, 2-tier
and mixed). It covers architectural features and
available measurements relevant to performance
analysis and capacity planning. Comprehensive
treatment of all the information sources is necessary
for the solution.
1. Performance analysis of an
application
Before dealing with PeopleSoft let us consider what it means to analyze
performance and do capacity planning for an application. These activities can be
subdivided into several related but distinct areas.
1.1 Workload characterization
This activity provides information for day-to-day performance management and
monitoring, as well as the foundation for modeling what-if performance questions. It can also be used to derive data for sizing, as described in the next section. To
characterize workloads (and in real production environments there are usually multiple workloads to be distinguished) the following needs to be done:
• Resource consumption for different types of work, e.g., business functions, should be determined
• Relationships among different steps of transaction execution, i.e., different layers and servers should be established
• Counts of business units of work need to be obtained, or at least ratios of those counts in different situations
Under distributed systems, e.g., UNIX or NT, the user visible representation of application work is a collection of system processes. Therefore, the first approach to the above tasks lies through identification of the role of relevant application processes and their relationships. As the first cut of workload characterization an analyst should do the following:
• Identify the list of processes doing the work of the application under study • Identify client-server relationships, if
any, among the processes • Determine if the user-relevant
breakdown of work by business function could be specified at the process level
Additional measurements describing the workload composition may be provided by the database, if used, or by the
instrumentation of the application itself. Most major databases provide some relevant information, whereas the level of
applications instrumentation varies widely from case to case.
1.2 Sizing using resource utilization profiles
This analysis requires resource utilization breakdown by specific types of business functions (a.k.a., units of work or
transactions). Benchmarking is usually needed to obtain resource profiles for the appropriate types of work. The profiles include resource consumption at different servers, hardware and software, and are suitable inputs for analytic modeling. They allow the user to do sizing in the absence of a running system or to test-drive the effect of adding new loads to the loads already present.
1.3 Application architecture-specific modeling
Several areas of architecture can be listed: • memory (buffers) use and the resulting
saturation behavior
• internal application scheduling and prioritization
• number and type of configured service components
These architectural features usually require targeted benchmarking and additional algorithms to represent them.
Little has been published about capacity planning for PeopleSoft and especially about the most common configurations moving
into the corporate IS. One recent paper, [CHU98], addresses derivation and modeling of service demand for low-level requests resulting from application activities such as update, insert, etc. However the problem of characterizing workload on a working system, as described in Workload Characterization above, was not covered in that paper, and indeed anywhere to date, to the authors’ knowledge. This problem is the main topic of the present paper.
2. PeopleSoft architecture
In order to approach workload
characterization, relevant architectural attributes of PeopleSoft need to be presented.
Until version 7, PeopleSoft was typically used as a 2-tier system where clients would be connected directly to the database server. Occasionally, ad hoc middleware would be introduced between clients and the database as an application server. Starting with Version 7 (current PeopleSoft version) in addition to 2-tier client/server architecture PeopleSoft provided a 3-tier model with a PC client and a TUXEDO application server. This introduced a level of standardization into the 3-tier configuration and made its capacity planning important for an increasing number of users. The major advantages of the 3-tier configuration are significant performance gains (especially with PeopleSoft 7.5) and a reduction in the network traffic.
At the same time 2-tier and 3-tier versions can and do coexist in the same
implementation at most sites. Hence one needs to consider both versions and design an approach valid in a mixed environment (see Fig.1).
Application Server A TUXEDO Domain X Payroll PC Client PC Client PC Client Application Server B TUXEDO Domain Y Manufact. PC Client Web Server Web Client 3-tier PC clients RDBMS Instance DB Server PC Client PC Client PC Client PS Tools
2-tier configuration
3-tier configuration
2.1 3-tier architecture
The 3 tiers are:
• Users (client workstations, typically PCs, web users, etc.)
• Application Server (Tuxedo) • Database Server
User requests are initiated at the user tier and directed to the Application Server where they undergo appropriate pre-processing. Any resulting database requests are packaged and shipped to the Database Server under the direction of the Application Server. Responses are post-processed at the Application Server and delivered to the User.
2.1.1 User Tier
This tier usually runs on a desktop PC or across the web, therefore no workload breakdown characterization is necessary at that level. Potential delays at the client or the network affecting performance as seen by this tier could be included in the overall service modeling but they are currently beyond the scope of this paper.
2.1.2 Application Server Tier
This tier is TUXEDO, a middleware product from BEA, configured for use by PeopleSoft. A Tuxedo system has a hierarchical
structure with several layers listed, top down, below:
• Domain • Machine • Group • Server
There may be several units of the lower layer for each one at the higher layer. PeopleSoft does not use more than one application server Machine per Domain. All of the above layers are logical layers within TUXEDO. The physical configuration may be mapped to the logical one in different ways, e.g., several physical systems per Domain or several Domains per physical system, with the database tier on a separate or the same node.
PeopleSoft has a number of different modules, e.g. Human Resources, Manufacturing, etc., for which typically a separate domain is allocated. If this scheme is followed, separation by Domain is
equivalent to separation by the type of work. Each unit of the lowest layer, Server, is represented by one specific system process (see Fig.2).
Each Domain, has a number of OS processes with distinct names doing the work:
• BBL (Bulletin Board Liaison Server), 1 per Domain
• WSL, WSH (Workstation Listeners and Handlers), several could be configured • PSAPPSRV, PSAUTH, PSQCKSRV,
PSSAMSRV (PeopleSoft Application Servers), a configurable number • JOBMAN, JRAD, JRLY, JREPSVR,
JSH, JSL (web connection handlers) It is possible to distinguish processes belonging to different domains if more than one domain exists on the same physical host because full command names for these processes include their domain names .
2.1.3 Database Server Tie Domain X Payroll Domain Y Manufact. PSAPPSRV PID=97
Figure 2. TUXEDO Logical Hierarchy
Domain Z Supply Chain TUXEDO System Machine L Group
BASE
GroupAPPSW
WSL PID=54 PSAUTH PID=78 PSQCKSRV PID=110 Servers - ProcessesThis is a commercial RDBMS. The processes that actually perform the database work are usually well known and could be easily identified at the OS level. For Oracle, for instance, servers and
background processes should be included in the list.
2.2 2-tier architecture
2-tier users access the database directly bypassing any application servers. There are two types of 2-tier users, UNIX-based and PC-based.
2.2.1 UNIX-based 2-tier users
These users usually do batch and power query processing. In the Oracle
environment, they are typically connected to the database through a dedicated server. This means that database processes associated with this work run under the user names of the owners of batch jobs and are in fact children of the front-end processes. Therefore, UNIX processes permit
identification of both the front-end and the back-end resource use for this work. Processes participating in this work are described below.
PSRUNxxxx – PeopleSoft process (or batch) scheduler where xxxx is the Oracle instance name. Process scheduler is implemented as a UNIX daemon for each database instance, and is usually run under a UNIX user who owns the PeopleSoft installation. Process scheduler is always running in the system waiting for jobs to submit and always uses Oracle dedicated server connection (two-task). Therefore, you will see the following structure for batch jobs: PSRUNxxxx processes, owned by the UNIX user in charge of batch jobs, and their
descendants, most importantly processes called
oracle owned by the same user. All these processes are part of the batch workload.
sqr – a tool widely used in the PeopleSoft environment for the report generation. Could be and usually is used under different UNIX names, though the list of UNIX users who actually use the tool is limited; these are typically power users or PeopleSoft developers.
2.2.2 PC-based 2-tier users
PC-based 2-tier users run PeopleSoft-supplied clients and connect directly to the database. Their work is reflected in the measurements obtained from RDBMS from the well-known v$session tables. In
particular, resource consumption breakdown by application can be obtained for these users’ work. The names of these
applications are the same as the
executables for their PC front-ends, e.g., PSTOOLS.EXE, PSIDE.EXE, etc. Since the names of PeopleSoft executables on PC client are well known (they usually end with
.EXE) they are easily identifiable. The following names of the applications (executables) are typical for PeopleSoft V 7: PSTOOLS.EXE - PeopleSoft front end; all interactive functions are performed through it
CRW32.EXE - Crystal reports; a report generating tool
PSNVS.EXE - NVision; a reporting tool with the interface to Microsoft Excel PSPM, PSPMD.EXE - PeopleSoft Process Monitor
PSQED.EXE - a PeopleSoft query tool PSIDE.EXE - PeopleSoft application designer tool
All application processes mentioned above access Oracle as user SYSADM, so user name cannot be used directly to form workloads. The reason is PeopleSoft security mechanism, where each user signs on to the PeopleSoft database through the front end as himself and then is switched by PeopleSoft to SYSADM with the proper clearance (there are many clearance classes in PeopleSoft database: Administrator, Payroll Clerk, Benefit Administrator etc).
nhttp.EXE - these processes are run for the Web interface, several pre-spawned connections could be created through which all Intranet users are getting their data.
3. Modeling PeopleSoft
3.1 Modeling a pure 3-tierenvironment
The approach to modeling a 3-tier environment is to treat it is a client-server configuration with, potentially, several Application Servers as clients of one Database Server. PeopleSoft Application Server (TUXEDO) currently does not provide measurements of resource consumption, therefore we need to look at other sources of information.
3.1.1 Measurements available for Application Server
Since TUXEDO application services are performed by processes we can identify (see the list above), all the process-based resource measurements from the Operating System are available, in particular, we can identify resources consumed by each Domain. If a PeopleSoft installation is configured such that the work is broken into domains based on the functional areas, e.g., HR, Materials Management, etc., we obtain resource consumption by business function at the same time.
3.1.2 Measurements available for Database Server
The Database Server is measured at 2 levels, Operating System and RDBMS. At the OS level, one can determine all of the processes that do the work and
consequently consume resources for the database. For Oracle, for instance, these are usually processes whose command name begins with ora.
In RDBMS measurements, there is an additional resource breakdown of database work. This breakdown is described below. It is not utilized for the pure 3-tier case: different types of 3-tier work are simply not reflected in this breakdown. It is useful, however, for the 2-tier load separation and breakdown described further on in this paper.
The available RDBMS measurements and their utility in PeopleSoft environment are as follows:
USER
All PeopleSoft users are represented in Oracle as one user SYSADM (PeopleSoft schema owner). For this reason the use of the USER field is not helpful for creating workloads.
APPLICATION
PC-based 2-tier applications (non-TUXEDO) identify themselves to Oracle if the Oracle PC configuration file (ORACLE.INI) contains their ID fields, therefore their resource use is represented in Oracle measurements. 3-tier requests (coming from Tuxedo) do not identify their application (as observed so far) and don't allow for the resource breakdown of work.
3.1.3 What questions can be answered for a 3-tier environment
Once a workload characterization description is created according to the above guidelines, collected data can be processed, using appropriate tools, and resource use for different services and types of activities can be distributed into
corresponding buckets. Using analytic queueing methods, such as the ones described in [LAZO 84], the following types of questions could be answered:
• what domains/functions consume what resources, day-to-day or for special situations
• how will performance parameters (resource utilization, relative response time) change if the load on any of the domains changes
• same if load/functionality is moved around, consolidated or split • same if the underlying hardware is
3.2 Separating relevant RDBMS instances
There may be more than one instance of the database running on the Database Server host. Not all of them may be involved in doing PeopleSoft 3-tier work. Only the work of the relevant instances should be included in PeopleSoft workloads. This separation is possible because the full command name of a database process usually contains a string with the instance name.
3.3 Separating the 2-tier work
If both 3-tier users and 2-tier users send requests to PeopleSoft, one may want to separate work done on behalf of the 3-tier environment. This would be needed, for instance, in case what-if questions about the load change assume different growth rates for 2 and 3-tier loads. As mentioned above, there are 2 types of 2-tier users, UNIX-based and PC-UNIX-based.
3.3.1 Separating UNIX 2-tier users
As mentioned in the architecture section, database processes associated with this work run under the user names of the owners of batch jobs. Therefore, identifying database processes owned by batch users can separate RDBMS work for batch. The remainder of database work is attributable to the 3-tier work and non-Unix 2-tier users.
3.3.2 Separating PC 2-tier users
While the measurements from the RDBMS are not very helpful for the 3-tier load, the APPLICATION type measurements are useful in reflecting the work of PC-based 2-tier users. As mentioned in the architecture section, each type of PC client has its RDBMS resources consumption
summarized by PC-side executables. Thus, both the breakdown and the total amount of database work on behalf of such clients is provided. Since all resource utilization by database applications with well-known names listed above is caused by PC-based 2-tier users, it should be subtracted from the total database resources. The remainder represents 3-tier related database work.
3.4 Using additional information from TUXEDO
TUXEDO is instrumented to count requests to specific servers (PSAPPSRV, PSAUTH, etc. for PeopleSoft) and place them in a MIB [BEA97]. In case of PSAUTH, for instance, this count is the number of requests for authorization. In case of PSAPPSRV it may be requests from a lengthy list of different types. These request counts (Requests Done) could be summarized up, to the higher levels of hierarchy (Groups,
Machines, Domains). Note that there is only one Machine per Domain in PeopleSoft, so that Machine level counts could be used to define the number of transactions per Domain. Alternatively, since each service could be linked to a specific system process, a finer granularity of breakdown could be achieved by calculating TUXEDO resource utilization per unit of work (Requests Done) for different services.
While counts of requests are provided, relative resource use weights of transactions from different domains are not. This is true for TUXEDO level resource, with the
exception of the possibility of breaking down by Domains/Services. This is also true at the database level. It appears at the time of this writing that both BEA (the owner of
TUXEDO) and PeopleSoft are planning to improve their performance instrumentation, possibly using ARM (Application Response Measurement), see, e.g., [DING97]. This would allow them to catch up with SAP R/3 where workload statistics instrumentation is part of the system (CCMS). In the absence of this level of instrumentation, the
alternatives are
• assume that all requests have equal resource weight (within the chosen granularity entity)
• conduct experiments for specific types of requests similar to the ones described in [CHU98], except at a different
hierarchy level, e.g., type of request as opposed to SQL statement
3.5 Modeling a 2-tier PeopleSoft environment
In the previous sections techniques to separate 2-tier work from 3-tier work were described. This section summarizes analysis of 2-tier work.
3.5.1 Modeling UNIX 2-tier work
Looking back at the corresponding architecture section, one simply needs to describe a workload which combines the work done by batch tools (PSRUN, sqr) and their children processes, including the ones doing database work (e.g. ora…) and owned by batch users. Batch front-ends are clients here, while database processes constitute the server. This assignment naturally requires knowledge of batch user names.
3.5.2 Modeling PC 2-tier work
For the PC-based 2-tier work, only the database part is considered here. The breakdown of that work available with the current instrumentation is based on the client-application (PSTOOLS, PSQED, etc.). As mentioned in the architecture section, this resource breakdown is provided by the RDBMS where each type of client software is treated as a separate application. If an effort is made to include the name of every PC user into the oracle.ini file on the PC, user names could also be employed for workload characterization.
4.
Examples
4.1 Sample resource breakdown
Figure 3 demonstrates a sample resource breakdown for a mixed 2-3-tier
configuration. The environment consists of a UNIX system hosting both TUXEDO and database. The system consists of 6
processors, that is why CPU utilizations are computed out of 600%. There is some TUXEDO work, some batch jobs, and some requests by PC clients connected directly to the database. To obtain the database resource breakdown, the following steps have been taken.
• The total CPU utilization of all database processes happens to be 309%. The batch jobs present (PSRUNxxx) are all owned by the UNIX user hrprod, therefore all the database processes owned by that user need to be
separated and included into the batch workload (55% CPU). The rest of the database work is broken down based on database measurements.
• As described in the paper, resource consumption of applications is
measured by the database. In this case, we have several PS clients running PSTOOLS and other utilities for a total database CPU consumption of 153% (we have chosen to separate PSTOOLS and combine all the others).
• The rest of the work, as reported by the database, is done by database user SYSADMIN and belongs to undefined
application. This is the work attributed to TUXEDO requests.
As mentioned in the paper, a finer
breakdown of TUXEDO work by servers is available. In addition, a way to match server work breakdown with the corresponding database work is currently investigated.
PC Client PS Tools
Figure 3. 2 – 3-tier sample resource
breakdown
PSTOOLS.EXE 102% Other PS*.EXE 51% SYSADM (APP undefined)=3-tier work 101% ora* UNIX Processes UID=hrprod 55%DATABASE
PC Client TUXEDOProcesses PSRUN* UNIX
Processes UID=hrprod PC Client UNIX Measurements Database Measurements UNIX System
4.2 Sample workload resource utilization profiles
To obtain resource utilization profiles for specific types of activities, to enable sizing and capacity planning, one needs to conduct controlled experiments and measure
resources as described earlier in this paper. The results of 2 such experiments are presented in Table 1. They include CPU consumption for 2 queries: disability benefits and payroll hours. The queries were run in a 3-tier environment (with TUXEDO). No extraneous activity was present. Resource consumption was obtained from OS and RDBMS data, whereas the query counts were derived from TUXEDO statistics.
Query TUXEDO
CPU
Oracle CPU
Payroll hours 143 msec 101 msec Disability
benefits
53 msec 1672 msec
Table 1. Transaction CPU utilization profiles
One can see that the benefits query is much heavier on the database than the payroll one, a useful piece of knowledge for configuring and balancing loads between application and database servers. Further analysis, not reflected in the table, shows that most of the app server work for the payroll query was done by a PSAPPSRV process, the main server engine of
TUXEDO. On the other hand, about half of the app server work for the benefits query was due to WSH, the workstation handler. The database used in the tests was the small default sample that comes with PeopleSoft, therefore the amount of I/O was negligible.
The following SQL statements describe the 2 queries: Disability benefits SELECT TO_CHAR(A.DEDUCTION_BEGIN_DT,'YYY Y-MM-DD'), TO_CHAR(A.COVERAGE_BEGIN_DT,'YYY Y-MM-DD') FROM PS_DISABILITY_VW A, PS_EMPLMT_SRCH_GBL A1 WHERE A.EMPLID = A1.EMPLID AND A.EMPL_RCD# = A1.EMPL_RCD# AND A1.OPRCLASS = 'ALLPANLS'
Payroll hours SELECT A.BALNC_LINES, A.BALNC_REG_PAY_HRS, A.BALNC_OTH_HRS, A.BALNC_REG_PAY FROM PS_PAY_PAGE A
5.
Conclusions
• PeopleSoft application has a complex multi-tiered structure that needs to be clearly understood and reflected when doing capacity analysis and planning Structures and components for different types of access and even different number flavor of tiers can coexist in the same environment and be considered in the analysis
• There is a variety of relevant
measurements at different levels but currently no comprehensive internal instrumentation to provide resource utilization breakdown by functions, a la SAP CCMS instrumentation
OS, RDBMS, TUXEDO and PeopleSoft itself are sources of disparate metrics that need to be pulled together for successful analysis • An approach to deriving workload
information using the available
measurements and suitable for capacity planning is introduced in this paper Using this approach one can separate activities from different tiers and obtain further breakdown of work. Resource
profiles for specific types of activities can be obtained using controlled tests.
6.
References
[BEA97] BEA TUXEDO Administrator’s Guide, Rel. 6.3, 1997
[BILB99] Darrel Bilbrey, PeopleSoft Administrator's Guide. The Essential Resource for Implementation and Administration, Cybex, 1999
[CHU98] Yi-Chun Chu, Charles J. Antonelli, and Toby J. Teorey, Performance
Measurement of the PeopleSoft Multi-tier Remote Computing Application, Proc. of CMG’98, 1998
[DING97] Yiping Ding, Performance Modeling with Application Response Measurement (ARM): Pros and Cons, Proc. of CMG’97, 1997
[LAZO84] Edward Lazowska, John Zahorjan, Scott Graham, and Kenneth Sevcik, Quantitative System Performance,