A) Conceptual framework and scenarios
B) Layered systems and models C) Building models from scenarios D) Automated model-building
... from Use Case Maps, UML, traces E) Modeling insights: patterns and anti-patterns F) Some examples
Outline
Techniques for Deriving Performance
Models from Software Designs
Murray Woodside
Second Part
Techniques for deriving performance models from software designs
© Murray Woodside, July 2002 2 Carleton University
E) Modeling Insights:
Patterns and Anti-patterns
• recurring resource-use patterns
• to achieve overlapping and enhance concurrency
• logical resources and mutual exclusion
• shared pool of logical resources
• peer-to-peer
• communications patterns
• performance pathologies associated with some patterns
• software bottleneck
Pattern: Overlapped execution and
multi-threading or virtual threads
Single-threaded A Multi-threaded A can take a second
... blocks for reply request while blocked on first
... server is “busy” while ... higher capacity
blocked ... each thread has a long service time
... long service time ... in model terms, a multiserver
... saturated server, ... many servers, one queue
lightly loaded processor
A B A B A B A A B
Techniques for deriving performance models from software designs
© Murray Woodside, July 2002 4 Carleton University
Overlapping: Multi-threading and virtual threads
• Multi-threading is supported in the kernel (Mach,
Solaris, Linux, NT) or by user-level thread packages
• Virtual threads are programmed by the user
• save the context of the blocked operation in a data
table
• accept another message and deal with it
• if it is a reply to a previous request, get out the
state of the blocked operation and resume it
• called an “asynchronous style of programming”
Threading: A web server needs threads
• the server blocks on the
disk, and for the TCP transfer to complete
• CPU time is a small
portion of the thread service time
• multi-threaded, it can
serve many other requests
• thread-per-request,
fixed thread pool, adaptive pool size
Server [.01] Net [2] Web user [.005] Srvr Processor 2 1 Srvr Disks [.01] TCP Ack Delay n v
Techniques for deriving performance models from software designs
© Murray Woodside, July 2002 6 Carleton University
Overlapping through early response, or second
phase service
• Idea: Give a reply as early as possible
• Do postponeable work after the reply, as phase 2 • E. G.: Database server update operation:
• write to log file before returning, • execute final writes later. • Second-phase model may
• (a) place this work right after the return (approx), or
• (b) send an asynchronous message to a clean-up process that queues it and
does it later
phase 1 phase 2, asynchronous and parallel client
Overlapping through asynchronous RPC
A can execute activity b while the remote call to server B is proceeding At some point A must wait
for the result
A
B
A
B
a b c
Modeling: the entry of A is defined by:
a b c
B
This parallel activity defines a subthread that makes a request to B and blocks
Join
Techniques for deriving performance models from software designs
© Murray Woodside, July 2002 8 Carleton University
Overlapping through forwarding
• typical of
• a service dispatcher
• a pipeline
• overlapped service:
• a second request can enter
taskB as soon as it has
forwarded to taskC
• A is synchronous, B is not
• message-handling pipeline, call
setups
ta sk A
ta sk B
ta sk C
Forwarding in CORBA
• Alternative interaction patterns provided by an ORB:
App
ORB
method1 method2
1. ORB acts as RPC intermediary
App
ORB
method1 method2
2. ORB forwards to method
App ORB method1 method2 3. ORB provides a handle (acts as name server)
Techniques for deriving performance models from software designs
© Murray Woodside, July 2002 10 Carleton University
Forwarding in a web-telephony system
• Support automated call answering by web pages
• speech playout is from a file encoded in a web page
• user enters tones; logic of decoding is also in the web page
• VoiceXML standard (vxmlforum.org)
Call Control Playout Device Web Page Cache Web Page Interpreter Remote Web Server
Pattern of logical resources: critical sections
• Tasks A and B must enter a critical
section (call it CS) for some work....
• this shows the call to enter CS • but it doesn’t express the resource
context effect of CS
• So:
• Separate out the computation
within CS into Shadow Tasks A|CS and B|CS
• to direct the call from A to A|CS,
make CS a pseudo-task with two
pseudo-entries
A B
not a suitable model
PA PB
A
B B|CS A|CS CS PB PA CS ManagerTechniques for deriving performance models from software designs
© Murray Woodside, July 2002 12 Carleton University
...using the critical section pattern
• the shadow task A|CS is really part
of A
• A always blocked when A|CS
runs, so separating them does not produce false concurrency
• it does place A|CS in a separate resource context
... this is a general modeling pattern for separating resource contexts
• both A and A|CS can make any
calls to other tasks as servers
A
B B|CS A|CS CS PB PA SERVLogical resources: buffers
• a set of Application tasks share a pool of B buffers, managed by
BufMgr
• a task needing a buffer queues until it is provided
• while holding the buffer, the Applications execute operations
called App|Buf... the same for all of them... or different
B “threads” N applications Applications Applications Applications BufMgr ApplicationsApp|Buf App2 App1
BufEntry1 BufEntry2 ... BufMgr
App2|Buf App1|Buf P1
...
...
P2Techniques for deriving performance models from software designs
© Murray Woodside, July 2002 14 Carleton University
Models with buffered communications
• Source sends a message to a finite buffer pool (space for B
messages)
• if pool is full, Source blocks until there is space
• Buffers replies at once to source, then sends in phase 2 to
Destination
• if Destination needs to hold the buffer, it does its work in phase 1 • if it does not, it “replies” at once to Buffers and does its work
in phase 2.
Applications Destination
Source Buffers (0,1)
B “threads” May be several tasks
Pattern: Peer-to-peer interaction
• Peers are symmetrical
• they make blocking requests to each other • potential exists for cyclic interactions • to give layered behaviour:
• each peer has a Main part, and a Responder part that
responds to requests from others
• simplest case: the Responder part does not make requests to
other sites
• more complex, with cycles: a mechanism is needed to handle
possible deadlock due to cycles.
Techniques for deriving performance models from software designs
© Murray Woodside, July 2002 16 Carleton University
Example of peer-to-peer interaction: Distributed
database (Sheikh)
• transform a peer-to-peer relationship to reveal layered behaviour
Original view Layered view
DataManagerA DataStoreA DataStoreB DataManagerB UserA UserB DataManagerA DataStoreA DataStoreB DataManagerB UserA UserB ApplicationB ApplicationA
Communications patterns
• overhead execution at the sender and the receiver, to
make the messages and execute the protocol
• protocol delays
• for flow control
• network latency as a representation
• network bandwidth limitation
Techniques for deriving performance models from software designs
© Murray Woodside, July 2002 18 Carleton University
Communications Patterns (1): Overhead “tasks”
• send-request executes the
protocol overhead at the sender, in both directions
• send-reply executes the
overhead at the server, before the logical reply gets back to the client
• send-reply execution is lumped together and possibly misplaced in sequence, but the total work to provide the reply is right client send-request server send-reply 1 1 1
Communications Patterns (2):
overhead and latency
• net delay is
included in the
send, and again in
the reply
• one latency each
way
client send-request Net delay server send-reply Net delay 1 1 1 INF 1 INFTechniques for deriving performance models from software designs
© Murray Woodside, July 2002 20 Carleton University
Communications Patterns (3):... add a bandwidth
limitation to admit messages to the network
• “Rate Limiter” RL
has zero first phase, second phase is T = 1/rate
• next message is
admitted only after second phase is over
• only shown for send • if the limitation is
shared between the directions, the RL “task” may be shared too. client send-request Net delay server send-reply Net delay 1 1 1 INF 1 Rate Limiter INF
admission control admission control admission control
Communications Patterns (4): .... add layered flow
control
• all packets must
complete their admission and
latency before logical delivery of the
request to the server
client send-request admission control W threads for window tokens Net delay server send-reply W
threads delayNet n packets 1 1 1 m packets 1 1 INF INF
Techniques for deriving performance models from software designs
© Murray Woodside, July 2002 22 Carleton University
Anti-Pattern: software bottleneck
• potential bottleneck pattern
wherever there are multiple clients and multiple servers
• software bottleneck must be
observed via utilizations
(saturated at S and above, not saturated below)
• S is saturated because it is
blocked on other servers
• cure is
• multiplicity (e.g. multiple
threads, partitioned subsystems),
• reducing the blocking times
Server S
Processor
multiple requesters
multiple lower servers unsaturated
(saturated)
Hourglass pattern of tasks
Examples of software bottlenecks
• in general, due to serialization at the bottleneck
• single threaded web server
• routing table
• single exclusive lock on an entire data base
• small buffer pool
• small flow control window
Techniques for deriving performance models from software designs
© Murray Woodside, July 2002 24 Carleton University
Multiplicity to cure a software bottleneck
• server level
• multithread a server
• replicated servers (this adds processing and network resources
too)
• data level
• partition the data and lock each partition separately (locks on
objects or pages)
• copies of the entire data service (distributed replicas of a
routing table)
• subsystem level:
Reduced delay to cure a software bottleneck
• batched requests across a network of long latency
• avoid retrieval across a network of long latency
• cache data
• prefetch
• move functions out of this server (minimize the critical
section)
• use worker threads or parallel operations or fetches
Techniques for deriving performance models from software designs
© Murray Woodside, July 2002 26 Carleton University
F) Experience in building and using models
• Examples:
• web integration with telephony
• router software and hardware
• web-based application
• Concerns:
• accuracy
• modeling power
• interpretation
Example: “WebTalk” (web server+telephony)
• Support automated call answering by web pages • speech playout is from a file encoded in a web page
• user enters tones; logic of decoding is also in the web page • VoiceXML standard (vxmlforum.org)
• User scenario for one interaction
1. DTMF (tone) or begin call event 2. playout begins and continues
3. user listens and enters next DTMF event 4. next playout OR forward the call
3
1 WAIT LISTEN/THINK STOP PLAYOUT2
Techniques for deriving performance models from software designs
© Murray Woodside, July 2002 28 Carleton University
Web-based telephone dialogue:
Scenario over system components
Call Control Playout Device Web Page Cache Web Page Interpreter Remote Web Server