Evidence based performance tuning
of
enterprise Java applications
enterprise Java applications
By Jeroen Borgers
Evidence based performance tuning
of
enterprise Java applications
By Jeroen Borgers
Bad performance: hard to fix
• Call center agent took 20 minutes to work through five screens with customer
– Customer un-friendly – High cost
• Speedup needed before rush of number porting
4
• Speedup needed before rush of number porting
– Major app redesign was needed
• Result
– very slow system when up, mostly system down – damage to company image
– loss of thousands of customers and $100,000,000 – company crash
Goal
Provide an answer to the question:
Who is Jeroen Borgers?
• sr. consultant with Xebia
• helps customers on enterprise Java performance issues
• instructs Java performance tuning courses • instructs Java performance tuning courses • doing Java since 1996
• various roles in several industries
– developer, architect, team lead, quality officer, auditor, performance tester and tuner
Who is Xebia?
• international IT organization
– NL, FR, IN
• specialized in enterprise Java
• consultancy and projects
• consultancy and projects
• 5 years old
• ±100 employees and growing
• share knowledge
Agenda
• Big picture architecture vs. implementation
• How to achieve bad performance
• Evidence based tuning for high speed
• Tools overview
8
• Tools overview
• Largest Dutch web shop case
• Conclusions
Enterprise Java performance
big picture
• Java is not slow anymore
• Importance how you do it: algorithms
• Typical enterprise app: most time spent in calls out of Java process
– Remote calls – Remote calls
– Database access
for (int i = 0; i < size; i++) { order = orderDao.getOrder(i);
Architecture enables performance
and scalability
• Key choices
– Application distribution
– Persistent data access approach: O-R mapping – Data conversion: XML, encodings
– Presentation tier technologies
10
– Presentation tier technologies
– Use of proprietary features of db, app server – Data caching
• It is all about:
– Remote calls
– Database access
Architecture enables,
implementation makes it happen
• Minimize remote calls and other I/O
• Speed-up data conversion
• Code optimizations involving remote calls and data conversion
Architecture enables,
implementation makes it happen
• Minimize remote calls and other I/O
– Direct JDBC: use of batch updates – Hibernate: use of query caching
• Speed-up data conversion
– XML usage: use JiBX instead of JAXB
12
– XML usage: use JiBX instead of JAXB
– URL encoding: use commons codec instead of java.net
• Code optimizations involving remote calls and data conversion
– Algorithms, collections
Architecture enables performance
& scalability
Architecture enables performance
& scalability
• The architecture is scalable
14
Architecture enables performance
& scalability
• The architecture is scalable
Architecture enables performance
& scalability
• How one implements the architecture makes a difference
16
• But the requirement of 120 km/h will never be met!
Architecture enables performance
& scalability
Architecture enables performance
& scalability
• More demanding requirements can be met with the right implementation
• Bugatti EB Veyron 16.4 18 16.4 • Top 407 km/h • 0-300 in 14 s.
How to achieve bad performance
• Many JavaEE apps have hard to solve
performance problems
• How to achieve this?
– Architecture not suitable for requirements – Architecture not suitable for requirements – Ignore the problems too long
– Don’t follow best practices
Architecture not suitable
• Architect: “Most important is a pure, flexible, scalable architecture: we distribute our
application components and loosely couple with XML and web services.”
20
• Nonsense!
– Those who pay the bill care about efficient working, not purity;
– Speed is traded for scalability: remote calls and data conversion can be killing;
Ignore problems too long
• Project leader: “First we get the functionality
right, then we start worrying about performance.” • Performance is often ignored until the final test,
a week before production, or later!
• To meet requirements you may need a major • To meet requirements you may need a major
re-design: costly • Solved by:
– Verify the architecture in a POC: test a vertical slice – Test performance continuously
How to achieve bad performance
• How to achieve this?
– Architecture not suitable for requirements – Ignore the problems too long
– Don’t follow best practices
22
– Don’t follow best practices – Optimize counterproductively
Don’t follow best practices
• Best practices are like:
– Use Map instead of List for lookups – Use I/O buffering: InputStreamBuffer – Use SQL batch updates
– Use SQL batch updates – Use db table indexes
Don’t follow best practices
• Best practices are like:
– Use Map instead of List for lookups – Use I/O buffering: InputStreamBuffer – Use SQL batch updates
24
– Use SQL batch updates – Use db table indexes
– Use StringBuffer for String concatenation?
• Be aware: a best practice may not work in
your situation
Best practices can be premature
optimizations with trade-offs
• Developer: “String performs badly, we use StringBuffer for concatenation everywhere.”
• Many books: “use StringBuffer for
toString() { return “[“ + getName() + “]”; }
• faster is
toString() {
StringBuffer buf = new StringBuffer(“[“); buf.append(getName()).append(“]”);
buf.append(getName()).append(“]”); return buf.toString();
} ”
• However: same bytecode!
• Maintainability and developer time is traded for an assumed,
non-existing performance gain
Optimize counterproductively
• Updating state remotely by sending only delta’s like RemoveAttribute, AddAttribute
– many small objects with much serialization and communication overhead
• Object and thread pooling depending on JVM
26
• Object and thread pooling depending on JVM • Optimizations are very time and environment
dependent
• So: don’t trust your knowledge or intuition, but measure that your optimization solves the
Bad performance
• Architecture not suitable for requirements
• Ignore the problems too long
• Don’t follow best practices
• Optimize counterproductively
• Optimize counterproductively
Evidence based tuning for high
speed
• Don’t make assumptions but measure
• Have quantified requirements, prove they are met or not • Prove the architecture performs in a POC
• Continuously test performance during development
– to get a first impression
– for quick feedback on changes
28
– for quick feedback on changes
• Test representatively in a dedicated environment
– with production-like load
– on a production-like environment – with production-like data
• Fix problems when found
• Prove the effect of each optimization
Tools for measuring
• OS profilers to see resource usage: CPU, memory, I/O, threading • Database analyzers to see query behavior in the database
• Network analyzers to see network traffic
• App server monitors to see app server resource usage: heap, connection pool, stmt cache,..
• J2EE analyzers to see J2EE behavior under load
• Instrumentation monitors to see timing at the instrumented
30
• Instrumentation monitors to see timing at the instrumented locations
• Java profilers to see detailed CPU and memory behavior in JVM • Continuous test tools to quickly get timing feedback of code
changes
• Load test tools to generate usage load
Case: speeding up a web shop
Wehkamp TRC architecture
web server: ASP/.Net
Tuxedo DB TRC-appsrv: Java DB Services here DB
Speedup results at Wehkamp
• Several services optimized to meet requirements • Most expensive: voegKlantToe (addCustomer);verwerkenNieuweOrder(processNewOrder): from 7 s. to 1.5 s.
• 45% db-queries, 45% tux-calls, 10% java code • Top optimizations:
34
• Top optimizations:
– Reduce number of db queries
– Optimize most expensive db queries – Reduce number of Tux/mainframe calls
– Optimize most expensive Tux/mainframe calls
• App orchestrates: many Java code changes needed for this
Tools used
• Load test tool: Apache JMeter • App level monitor: JAMon API
• History and trends reporter: JARep
• Continuous test tool: cruisecontrol+maven+JMeter • Java profiler: Quest JProbe
• Java profiler: Quest JProbe
• App server monitor: Tivoli Performance Monitor • J2EE analyzer: Quest PerformaSure
• Database analyzer: Quest Toad, Oracle Statspack • Unix OS profiler: vmstat, nmon
JMeter for generating load and
black box testing
JAMon for monitoring statistics
added
service name
Instrumenting with JAMon API
JARep reporting for history
JVM / app server node 4 JVM / app server node 3
deployment of JARep
your
app jamon.warjamonadmin .jsp
ShowJamon
Counters.jsp DataFetcher
DataPersister
every 5 minutes
JVM / app server node 3 JVM / app server node 2 JVM / app server node 1
www.xebia.com
jamonapi.jar
JVM / off-line box
RDBMS
PerformaSure for analyses under
load
Analysis tools compared
• Profiler JProbe
– Pro: High detail: code line level
– Con: High overhead, cannot deal with load
• JAMon API + JARep
– Pro: Low overhead < 1%
46
– Pro: Low overhead < 1%
– Pro: Enables comparing over time, see trends – Con: No call tree, cannot drill down
– Con: Limited to built-in measure points
• PerformaSure
– Pro: reasonable overhead, slick UI, measures all on JVM – Con: not cheap, configuration, wait time
Conclusions & recommendations
• Architecture and implementation both determine performance
• Consider whole application life cycle: requirements, POC, continuous testing, representative testing, monitoring
representative testing, monitoring • Biggest gains are in
– Remote calls and other I/O
– Database access or other data conversions
“Meten is weten”
(measuring means knowing)