The Theory And Practice of Testing Software Applications
For Cloud Computing
Mark Grechanik
Cloud Computing Is Everywhere
Global spending on public cloud services estimated to reach $110.3Bil by 2016.
Does it affect engineering software and in what way?
Software Engineers Are Not Sure What Cloud Computing Brings to Them
Testing In the Cloud OR
Testing For the Cloud?
Testing in the cloud means to run multiple test scripts in parallel in different virtual machines (VMs)
• Use cloud to speed up testing
Testing for the cloud means that certain properties of the application are verified and validated to ensure that it works correctly when deployed in the cloud
Is Cloud Computing Orthogonal To Software Testing?
Or The Cloud Will Get Ya If You Ignore It!
Software Engineering
1 2 3
Cloud Compu,ng Is Not A Part of So9ware Development Lifecycle
Requirements
Gathering Design Development
4 Tes6ng 5 Maintenance and Evolu6on
What does it mean to develop and
test software applications for cloud platforms?
Understanding Fundamentals of Cloud Computing
Cloud Is Not A New Thing • Distributed Computing • Grid Computing • P2P computing • Internet computing • Service-oriented computing • Cloud computing
What Is The Fundamental
Nature of Cloud Computing?
Traditional Computing
Forecast infrastructure needs
Purchase infrastructure hardware Own and maintain infrastructure
Cloud
Computing
Infrastructure is rented
Resources are dynamically (de)allocated
Elasticity of Cloud Computing
Cloud infrastructure re-allocate resources on demand and are considered elastic
• Scale out and up – adding resources to VMs as demand
increases
• Scale in and down – releasing resources to VMs as
demand decreases
Highly elastic clouds can re-provision resources rapidly and automatically in response to demand
• Stakeholders pay only for using, rather than for owning and employing the hardware/software infrastructures and
The Dream of Elastic Cloud
I wish to significantly reduce the costs by
renting rather than owning elastic cloud
execution infrastructure!
What Amazon AWS Offers
http://aws.amazon.com/autoscaling/
Auto Scaling enables you to closely follow the demand curve for your applications, reducing the need to provision
Amazon EC2 capacity in advance. For example, you can set a condition to add new Amazon EC2 instances in
increments of 3 instances to the Auto Scaling Group when the average CPU utilization of your Amazon EC2 fleet goes above 70 percent; and similarly, you can set a condition
to remove Amazon EC2 instances in the same increments when CPU Utilization falls below 10 percent. Often, you may
want more time to allow your fleet to stabilize before
Auto Scaling adds or removes more Amazon EC2 instances. You can configure a cool-down period for your Auto Scaling Group, which tells Auto Scaling to wait for some time after taking an action before it evaluates the conditions again. Auto Scaling enables you to run
Google And Other Cloud Providers Offer Heuristics
Cost-Utilization Curve time cost Lost customers U O Lost money O A1 A2
The Problem
Application’s behavior is difficult
to predict
Cloud does not (de)allocate resources effectively Increased cost of the application deployment Decreased performance of the application 60% or companies were not satisfied with performance and 46%
Excellent Performance Is Even More Important!
Balancing Cost And Performance
Resources cost money, per usage
Precise cloud elasticity is an answer to the problem of providing sufficient resources to software applications while minimizing the cost of deployment
Connect Cloud Deployment With Performance Testing
Cloud Deployment Performance Testing Provisioning Strategies
Cloud Computing And Software Engineering
Software Engineering Lifecycle
Modeling Testing Optimizing Maintaining Evolving Cloud Computing Hardware Load Balancer Virtual Machines Interfaces
Key Cloud Concepts
Cost of
Resources Variability of resources
Key Insights
Key software and data artifacts are generated during different stages of the software engineering lifecycle could be, but are not currently being, used by the cloud infrastructures to improve predictive provisioning of resources.
Applications running in the cloud do not provide important feedback to stakeholders about any performance problems their applications may have and that might ultimately require more resources and, thus, increase operational costs.
There is no two-way flow of information between software development and
Model of Cloud Computing VM1 VM2 VM3 VM4 VM5 P P P P P P P P Load Balancer P P
Performance Rules
Descriptive rules for selecting test input data play a significant role in software testing, where these rules approximate the functionality of the application.
• Example rule: some customers will pose a high insurance risk if these customers have one or more prior insurance fraud convictions and deadbolt locks are not installed on their
premises.
Putting Rules And Strategies
Together For Cloud Computing
Learn Performance Rules Synthesize Provisioning Strategies Use Provisioning Strategies
Learn Performance Rules Synthesize Provisioning Strategies Use Provisioning Strategies
Provisioning Resources with performancE Software Test automatiOn (PRESTO)
Performance Rules Are Behavioral Models
Behavioral model is a collection of workload profiles, constraints, performance counters, and various relations among components of the application.
Performance rules are examples of applications behavioral models.
Our goal is to obtain and use these models to synthesize provisioning strategies
Finding Rules That Describe Performance Problems
A goal is to find situations when applications unexpectedly exhibit worsened characteristics for certain combinations of input values.
Finding This Situation Is Like Getting a Perfect Hand!
Example: Bottlenecks
A B
Bottlenecks or hot spots are phenomena where the performance of the application is limited by
one or few components
C
A slow component of the application
Bottlenecks in Nontrivial Applications
Input
s Output
Performance Testing Applications
Server
Client calls Server Client methods Server interacts with back-end data storages Data base
The Client executes methods of the Server using different combination of input parameter values
Performance Testing Applications Server Client Data base Client receives data from Server Server receives data from Database
Performance can be measured in the number of roundtrips per time unit
Automating Performance Testing Applications
Server
Client calls Server Client methods and receives data Server interacts with back-end data storages Data base Data base
Automated test scripts simulate clients who perform transactions against the Server, and they use input data from the backend database
Automating Performance Testing Applications
Server Server interacts
with back-end data storages Data base Data base
Automated test scripts simulate clients who perform transactions against the Server, and they use input data from the backend database
A Problem With Performance Testing of Large Applications
A medium-size renters insurance application has a large universe of input data.
• For 78,000,000 customer profiles it would take close to 1,500 years to test this application on all profile inputs provided that it takes only 10mins per input. • Just “touching” one customer profile entry for 1sec, it takes about 2.5 years to
“touch” them all!
A Problem With Performance Testing of Large Applications
How to select a manageable
subset of the input data without compromising the effectiveness of performance testing.
• We need an insight into selecting
An Example
Most of the renters are good customers, have
hardly any accident/claims or fraudulent activities.
• That’s how insurance companies make money.
• Unfortunately, the data of these “good” customers are not always good for performance testing purpose.
Randomly selecting customer profiles does not work well – keep encountering good customers.
• Yet, exploratory performance testing is one of the main ways of performance testing applications in industry!
The Intuition Behind Performance Testing Trigger more computation Bigger executing footprint Trigger more database queries Bigger database footprint More data to process in programs
Core Idea
Select only these customer inputs that trigger intensive
computations. These computations most likely
involve significant
Performance Testing Is Laborious and Difficult And
A Computing Primitive in PRESTO - FOREPOST
Feedback-ORiEnted PerfOrmance Software
Testing (FOREPOST) finds performance
problems automatically by learning and using rules that describe classes of input data that lead to intensive computations.
• FOREPOST is an adaptive, feedback-directed learning testing system that learns rules from execution traces and uses these learned rules to select test input data
automatically to find more performance problems in applications when compared to exploratory random performance testing.
Compute a Function That Represents the Application
f: Input -> Performance
FOREPOST
Components of FOREPOST
1 2 3
Adap6ve test script uses rules that are issued by machine learning
components to select input data that lead to more intensive computa6ons.
When the test script executes the applica6on, its execu6on traces are collected as part of run6me monitoring, leading to (re)learning rules.
Run6me
Monitoring Machine Learning
4 Informa6on Compression 5 BoPleneck Detec6on Algorithm Adap6ve Test Script
Some Statistics
It takes about 10mins to execute the instrumented renters app end-2-end automatically using a test script.
An average run invokes over 3,000 methods more than 3,000,000 times and it retrieves over 1Mb of data.
An average execution traces contains over 1Gb of data. It takes less than 10mins on average to process and analyze one trace.
Constructing Data For a Learner and Learning Rules
Input_1_Name Input_2_Name … Input_k_Name Class
Value_p Value_q … Value_m Good
Value_p Value_q … Value_m Bad
… … … … …
Finding Bottlenecks
A B
Bottlenecks or hot spots are phenomena where the performance of the application is limited by
one or few components
C
A slow component of the application
Finding Bottlenecks Cluster Execution Traces Rank Methods by Their Significance In Each Cluster Compute Logical Difference Between Clusters
FOREPOST Architecture Test Script Profiler Execution Trace Analyzer Trace Clustering Rules Good test traces Bad test traces Learner Trace Statistics Method and data statistics ICA Method weights Advisor Test Script
PRESTO Architecture FOREPOST … Load Balancer Load Model Provisioning Strategies App Resource Controller Customers Auto Test Scripts 2 3 4 5 6 8 7 1 9 10 11 12 13 16 17 14 14 15 Performance Rules Perturbation Operators
Experimenting With CPU
Bottlenecks Using CloudStack
0 0.5 1 1.5 2 2.5 3 JPetStore -‐ 5
users JPetStore -‐ 15 users JPetStore -‐ 30 users Dell DVD Store -‐ 5 users Dell DVD Store -‐ 15 users Dell DVD Store -‐ 30 users
Th ro ug hp ut (n um be r o f U RL s pe r s ec on d) Random Basic Δ1 Δ2 Δ3 Δ4 Δ5
Experimenting With Database Bottlenecks Using CloudStack
0 0.5 1 1.5 2 2.5 3 JPetStore -‐ 5
users JPetStore -‐ 15 users JPetStore -‐ 30 users Dell DVD Store -‐ 5 users Dell DVD Store -‐ 15 users Dell DVD Store -‐ 30 users
Th ro ug hp ut (n um be r o f U RL s pe r s ec on d) Random Basic Δ1 Δ2 Δ3 Δ4 Δ5
Conclusions
This proposed research program is novel, as to the best of our knowledge, there exists no work that utilizes automatic feedback-directed adaptive test scripts to find improve cloud elasticity.
The results of this work show that the proposed approach can effectively locate performance problems and guide stakeholders to improve the quality of software that is deployed in the cloud.