• No results found

Rigorous Performance Testing on the Web. Grant Ellis Senior Performance Architect, Instart Logic

N/A
N/A
Protected

Academic year: 2021

Share "Rigorous Performance Testing on the Web. Grant Ellis Senior Performance Architect, Instart Logic"

Copied!
45
0
0

Loading.... (view fulltext now)

Full text

(1)

Rigorous Performance

Testing on the Web

Grant Ellis

(2)

Who is Instart Logic?

§  Software company focused on Application Delivery

§  We work with globally known brands whose business

depends on performance, and make their sites and apps really fast

§  Team includes big data, virtualization and web

performance experts from Google, Facebook, Akamai, Cisco, Citrix, VMware, and Aster Data

(3)
(4)

§  How was the data collected? Aggregated? Normalized?

§  What is response time? What does that mean for the users?

§  Did any actual human beings see this response time?

§  What devices/browsers were used? Laptop? Phone? Tablet?

§  Where were the users located?

(5)

1.  Methodology matters more than results

2.  Statistical analysis can (and sometimes does) lie.

Ø It is really easy to

Ø make great results look poor,

Ø make poor results look great,

Ø either deliberately or accidentally.

(6)

Table of Contents

§  The Internet, The Bottleneck, and The Test: A brief history

§  Last-Mile Performance Tools (It’s dangerous to go alone!)

§  Now I have data… Lots of data

§  But, wait, there’s more (data)!

§  Need more? Meet the CDF.

(7)

Need For Speed: Packet Edition, created by Raphaël Luta

http://www.aptiwan.com/packetstory/

(8)

The Dawn of the (World Wide) Web

§  Adoption viable for commerce and business

§  Performance detractors:

-  Weak server hardware

-  Clumsy scaling technology

-  Poor first-mile connectivity

§  Primary Bottlenecks:

-  Hardware

-  First-mile connectivity

(9)

The Internet, The Bottleneck, and The Test: A brief history

LAST MILE MIDDLE MILE FIRST MILE

Bottleneck Bottleneck

HARDWARE

Repeatedly loads whole pages. Measured performance takes into account the page, the

embedded objects, and the server latency introduced by a then-traditional three-tier

architecture.

ADC ISP

(10)

Data center scale was conquered.

§  Adoption on the web increased again:

-  Google, Facebook, fully-baked e-commerce, others

-  Governments digitized records and moved vital functions to the

Web

§  Performance detractors:

-  Middle-mile copper

-  Congested switches

-  Poorly maintained peering points

§  Primary Bottlenecks:

-  Middle-mile

(11)

The Internet, The Bottleneck, and The Test: A brief history

LAST MILE MIDDLE MILE FIRST MILE HARDWARE

Bottleneck

Backbone products from Gomez and Keynote

Enables ongoing performance testing (e.g. monitoring) from multiple geographies at the same time.

Beware: Some content delivery networks have taken care to

place their nodes on the same network, or even the same rack, as synthetic testing nodes. Look for unrealistically low response

times in your embedded objects!

(12)

The Internet, The Bottleneck, and The Test: A brief history

LAST MILE MIDDLE MILE FIRST MILE HARDWARE

Bottleneck

§  Last mile latency, packet loss

§  Browser mechanics

(13)

The Application Delivery Challenge Today

High Performance Browser Networking by Ilya Grigorik, Figures 7-16 and 10-6

Available for free online: http://chimera.labs.oreilly.com/books/1230000000545/index.html

0 50 100 150 200 250

Wired LTE WiFi 4G 3G

La te n cy (ms )

(14)

Table of Contents

§  The Internet, The Bottleneck, and The Test: A brief history

§  Last-Mile Performance Tools (It’s dangerous to go alone!)

§  Now I have data… Lots of data

§  But, wait, there’s more (data)!

§  Need more? Meet the CDF.

(15)

Last-Mile Performance Tools

(It’s dangerous to go alone!)

•  JMeter and LoadRunner measure:

•  From a single geography (usually on-premise) •  With a single browser

•  Keynote backbone / Gomez backbone: •  Report only on average

•  Use fixed (backbone) connectivity •  Still simulate data

•  None of the above measure:

•  Multiple devices

•  Multiple connection types •  True user experience

•  Impact from wireless technologies

(16)

Synthetic Testing

Pros

•  User Experience metrics

•  Open source!

•  Multiple device types

•  Multiple connection types (traffic shaping) •  Great reports

•  Captures waterfall diagrams

Cons

•  Limited analysis tools

•  Difficult to monitor performance •  Platform stability

•  It’s still synthetic

Real User Monitoring (RUM)

Pros

•  True user experience •  Easy set-up

•  Great browser support •  Multiple device types

•  Multiple connection types •  Open source tools available

Cons

•  Requires live traffic - Responsive, not preemptive •  Measurement impacts results

•  Safari data is limited

•  Outliers are can be extreme and must be removed

boomerang.js

Last-Mile Performance Tools

(17)

First: New vocab for last-mile tools

§  Fully Loaded

-  Entire page has been loaded

-  Including asynchronous functions like analytics beacons.

-  The browser hasn’t utilized the Internet Connection for a while

-  Generally transparent from a users perspective

For a long time, fully loaded is all we had. With mature

client-side technologies, the Fully Loaded metric is much less relevant:

•  Does not take into account browser mechanics

•  Fires after connection is disused– nothing to do with

(18)

First: New vocab for last-mile tools

§  Fully Loaded

-  Entire page has been loaded

-  Including asynchronous functions like analytics beacons.

-  The browser hasn’t utilized the Internet Connection for a while

-  Generally transparent from a users perspective

§  Document Complete (or Onload)

-  The page is assembled by the browser and ready for the user.

-  (Almost) always visually complete

-  User can use the scroll bars, click links, or search.

-  The browser may still be doing things in the background.

•  Some sites defer loading of prominent content until after document complete.

•  Some Front-End Optimization (FEO) packages defer script

execution for document complete. In this case, an interactive site may look visually complete at document complete, but

won’t actually be responsive or usable until after those scripts execute!

(19)

First: New vocab for last-mile tools

§  Fully Loaded

-  Entire page has been loaded

-  Including asynchronous functions like analytics beacons.

-  The browser hasn’t utilized the Internet Connection for a while

-  Generally transparent from a users perspective

§  Document Complete (or Onload)

-  The page is assembled by the browser and ready for the user.

-  (Almost) always visually complete

-  User can use the scroll bars, click links, or search.

-  The browser may still be doing things in the background.

§  Start Render (or Render Start)

-  Browser paints something (anything) on the screen.

-  May be all or most of the page, or a single image, or a single paragraph, or a single pixel.

-  The moment your user knows that the web site is actually working.

(20)

Load Time

§  Otherwise known as Document Complete.

First Byte

§  Network latency plus server latency.

Start Render

§  Otherwise known as Render Start.

Visually Complete

§  All visual components of the page are painted on the screen.

Speed Index

§  Loosely, the average time for visual components to be painted on

the screen.

Fully Loaded

§  The same Fully Loaded. The Browser stops using the connection.

First: New vocab for last-mile tools

§  Transparent for users.

§  Critical path for all browser functions

(21)

Load Time

§  Otherwise known as Document Complete.

First Byte

§  Network latency plus server latency.

Start Render

§  Otherwise known as Render Start.

Visually Complete

§  All visual components of the page are painted on the screen.

Speed Index

§  Loosely, the average time for visual components to be painted on

the screen.

Fully Loaded

§  The same Fully Loaded. The Browser stops using the connection.

First: New vocab for last-mile tools

§  BEWARE: Visually complete is not the same as functional.

Some Front-End Optimizations defer JavaScript execution to make the page look visually complete faster– but users may not be able to click links, scroll the window, or search!

(22)

§  More technically: the integration of the area above the

curve if all paint events are plotted (lower is better).

§  The same warnings around visual completeness apply.

Sites with great speed indexes are not necessarily functional as quickly as they are visible.

Load Time

§  Otherwise known as Document Complete.

First Byte

§  Network latency plus server latency.

Start Render

§  Otherwise known as Render Start.

Visually Complete

§  All visual components of the page are painted on the screen.

Speed Index

§  Loosely, the average time for visual components to be painted on

the screen.

Fully Loaded

§  The same Fully Loaded. The Browser stops using the connection.

(23)

Table of Contents

§  The Internet, The Bottleneck, and The Test: A brief history

§  Last-Mile Performance Tools (It’s dangerous to go alone!)

§  Now I have data… Lots of data

§  But, wait, there’s more (data)!

§  Need more? Meet the CDF.

(24)

Now I have data… lots of data

Over 6,000 data points.

(25)

Possible interpretations…

    Average   Median   Devia/on  Standard  

blue   8.947   7.323   4.792  

red   9.239   7.168   5.357  

green   8.155   6.977   4.844  

purple   14.104   Over 6,000 data points. 13.109   4.397  

à Gross oversimplification

May be useful.

But, look at how the graph changes with slightly different cuts.

(26)

Table of Contents

§  The Internet, The Bottleneck, and The Test: A brief history

§  Last-Mile Performance Tools (It’s dangerous to go alone!)

§  Now I have data… Lots of data

§  But, wait, there’s more (data)!

§  Need more? Meet the CDF.

(27)

But, wait! There’s more (data)!

None of these representations capture the whole picture!

There are hundreds of permutations of variability- different: •  Internet connection types

•  Devices •  Browsers

•  Geographies

•  Wireless connection quality •  Computing power

And then, there’s the natural variability of the Internet.

Plots over time usually aren’t that relevant for web performance: •  Oversimplification – sometimes misleading!

(28)

But, wait! There’s more (data)!

§  We can’t take all these things and distill them into one number, or even one number

plotted over time. Enter the histogram:

The histogram expresses

how many users experienced a particular page load time.

(29)

But, wait! There’s more (data)!

§  We can’t take all these things and distill them into one number, or even one number

plotted over time. Enter the histogram:

Taller bars mean that more users saw the load time in that interval.

(30)

But, wait! There’s more (data)!

§  We can’t take all these things and distill them into one number, or even one number

plotted over time. Enter the histogram:

Shorter bars mean that fewer users saw the load time in that interval.

(31)

But, wait! There’s more (data)!

§  We can’t take all these things and distill them into one number, or even one number

plotted over time. Enter the histogram:

Faster transaction times are on the left side of the histogram.

(32)

But, wait! There’s more (data)!

§  We can’t take all these things and distill them into one number, or even one number

plotted over time. Enter the histogram:

When the taller bars are on the left side, it means that more users saw a fast experience.

If you are comparing two experiences, plot the histograms on the same chart!

(33)

But, wait! There’s more (data)!

§  We can’t take all these things and distill them into one number, or even one number

plotted over time. Enter the histogram:

Red is definitely faster than blue: •  Fast users got faster

•  Medium users got faster •  Slow users got faster

(34)

Table of Contents

§  The Internet, The Bottleneck, and The Test: A brief history

§  Last-Mile Performance Tools (It’s dangerous to go alone!)

§  Now I have data… Lots of data

§  But, wait, there’s more (data)!

§  Need more? Meet the CDF.

(35)

Need More? Meet the Cumulative Distribution Function (CDF)

§  We all love histograms:

-  Everything is represented

-  Easy to consume

§  But, they still have shortcomings:

-  Finite granularity

-  Arbitrary bucket designations

(36)

Need More? Meet the Cumulative Distribution Function (CDF)

The Cumulative Distribution Function (CDF) expresses the percentage of page loads completed after a given amount of elapsed time.

(37)

Need More? Meet the Cumulative Distribution Function (CDF)

So, for blue, approximately 20% of page

loads were completed in 5 seconds or less.

(38)

Need More? Meet the Cumulative Distribution Function (CDF)

Slightly less than 70% of transactions were done in 10 seconds or less.

(39)

Need More? Meet the Cumulative Distribution Function (CDF)

As with histograms, a better (faster) CDF is one with a curve to the left and above this one.

(40)

Need More? Meet the Cumulative Distribution Function (CDF)

The red line is higher and more to the left. A greater percentage of users are done with their page load at any given time.

(41)

Need More? Meet the Cumulative Distribution Function (CDF)

The gap between the lines is the differential. Right here, only 80% of blue users were done with

their page load. After the same amount of time, more than 90% of red users were done.

(42)

Need More? Meet the Cumulative Distribution Function (CDF)

The red curve is above and to the left of the blue curve in all cases. Red is faster for all users.

(43)

Table of Contents

§  The Internet, The Bottleneck, and The Test: A brief history

§  Last-Mile Performance Tools (It’s dangerous to go alone!)

§  Now I have data… Lots of data

§  But, wait, there’s more (data)!

§  Need more? Meet the CDF.

(44)

Tie it all together…

§  The Internet is a jungle.

§  Methodology matters more than results.

§  Statistics can lie.

§  Pick your tool wisely.

§  Irrelevant metrics mislead.

§  Performance is never a single number.

§  Powerful visualizations trump aggregate figures.

(45)

Thanks!

References

Related documents