The cloud to the rescue - Testing, deployment, and operations in the cloud

Testing, deployment, and operations in the cloud

7.2 The cloud to the rescue

If you’ve ever worked in a development or operations team, the architecture and pur- chasing decisions we walked through in section 7.1 are likely familiar. The cloud—or, more generally, virtualization— is changing how people go through the previous exercise. Although it certainly saves money, it’s also making businesses more efficient at testing and more prepared to scale with customer demand. Let’s look at some of the ways the cloud can help your company.

7.2.1 Improving production operations with the cloud

The most commonly cited reason for moving to cloud computing is its ability to achieve

scale to 100,000 users (10X growth) because the site was mentioned on Oprah, the tra- ditional deployment model we went through earlier wouldn’t work. There’s no way to acquire another 36 web servers and some number of database servers on demand.

Although this scalability argument is one of the best for moving deployments and operations to public or private clouds, there are many good reasons to consider the cloud even if your applications will never get surges of traffic such as this.

ELASTIC BANDWIDTH

Whether you’re building your own data center or renting space from an Internet Service Provider (ISP), you have to pay for bandwidth. Bandwidth is almost always metered, usually by a combination of total bytes transferred per month and peak throughput in terms of megabits per second. If you have your own data center, your data throughput may be limited by the size and type of network connection coming into your data center, possibly limiting the speed at which you can deliver content to your users.

Either way, it’s often impossible or at least costly to quickly surge up to extreme levels of network throughput. Yet in a cloud model, you get the benefit of pooling resources to have much larger network throughput than you’ll typically ever need but can tap into on occasion.

For example, I’ve seen my own pool of machines on the Amazon EC2 network, capable of collectively transferring more than 3 GBps. That’s the equivalent of downloading a full, uncompressed CD in less than 2 seconds, or a complete Blu-ray movie in about a minute.

Even if the number of machines you need can stay completely constant, the cloud’s massive network infrastructure is a benefit that’s often overlooked. Most network operations teams can easily spend hundreds of thousands of dollars getting only a fraction of the performance that large public and private clouds can provide immediately.

ELASTIC DISK STORAGE

A local, redundant, high-speed storage area network (SAN ) is often a massive infrastructure investment. And when you eventually outgrow your original storage space, increasing the size can be extremely difficult. But in the cloud, your data may be practi- cally unlimited in terms of scalability.

For example, for any data stored in Amazon S3 , it’s unlikely you’ll ever need to think about disk space. Amazon’s pricing page talks about 5 PB (5,000,000 GB) as one of its pricing tiers—and you can go well beyond that if needed. For that, your architecture must work with S3’s basic capabilities and remote nature. See chapter 5 for more on architecting for the cloud.

A local SAN will always offer much faster performance than fetching objects in a remote file store, such as Amazon S3. Consider that it may cost $250,000 or more just for the initial hardware for a 100 TB SAN, plus hundreds of thousands more in personnel and electricity costs. You can store that same 100 TB in Amazon S3 for less than $15,000/month.

Whereas those cost savings are reason enough for people to re-architect their applications to work with the cloud, there’s another factor to consider. If you ever

outgrow the maximum capacity of your SAN (that is, the SAN controller’s CPUs or RAM are fully utilized), the cost of building a new, larger SAN and migrating the data over can be a massive or even crippling expenditure.

RESPONDING TO BAD HARDWARE

Similar to the growing pains of expanding SANs, another area that network operations often spends a lot of time on is responding to emergencies when hardware fails. Practi- cally everything in a server will eventually fail: disks, CPUs, RAM, fans, and so on. How quickly you can respond can greatly affect customer satisfaction.

In the non-cloud world, if a server suffers a crash from hardware failure, it’s taken out of rotation, and replacement parts are installed as quickly as possible. This can take hours, days, or even weeks, depending on where the servers are located and whether you can readily find replacement parts.

In the cloud world, hardware still goes bad. Although it’s just as rare as with physical hardware, we’ve seen hundreds of Amazon EC2 instances fail all at the same time, likely due to hardware problems. The difference is how we responded to the issue. Because our software was designed for the cloud, all we had to do was click a few buttons: those machines were replaced with new ones in a different availability region where there were no hardware issues.

AUTOMATING DEPLOYMENT

Being able to respond to failing servers or instantly provision new ones for scale greatly depends on the software architecture and whether it allows for truly automated deployment. Although there are many benefits of public and private clouds, you can’t take advantage of them if you rely heavily on manual processes.

If your team needs to manually bring up a server, install Apache, copy over your PHP web application, configure the application to point to the MySQL database, and then finally add the new IP address to the load-balancer for production rotation, you probably aren’t ready for cloud scale (or, heaven forbid, a mention on Oprah).

But if you can answer “yes” to some or all of the following questions, you may be cloud-ready:

■ Do you have automated scripts for setting up the operating system and installing all necessary software?

■ _{Do you package your software in such a way that all the configuration files are}

bundled with the binary artifacts, ready for one-click deployment?

■ Do you run your software stack inside of virtual machines that can be cloned?

■ Are common maintenance tasks (such as vacuuming the database, announcing maintenance windows, and backing up data) happening automatically or easily automated with a single click?

■ Is your software designed to scale horizontally by adding new web servers or other machines?

By putting some time into automation efforts that allow you to answer “yes” to these questions, you not only prepare yourself to be able to address hardware issues and

dynamically scale using the elasticity of the cloud, but also put yourself in position to accelerate your development and testing.

7.2.2 Accelerating development and testing

Whereas we’ve been highlighting the merits of the cloud for production operations, the rest of this chapter will focus on how the cloud changes the software when testing is done. Before diving into specific types of testing, let’s explore the two primary reasons you should consider cloud-based testing: cost savings and test acceleration.

COST SAVINGS

Remember that half of the $22,500 of hardware purchase in the earlier hypotheti- cal testing environment was for testing and staging, both used for a variety of QA and testing. But that hardware is unlikely to be used 100 percent of the time. Let’s assume that both environments are needed only 50 percent of the time during normal business hours. That comes out to approximately 1,000 hours per year of required usage.

Table 7.2 compares physical hardware utilized 100 percent of the time (24 hours × 365 days) to that of equivalent cloud-based deployments.

Table 7.2 Comparing staging and testing cloud fees to production hardware costs

Production Staging Testing Staging (Alt) Testing (Alt)

Servers 7 5 2 7 7

Annual hours 8,760 1,000 1,000 250 1,000

Cores/server 8 8 8 8 8

Hardware cost $11,250 - - - -

Annual cloud cost - $4,025 $1,600 $1,406 $5,625

The costs are estimated at approximately 10 cents per CPU per hour, plus a 2.5 cents- per-hour fee for a load-balancer. These prices reflect the public prices of Amazon’s EC2 service at the time of publication of this book.

As you can see, when the hardware is used only 1,000 hours per year, the combined cost of staging and testing is $5,625 per year—much less than the hardware costs of both smaller environments.

But also consider the alternative deployment layouts represented in the last two columns of table 7.2. In this situation, you’re re-creating a full production environment with all seven servers in both environments for not much more. In doing so, you can also use the staging environment less often, because the testing environment is now much larger and can be used for performance testing.

Note that to take advantage of these savings you have to be able to quickly deploy and tear down the environments. That’s where the investments put in by the operations staff and developers can help out. Often, you can reuse the tools and

processes used for cloud-based disaster recovery and scalability to save thousands of dollars each year.

SPEEDING UP TEST AUTOMATION AND MANUAL TESTING

Although the savings in hardware costs are nice, the largest expense for most businesses is usually employee salaries. As such, anything that can make testers more productive is often worth the effort. That’s why as agile software methodologies have taken hold over the past decade, a major focus on automated testing has been central to the agile movement.

Whether it’s for load testing, functional testing, or unit testing, the cloud and various cloud-based tools (commercial and open source) are helping with test automation. Even for manual testing, various cloud-based services are making individual testers more productive.

Before we go deeper into how the cloud is speeding up test automation and manual testing, let’s take a moment to quickly review the various kinds of testing most QA teams do:

■ Unit testing —Involves using tools such as JUnit or NUnit to build and run auto-

mated tests that exercise the internal algorithms of your software.

■ Functional testing —End-to-end testing of the entire application, from the end

user’s perspective. Also known as acceptance testing .

■ Visual testing —Verifies the user interface on a variety of different platforms. Be-

tween mobile devices, several versions of Windows, and at least five major brows- ers, this is particularly important for most web applications.

■ Load testing and performance testing —Measures the performance of an application

from when it’s barely being used all the way up to heavy utilization. Also used to determine the failure point of an application .

■ Usability testing —Collects subjective feedback on how real users react to the ap-

plication’s interface and functionality.

■ Ad hoc and manual testing—A broad bucket of various types of manual testing ef-

forts that can’t or shouldn’t be automated.

■ Penetration testing —Evaluates the security of a computer system or network by

simulating an attack from a malicious source.

Each of these types of testing can benefit from the cloud. Some, such as load testing and functional testing, benefit through the use of new testing tools designed for the cloud. Others, such as manual testing, benefit when the application under test (AUT ) can be deployed quickly to the cloud.

For example, suppose two testers need to have exclusive access to the testing environment at the same time—one plans to run a large load test, and the other needs to run the entire suite of automated tests. Without the cloud, one would have to wait for the other to finish. With the cloud, the environment can be cloned, and both testers can get on with their job without interruption. Let’s explore ways the cloud allows for tasks to run in parallel, allowing developers and testers to operate more efficiently.

In document The Cloud at Your Service (Page 177-182)