Maintenance Mode Example: Google Desktop by Jason Arbon

I was asked to take on the gargantuan task of testing Google Desktop with tens of millions of users, client and server components, and integration with Google search midway through the projects. I was the latest in a long line of test leads for this project with the usual quality and project debt. The project was large, but as with most large projects, it had begun to slow down feature wise and risk had been reduced through several years of testing and usage.

When two test buddies and I dropped into the project, Google Desktop had some 2,500 test cases in the old TestScribe Test Case Manager (TCM) database and several smart and hardworking vendors in the Hyderabad office running through these tests cases for each release. These test passes were often week-long or longer test cycles. Some previous attempts were made at automating the product via the UI and accessibility hooks, but that effort had failed under the weight of complex-ity and cost. It wasn’t easy to drive both web pages and the desktop window’s UI via C++, and then there were the issues of timeouts everywhere.

The two test buddies were Tejas Shah and Mike Meade. There weren’t a lot of resources for client testing at Google. As the bulk of Google products were on the Web or moving to the Web quickly, we decided to leverage a Python test frame-work (previously developed for Google Talk Labs Edition) which drove the prod-uct via the web DOM. This quick framework had the basics, such as a test case class, derived from PyUnit. Many TEs and developers knew Python, so we had a quick exit strategy if needed, and a lot of other engineers could help if something broke. Also, Python is amazingly quick to iteratively develop smaller bite-sized chunks of code without a compilation step, and it is installed on everyone’s work-stations by default at Google so the whole test suite can be deployed with a single command line.

Together, we decided to build out the full breadth of a Python API to drive the product, using ctypes to drive the client-side COM APIs for searching, mocking the server responses for testing injection of local results into google.com results (non-trivial!), using quite a few library functions for users, and manipulating the crawl. We also constructed some virtual machine automation for tests that required a Google Desktop index; otherwise, we would have to wait several hours for the indexing to complete on a fresh install. We built a small, automated smoke suite to cover the high-priority functions of the product.

We then moved to investigate the older 2,500 test cases. Many of these were difficult to understand and referred to code words from prototypes or dropped features from the early days of the project, or they assumed a lot about the context and state of the machines. Much of this documentation was unfortunately locked

ptg7759704 away in the minds of the vendors in Hyderabad. This wasn’t a good idea if we

needed to quickly validate a build with a security patch with zero notice. It was also downright expensive. So, we took the brave leap to review all 2,500 tests, iden-tify the most important and relevant ones based on our independent view of the product, and deprecated (deleted) all but around 150 tests. This left us with an order of magnitude fewer test cases. We worked with the vendors to clean up the text of the remaining manual test cases to make the test steps so clear and detailed that anyone who had used Google Desktop for a few minutes could run them. We didn’t want to be the only people who could perform a regression run in the future.

At this point, we had automated coverage for every build that started catching some regressions for us and a very small set of manual tests that could be executed by anyone in an afternoon against a release candidate build. This also freed up the vendors to work on higher value targets, it reduced cost, and it reduced ship latency with close to the same amount of functional coverage.

About this time, the Chrome project began and we started looking in that direction as the future of Google services on client machines. We were just about to reap all the benefits of our rush to automate with a rich test API and we were going to build generated and long-running tests, but we were asked to move resources quickly to the Chrome browser.

With our automated regression suites checking every internal build and the public builds, and with a super light manual test pass, we were in good shape to leave Desktop in maintenance mode and focus our energy on the more volatile and risky Chrome project.

But, for us, there was one nagging bug we kept hearing from the forums: For several versions of Google Desktop and for a few people, Google Desktop was gobbling up drive space. The issue kept getting deferred because there wasn’t a consistent repro. We reached out through our Community Manager to get more machine information from customers, but no one could isolate the issue. We wor-ried that this would impact more users over time, and without a full crew on board, it would never be resolved or it would be painful if it ended up needing to be dealt with later. So, we invested in deep investigations before moving on. The test team kept pushing on PM and Dev to investigate, even pulling a developer from the original indexing code back onto the project from a remote site thinking he’d know what to look for. He did. He noticed that Desktop kept re-indexing tasks if the user had Outlook installed. The index code kept thinking each old item was a new item each time it scanned, slowly but steadily chewing up hard drives and only for users of Outlook who used the Outlook tasks feature. Because the index was capped at 2GB, it took a long time to fill up, and users would only notice because recent documents weren’t indexed. But, diligence in engineering led to its discovery and fix. The last version of Desktop launched with this fix so we would-n’t have a latent issue pop up in 6 to 12 months after shipping with a skeleton crew.

We also time-bombed a feature, giving users the warning it was going away.

We also made it simple and reliable. The test team suggested moving this from a code path of pinging the server for a global flag when the feature should be

ptg7759704 When putting a project in a quality maintenance mode, we need to

reduce the amount of human interaction required to keep quality in check.

A funny thing about code is that when left alone, it gets moldy and breaks of its own accord. This is true of product code and test code. A large part of maintenance engineering is about monitoring quality, not looking for new issues. As with all things, when a project is well funded, you don’t always build the leanest set of tests so the tester will need to deprecate (remove) tests.

When deprecating manual tests, we use these guidelines:

• We look for tests that always pass or tests that are a low priority when you can’t keep up with your higher priority testing. Deprecate them!

• We understand what we are deprecating. We take the time to pick a few representative sample tests from the areas you deprecate. Talk to the original authors if possible to understand their intent so it’s not lost.

• We use the newly freed time for automation or to look at a higher prior-ity test or at exploratory testing.

• We also prune automated tests that might have given false positives in the past or are flaky—they just create false alarms and waste engineer-ing work later.

Following are tips to consider before entering maintenance mode:

• Don’t just drop the hard problems; fix them before leaving.

• Even a small, automated suite focused on E2E can give a lot of assur-ance over the long haul for almost zero cost. Build this automated suite if you don’t have one already.

• Leave a how-to document, so anyone in the company can run your test suite; it takes you off the firing line for a random interrupt and is the right thing to do.

• Ensure you have an escalation path if something were to go wrong. Be willing to be somewhere on that escalation path.

• Always be ready to dive in and help on projects you used to work on.

It’s good for the product, the team, and users.

disabled to a more reliable and robust client-only one. This eliminated the need for a subsequent release without this feature and made the feature more robust through simplicity of design.

We set up a quick doc and how-to for executing the automation and kicking the manual tests (a small release requiring only one vendor a few hours), placed a vendor on call for this work, and moved our testing focus to Chrome and the cloud. Incremental releases went off without a hitch. Those automated runs con-tinue today. Desktop customers are still actively using the product.

ptg7759704 Entering test maintenance mode is a fact of life for many projects,

espe-cially at Google. As a TE, we owe it to our users to take prudent steps to make sure it is as painless for them as possible and as efficient as possible engineering wise. We also have to be able to move on and not be married to our code or ideas.

In document How Google Tests Software (Page 169-172)