Item #1: A product is passed from one maintenance developer to the next. Each new developer discovers that the products design documentation is out of date and that the build process is broken. After a month of analysis, each pronounces it to be poorly engineered and insists on rewriting large portions of the code. After several more months, each quits or is reassigned and the cycle repeats.
Item #2: A product is rushed through development without sufficient understanding of the problems that it's supposed to solve. Many months after it is delivered, a review discovers that it costs more to operate and maintain the system than it would have cost to perform the process by hand.
Item #3: $100,000 is spent on a set of modern integrated development tools. It is soon determined that the tools are not powerful, portable, or reliable enough to serve a large scale development effort. After nearly two years of effort to make them work, they are abandoned.
Item #4: Software is written to automate a set of business tasks. But the tasks change so much that the project gets far behind schedule and the output of the system is unreliable. Periodically, the development staff is pulled off the project in order to help perform the tasks by hand, which makes them fall even further behind on the software.
Item #5: A program consisting of many hundreds of nearly independent functions is put into service with only rudimentary testing. Just prior to delivery, a large proportion of the functions are deactivated as part of debugging. Almost a year passes before anyone notices that those functions are missing.
These are vignettes from my own experience, but I bet they sound familiar. It's a truism that most software projects fail, and for good reason-from the outside, software seems so simple. But, the devil is in the details, isn't it? And seasoned software engineers approach each new project with a wary eye and skeptical mind.
Test automation is hard, too. Look again at the five examples, above. They aren't from product development projects. Rather, each of them was an effort to automate testing. In the eight years I spent managing test teams and working with test automation (at some of the hippest and richest companies in the software business, mind you), the most important insight I gained was that test software projects are as susceptible to failure as any other software project. In fact, in my experience, they actually fail much more often, mainly because most organizations don't apply the same care and professionalism to their testware as they do to their shipping products.
Strange, then, that almost all testing pundits, practicing testers, test managers, and of course, companies that sell test tools unreservedly recommend test automation. Well, perhaps "strange" is not the right word. After all, CASE tools were a big fad for a while, and test tools are just another species of CASE. From object-orientation to "programmerless" programming, starry-eyed advocacy is nothing new to our industry. So maybe the poor quality of public information and analysis about test automation is less strange than it is simply a sign of the immaturity of the field. As a community, perhaps we're still in the phase of admiring the cool idea of test automation, and not yet to the point of recognizing its pitfalls and gotchas.
Having said that, let me hasten to agree that test automation
is a very cool idea. Most full-time testers and probably all developers
dream of pressing a big green button and letting a lab full of
loyal robots do the hard work of testing, freeing themselves for
more enlightened pursuits, such as playing head-to-head Doom.
However, if we are to achieve this Shangri-La, we must proceed
with caution. This article is a critical analysis of the "script
and playback" style of automation for system-level regression
testing of GUI applications.
"Automated tests execute a sequence of actions without human intervention. This approach helps eliminate human error, and provides faster results. Since most products require tests to be run many times, automated testing generally leads to significant labor cost savings over time. Typically a company will pass the break-even point for labor costs after just two or three runs of an automated test."
The quote above is from a white paper on test automation published by a leading vendor of test tools. Similar statements can be found in advertisements and documentation for most commercial regression test tools. Sometimes they are accompanied by impressive graphs, too. Notice how the same argument could be applied to the idea of using software to automate any repetitive human activity. The idea boils down to just this: computers are faster, cheaper, and more reliable than humans. Therefore, automate.
At least with regard to system testing of Windows applications,
this line of reasoning rests on many questionable assumptions:
1. Everything important that people do when manually testing can
be mapped to a definable "sequence of actions."
Most skilled hand testing is not pre-planned in detail, but is
rather a guided exploration and reasoning process. This process
is therefore not a sequence of actions, but rather a heuristic
search. Sure, this process can be projected into a sequence of
very specific test cases with very specific pass/fail criteria,
but the resulting hundreds or thousands of test cases would be,
at best, an adjunct to skilled hand testing, not a replacement.
2. That sequence of actions is useful to repeat many times.
Once a specific test case is executed a single time, and no bug
is found, there's only a remote chance that same test, executed
again, will find a bug, unless a new bug is introduced into the
system. If there is variation in the test cases, though, as there
usually is when tests are executed by hand, there is a greater
likelihood of revealing problems both new and old. Variability
is one of the great advantages of hand testing over script and
playback testing. When I was at Borland, the spreadsheet group
used to track whether bugs were found through automation or manual
testing-consistently, over 80% of bugs were found manually, despite
several years of investment in automation. Their theory was that
hand tests were more variable and more directed at new features
and specific areas of change where bugs were more likely to be
found.
3. That sequence can be automated.
Some tasks that are easy for people are hard for computers. Probably
the hardest part of automation is interpreting test results. For
GUI software, it is very hard to automate that process so as to
automatically notice all categories of significant problems while
ignoring the insignificant problems.
The problem of automatability is compounded by the high degree of uncertainty and change in a typical innovative software project. In market-driven software projects it's common to use an incremental development approach, which pretty much guarantees that the product will change, in fundamental ways, until quite late in the project. This fact coupled with the typical absence of complete and accurate product specifications, make automation development something like driving through a trackless forest in the family sedan: you can do it, but you'll have to go slow, do a lot of backtracking, and you might get stuck.
Even if we have a particular sequence of operations that can in principle be automated, we can only do so if we have an appropriate tool for the job. Information about tools is hard to come by, though, and the most critical aspects of a regression test tool are impossible to evaluate unless we create or review an industrial size test suite using the tool. Here are some of the factors to consider when selecting a test tool. Notice how many of them could never be evaluated just by perusing the users manual or watching a trade show demo:
4. Once automated, the process will go faster, because it does
not require human intervention.
All automated test suites require human intervention, if only
to diagnose the results and fix broken tests. It can also be surprisingly
hard to make a complex test suite run without a hitch. Common
culprits are changes to the software being tested, memory problems,
file system problems, network glitches, and bugs in the test tool
itself.
5. Once automated, human error is eliminated.
Yes, some errors are eliminated. Namely, the ones that humans
make when they are asked carry out a long list of mundane mental
and tactile activities. But other errors are amplified. Any bug
that goes unnoticed when the master compare files are generated
will go systematically unnoticed every time the suite is executed.
Or an oversight during debugging could accidentally deactivate
hundreds of tests. The dBase team at Borland once discovered that
3,000 tests in their suite were hard-coded to report success,
no matter what problems were actually in the product. To mitigate
these problems, the automation should be tested or reviewed on
a regular basis. Corresponding lapses in a hand testing strategy,
on the other hand, are much easier to spot using basic test management
documents, reports, and practices.
6. It is possible to measure the relative costs and benefits of
manual testing versus automated testing.
The truth is, hand testing and automated testing are really two
different processes, rather than two different ways to execute
the same process. Their dynamics are different, and the bugs they
tend to reveal are different. Therefore, direct comparison of
them in terms of dollar cost or number of bugs found is meaningless.
Besides, there are so many particulars and hidden factors involved
in a genuine comparison that the best way to evaluate the issue
is in the context of a series of real software projects. That's
why I recommend treating test automation as one part of a multifaceted
pursuit of an excellent test strategy, rather than an activity
that dominates the process, or stands on it own.
7. The value of automated testing will duplicate or surpass that
of manual testing.
This is true only if all of the previous assumptions are true,
and if there is also no value in having testers spend time actually
using the product.
8. The cost to automate the testing will be less than three times
the cost of a single, manual, pass through the same process.
This loosey goosey estimate may have come from field data or from
the fertile mind of a marketing wonk. In any case, the cost to
automate is contingent on many factors, including the technology
being tested, the test tools used, the skill of the test developers,
and the quality of the test suite. Writing a single test script
is not necessarily a lot of effort, but constructing a suitable
test harness can take weeks or months. As can the process of deciding
which tool to buy, which tests to automate, how to trace the automation
to the rest of the test process, and of course, learning how to
use the tool and then actually writing the test programs. A careful
approach to this process (i.e. one that results in a useful product,
rather than gobbledygook) usually takes months of full-time effort,
and longer if the automation developer is inexperienced with either
the problem of test automation or the particulars of the tools
and technology.
9. For an individual test cycle, the cost of operating the automated
tests, plus the cost of maintaining the automation, plus the cost
of any other new tasks necessitated by the automation, plus the
cost of any remaining manual testing, will be significantly less
than the cost of a comparable purely manual test pass.
This is yet another reckless assumption. Most analyses of the
cost of test automation completely ignore the special new tasks
that must be done just because of the automation:
Test cases must be documented carefully.
These new tasks make a significant dent in a tester's day. Every group I ever worked in that tested GUI software tried at one point or another to make all testers do part-time automation, and every group eventually abandoned that idea in favor of a dedicated automation engineer or team. Writing test code and performing interactive hand testing are such different activities that a person assigned to both duties will tend to focus on one to the exclusion of the other. Also, since automation development is software development, it requires a certain amount of development talent. Some testers aren't up to it. One way or another, companies with a serious attitude about automation usually end up with full time staff to do it, and that must be figured in to the cost of the overall strategy.
I've left for last the most thorny of all the problems that we face in pursuing an automation strategy: it's always dangerous to automate something that we don't understand. It's vital to get the test strategy clearly outlined and documented, not to mention the specification of the product to be tested, before introducing automation. Otherwise the result will be a large mass of test code that no one fully understands. As the original developers of the suite drift away to other assignments, and others take over maintenance, the suite gains a kind of citizenship in the test team. The maintainers are afraid to throw any old tests out, even if they look trivial, because they might actually be important. It continues to accrete new tests, becoming an increasingly mysterious oracle, like some old Himalayan guru or talking oak tree. No one knows what the suite actually tests, and the bigger it gets, the less likely anyone will go to the trouble to find out.
This situation has happened to me personally (more than once, before I learned my lesson), and I have seen and heard of it happening to many other test managers. Most don't even realize that it's a problem, until one day a development manager asks what the test suite covers and what it doesn't, and no one is able to give an answer. Or one day when it's needed most the whole test system breaks down and there's no manual process to back it up. The irony of the situation is that an honest attempt to do testing more professionally can end up assuring that it's done blindly and ignorantly.
A manual testing strategy can suffer from confusion too, but when
tests are created dynamically from a relatively small set of principles
or documents, it's much easier to review and adjust the strategy.
It's a slower testing method, yes, but it's much more visible,
reviewable, flexible, and it can cope with the chaos of incomplete
and changing products and specs.
Despite the concerns raised in this article, I do believe in test automation. Just as there can be quality software, there can be quality test automation. To create good test automation, though, we have to be careful. The path is strewn with pitfalls. Here are some key principles to keep in mind:
One day, a few years ago, there was a blackout during a fierce evening storm, right in the middle of the unattended execution of our wonderful test suite. When my team arrived at work the next morning, we found that our suite had automatically rebooted itself, reset the network, picked up where it left off, and finished the testing. It took a lot of work to make our suite that bulletproof, and we were delighted. The thing is, we later found, during a review of test scripts in the suite, that out of about 450 tests, only about 18 of them were truly useful. It's a long story how that came to pass-basically the wise oak tree scenario-but the upshot of it was that we had a test suite that could, with high reliability, discover nothing important about the software we were testing. I've told this story to other test managers who shrug it off. They don't think this could happen to them. Well, it can happen if the machinery of testing distracts you from the craft of testing.
Make no mistake. Automation is a great idea. To make it a good investment, as well, the secret is to think about testing first and automation second. If testing is a means to the end of understanding the quality of the software, automation is just a means to a means. You wouldn't know it from the advertisements, but it's only one of many strategies that support effective software testing.
Copyright (c) 1996 ST Labs, Inc.