Skip to main content

Unit Tests Don't Find Bugs: the Death of QA

Unit tests don't find bugs. They find regressions. This is a painful lesson I learned when I first started doing TDD (test-driven development), and it's well known among most TDD circles.

TDD's goal is to prevent programmers from introducing new bugs into working code. However, when you're writing code from scratch, your tests won't help you find all the bugs in your code. That's because you can't possibly write tests for all the ways your software will be used (or abused). When I first started doing TDD, I had really good tests, but I was too tired to do much exploratory QA. However, my boss wasn't, and I was very embarrassed to find that my software had lots of bugs. Simply put, he used my software in ways that I hadn't intended.

I've seen a lot of companies that don't bother writing any tests or doing any QA. They just let their users find all the bugs. Needless to say, I've never had respect for those companies.

However, it's growing more and more popular to destaff the QA department and just require engineers to write lots and lots of tests. Often, these are in the form of unit tests. Even though integration tests can conceivably catch more bugs, they take much longer to run. Hence, even integration tests are often deprioritized.

What I'm discovering is that a lot of projects have both lots of unit tests and lots of bugs. These are bugs that could have been found manually by a QA engineer, but it seems that manual QA testing (i.e. exploratory testing) has gone out of vogue.

I used to think that code that was well-documented, well-styled, well-tested, and code reviewed would rarely have bugs. Sadly, I no longer believe that to be the case. I think we need to go back to the days when we had decently-sized QA departments, perhaps in addition to all the other things we do.

To tweak what Knuth said, "Beware of the above code. I have only tested that it works. I haven't actually tried it."

Comments

Anonymous said…
Great post. I like it. Just want to add, that in my experience, I found that QA guys also don't find all the bugs. When they have to do the same thing every day, they tend to do it the same way over and over. They find bugs in the first few runs, those bugs get fixed, then QA manual tests turns from explanatory to regression. :)
- Mykola
jjinux said…
Good point. Thanks Mykola ;)
Anonymous said…
I think it's a mistake to imagine that any process could identify all of the bugs in a piece of code. Other than design flaws that prevent the execution of expected use cases and simple mistakes made while writing, most bugs are cases of unexpected inputs generating unhandled states. Unless the code you've written can be deterministically and exhaustively exercised, something that's really only possible at the unit level, the accretion of units of that size will generate untestable complexity very quickly.
Bug of this sort are only bugs in the particular context of a user having produced such and unexpected use case. Testing, by devs or QA specialists, can only hope to mitigate the risk of these inevitable outcomes and protect you from the known knowns. Intelligent testing is the practical management of risk. As a QA professional, I feel well trained to provide that service, but I've met plenty of developers who do that well, and lots of QA staff who don't.
I also agree with the above post that QA staff left to rot in mindless repetition of mechanical tasks are probably less useful than well written unit and integration tests. Still, any QA manager worth the name will find the means to prevent this abuse of his staff's faculties, and more effective use of their talents.
Anonymous said…
By writing tests first you were supposed to be able to write the bare minimum of code necessary to do the job. Less code equals less bugs, at least in theory. Depending on the technology at hand you might be able to restrict your code to how it is used in the unit test. So if a new use-case pops up, additional tests and code have to be written. Wishful thinking, I admit.
As soon as the code base grows larger, regression will provide a better safety net for refactoring than no tests at all.
The lack of automated QA testing is frustrating and a waste of money. QA tackles a completely different class of bugs due to use-case based testing. And even it turns into regression at some point, it will at least improve the safety net.

Frisian

Popular posts from this blog

Ubuntu 20.04 on a 2015 15" MacBook Pro

I decided to give Ubuntu 20.04 a try on my 2015 15" MacBook Pro. I didn't actually install it; I just live booted from a USB thumb drive which was enough to try out everything I wanted. In summary, it's not perfect, and issues with my camera would prevent me from switching, but given the right hardware, I think it's a really viable option. The first thing I wanted to try was what would happen if I plugged in a non-HiDPI screen given that my laptop has a HiDPI screen. Without sub-pixel scaling, whatever scale rate I picked for one screen would apply to the other. However, once I turned on sub-pixel scaling, I was able to pick different scale rates for the internal and external displays. That looked ok. I tried plugging in and unplugging multiple times, and it didn't crash. I doubt it'd work with my Thunderbolt display at work, but it worked fine for my HDMI displays at home. I even plugged it into my TV, and it stuck to the 100% scaling I picked for the othe

ERNOS: Erlang Networked Operating System

I've been reading Dreaming in Code lately, and I really like it. If you're not a dreamer, you may safely skip the rest of this post ;) In Chapter 10, "Engineers and Artists", Alan Kay, John Backus, and Jaron Lanier really got me thinking. I've also been thinking a lot about Minix 3 , Erlang , and the original Lisp machine . The ideas are beginning to synthesize into something cohesive--more than just the sum of their parts. Now, I'm sure that many of these ideas have already been envisioned within Tunes.org , LLVM , Microsoft's Singularity project, or in some other place that I haven't managed to discover or fully read, but I'm going to blog them anyway. Rather than wax philosophical, let me just dump out some ideas: Start with Minix 3. It's a new microkernel, and it's meant for real use, unlike the original Minix. "This new OS is extremely small, with the part that runs in kernel mode under 4000 lines of executable code.&quo

Haskell or Erlang?

I've coded in both Erlang and Haskell. Erlang is practical, efficient, and useful. It's got a wonderful niche in the distributed world, and it has some real success stories such as CouchDB and jabber.org. Haskell is elegant and beautiful. It's been successful in various programming language competitions. I have some experience in both, but I'm thinking it's time to really commit to learning one of them on a professional level. They both have good books out now, and it's probably time I read one of those books cover to cover. My question is which? Back in 2000, Perl had established a real niche for systems administration, CGI, and text processing. The syntax wasn't exactly beautiful (unless you're into that sort of thing), but it was popular and mature. Python hadn't really become popular, nor did it really have a strong niche (at least as far as I could see). I went with Python because of its elegance, but since then, I've coded both p