The dark side of failing tests

Published on and tagged with testing

[rant]
Yesterday, I opened some tickets because some core tests failed while I run them with the new testsuite shell. As I didn’t attach any patches to the tickets I got “attacked” on twitter by some people from the core team, and I even got a nice mail from gwoo (the project manager of CakePHP) that I should stop submitting tickets…
[/rant]

Anyway, I think this is a good opportunity to think about failing tests (with tests I primarily mean unit tests).

Even though “failing tests” may sound quite negative, this is not the case, at least if you practice test driven development (or a similar approach). There, failing tests are an integral part of the process. You start with a failing test, and then you implement the functionality so that the test no longer fails. Or if you make a change, and you break something by accident, then a failing test informs you about it. So, a failing test is quite positive.

On the other hand, failing tests also have a dark side.

The longer you wait to fix them, the higher the “costs”. If you fix a failing test immediately, then the costs are almost zero: you know what you changed, and so you can easily fix it. But if you wait a while, then the code is no longer present in your mind and it takes much more effort to fix the failing test. A side effect of this waiting is that it encourages others to do the same. If X doesn’t fix the tests he has broken, why should I care whether my tests run? As another side effect it may undermine the motivation of bug reporters to contribute tests if they see the project members don’t care about tests…

A special “behavior” of failing tests is that they don’t tell you whether they fail because of an incorrect test case or a bug in the code you test (in practice, this is sometimes obvious, sometimes not, depending on various factors). And they don’t tell you whether they only fail in your environment or everywhere. So, this means, if I run your tests and some of them fail, then all I can say may be: test X fails in my environment (the reason for saying “in my environment” is that I assume you don’t give me broken code, i.e. code with tests you know are failing). As it is an environment-specific issue, I will inform you about it, so it can get fixed. But if you provide broken code, then everyone using it will bother you with the same issue, at least theoretically ;-)

I think it is obvious what helps to deal with those dark sides: don’t provide code with tests you know are failing, it’s that simple!

I hope it wasn’t too confusing, and sorry about the rant at the beginning, but sometimes it’s necessary to vent one’s anger ;-)

12 comments baked

  • Mariano Iglesias

    Are you in need of traffic?

  • Brendon Kozlowski

    One might also look at it as if it’s an already known issue (assuming you’re using a SVN checkout or a nightly) and the tests simply inform the user that in the current version there is an issue and where to be cautious within your own applications. :) I suppose it’s one of those point-of-view issues.

    Speaking of “point-of-view”, thanks for providing it from a users’ perspective, I rarely take that perspective. ;)

  • nate

    Hi Daniel,

    You make some excellent points about the cost of failing tests and the importance of testing across different environments. However, you seem to be mischaracterizing the situation in this case.

    The particular tests to which you’re referring aren’t actually broken, in fact, they all pass just fine. However, those tests were written to run in the context of an HTTP request, and you’re running them on the command line. This isn’t just a different “environment,” this is an entirely different context.

    Because of this, and because of the fact that the new test suite shell command is as-yet-unreleased, “fixing” those tests (i.e. setting up the necessary environmental conditions to run them on the CLI) hasn’t been a high priority, because as I mentioned, nothing is actually broken.

    Therefore, your idea about the “costs” doesn’t really apply, since the necessary environmental bootstrapping is a fixed cost, and will apply the same across the board to all affected tests.

  • Tarique Sani

    @Nate well put – things are now in perspective.

    @Daniel – when you are in the team taking a stance “the person who breaks it fixes it” is correct but from if you are in the community then not supplying patches when it is well known that you *can* will be considered “whining”

    @Mariano – That was as rhetorical question right? :P

  • cakebaker

    @all: Thanks for your comments!

    @Mariano: No.

    @Brendon: Sure, it is a point of view thing, and you can see it like you described.

    @nate: Well, I just set up the “normal” test environment and ran the tests, and the same tests fail again (plus some other tests), even the test suite fails with a fatal error ;-)

    @Tarique: I think taking the stance “the person who breaks it [the tests] fixes it” is independent of whether I’m a team member or not. Sure, I may be able to supply patches, but it is not satisfying to clean up the “mess” of others only because they are too lazy to do it themselves. Somehow it would be like giving them a fish instead of teaching them how to fish…

  • Tarique Sani

    Ummm…. Using “mess” and “lazy” is I would say a bit presumptuous. Not providing patches puts you as someone who “wants to complain but not help”

    IMO submitting patches is teaching precisely how to fish. Providing patches is the cornerstone of OSS development. If every bug in every OSS was left for the developer to fix we would not be here and so far out.

    To paraphrase Linus “Talk is cheap – Show me the code” is universally true.

  • nate

    Again you twist the issue to appear in your favor. You should come to the US and run for president, you’d make a great politician. ;-)

    I just ran all the tests, and had others run them as well. There are a very few failing tests, on code which is still in-development (as you say, it is part of the process). There are no fatal errors.

    Also, none of the tests that currently fail were mentioned in the tickets you opened, and none of the tickets you opened have any failing tests (when run from the web test interface).

    This very debate illustrates the crux of my issue with you: you’re an intelligent programmer, and we could always, always use more people like you to help improve the Cake core, but in my experience you’re not able to work with other people in a constructive way. Your main interest always seems to be in blaming others while doing everything possible to avoid blame yourself.

    I hope that in the future you’re able to prove me wrong.

  • Tarique Sani

    @Daniel – this is an invitation from Nate :)

  • Jonathan Snook

    While I’m unaware of what tickets you files and why, I think you’d want to be careful that you the tickets aren’t about the failed tests themselves but rather about possible bugs that they uncover (which is the whole point of the tests, right?).

  • cakebaker

    @Tarique: Yes, in general you are right, that submitting patches teaches others how to fish. But what I tried to explain is an exception of this general case ;-)

    Let’s say we have an application where all tests run fine. Now you make a change which breaks the tests, you commit it to the repository, and you go on to work on other stuff. Later, I run the tests and see that they fail. What should I do? Should I be the “good” guy and fix them myself and provide a patch for you? Or should I be the “bad” guy that simply says: “Hey, you broke the tests, please fix them”? The first approach teaches you it’s ok to be lazy, there is someone else who cleans up, whereas the second approach hopefully teaches you to run the tests the next time before committing code ;-)

    @nate: Well, maybe my test environment is not set up correctly. That’s what I did to make the tests run: I changed the value of CAKE_CORE_INCLUDE_PATH in test.php (I’m using an advanced installation) and defined the $test database connection. I don’t know whether there are additional steps necessary. The fatal error I get is because I don’t have the xdebug extension installed, see my ticket.

    I agree with you, that I’m not always an “easy” guy to work with. I’m very analytical and critical, and not everyone can deal with that. In the past I even got fired because I gave my boss negative feedback ;-) And I’m probably not always that diplomatic.

    Regarding “… work [..] in a constructive way”, I think we have different definitions of what that means. For example: I opened the tickets to encourage you to fix them (as I think it is bad to have failing tests) whereas from your point of view they were useless.

    Anyway, thanks for your feedback!

    @Jonathan: The respective tickets are tickets #4596-#4600. In those tickets I simply reported the failing tests, as I didn’t knew whether the problems were in the test cases, in the tested classes, or in the new test suite shell.

  • dr. Hannibal Lecter

    Did I get this right? You were asked *not* to post tickets? That seems just silly, am I missing something? How is that helpful?

  • cakebaker

    @Hannibal Lecter: Yep, I was asked to not post tickets. But I think it was an overreaction ;-)

Bake a comment




(for code please use <code>...</code> [no escaping necessary])

© daniel hofstetter. Licensed under a Creative Commons License