Agile Project Planning

ExtremePlanner: Agile Project Management for Distributed Software Teams
Click For Your Free Online Trial

Monday, June 19, 2006

Managing Defects on an Agile Software Team

I've been thinking about one of the under-discussed issues with Agile software development, which is how a team actually manages defects discovered after release.

Much of the discussion has centered around preventing defects in the first place, through Test Driven Development, Customer Acceptance Tests, and other three-word phrases intended to improve quality. These are all extremely valuable, but I've never seen or heard of a software team that didn't let a defect out into the wild.

So, I'll describe a process I've used for managing defects, and I hope you'll share yours as well.

Carl the Customer sends an email with an issue he's been having with the software. At this point, it doesn't matter whether it's a software bug, a documentation problem, or a user error, I just want to understand what Carl is trying to do, and why he is frustrated.

For example, Carl could say "I want to rescehdule all appointments to next month, but the software makes me do one at a time." This is clearly an enhancement to the product in software terms, but to Carl, it may as well be a defect, since he can't do what he wants, and it hurts.

In another scenario, Carl might say "I tried to reschedule an appointment to next year, but I get a message that says Internal Error -376." OK, we seem to have a software defect now. If the software can't handle rescheduling items more than 365 days away, maybe it's just a bad error message, or if the spec says it should be able to anything, maybe it's a true failure. Either way, Carl is in pain.

So my approach is to treat all of these situations similarly, regardless of the label you might apply, "bug", "enhancement", "spec problem", "cosmetic issue", "feature request".

In a typical agile process, there is a "backlog" of features, enhancements, and defects. As part of the plannning process, the Customer/Product Owner/Decider will prioritize the items on the backlog for the next release.

Now, in the case of defects, teams have often resorted to a DIFFERENT backlog of "bugs", and managed it in some other manner. I believe this can be problematic, especially if the Customer is not involved in choosing the priority of these items. Without customer involvement, the team may be spending large chunks of the development budget on issues that aren't critical or destabilizing to the codebase.

Note that I don't consider problems found while developing or testing new code to fall into this category. Why? Because the Customer has already prioritized that new code in the form of a user story, and it's not done until you've got approriate tests that all pass. So we're strictly referring to issues found in the wild.

There is one challenging aspect to this approach. As a developer, project manager, and a leader, I am driven to produce the highest quality software possible. It bothers me to see unfixed bugs in any system I am involved with. But there is a fine line between conscientious quality management, and obsessive, wasteful fixing of every defect that arrives.

As an example, I've had defects on my backlog that become irrelevant when we implement a new feature, or modify an existing one. The time we would have spent fixing those would have been wasted because: 1) The customer wasn't impacted by them, 2) The features affected were deprecated or removed, or 3) The code was replaced by a new approach or implementation.

So it's important to consider the opportunity cost to your customer of excessive defect fixing, and let them be the judge of what's important, with your guidance. Whether or not you choose to fix a defect, it is still useful to try to understand the root cause in all cases - you may be sitting on a time bomb if you don't know why or how this defect got into the system.

As an example, I once encounted a defect that would show up once every 2 weeks in production, but when it did, the system needed to be restarted. This was painful, and we tried to find the root cause, but were unable. After a few months of this, it hurt enough to invest the time in fully researching it. After a couple of weeks of one developer's time during a slow period, we were able to isolate the problem to an issue in the underlying platform software (a specific Java library).

In this case, it was still worth the effort to understand where the problem was rooted, even though ultimately the code was not within our control (we did ultimately find a tolerable workaround). So even though I'm advocating prioritization of defects, some due diligence of the root cause is important so that you can communicate the impact to your customer, as well as gain a better understanding of what the fix would require.

Even within this process, my teams have certainly fixed small problems without prioritizing, but there is a slippery slope here. We were occasionally burned by seemingly inconsequential fixes that changed the behavior of some obscure part of the system, even though all of the tests passed. That's another part of the cost/benefit equation. Fixes, just like new features, can introduce new problems into the system. But unlike new features, which typically just break themselves, defect fixes can break critical existing functions in subtle ways. Although solid test coverage can minimize this kind of regression, it can and does happen.

I believe that holistic management of your backlog, including new features, enhancements, and defects, is the most consistent way to stay on top of priorities, and effectively deliver quality software that meets your customer's needs.

How do you manage your defects?

For more on agile tools and techniques:
(Tags: , , )

Get your copy of the new book! Agile Thinking: Leading Successful Software Project and Teams


  • Since we use XP-stories all bugs are posted just as normal stories.

    The logic behind it is that is a user tries to print a report and it fails, then program is unable to print reports thus need the feature "print report".

    All bugs/problems/errors are posted as stories so that both the customers and the manager can set priority to them.

    We use "The List"-philosofi all things that needs to be done should be on the list and developers do the top items first.

    By Anonymous Anonymous, at 3:11 PM  

Post a Comment

<< Home