It's simple (TOC is incredibly obvious once you know it)
It's brilliant (our internal LRQA auditor gave me a story yesterday of doubling a company's turnover in 2 years by applying it).
So what's it got to do with software development? EVERYTHING! Basically, practically every human endeavor to make something (anything) new, is a sequence of linked operations with statistical variation in them. This means that TOC applies. Simple as, no arguments, no buts, no "we're different", it applies. End of.
So what do we do as software project managers? Well, as the book goes, we "find Herbie" (the slow-coach in the pack). It's a vital message in the book, to recognize that your total throughput is the same as the throughput of your bottleneck. There are then five focusing steps involved in dealing with your bottleneck:
- Locate the bottleneck
- Maximally exploit the bottleneck
- Subordinate everything else to exploiting the bottleneck
- Elevate the bottleneck (add capacity some other way)
- Avoid inertia and keep checking your bottleneck hasn't moved
And what have I done? Well, I don't know for sure yet, but I've a strong suspicion that QA is our bottleneck, and my numbers are piling up to allow me to make a proper judgement. Testing and QA has traditionally been held to the end of a project in the waterfall development lifecycle (the worst place for Herbie to be, he should be at the front, controlling the pace!) In addition, QA has the terrible problem of it's throughput being a slave to the shipping decision: it's like the average pregnancy term in the UK: unknowable, because we prevent them from running more than two weeks over! In the software gestation period, the worst offenders cut their losses and call all of the bugs "known issues"!
With all of this considered, what should I do about the test team? Basically, make sure they're never sat idle! Now I first learned this from Kanban by David Anderson, but it's a TOC thing really. By applying the teachings of Anderson and Goldratt, I've been through a very enjoyable learning curve, passing through a number of phases on the way:
First, I simply loaded the testers up with all of the bugs that we'd addressed so far and a few feature tests for code we'd completed. That went through quickly and generated a few new development items, but the testers were pretty much left twiddling their thumbs again almost immediately. The lesson here was that at this stage in the process, test is not a constraint, coding is. However, overall, I know it's likely to be, because of the QA push at the end, when the developers sit idle waiting for bugs.
OK I thought to myself, We'd better load them up with some feature tests for the stuff we're developing now, after all, we have finished specs for the tests to be written against and this is all good Agile practice. On balance, this seemed to get us closer to keeping the test team busy, but they were still capable of starving the test buffer pretty quickly if they were all in the office and working well. The lesson I learned at this stage was a basic Agile one as already stated: that a bunch of user stories and a detailed spec is best expressed as a set of tests. There's still more to do here, and I've been talking with people only today about whether my Product Manager should sign off a feature after it's been tested, or whether he should sign off tests as representative of his requirements up-front.
The interesting thing that I now felt I could see on my Kanban board was a test buffer "vacuum" which just wanted to consume some bugs or features. The clock was also ticking in my head: every day that goes by conjures up fears of delivery a day late, or quality some equivalent to a "test day" lower. The natural thing to release to the "Testing Hoover" seemed to be bugs, so we stopped feature development and addressed a load of legacy bugs. In fact, we do this on a regular basis now. Not just the bugs in the feature we're writing (that's plain Agile good practice to keep quality up and maintain regular delivery of release quality code), but historical bugs and new issues found in features not under development. This was pretty good at feeding the hungry testers, but still, they seemed to need more. Although almost everything I'd done so far to keep the test team busy has been about "exploiting" the bottleneck, this step involved me realizing that a bug backlog is a type of inventory in the process, and TOC tells me that I should reduce this as well as increasing throughput. Although bugs don't represent items we've invested time and money in that we haven't yet sold (a very simple definition), it does represent a pile of work-in-process in front of the bottleneck: it has to go through at some point, so why wait until the bottleneck is maximally loaded?!
The last thing I have tried was to enter bugs into test before they are fixed. This might seem odd to some, as testers are used to verifying that something's behaving the way it should be, rather than writing a test that they know is going to fail. I was a bit worried about this if I'm honest, as a broken feature prevents a bit of exploratory testing around it to add to the tests. However the great advantage of this approach was that once a developer did a bug fix, they could have almost immediate feedback about whether it was fixed or not. What did this teach me? That TDD can "reach outside" development and extend to the test team, such that they create independent, "QA minded" tests to support development.
An overall lesson I have learned is one that Kanban can teach us all: that I need an appropriately sized buffer for my test team. Particularly as I'm in the UK and the testers are in the US. And the test manager's in Australia. It's also worth noting that - for me - test is not only a bottleneck constraint, it's a non-instant availability constraint also, as the test engineers also do customer support and site installs, so can disappear for days or weeks at a time. Basically, as soon as their backside is back in their chair and they think of me, I need them to be able to start work straight away!
So, what's left to sort out? Well, one huge flaw in all of this is that I don't know for sure that testing is the bottleneck, so I need to gather and scrutinize some data to root out where Herbie really is. The other thing I need to do is to bring in the remaining three focusing steps: all I've done so far is try to maximally exploit the bottleneck. I'll need time to see if this has a positive effect on our QA process beyond code complete, but if there's more to be done, then I need to look at other strategies. However, I suspect that the next thing to do will be to sniff out a new "Herbie" within the process. That shouldn't be too difficult in principle, as the process is very linear and is composed of very few operations (compared to a complex manufacturing plant). However, I can see the need to gather good statistical data to support this learning process as a significant challenge. I'm looking forward to it.