This discussion between Laurent Bossavit and Steve McConnell makes for very interesting reading: Bossavit critiques McConnell’s Making Software chapter on differences in programming productivity (original in French here), arguing that the studies it cites do not establish as a fact that some programmers are an order of magnitude better than others; McConnell responds, nobly and patiently, justifying the citations and the order-of-magnitude claim that they support.
Bossavit’s critique seems slightly tinged with indignation at discovering how scientific sausages are made. (Incidentally, Bruno Latour, whom he discusses at some length throughout his piece, is a prime exponent of such sausage making.) Bossavit goes back to some of the studies cited by McConnell and finds that they are not controlled laboratory experiments, or that their sample size is fairly small, or that the participants were debugging instead of programming properly, or some other problem. He therefore finds McConnell’s litany of citations suspect, as none of them conclusively establish as a fact that there are programmers that are an order of magnitude better than others, though taken all together they form an intimidating wall of academic texts that encourage the reader to just take McConnell’s summary, erroneously, for a fact. In his reply, McConnell convincingly shows that the scientific evidence for an order-of-magnitude difference in individual programming productivity is far more solid than Bossavit makes it out to be. Not conclusive, perhaps, but as strong as it gets in our field to date.
However, having settled the issue of McConnell’s chapter, there are still several important observations to extract from Bossavit’s critique, taken more generally as discussing software development research (which was, I believe, Bossavit’s intent), that were overlooked in the subsequent discussion.
The first is our tendency to protect some of our questionable claims with a layer of citations. Whole subfields of software research have sprouted from such clever gardening; by the time they wither their creators will have long secured themselves, having achieved tenure and respectability years before. It is truly a pain, in some occasions, to dig through the list of references in an initially exciting paper, only to find that it is supported by the flimsiest empirical support. Even if in this case Bossavit’s criticism was unwarranted, it holds for many other academic papers in our area.
On the other hand, paired with this tendency to offer questionable citations is the tendency to demand them in the literature we read and review, even to support fairly obvious statements. In a sense, we’ve become rather lazy, preferring an (Author, Year) string over the (slight) effort of considering whether a claim makes sense through simple argumentation or experience. We demand statistical significance, rather than clarity of thought.
This is the case with the whole productivity issue. I know that there are people who are at least an order of magnitude better programmers than others; I have seen them, and I suspect most other software developers or researchers have, too. It’s just part of the difficulty of the task and of the variety of human nature. I also know runners, jugglers, writers, cooks, managers, and scientists, who, by any sensible criteria, are far beyond the abilities of some of their peers. We don’t really need a series of double-blind controlled experiments with thousands of participants around the globe to establish this; our resources are better spent otherwise, and in the case of programmer productivity the sources that McConnell offers, methodologically weak as they might be, are more than enough to convince ourselves that there are no surprises here, and move on.
The real question, the thorny issue, is the nebulousness of our constructs—in the case in point, the development productivity construct. Bossavit gets to this near the end of his critique, but his arguments appear to have been ignored in the posterior debate. What is productivity? For starters, Bossavit reminds us (and again, there’s no need to demand citations or studies here; our experience confirms the statement) that some people have a net negative productivity. Initial measurement efforts are naive: lines of code have long been discredited as an accurate indicator of anything. Other programming-centric elements (such as function points) risk missing the essence of productivity: as Greg Wilson likes to say, “a week of work saves an hour of thought,” and such hours of thought are not amenable to straightforward measurement, as they tend to produce very little code. And what about the more subtle components of productivity? Perhaps you, like I, have worked with someone who may not be particularly skilled, technically speaking, but that holds some other attribute (charisma, empathy, drive, a sense of purpose) that greatly amplifies the productivity of the team many times over. How can we include such considerations in our productivity construct, seeing as they are meshed within our understanding of it?
The problem is that, for as long as a construct is as weakly built as the one of development productivity, any experimentation we carry out is bound to be unsatisfactory. We know that some people are more productive than others; whether they are exactly five, ten, or twenty seven times more productive is not a question we can settle at this point (or, perhaps, ever—and by the way, I’m not sure this is really where we want to go as a society, but that’s a different topic). If we can spare research efforts in exploring productivity in more detail, I suggest we should aim them at settling these theoretical and conceptual issues first, rather than at more careful and methodical experimentation.