Pair programming evaluated

A controlled experiment with three hundred paid subjects participating for a full day is flimsy evidence in the medical field, but for software engineering researchers it is huge. In the February 2007 issue of the Transactions on Software Engineering journal, Arisholm et al., from the Simula Research Lab in Norway, report on a pair programming study with 295 professional Java consultants [1]. One third of the participants worked alone, two thirds worked in pairs on the same problems.

Pair programming –that is, working shoulder-by-shoulder with another developer in the same computer– is one of those practices that has a few vocal convinced believers and a large majority of outsiders remaining silently skeptic. The standard claims are that these pairs are as efficient as individuals, or more; that they produce code of higher quality since everything is being verified as it is written; and that knowledge about the project flows better in the team since at least two people know how every bit of the system works, leading to all sorts of qualitative improvements.

The study of Arisholm & Co. explores the first two of those claims. From the abstract:

“The results of this experiment do not support the hypotheses that pair programming in general reduces the time required to solve the tasks correctly or increases the proportion of correct solutions. On the other hand, there is a significant 84 percent increase in effort to perform the tasks correctly”

So no improvements in time or quality were observed. The results also indicate that pair programming produced higher-quality code in more complex problems, and that junior consultants working in pairs benefit mostly in terms of correctness, while more senior consultants benefit mostly in terms of effort.

Although it’s an unusually strong experiment, mainly due to its magnitude, it has two important biases. First, against pair programming: Most subjects had no pair programming experience (while every developer has solo programming experience), and they had to get to work with strangers, so the benefits of knowing your pair’s work style and habits did not have a chance to ramp up. Everyone that has seen team members develop trust and collaboration over time would agree that evaluating recently formed pairs is artificial.

Second bias, in favour of pair programming: The real comparison should not be against solo programmers, but against pairs of developers working on the same problems, as a team, but on separate computers. For instance, Joe may choose to work on problems 1, 2, and 3, while Jane works on the much more complex problem 4 all along. The pairs of solo programmers would need to be familiarized with each other as well, of course, so that their long-time collaboration has a chance to kick in.

Still, this is the heaviest data point in pair programming research to date, and it’s worth taking a look at it if the topic interests you.

—

[1] Erik Arisholm, Hans Gallis, Tore Dyba, and Dag I.K. Sjoberg. “Evaluating Pair Programming with Respect to System Complexity and Programmer Expertise.” IEEE Transactions on Software Engineering, 33:2, Feb 2007.

About Jorge Aranda

I'm currently a Postdoctoral Fellow at the SEGAL and CHISEL labs in the Department of Computer Science of the University of Victoria.

View all posts by Jorge Aranda →

20 Responses to Pair programming evaluated

Pingback: Theory use in Software Engineering research « Catenary
juan says:

March 12, 2007 at 9:35 pm

I know we’ve had that very same discussion at the office: even if the 2 developers were able to solve the same problem with higher quality, would this increase in quality be worth twice the value? It costs that much.

Jorge says:

March 12, 2007 at 10:41 pm

Exactly –this study tracks both time and effort (which basically is time x 2). Pairs took about as long as individuals, for almost twice as much total effort.

The idea that two people in one computer are more efficient than the same people in separate desks has always seemed odd to me. If pair programming outperforms solo programming, it would only be because of small qualitative factors (better shared understanding, less impact of turnover, etc.), not because of direct efficiency.

mcyclops says:

March 13, 2007 at 10:53 am

Two cents: What about compatibility? The study assumes that the pair is going to like each other, and that is, at least, debatable. Second thought: What about solo programmers who want to work alone? One of the kicks of the computer science field is the amount of work that an individual can get by himself, at his own pace. I would hate to share a desk with other guy with another programming style and another pace. This is why God created modularity and encapsulation…

Jorge says:

March 13, 2007 at 12:35 pm

Those are probably the reasons why most people won’t even try pair programming.

In reality, those who have told me they apply this practice don’t do it 8 hours a day. They’ll work on their own, separately, for most of the day, and get together for a pair programming session of 2-3 hours.

Pingback: The Third Bit » Blog Archive » Doing the Science
riseagain says:

March 14, 2007 at 10:41 am

Nice article, keep up the good work !

I think solo programming is better but the development environment can change the balance.

Jorge says:

March 14, 2007 at 12:06 pm

Thanks, riseagain!
I have never tried pair programming in a disciplined way myself, but from my brief experiences, I don’t think I could do it for more than a couple of hours per day.

You’re right that the development environment may make one or the other better.

Riki says:

May 13, 2007 at 2:56 pm

Thanx for interesting infos for sister she study in Norway .R

Pingback: ICSE 2007, Day 3 (and last) « Catenary
Pingback: My ICSE 2007 picks « Catenary
Sean says:

February 5, 2008 at 5:15 pm

Once again the study screws it up. They compare pair programming versus a single programmer. They don’t compare pair programming versus two programmers working on separate things.

196 people, working in pairs, versus 98 people working alone, and the fact that 196 people didn’t take LONGER is a success?!

Jorge says:

February 5, 2008 at 5:24 pm

Sean,

Yes, the comparison we described would be better. But note that the study never claims that pair programming was a ‘success’. If anything, it comes up as a failure: nearly twice the effort for a slight improvement in quality.

Pingback: Pair Programming « Ehsan Tavakoli’s Weblog
Chris says:

October 1, 2008 at 4:28 pm

Was there an explanation given as to why the individual programmers averaged 1, 2, 5 years of experience for the classifications of junior, intermediate, senior respectively, yet the pair programmers averaged 1, 6, 10 years for those same categories?

The test environment also removed one of the big benefits of pair programming — keeping people focused. Everyone will stay focused for one day as part of a graded experiment.

Given the lack of pairing experience, never mind jelled teams, the slowness is no surprise. I tell people who want to try pair programming that they will be noticeably slower for the first ~3 weeks.

Jorge says:

October 2, 2008 at 7:29 am

“The test environment also removed one of the big benefits of pair programming — keeping people focused. Everyone will stay focused for one day as part of a graded experiment.”

That’s a good point, Chris. I don’t see how the researchers could work around it in a controlled experiment, but doing so would seem to favour pair programmers.

Pingback: Experiences of Using Pair Programming in an Agile Project - It will never work in theory
Johan Nilsson says:

February 4, 2012 at 11:53 am

Now, after the first day, take in a completely new crew of solo programmers that will perform additional tasks on the work of their predecessors. Also replace half of the pair-programmers with new ones and let the new pairings continue.

Pair programming is intended as an investment for the long-term. The researchers should have used 1/10th of the subjects for 10 days instead to get some hint about the long-term effects.

- Jorge Aranda says:
  
  February 4, 2012 at 12:36 pm
  
  Johan, the more I think about this matter (and the more I see pair programming in practice), the more I agree with your argument. The problem is that it’s really hard (and expensive) to test a long-term investment such as pair programming in the controlled environment of a lab.
  
Pingback: Pair programming | notatki na mankietach