The Secret Life of Bugs: a brief summary

Well, my ICSE talk has come and gone, and I’m very happy with it and with the feedback I got. Now, with the talk behind, I feel it’s a good idea to give a brief summary of what the paper Gina and I wrote is all about. I should say though, if you’re interested in the topic, it’s much better if you read the paper than my summary. You can also download the slides I used during the talk, although they miss much of what I (and the paper) said.

To understand this work you may need a bit of context. In recent years there has been a lot of research that mines repositories to discover stuff about software projects. For instance, you can extract a list of people that changed the code of one file, then you can extract the number of emails that they’ve sent to each other during the same period, and through some math magic you can find out if these people coordinated properly (that is, emailed each other enough) to work on the code – because if they didn’t, they should’ve.

This is a simplification, of course, but that’s the main idea: mine software repositories to discover how teams (ought to) coordinate and communicate. A lot of this research is quite interesting and useful, and the people that do it are extremely smart. But I often find their conclusions questionable because they make one important, dangerous assumption: that the electronic repositories they mine provide an accurate, or at least appropriate, picture of what really happenned in the projects they come from. For example, can we really assume that if two developers didn’t send each other an email on a particular period then they were working uncoordinated? It’s an (until recently) untested assumption, but that doesn’t seem to give much pause to the researchers in the area.

dragonflyfossil

This is the assumption Gina Venolia and I set out to test. In my talk I made an analogy between paleontologists and entomologists: when it comes to studying bugs, the first are stuck with fossils as the only evidence they have, while the second have relatively easy access to live specimens. They have a much easier time confirming or refuting simple hypotheses because of this. Paleontologists would love to have that kind of opportunity. And yet, it seemed to us, our community seems more comfortable handling fossils than venturing to study live specimens.

We wanted to find out the consequences of this focus on fossils (or electronic records) over the real deal. We investigated in-depth the stories of ten bugs at Microsoft – a friend rather accurately described the process as one of investigative journalism. We tried to find out all possible information about the bugs in question. At the same time, we kept track of what the story of each bug would be like if we didn’t have access to interview the people involved, or if we hadn’t spent time making sense of little oddities of each story. In effect, we compared our in-depth stories with the (rather plain) story you get if you’d only mined the same data. When we were almost done with our case study we conducted a survey, also at Microsoft, to validate its findings.

dragonfly

The differences between the two kinds of stories are huge. We spend a good deal of the paper describing them. In every case the electronic records are missing important information; for most they actually contain erroneous information. In the paper we make the case that a lot of the information that matters the most for each bug is the kind that is the hardest to document or to extract automatically: personal, political, social, tacit knowledge.

We also go to some depth in describing what the stories of bugs actually look like. We do not narrate the stories we found (each of them fills several pages); rather, we list and describe the most important patterns that we observed in them, as well as the goals that people appear to have when they are working on bug-fixing activities.

That’s what we did. After the talk, a friend told me that he thought our message was mainly that qualitative studies are better than quantitative studies. Though I do prefer qualitative studies I hope that’s not the message most people got: it’s not that qualitative studies are better, but that for questions of communication and coordination, many quantitative studies rely on data that are wrong or misleading in important ways, and the researchers that carry them out need to be extremely careful with their assumptions.

—

Photo credits: John Cancalosi (fossil), Andre Karwath (yellow-winged darted dragonfly)

About Jorge Aranda

I'm currently a Postdoctoral Fellow at the SEGAL and CHISEL labs in the Department of Computer Science of the University of Victoria.

View all posts by Jorge Aranda →

4 Responses to The Secret Life of Bugs: a brief summary

Pingback: Blogging from ICSE - Thursday | Serendipity
Neil says:

May 25, 2009 at 10:40 am

Great analogy!

I think the explosion of interest in MSR was motivated by frustration with the difficulty in doing something ’empirical’ in software engineering (vs. designing YAFSF — yet another f— software framework). I believe you said your study, for example, avoided ethics review because you were a MSFT employee. Furthermore, it seems nearly impossible for independent researchers to get access to corporate (‘representative’?) data (viz. IBM). Arguably you were not ‘independent’ in this study.

On a wider scale, I think you are of course quite correct. Looking at raw numbers will never produce the same kind of broad insight that longitudinal case studies do. But aren’t we comparing apples and oranges? I’m thinking of the diagram in the Pink Yin book (2.2), which discusses generalization. An experiment that mines data from an OSS repository is only giving us one data point towards a theoretical explanation (despite the quantity of data surveyed). One might argue that surveys and interviews are fraught with bias as well: respondents might change their answers, be ignorant of actual work habits, etc. These are common problems in historiography too.

There’s a great article in the Atlantic Monthly — http://www.theatlantic.com/doc/200906/happiness — on a related topic. It discusses a longitudinal study of a few hundred Harvard males from 1945-ish, and follows them throughout their lives. It comes up with some interesting observations on happiness and fulfilment that arguably could not be identified with a simpler, short-term psychology experiment. On the other hand, the conclusions are drawn from a group of white, wealthy males from 1945, most of whom were in the war, all of whom got into Harvard, etc.

Finally, I think your conclusion: “the researchers that carry them out need to be extremely careful with their assumptions.” applies to ANY research methodology and researcher.

I’ve been pondering this for a few days, and I don’t want to suggest that you are wrong: on the contrary I think you make some very valid points. I just want to throw out some observations as someone who does MSR work (or has done, anyway).

Jorge says:

May 25, 2009 at 11:53 am

Neil — these are all good points, thanks!

You’re right that I had a much easier access to Microsoft data thanks to being an intern there. But although many companies are very secretive, there’s always some (Microsoft is only one example) that will open their doors for your studies with the right assurances; I’ve relied on them and I continue to. Ethics approval is definitely a pain.

I don’t think this is comparing apples and oranges though: we are interested in the same research questions and our methods try to answer them; our answers should be comparable. What I have is evidence that the mining method often gives a wrong, and always incomplete, answer to these questions. Whether that’s a risk that data miners want to take is not up to me.

Pingback: The CRU emails and the secret life of bugs | Catenary

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31