Catenary

Entries from May 2009

Andy Ko and the semblance of objectivity in numbers

May 31, 2009 · 2 Comments

Andy Ko blogged yesterday about having a paper rejected at FSE, the Foundations of Software Engineering Conference, because it used qualitative research methods. His post includes depressing snippets from his reviews, and his replies are fitting and well worth reading:

Transforming empirical observations into numbers does NOT make them objective, nor does it prevent bias and misinterpretation.

I’ve written about this before. As a community, our excessive focus on quantitative methods is preventing us from developing the right constructs to study. It’s a shame that when researchers address this fundamental problem they’re shot down by their peers.

Categories: Academia

Online climate simulator

May 30, 2009 · 1 Comment

C-Learn is a simplified online climate simulator you can use to play with possible CO2 emission scenarios. It may take you a few minutes to figure out how to change the input variables, and what do they mean exactly, but once you get the hang of it its a very sobering exercise.

I just spent some time trying to bring us down to a temperature increase of “only” 2 degrees Celsius, and when I got there (hint: you pretty much need a reduction in emissions of at least 90% from 1990 levels across the board, and huge efforts in stopping deforestation and increasing afforestation, which is what our best current models demand as well) I started playing with delay tactics –what if we wait a few years to start getting serious about stopping our emissions; what if developing countries get a few years’ handicap– and mostly hitting against a wall in every case (with one exception: action in developed countries is far more urgent than in developing ones).

So give it a try and pass it around. The more people know about the true magnitude of the mess we’re in, the better.

(via Jon Pipitone)

Categories: Activism

ICSE 2009 highlights

May 24, 2009 · 2 Comments

Here’s the list of my highlights from ICSE 2009. It is, of course, a very partial list, focused on the “human” or “soft” or “cooperative” or “whatever-its-name-is” side of software development that interests me.

The CHASE workshop

The Cooperative and Human Aspects of Software Engineering workshop is the one I always look forward to, and if most of ICSE was like this it would be my Disneyland.

Rob DeLine shared this view in his CHASE keynote. He called for CHASE to go mainstream, offering two definitions of CHASE studies: the tongue-in-cheek (“CHASE studies software development as though software were created by people working together”) and the more elegant (“CHASE studies those aspects of software development from which people cannot be usefully abstracted away”), and he proposed that the ICSE committee should reject papers that do not (a) deal at least in part with the people involved in software development nor (b) provide a convincing argument that, for their topic, people can be usefully abstracted away.

The STC workshop

The Socio-Technical Congruence workshop is another favourite of mine, and as usual had interesting papers and discussions. Much of this research mines software repositories in ways that I criticized during my talk, but I still find their work and interests very close to mine. Marcelo Cataldo’s work suggesting that there are software design patterns to deal with social/organizational issues was inspiring, as was Andy Begel’s observation that if we’re to design software built to last perhaps we should design it so that it can be supported by social systems that have proved to be durable through civilization.

The main conference

Steve McConnell gave the opening keynote. His topic was a rather arbitrary list of “the 10 most powerful ideas in software engineering.” You can find his list, along with a well-deserved critique of the talk, at Steve Easterbrook’s blog. Personally, I liked part of the message (software development is performed by human beings, incrementalism and iteration are good, different projects call for different kinds of software development) and it’s worth repeating it often to an ICSE community that would rather focus on other things. On the other hand, most of his top ten ideas were platitudes for some of us, and they were very uninspiring in terms of what we as researchers should be doing.

I didn’t attend the keynote by Carlo Ghezzi (I was preparing my own talk), but I hear it was quite good, and I wish I’d made it.

Some papers now on my to-read list: Timo Wolf’s study on predicting build failures by studying the networks of communication of software developers, Christoph Treude’s analysis of how and why is tagging useful in a development platform, Emerson Murphy-Hill’s study of how people refactor code, and Christian Bird’s case study of distributed development and software quality in Windows Vista.

The greatest highlight for me was Steve’s call to arms for using our abilities and resources to help save the planet from climate change. He gave a clear, whirlwind tour of the latest climate science and its horrible implications for life as we know it (if you’re not scared you’re not paying attention), and offered a list of challenges we can tackle and a list of skills that we can use to do so. The reaction from the attendants was far better than I’d expected: no time wasted with denial or sci-fi “solutions”; a lot of sincere concern and involvement; many great ideas. The challenge now is to keep the momentum going.

The SECSE workshop

The Software Engineering for Computing Science and Engineering workshop (yes, it’s pronounced “sexy”), after the main conference, was another source of inspiration. But I was completely drained out by then, and didn’t get as much of it as I could’ve otherwise. Parmit Chilana’s interviews of biologists and computer scientists seem interesting, as do Daniel Hook’s work on testing scientific code and Jeff Overbey’s presentation of an Eclipse plug-in for Fortran (“photran”) for easy refactoring.

Miscellaneous

The banquet’s setting (Vancouver’s aquarium) was fantastic, the weather cooperated, the access to Internet sucked, the vegetarian options at lunchtime sucked as well, and there was a panel of judgment in software estimation and a werewolf night that I had to miss for a chance to talk with the people at the STC workshop.

Overall it was a great week – inspiring and fun. But, really, I’m glad it’s over.

Categories: Academia · ICSE · Software development

Bibliometrics

May 23, 2009 · 1 Comment

Hmmm…

numbersgame

Should we heed David Parnas’ call for dismissing bibliometrics information? After all, how can he be right? It’s only been cited twice so far…

Categories: Academia

The Secret Life of Bugs: a brief summary

May 23, 2009 · 3 Comments

Well, my ICSE talk has come and gone, and I’m very happy with it and with the feedback I got. Now, with the talk behind, I feel it’s a good idea to give a brief summary of what the paper Gina and I wrote is all about. I should say though, if you’re interested in the topic, it’s much better if you read the paper than my summary. You can also download the slides I used during the talk, although they miss much of what I (and the paper) said.

To understand this work you may need a bit of context. In recent years there has been a lot of research that mines repositories to discover stuff about software projects. For instance, you can extract a list of people that changed the code of one file, then you can extract the number of emails that they’ve sent to each other during the same period, and through some math magic you can find out if these people coordinated properly (that is, emailed each other enough) to work on the code – because if they didn’t, they should’ve.

This is a simplification, of course, but that’s the main idea: mine software repositories to discover how teams (ought to) coordinate and communicate. A lot of this research is quite interesting and useful, and the people that do it are extremely smart. But I often find their conclusions questionable because they make one important, dangerous assumption: that the electronic repositories they mine provide an accurate, or at least appropriate, picture of what really happenned in the projects they come from. For example, can we really assume that if two developers didn’t send each other an email on a particular period then they were working uncoordinated? It’s an (until recently) untested assumption, but that doesn’t seem to give much pause to the researchers in the area.

dragonflyfossil

This is the assumption Gina Venolia and I set out to test. In my talk I made an analogy between paleontologists and entomologists: when it comes to studying bugs, the first are stuck with fossils as the only evidence they have, while the second have relatively easy access to live specimens. They have a much easier time confirming or refuting simple hypotheses because of this. Paleontologists would love to have that kind of opportunity. And yet, it seemed to us, our community seems more comfortable handling fossils than venturing to study live specimens.

We wanted to find out the consequences of this focus on fossils (or electronic records) over the real deal. We investigated in-depth the stories of ten bugs at Microsoft – a friend rather accurately described the process as one of investigative journalism. We tried to find out all possible information about the bugs in question. At the same time, we kept track of what the story of each bug would be like if we didn’t have access to interview the people involved, or if we hadn’t spent time making sense of little oddities of each story. In effect, we compared our in-depth stories with the (rather plain) story you get if you’d only mined the same data. When we were almost done with our case study we conducted a survey, also at Microsoft, to validate its findings.

dragonfly

The differences between the two kinds of stories are huge. We spend a good deal of the paper describing them. In every case the electronic records are missing important information; for most they actually contain erroneous information. In the paper we make the case that a lot of the information that matters the most for each bug is the kind that is the hardest to document or to extract automatically: personal, political, social, tacit knowledge.

We also go to some depth in describing what the stories of bugs actually look like. We do not narrate the stories we found (each of them fills several pages); rather, we list and describe the most important patterns that we observed in them, as well as the goals that people appear to have when they are working on bug-fixing activities.

That’s what we did. After the talk, a friend told me that he thought our message was mainly that qualitative studies are better than quantitative studies. Though I do prefer qualitative studies I hope that’s not the message most people got: it’s not that qualitative studies are better, but that for questions of communication and coordination, many quantitative studies rely on data that are wrong or misleading in important ways, and the researchers that carry them out need to be extremely careful with their assumptions.

Photo credits: John Cancalosi (fossil), Andre Karwath (yellow-winged darted dragonfly)

Categories: Academia · ICSE · Software development

Presenting at ICSE 2009

May 21, 2009 · 2 Comments

If you’re attending ICSE 2009, do come to the talk on The Secret Life of Bugs tomorrow (Thursday) at the Maintenance session, 11am!

If you’re curious about the paper, and don’t have access to the proceedings, you can find it on this link. I’ll write some more about the study after I’m done with the presentation.

Wish me luck!

Categories: Academia · Software development

Mini-update: All is well

May 4, 2009 · 13 Comments

I’ve been away from my blog for a long time so I thought an update was in order.

In the last few days I presented a poster at the Consortium of Software Engineering Researchers, in Montreal, which I think was well received, I stole four bananas out of necessity and I toured the beautiful Old Montreal (thanks Manuel!); I begun to prepare for my ICSE presentation in Vancouver; Spring arrived; the swine flue arrived and I became addicted to maps of its outbreak; I marked a seemingly endless stream of assignments and final exams; my colleagues and I discovered that we have a book burglar with access to the lab; I ran a 10km race at an excellent (for me) time of 50:16, despite starting to develop tendonitis two weeks ago; and I celebrated my seventh anniversary with Val (today!). So all is well. But I’m happy to be able to focus on my research proposal more intensely from now on.

Categories: Uncategorized