Catenary

Entries categorized as ‘ICSE’

ICSE 2009 highlights

May 24, 2009 · 2 Comments

Here’s the list of my highlights from ICSE 2009. It is, of course, a very partial list, focused on the “human” or “soft” or “cooperative” or “whatever-its-name-is” side of software development that interests me.

The CHASE workshop

The Cooperative and Human Aspects of Software Engineering workshop is the one I always look forward to, and if most of ICSE was like this it would be my Disneyland.

Rob DeLine shared this view in his CHASE keynote. He called for CHASE to go mainstream, offering two definitions of CHASE studies: the tongue-in-cheek (“CHASE studies software development as though software were created by people working together”) and the more elegant (“CHASE studies those aspects of software development from which people cannot be usefully abstracted away”), and he proposed that the ICSE committee should reject papers that do not (a) deal at least in part with the people involved in software development nor (b) provide a convincing argument that, for their topic, people can be usefully abstracted away.

The STC workshop

The Socio-Technical Congruence workshop is another favourite of mine, and as usual had interesting papers and discussions. Much of this research mines software repositories in ways that I criticized during my talk, but I still find their work and interests very close to mine. Marcelo Cataldo’s work suggesting that there are software design patterns to deal with social/organizational issues was inspiring, as was Andy Begel’s observation that if we’re to design software built to last perhaps we should design it so that it can be supported by social systems that have proved to be durable through civilization.

The main conference

Steve McConnell gave the opening keynote. His topic was a rather arbitrary list of “the 10 most powerful ideas in software engineering.” You can find his list, along with a well-deserved critique of the talk, at Steve Easterbrook’s blog. Personally, I liked part of the message (software development is performed by human beings, incrementalism and iteration are good, different projects call for different kinds of software development) and it’s worth repeating it often to an ICSE community that would rather focus on other things. On the other hand, most of his top ten ideas were platitudes for some of us, and they were very uninspiring in terms of what we as researchers should be doing.

I didn’t attend the keynote by Carlo Ghezzi (I was preparing my own talk), but I hear it was quite good, and I wish I’d made it.

Some papers now on my to-read list: Timo Wolf’s study on predicting build failures by studying the networks of communication of software developers, Christoph Treude’s analysis of how and why is tagging useful in a development platform, Emerson Murphy-Hill’s study of how people refactor code, and Christian Bird’s case study of distributed development and software quality in Windows Vista.

The greatest highlight for me was Steve’s call to arms for using our abilities and resources to help save the planet from climate change. He gave a clear, whirlwind tour of the latest climate science and its horrible implications for life as we know it (if you’re not scared you’re not paying attention), and offered a list of challenges we can tackle and a list of skills that we can use to do so. The reaction from the attendants was far better than I’d expected: no time wasted with denial or sci-fi “solutions”; a lot of sincere concern and involvement; many great ideas. The challenge now is to keep the momentum going.

The SECSE workshop

The Software Engineering for Computing Science and Engineering workshop (yes, it’s pronounced “sexy”), after the main conference, was another source of inspiration. But I was completely drained out by then, and didn’t get as much of it as I could’ve otherwise. Parmit Chilana’s interviews of biologists and computer scientists seem interesting, as do Daniel Hook’s work on testing scientific code and Jeff Overbey’s presentation of an Eclipse plug-in for Fortran (“photran”) for easy refactoring.

Miscellaneous

The banquet’s setting (Vancouver’s aquarium) was fantastic, the weather cooperated, the access to Internet sucked, the vegetarian options at lunchtime sucked as well, and there was a panel of judgment in software estimation and a werewolf night that I had to miss for a chance to talk with the people at the STC workshop.

Overall it was a great week – inspiring and fun. But, really, I’m glad it’s over.

Categories: Academia · ICSE · Software development

The Secret Life of Bugs: a brief summary

May 23, 2009 · 3 Comments

Well, my ICSE talk has come and gone, and I’m very happy with it and with the feedback I got. Now, with the talk behind, I feel it’s a good idea to give a brief summary of what the paper Gina and I wrote is all about. I should say though, if you’re interested in the topic, it’s much better if you read the paper than my summary. You can also download the slides I used during the talk, although they miss much of what I (and the paper) said.

To understand this work you may need a bit of context. In recent years there has been a lot of research that mines repositories to discover stuff about software projects. For instance, you can extract a list of people that changed the code of one file, then you can extract the number of emails that they’ve sent to each other during the same period, and through some math magic you can find out if these people coordinated properly (that is, emailed each other enough) to work on the code – because if they didn’t, they should’ve.

This is a simplification, of course, but that’s the main idea: mine software repositories to discover how teams (ought to) coordinate and communicate. A lot of this research is quite interesting and useful, and the people that do it are extremely smart. But I often find their conclusions questionable because they make one important, dangerous assumption: that the electronic repositories they mine provide an accurate, or at least appropriate, picture of what really happenned in the projects they come from. For example, can we really assume that if two developers didn’t send each other an email on a particular period then they were working uncoordinated? It’s an (until recently) untested assumption, but that doesn’t seem to give much pause to the researchers in the area.

dragonflyfossil

This is the assumption Gina Venolia and I set out to test. In my talk I made an analogy between paleontologists and entomologists: when it comes to studying bugs, the first are stuck with fossils as the only evidence they have, while the second have relatively easy access to live specimens. They have a much easier time confirming or refuting simple hypotheses because of this. Paleontologists would love to have that kind of opportunity. And yet, it seemed to us, our community seems more comfortable handling fossils than venturing to study live specimens.

We wanted to find out the consequences of this focus on fossils (or electronic records) over the real deal. We investigated in-depth the stories of ten bugs at Microsoft – a friend rather accurately described the process as one of investigative journalism. We tried to find out all possible information about the bugs in question. At the same time, we kept track of what the story of each bug would be like if we didn’t have access to interview the people involved, or if we hadn’t spent time making sense of little oddities of each story. In effect, we compared our in-depth stories with the (rather plain) story you get if you’d only mined the same data. When we were almost done with our case study we conducted a survey, also at Microsoft, to validate its findings.

dragonfly

The differences between the two kinds of stories are huge. We spend a good deal of the paper describing them. In every case the electronic records are missing important information; for most they actually contain erroneous information. In the paper we make the case that a lot of the information that matters the most for each bug is the kind that is the hardest to document or to extract automatically: personal, political, social, tacit knowledge.

We also go to some depth in describing what the stories of bugs actually look like. We do not narrate the stories we found (each of them fills several pages); rather, we list and describe the most important patterns that we observed in them, as well as the goals that people appear to have when they are working on bug-fixing activities.

That’s what we did. After the talk, a friend told me that he thought our message was mainly that qualitative studies are better than quantitative studies. Though I do prefer qualitative studies I hope that’s not the message most people got: it’s not that qualitative studies are better, but that for questions of communication and coordination, many quantitative studies rely on data that are wrong or misleading in important ways, and the researchers that carry them out need to be extremely careful with their assumptions.

Photo credits: John Cancalosi (fossil), Andre Karwath (yellow-winged darted dragonfly)

Categories: Academia · ICSE · Software development

My ICSE 2007 picks

June 26, 2007 · 2 Comments

Before letting the ICSE 2007 topic to rest, I wanted to write some notes about some papers that I found satisfying and relevant to my research interests. ICSE is a huge conference, and although it has something for everyone, it’s easy to get lost in its proceedings. Greg Wilson has already posted an initial list of his picks; if you’re into the social side of software development, these are my recommendations, in no particular order:

‘Good’ Organisational Reasons for ‘Bad’ Software Testing, by David Martin, John Rooksby, Mark Rouncefield, and Ian Sommerville. In this ethnographic study, Martin & Co. show that testing in agile software development depends heavily on the organizational context of the team. The motivation of the research team is that there’s very little real world data on how companies do testing and verification:

“Many descriptions in Software Engineering research of the application of testing and verification methods to real world problems, whilst welcome, should be considered more-so as demonstrations than as reports of how testing is done in practice.”

The research team spent a month’s work of fieldwork observing testing and verification in a small software company. They found that the company largely dismissed test plans and textbook “best practices”, using instead a “thoroughly pragmatic” approach for apparently good reasons:

“W1REsys are not ignorant of best practice in software testing [...] but, for good organisational reasons, chose not to adopt this but to orient their testing to meet organizational needs. Their driving priorities are:

  1. The dynamics of real customer relationships [...]
  2. Using limited effort in the most effective way [...]
  3. The timing of software releases [...]
  4. The need to ‘grow the market’ for the system [...]

These priorities, they argue, influence their selection of software development practices far more than technical testing concerns.

All in all, although the participating company seems to dismiss some low-effort, high-payoff testing tools and practices, and although the research team does not delve into these possibilities, this is an engaging, insightful paper. It’s also necessary: as I’ve argued before, software development academia and industry are now so disconnected that we need this sort of work to tell it like it is and bring them back together.

The Social Dynamics of Pair Programming, by Jan Chong and Tom Hurlbutt. Another ethnographic study of agile software development. The researchers observed professional pair programmers from two software development teams, and documented their dynamics and interactions. Being an ethnographic study, the paper does not offer any quantitative results –such as those presented by Arisholm & Co four months ago–. However, the paper is chock-full of insights about the way pair programming really works. Some of the findings:

  1. Drivers and navigators: Although almost every description of pair programming emphasizes a division of labor between “driver” and “navigator”, Chong and Hurlbutt found that “pair programmers behave in ways inconsistent with the driver/navigator division of labor that is described in the pair programming literature”, and suggest that trainers and agile development texts should just drop this artificial division that people are bound to ignore anyway.
  2. Keyboards: It seems innocuous, but having the capability of switching keyboard control within the pair appears to be a much more productive set-up than having just one keyboard and one typist.
  3. Expertise: Pair programming is often presented as a way to spread expertise between team members — you pick up stuff from experts you work with, and pass on your expertise to your own pairs. But Chong and Hurlbutt warn us to be careful about pairing people with considerable expertise differentials: the expert tends to drive the bulk of the programming discussions, makes decisions without consulting, and does not seem to explain the rationale behind his actions.

Some of these findings are perfect hypotheses for confirmation/rejection through quantitative, controlled studies. I hope to see more on this in the short term…

Software Development Environments for Scientific and Engineering Software: A Series of Case Studies, by Jeffrey Carver, Richard Kendall, Susan Squires, and Douglass Post. This paper reports some technical and organizational details of the developers of five high performance computing projects. For outsiders, it’s easy to forget some of the concerns of these groups –portability and maintainability are goals as important as performance, as the code written today will most likely still be in use thirty years from now.

Carver & Co. discuss lessons learned from talking to these groups. The lessons range from the very insightful (e.g., validation and verification is extremely difficult when you don’t know what’s the right answer) to the exasperating (some experienced developers reject IDEs because they feel they impose “rigidity” on their development activities –don’t ask me why).

I would have liked more information on the organizational configurations of these teams, and I disliked the authors’ confusion between agile development and the anything-goes approach to development in these projects. However, this is still a very valuable window into the dynamics of these high-performance computing and engineering teams.

Information Needs in Collocated Software Development Teams, by Andrew Ko, Robert DeLine, and Gina Venolia. One more of the very few ICSE studies of real developers doing real work. The authors observed 17 Microsoft developers for 90 minutes each, and recorded the tasks performed by these developers, their interruptions, and the reasons why they often could not complete the task they were working on.

From these observations, the research team came up with a list of types of “information needs” of developers (“what is the program supposed to do? which changes are part of this submission? what is the purpose of this code?”). They report on the frequency and the outcome of the searches for each of these information needs. They suggest that their list is representative of the problems developers have in their daily work, and that it should drive the creation of new tools, processes, and notations from the software engineering community.

Perhaps unfortunately, too many developers seemed to be doing debugging work at the time of the observations, and so the list is biased towards debugging tasks. However, it is still valuable as a roadmap for software development research.

Global Software Engineering: The Future of Socio-technical Coordination, by James Herbsleb (requires ACM subscription). After explaining what is different about global software engineering (less communication, less effective communication, lack of awareness, and team incompatibilities), Herbsleb discusses a series of challenges that are relevant to the area and are likely to see action from the research community. It’s an informative and candid discussion, with bits such as this one about offshoring:

“Most development organizations seem to be placing large bets that the correct configuration [of software teams] is to have an analyst and marketing group physically near the customer, but are also willing to locate one or more development groups remotely. As yet we have little evidence of the effectiveness of this arrangement.”

Re-reading this post, I notice it would seem as if ICSE as a conference was full of relevant, exciting, reality-based papers about how to understand and improve software development. Unfortunately, it’s not. I found these papers to be the exception rather than the rule. Here’s to seeing the trend improve at ICSE 2008!

Categories: Academia · ICSE · Software development

Ed Yourdon on the Peopleware panel

June 24, 2007 · 6 Comments

Ed Yourdon, one of the participants of the ICSE Peopleware panel I blogged about, has an extremely informative description of the panel’s discussion in his blog.

Here’s an insightful bit: During the panel, Tom DeMarco half-jokingly blamed Barry Boehm (another panelist) for the relatively slow adoption of agile development. Boehm, after all, had reported that the cost of fixing software defects rises exponentially on the time between their injection and their discovery. Yourdon describes:

[DeMarco] said that as a result, the commandment “get the requirements right!” was drummed into the heads of a generation of software engineers. Tom turned towards Barry, smiled, wagged his finger, and said, “And I have never forgiven you!”

Barry Boehm relieved the tension in the air by agreeing with Tom. He explained that, back in the 1970s, he had linked up with Win Royce at TRW, where the two of them found that the waterfall methodology worked pretty well. But he acknowledged that they were working in an application domain (aerospace systems, military systems), and in a time, when the end-user’s requirements were fairly well-defined; consequently, it made a great deal of sense to capture those requirements early, rather than discovering later on that a great deal of software had been built to implement the wrong requirements. But Boehm acknowledged that by the 1980s, things had begun to change drastically … and obviously this continues to be true today.

Well worth a read.

Also, Yourdon entered a small comment in my ICSE post. I think it’s the first time that the author of books I read more than a decade ago as an undergrad comments my blog. It’s quite an honour.

Unfortunately, he called me Chris.

Categories: Academia · ICSE · Software development

ICSE 2007, Day 3 (and last)

May 26, 2007 · Leave a Comment

Some pointers to interesting research presented on the third –and last– day of the International Conference on Software Engineering:

  • Information needs of software teams: Andrew Ko reported on a nice paper about the kinds of information needed by developers in a Microsoft team. The list of “information needs” that the study offers can easily be treated as a very comprehensive set of urgent human aspects research in software development. (You can access the paper, and the list, in Andrew’s website).
  • Pair programming: Jan Chong presented a very insightful ethnographic study of pair programming. One of the most striking take-away messages were that having two keyboards makes a huge difference in the dynamic of the pair. (The paper, as well, is available in Jan’s website).
  • Roles in open source: Chris Jensen discussed the ‘advancement’ of roles of open source developers, from his trawling the archives of three major OSS projects. (Yup, the paper is in his website too.)
  • Requirements engineering research directions: Betty Cheng presented a list of challenges for future requirements engineering research. Issues of scale, self-managing systems, and tolerance of software of non-provable reliability were thrown in the mix.
  • Programming environments: Andreas Zeller set a vision of programming environments that join several tools that developers depend on –bug databases, version control, e-mail, chat, requirements descriptions, etc.– and mines their data to find useful, non-intrusive information.
  • Search-based software engineering: In what for me was one of the few “so wacky it could work” talks in this conference, Mark Harman convincingly argued that search-based approaches –such as genetic algorithms– for find good solutions to software engineering problems is a viable strategy (“not a silver bullet, but at least a machine gun with lots of nickel bullets”) .
  •  Empirical software engineering: Lionel Briand presented a paper by Dag Sjoberg & Co. on the future of empirical methods in software engineering. The need for third-party evaluation of proposals was made evident: when evaluation studies are conducted by the proponents of the technology, the results are almost always positive. When they are conducted by a different party, they are almost always negative. Although Briand did a great job presenting the paper, I was disappointed that I was not able to meet its authors, whose researched I’ve blogged about before. Apparently, the presenter was denied entry to the US for some minor problem with his passport. Way to go. Fortunately, the next two ICSEs will take place out of the US (Germany in 2008, and Canada in 2009), and away from overzealous immigration officers.

That was it for the conference. It was an extremely productive week, but I’m tired, homesick and glad to be heading back home!

Categories: Academia · ICSE · Software development

ICSE 2007, Day 2

May 24, 2007 · Leave a Comment

Second day at ICSE, and again a nice batch of paper presentations:

  • Testing: John Rooksby & Co. presented an ethnographic study of testing in a small company. They pointed out how “rigour” in testing is constrained by dynamics of customer relationships, strategic decisions, and testing effort.
  • Software Coordination: Jim Herbsleb listed the challenges and research areas for upcoming studies in global software engineering. Among the items in his list were assessing the impacts of requirements changes (who is affected, how should they be notified?), development environment issues (utility of being aware of others’ actions vs. each developer’s privacy), and defining an architecture/organization “fit”: “Can this organization (with its particular combination of geographic distribution, skills, and structure) produce software that conforms to this architecture?”.
  • Code Mobility: As part of a retrospective on the winner of the Most Influential Paper Award from ICSE 1997, Antonio Carzaniga, Gian Pietro Picco, and Giovanni Vigna talked about the various ways of thinking about “mobile code” –code that is sent to be executed to a different party over a network–, and about the impact that these various alternatives (mobile agents, remote evaluation, and others) have had over the years.

There was also a very nice banquet where we were able to try (a) archery, (b) tomahawk axe throwing, (c) segway riding (it doesn’t feel as dorky as it looks), and (d) s’mores by the fire. Good stuff!

Categories: Academia · ICSE · Software development

ICSE 2007, Day 1

May 24, 2007 · 6 Comments

Some notes of Day 1* of the 29th International Conference on Software Engineering, from Minneapolis:

To me, the high point of the day, and quite possibly of the conference, was a panel session on “Retrospectives on Peopleware”, with Barry Boehm, Fred Brooks Jr., Tom DeMarco, Tim Lister, Linda Rising, and Ed Yourdon. I don’t know how much these names mean to you, but their work is practically the reason why I came back to grad school. Two of the few books I brought with me when I left Mexico were DeMarco and Lister’s Peopleware and Brooks’ The Mythical Man-Month. Back then I thought that’s what software engineering should be all about, and I was very disappointed when I found so many people ignoring the essence of the problem of software development (the sociology and psychology of programming) to focus on far more comfortable problems that can be “proved” with math or logic, but that have little connection to real software work.
Incidentally, the panel touched on that topic (Fred Brooks ascribes the lack of Peopleware-related papers in ICSE simply to tenure, and DeMarco said he always had difficulty getting papers in ICSE because they were “squishy”. When his papers got in, he said, it was as “experience reports”. He invites everyone with “experience report” papers to call all other papers “non-experience reports”). They also discussed why it took so long for something like the Agile movement to come up (DeMarco: “I blame Barry [Boehm]“), the need for “hard play” –creative, deeply satisfying activities– in software development, and how terms from the Peopleware book can now be used to refer to complex concepts in everyday geek language: team jellying and furniture police, among others.

It was a very entertaining session –one could easily listen to these people all day. By the way, you can find Linda Rising (“the token female on the panel”) on InfoQ discussing, of all things, primate sex and how it relates to agile development.

Other notes on papers presented today:

  • Andrea Capiluppi & Co. presented what they claim is the first evolutionary study of agile development. They investigated the code-base of a project developed using all of XP’s practices. They found that “code complexity is low, and that the relative amount of complexity control work (e.g. refactoring) is higher than in other systems we have studied”. Most of the code in the repository (70%) was test code, and 46% of the “touches” to files were devoted to decrease complexity.
  • Jeffrey Stylos and Steven Clarke reported on a study to assess the usability of two types of constructor calls (parameterized vs. parameter-less, default constructors) in APIs. “Contrary to expectations, programmers strongly preferred and were more effective with APIs that did not require constructor parameters”. They speculate that this is because parameterized constructors force developers to commit prematurely to decisions they have not worked out yet.
  • Jayakanth Srinivasan and Kristina Lundqvist described an educational simulation game they use to teach software process models to their students. The game follows a constructivist approach, in that it enables students to discover the strengths and weaknesses of software process models for themselves.
  • Jeffrey Carver & Co. had a paper on the software development environments used by five scientific and engineering groups. Among several interesting insights, they found that these groups distrust higher-level languages, IDEs, and almost all external software, and that they value the portability and stability of older platforms and languages.
  • Ekwa Duala-Ekoko and Martin Robillard presented an approach to manage code clones instead of trying to eliminate them through refactoring. They describe “clone regions” in a manner that is independent from the exact text of the clones, and are able to track changes and simultaneously edit all clones in the “region”. Their paper won a Distinguished Paper award.
  • Another well-deserved Distinguished Paper award went to my colleagues and professors, Shiva Nejati, Mehrdad Sabetzadeh, Marsha Chechik, Steve Easterbrook, and Pamela Zave, for their work on matching and merging statecharts specifications. Congratulations!

* Although this is Day 1 of the conference, its collocated events started 4 days ago. I came here to present a paper I co-authored with Neil Ernst, Jennifer Horkoff, and Steve Easterbrook, on evaluating the comprehensibility of modeling languages, as part of the Models in Software Engineering workshop. If you’re interested, you can take a look at the paper and the presentation slides.

Categories: Academia · ICSE · Software development