Naur’s “Programming as Theory Building”

A critique from Alistair Cockburn on how the agile movement is under attack from Taylorism led me to an essay by Dave West on the philosophical incompatibilities between lean and agile techniques, and this in turn led me to finally give a read to Peter Naur’s 1985 text, “Programming as Theory Building.”  (Also available in Appendix B of Cockburn’s “Agile Software Development” book, and here.) I don’t know why I had not read it earlier. Not only did I find it a brilliant example of the kind of clear argumentation that I think is missing from much software research today, I also found that it should have been a key building block of my Ph.D. thesis: for the first time since I finished it, I felt the urge to go back and tinker with it some more. Perhaps I did read it at some point, absorbed it, and forgot about it.

Naur explains what he’s after in the abstract to his paper:

(…) it is concluded that the proper, primary aim of programming is, not to produce programs, but to have the programmers build theories of the manner in which the problems at hand are solved by program execution.

The actual code that the programmers deliver is not the point of programming. That code will probably soon need to be changed again: it lives in a state of constant flux. Instead, the real goal of the members of a development team is to understand in depth the problem that they are trying to solve and the solution that they are developing to solve it. If the team builds an appropriate theory, its software will be a better fit to the context in which it will perform, and the team members will find it easier to carry out the inevitable modifications and enhancements to its software. In fact, Naur stresses the extent to which the theory is important and the code is unimportant in a pretty clear way: he claims that the code in isolation from its developers is dead, even though it may remain useful in some ways:

During the program life a programmer team possessing its theory remains in active control of the program, and in particular retains control over all modifications. The death of a program happens when the programmer team possessing its theory is dissolved. A dead program may continue to be used for execution in a computer and to produce useful results. The actual state of death becomes visible when demands for modifications of the program cannot be intelligently answered. Revival of a program is the rebuilding of its theory by a new programmer team.

Reading Naur’s paper I felt a very deep connection to the ideas I put forward in my thesis: Naur’s programmer’s theories are essentially mental models in the sense I (and many others before me) present them, and both he and I claim that the overarching goal of a software development organization is to build those models (or theories) during the life of the project. I could actually restate my thesis contributions as extensions to Naur’s sketch in two ways: first, I explore what I think is the main challenge that software team members find today: to build consistent mental models (or in the terms of the thesis, to develop a shared understanding) of the world, among potentially large groups of people, in the face of abundant, shifting, and tacit information, and unclear or exploratory goals. Second, I outline some attributes of team interaction that make such a challenge easier to overcome.

I was glad to see that several of my conclusions mirror Naur’s. He argues that programming methods (taken as sets of work rules for programmers that tell them what to do next) are unhelpful from a Theory Building view because we can’t really systematize theory production: like other knowledge construction endeavours, it is an organic process. Developers can, and perhaps should, have a set of techniques and tools at their disposal, but they are ultimately in charge of choosing the actions that will best help them build their theories at any given time. Naur also argues that documentation is not an appropriate mechanism to transmit knowledge in software projects, an observation that I explore when I discuss the differences between the Shared Understanding paradigm and the more prevalent paradigms in software research (which I named Process Engineering and Information Flow). He claims that since the main end result of a development effort is the inarticulated theory that the programmers have built, “the notion of the programmer as an easily replaceable component in the program production activity has to be abandoned,” an observation that I think is better received now than it was at the time (it was taken as one of the organizing principles of the agile movement), and that in my own analysis I labeled proportionality of action and responsibility.

I really enjoyed reading someone far smarter than I am presenting these arguments clearly and concisely. I only wonder, how is it that more than 25 years later we still need to be making roughly the same points—how is it that they still feel fresh, mainly uncharted, and in need of advocacy?

About Jorge Aranda

I'm currently a Postdoctoral Fellow at the SEGAL and CHISEL labs in the Department of Computer Science of the University of Victoria.
This entry was posted in Academia, Conceptual Models, Philosophy, Software development. Bookmark the permalink.

17 Responses to Naur’s “Programming as Theory Building”

  1. Kai Stapel says:

    Wow, interesting to keep finding more and more people sharing my understanding of what software development really is.

    I share your conclusion. Why is it that software engineers do not have a shared understanding of software development being mainly an activity of developing shared understanding? 😉

    You might also be interested in Phil Armour’s view on software development to be mainly a knowledge acquisition task. You can read more about his theory in his book “The Laws of Software Process: A New Model for the Production and Management of Software”.

    Best wishes
    Kai

  2. Neil says:

    Just reading this paper for my thesis too … quick question: what is the evidence that theory building cannot be systematized? Isn’t that exactly what pedagogy tries to establish? Now, people may have different learning styles, but teaching, say, long division is about theory building, and there are well-documented ways to induce that.

    • Jorge Aranda says:

      I think the argument is that programming is building theories about new or different domains than previously encountered—otherwise you don’t program, you reuse software. If plenty of development teams encountered the same problem as kids do when learning long divisions, we could identify the obstacles and systematize ways to address them, but they find problems far more complex than that.

  3. Neil says:

    Presumably design patterns or architectural styles represent this type of “theory re-use” then.

    Any thoughts on why it has had so little impact?

    • Jorge Aranda says:

      The paper? Probably because it deals with constructs that we don’t yet know very well how to study systematically, and probably because it argues against the mainstream view of software development as a manufacturing process.

  4. I may be criticizing thinking in 1985 more cogently than thinking about software development now, but:

    In so much of what I read, “the program” is referenced, where this seems to mean something close to “the source code” as a stored object (I’m exaggerating only a little if you take a look at others’ precise wording). Taken literally this suggests that the bottom line reality being referenced and which we are to create theories about is that stored code – you might think this was an object that didn’t move or become active in any way in between modifications of the code itself.

    Maybe I’m just complaining about sloppy talk, but IMHO that’s usually *not* the correct referent, meaning the most useful referent. Far more often what’s in question – or ought to be – is “the program” meaning instead the “program-in-execution” or “the program-as-an-executing-entity”. That is, *what the program is actually doing* (and to the extent we can predict, would do). Obviously it’s what is happening when the program is active that matters, and this is what most obviously needs to be theorized about (I hope it is not too tedious to say). But, systematically, we never seem to talk that way, and I think that’s kept us from developing some of the most useful and practical tools to aid software development.

    The best way to correct our notions about what the program could do is to first correct our notions about what the program-in-execution is in fact doing. But thinking of programs as code heaps tends to prevent making this a priority.

    I expect a common response to what I’ve just written about confusing two meanings of the word program will be “oh c’mon, we mean *both* when we say program.” But from my point of view that isn’t a solution, that’s is the problem – that we conflate the two meanings; and I happen to think that this has been a very costly conflation, keeping back systematic methodological improvements in software development for decades. (I’m an old guy.)

    • Jorge Aranda says:

      This is a good point, but I take it one step further: more than analyzing what a program is doing, we should analyze how does it take part and affect the socio-technical environment in which it is executed.

  5. Both are important – but how do you gauge it’s future effect on the socio-technical environment if you don’t know it has some loose pointers that are busily invalidating everyone’s theory of how it’s working? Plus think of all the arguments and speculation that can be cut short by better observations, sooner, of what’s actually taking place – making for a lot of roads that don’t need to be traveled by the whole team.

    If we’re going to have big projects, and we are, we do need to figure out how to work in teams, and how to work in teams well. I agree. Maybe one day, even to really scale well!

    But from my narrow perspective, it’s discouraging that we aren’t anywhere near where we should be merely in providing tools and fast feedback even to the individual programmer, especially when the program-in-execution is violating his most explicit expectations.

    In my experience, one vague comment from a programmer (which he doesn’t yet know is vague), easily multiplies on its way through a whole team as if we were all in a dysfunctional childrens’ game “Telegraph”, too often with truly ugly results. Part – I admit only part – of what makes working in teams difficult is that the individual programmers so often don’t have just ONE clear point of view themselves. I am hopeful that better tools observing execution and automatically comparing it to expectations can help there. (Although I may be behind the times here, and of course, may be too optimistic).

  6. I might also have taken a slightly different tack. I hope I’m not merely defending a very eccentric way of writing code here, but here goes:

    I programmed best (in C) when I coded massive numbers of ASSERTions amongst the code; one for every single expectation I had about every wee part of the program. Including but not limited to a filter of assertions at the start of every function listing every expectation I had about the input variables, and another such filter for the output variables just before the function ended. I didn’t just want to discover bad behavior, I wanted to find out ASAP, whenever some detail of my theory didn’t match what was actually happening in the program-in-execution. (We could do so much better now, and maybe we do for all I know, but even this crude tool was remarkably helpful.)

    This meant that two-thirds of my time was spent merely debugging assertions. Other programmers thought I was nuts, and when I finished ahead of time, with far fewer errors, particularly logic errors, they didn’t change that assumption; they just knew that I must have had some other special sort of pixie dust, or had just lied about the time put in. But it turned out that no moment spent revising my theory of how the program was – that is, how it was performing in execution – was wasted. I really *did* need to know, in great and accurate detail, what was up inside the metal box. As a bonus, buffer overflows were caught quickly, and didn’t have to be traced back very far. Not a small point in C of course. As a further bonus, I could dare to be more ambitious, knowing there was a safety net under me that would let me know when I was in fact being too ambitious.

    If every programmer in a team were experiencing the same constant check on their expectations and their understanding of the program, surely that just has to help the team, and the communication within the team.

    I’m sure you would agree that GIGO goes for teams too. The environment, and the team, can’t competently handle, or even understand what they aren’t yet aware of. One programmer unknowingly but systematically providing false information (or assumptions labeled as facts) can obviously bring a team to their knees, backtracking through every assumption and every bit of apparent “knowledge” and theory, looking for the one loose brick, as it were. Which is to say that having teams means that it’s even more important that every single team member have great intellectual integrity: integrity enforced by tools that are automatically (and constantly) measuring his or her expectations against the actual execution. This is perhaps not a panacea, but I do think it’s a necessary condition for large teams to be functional, project after project.

    The power of mechanically and repeatedly checking all expectations to the extent possible is obviously highly synchronous and proximate. Why find out your model or theory is wrong next month when you can find out today, without leaving your seat? I think it fulfills proportionality by tending to keep problems from spreading beyond one programmer. Maturity: well, it makes it far more likely that most of the unfamiliar gets flagged early and pinched off early – which should keep team discussions on more familiar, and mutually important, ground.

    Much of my knowledge of programming is out of date. I just wish any experience with the difficulty of successfully coding large projects, or the difficulties that teams of programmers encounter, were as much out of date. I suspect from what I read, that it’s not.

    Theories are as good as our ability to test them moment-to-moment, in my experience, and I have trouble understanding why more effort hasn’t gone into creating tools that make execution more transparent to the programmer, and providing automated checks of programmers’ detailed expectations.

  7. Pingback: My new gig at UBC « Semantic Werks

  8. John Keklak says:

    Your effort to promote Naur’s idea of “programming as theory building” is of great service. Even after decades of software development, it seems there is still little understanding of what programming really involves. This apparent lack of understanding is evidenced in the parade of fads that profess that developing software can be turned into a systematized process that produces optimal results if only its prescription is followed. It seems much work remains, given that one of the comments to your post contains the following: “…quick question: what is the evidence that theory building cannot be systematized?”.

    The main point I’d like to address here is Naur’s notion of “theory”. You write,

    “If the team builds an appropriate theory, its software will be a better fit to the context in which it will perform, and the team members will find it easier to carry out the inevitable modifications and enhancements to its software.”

    Naur bases his concept of “programming as theory building” on Ryle’s definition:

    “…a person who has or possesses a theory in this sense knows how to do certain things and in addition can support the actual doing with explanations, justifications, and answers to queries, about the activity of concern.” [Naur, 1985]

    The term “theory”, as used by Naur (and originally by Ryle), refers not to something programmers try to find or to optimize, but rather to knowledge of decisions that accumulates as programs are written. Nothing requires these decisions to be particularly “good”. By Naur’s definition, code is “dead” when this knowledge is lost. [Naur, 1985]

    Once made, each decision comprises a bit of knowledge in the possession of the programmer who made the decision, and perhaps in the possession of others with whom the programmer has shared this information. (In the vast majority of cases, however, this information is known solely to the single programmer who created it.) This knowledge goes by a number of terms, such as “the thinking behind the code”, the “concepts underlying the code”, the “mental model”, and Naur’s/Ryle’s “theory”.

    Thus “theory” in Naur’s and Ryle’s sense is largely a kind of collected folklore, not rigorous mathematical constructs or carefully-considered designs.

    Source code is typically rife with incomprehensible statements based in such folklore. For instance, a line of code may append a space character to a string for seemingly inexplicable reasons. The thinking behind this code might be, “I’ll append a space character to this string here, so the code below — which assumes this string always has one or more characters — does not crash.” Perhaps this “hack” is rationalized with, “The space character will not cause any problems, since code later on strips off all white space from the string before it is used for further processing.”

    As easy as it may be to criticize such choices with a “holier than thou” attitude, it is necessary to admit that such reasoning occurs routinely. Moreover, it is clear that keeping a large software development organization informed of all such nuggets of reasoning is a Herculean task (while it is much easier to proliferate knowledge about rigorous mathematical constructs and carefully-considered designs).

    The Naur/Ryle definition of “theory” gives us an insight that might lead us to improve programming practices somewhat. For instance, we might pose the question: “How might we reduce the frequency with which ‘hacks’ occur?” Two remedies come to mind: (a) code reviews with the specific goal of identifying questionable decisions, and (b) training for programmers to use more professional judgment in decision-making.

    It seems such measures will lead to cleaner “theory”, and thus will reduce the frequency of anguish felt by programmers who encounter code based on arcane reasoning. It will also reduce the volume of obscure “facts” underlying code that need to be remembered and communicated. Although these measures are unlikely to substantially increase the speed of software development, a “silver bullet” as it were, they seem worthwhile nonetheless.

    (1) Naur, P. Programming As Theory Building, 1985, reprinted in Computing: A Human Activity — Selected Writings From 1951 To 1990, ACM Press/Addison-Wesley, New York, 1992.

  9. Perhaps the following two links may be of interest to readers of your post on Naur:

    + A short bio of Peter Naur on the ACM site of Turing Award winners: http://amturing.acm.org/award_winners/naur_1024454.cfm

    + An extensive conversation with Naur in 2011:
    http://www.lonelyscholar.com/node/7

    best wishes,
    Edgar

  10. Pingback: Evidence in Software Engineering | Semantic Werks

  11. Anyone who is serious about constructing a theory for a problem domain should use the literate programming paradigm.

  12. marvap says:

    Yes. That’s the point. Thanks. Todays theories of ‘better programming’ like TDD or SOLID principles aim to conding itself overlooking the fact that not the code is the matter, but the idea behind it.

Leave a comment