Representing information means mapping it into a particular medium –focusing on certain elements of the original data, ignoring the irrelevant ones, and, ideally, simplifying the process of understanding and using it. Unfortunately, our resulting information ‘maps’ are sometimes inappropriate: they may be ambiguous, unintuitive, or downright misleading. To illustrate what I mean, here are some examples of good and bad mappings:
Numbering systems: Our Arabic number system is tremendously effective. It’s not just that there are only ten digits to learn. Assessing magnitudes is easy: a simple glance to a number can tell us if we’re talking about a small (3) or a large (320150297) quantity. And since we have ten fingers in our hands, it’s natural for us to count in base 10.
In comparison, Roman numerals suck (is MXLI less than CCCXLXXI?). They’re only really useful to sound pretentious, as in my title above. And our standard scientific notation, although it uses Arabic numerals, can be very misleading to novices: 3.0×10^1 is humongous when compared to 3.0×10^-10, yet it’s difficult to conceptualize the difference in magnitudes.
Spatial, n-dimensional information: It’s much easier to represent two-dimensional information, such as the Physics problem below, in diagrammatic form rather than as sentences.
The equivalent textual problem statement would have to be several lines long (“There are two masses hanging on a frictionless pulley located at the top of a…”), and it can’t convey the simplicity of the picture. The diagram gives an instantaneous overview of the problem’s information that is perhaps impossible to beat with text. However, things begin to mess up at three dimensions. Is this structure
representing the left or the right cube below?
For complex three-dimensional structures, a two-dimensional representation is rarely satisfying. Incidentally, this three-dimensional ambiguity is the central motive in many optical illusions and Escher drawings such as this one, Belvedere:
Geographical maps are effective at conveying spatial information because, even though they are a two-dimensional representation of a three-dimensional world, data on the height of landmarks and streets is not normally necessary. But we sometimes require three-dimensional information to make sense of our environment, and maps won’t capture this effectively. For example, in Toronto, many people use the CN Tower as a compass of sorts (“the tower is to my right, so I’m facing East”), yet street maps don’t give it the relevance we do.
Can we represent more than three dimensions in a two-dimension diagram? Not really –we can try, but the results (by using colours, animations, etc.) are never entirely satisfactory, even for a small number (four, five) of dimensions. Mapping high-n-dimensional structures in a diagram in two dimensions is impossible by all practical means.
Musical notation systems: We have a notation system to represent musical compositions that is quite effective for Western music. Since Western music is based on a 12-tone scale, with relatively rigid rhythms, the notation makes a pretty good job at recording this information. Unfortunately, it’s useless to describe some kinds of more flexible, traditional Asian and African music. People have attempted to represent these in other systems, with varying degrees of success. For some types of music, repetition and imitation are still the most successful strategies to pass on musical knowledge.
Phonetic notations: One of my early embarrassments when learning English was finding out that recipe is not pronounced as recite. And I was already in Toronto while still calling wheat with the same termination as sweat, not as sweet (so asking, naturally, for wheat bread always led to puzzled looks). Written English is terrible as a phonetic notation. Spanish is much better, but it still doesn’t help me when I try to pronounce non-Spanish sounds, such as those in even the most basic Polish (czesc!) or Chinese (xie xie) words.
In comparison, the International Phonetic Alphabet is precise and comprehensive. Once you learn it (not an easy task!) you can use it to find out, exactly, how to pronounce words in practically any language.
I write about all of these examples because in my very own Software Engineering field, there is a strong community convinced that models and diagrams are the best way to represent software constructs. We have modelling languages to represent almost any software-related concept you can think of: objects, scenarios, states, classes, goals, beliefs, design rationales, threats, risks, you name it. What we don’t have is any real indication that our diagrams map satisfactorily to constructs in the world.
The problem is that we’re dealing with very abstract, very difficult to represent concepts, not with the two-dimensional structures of High School Physics problems. Where did we get the idea that use-case diagrams are an appropriate high-level mapping of human-computer system interactions? What leads us to believe that goal analysis diagrams are an accurate depiction of the real goals of stakeholders? Is a sequence diagram really better than pseudocode to represent the logic of a scenario? Simply put, there is no convincing evidence justifying any of these beliefs.
(Ontological analyses help us address these issues, by pointing out, for instance, that a weak point of entity-relationship diagrams is their difficulty at expressing entities with fuzzy boundaries and non-entities (such as fluids, thoughts and intentions). But ontological analyses do not answer the question of whether a representation appropriately conveys what it should convey to other humans –such as entities in the entity-relationship diagram case.)
I am not claiming that software engineering diagrams are inadequate mappings to the real world. I’m claiming we don’t know, that –considering we’re using these diagrams as communication artifacts– we should know, and that we’re giving too little thought to these matters in our community.