Catenary

Entries from October 2006

Donald Duck gets involved in Mexican riot

October 30, 2006 · 1 Comment

I’ve been following anxiously the riots in Oaxaca, and the latest takeover of its main plaza by the federal police. I’ve been hoping for a clean way out of the conflict, though I know the chances are pretty low.

Anyway, just found this picture of some protesters in Mexico City:

Machete protester

Must be a bit hard for a tough machete-wielding frontman to get some respect while wearing a cute Donald Duck t-shirt, but I guess he’s just got a lot of self-esteem.

(Photo from reforma.com -site in Spanish).

Categories: Off Topic

Because they can – Democamp Toronto 10

October 24, 2006 · 1 Comment

MaRS entranceIf there was a theme in yesterday’s Democamp 10, it was people building stuff just because they could. Most demos were of software that doesn’t really fulfill a need, but their developers felt it would be cool to do it.

The first demo was an exception to this, and one of the high points of the night. Sana Tapal and Andrey Petrov, from the University of Toronto, presented their Online Marking tool, which allows Profs and Teaching Assistants to mark students’ code through a browser. You can highlight a piece of code and add comments to it, which really has more applications than just assignment grading – code reviews, for instance, could really benefit from this. Jason Doucette apparently liked it too.

We then heard the people from 3 terra present Quotiki.com, a website to find famous quotes, select your favourites, and vote for them a la digg.com. Nicely done. Part of the functionality they built is frankly strange for a quotes website: you can subscribe to other people’s favourite quotes RSS channel and see what pearls of wisdom they’ve been thinking about. Why would someone want to do this is beyond me, but if this is just what you always dreamed to use, the website is already fully functional and you can give it a go.

Unfortunately, the third presenter, from Broken Tomb (“the world’s first commercial Smalltalk host”), had plenty of problems connecting to his server, and had to face the uphill battle of convincing 100+ people, with no real working demo and no slides, that they can use Smalltalk with Seaside for web development just as easily as they can use Turbogears or Rails. I did not envy him.

Jonathan Lung, again from U of T, was next presenting PBJ, some sort of Javascript framework he built as a side project over a couple of weekends. He pulled of something I wouldn’t dream to do: programming live in front of all of us, while presenting. Very cool performance, but other than the demo he’s not promoting or polishing his framework anymore -he seems to have done it for the sake of it.

Finally, the naturally-caffeinated Sacha Chua dazzled everyone with her Livin’ la Vida Emacs demo. She’s a great speaker and it was nice to see the fireworks of her using Emacs (a very extensible text editor) as a therapist, a text-to-voice converter, a powerpoint-substitute, a life organizer, and a Towers of Hanoi solver. Why would anyone go through the trouble of extending Emacs to do all of this? I don’t know – because they can, perhaps. When you have a hammer everything looks like a nail, but if you use the hammer as well as Sacha, that probably doesn’t matter so much.

Categories: democamp · torcamp

Cheap shots at the Gartner Hype Curve

October 22, 2006 · 8 Comments

The Gartner Hype curve, or Hype cycle, summarizes the visibility and the maturity of currently hot technologies and forecasts the productivity they will have. At both of the workshops in CASCON that I went to, presenters showed us the most current curve, pointing out that web 2.0 is currently at the “peak of inflated expectations”. They claim we should expect it to descend to the “trough of disillusionment”, only to see it triumph in its recovery through the “slope of enlightenment” and, ultimately, the “plateau of productivity” (click on the image for a better view).

Gartner Hype Curve for 2006
Apparently, this curve is the distillation of thousands of hours of work of expert forecasters and technologists. This is expensive work – the Gartner Group charges US$495 for a 16 page document that helps to understand it. However, the underlying idea is pretty simple: things get hot before they mature, and it’s only after a technology goes through a period of disappointment that we truly learn how to apply it.It’s an intuitive concept, but I found the curve fallacious and untrustworthy for two reasons:

Irrational optimism: The curve tells you that, no matter how wacky your technology is, and how unachievable its goals, after it fails to live up to its hype things are gonna get better, always! You’ll see the light at the end of the bad-press tunnel.

I find this happy ending scenario very implausible, partly because some proposed technologies do simply crash without recovering, and partly because forecasters have mistaken their job for that of cheerleaders in the past. The late Otto Eckstein, from Data Resources, once told the Wall Street Journal that “Data Resources is the most influential forecasting firm in the country… If it were in the hands of a doom-and-gloomer, it would be bad for the country.”

Disappearing acts: If you compare the curve from 2005 (below, click for better view) with the most recent one from 2006, you’ll see a number of technologies that have simply fallen out of the radar. SOA is gone. Videoconferencing is gone. Podcasting is gone. Are they past the plateau? Are they not worth a mention? Other technologies appear in 2006 out of nowhere, such as the Smartphone, which is already safe in the “slope of enlightenment” seemingly without any hype.

Gartner Hype Curve for 2005
I wish someone had been keeping score of the effectiveness of Gartner’s predictions so we could tell how skeptic should we be of their most recent curves. Anyway, the hype curve inspired me to create my own, and I’d now like to offer to the public domain something I call the “Aranda Ignominy curve”, which elegantly conveys my deep wisdom and predictive powers. A 2-page document explaining how to read it is available at the discount introductory rate of US$995. Let me know if you’re interested.

The Aranda Ignominy Curve
Aranda Ignominy Curve

Categories: Hype · Strategic Forecasting

CSER and CASCON

October 19, 2006 · 4 Comments

This past Sunday and Monday I went to a meeting of the Consortium for Software Engineering Research (CSER). Popular topics there were empirical software engineering, research ethics, diagnostics, and models and visualization. There were a couple of talks from Peggy Storey and Ian Bull, from the University of Victoria’s Chisel Group, which has built some very cool tools for information visualization that I’ll talk about later. From our own group, Greg Wilson presented Dr Project as a tool to manage undergraduate software teams, which sparked an interesting discussion on student data collection for research. Aaand I presented one of the 22 student posters that graced the reception.

The rest of my week was for CASCON, a free conference put together by the Center for Advanced Studies at IBM. There I went to two social computing workshops – for me, the highlight of the first was Joey deVilla’s refreshing no-powerpoint, democamp-style speech on “Failures 2.0″ (“go out there and flop!”, he yelled); and I very much enjoyed the second workshop’s panel discussion between web 2.0 evangelists and some healthily skeptic audience members. The downside, however, is that I’m still numb from drinking so much web 2.0 kool-aid. My delicate brain must have heard the term (and its crazy Enterprise 2.0, Office 2.0, and Collaboration 2.0 variants) literally hundreds of times. Yuk.

Categories: General · Information visualization · Software development

Ben Shneiderman on Creativity and Visualization

October 13, 2006 · 2 Comments

Ben Shneiderman, a professor at the University of Maryland’s Human-Computer Interaction Lab and author of Leonardo’s Laptop, gave a talk at Ryerson University yesterday and at the University of Toronto today, on two different topics:

At Ryerson he talked about creativity support tools. I was a bit frustrated by his approach to the topic: the creativity segment of the talk was inconclusive, and although Shneiderman stressed the importance of careful empirical case studies to evaluate tools (with which I fully agree), he never really described them nor the way his team addressed the challenges of doing good empirical work in the area.

Treemap

The talk at the University of Toronto, on visualization of high dimensional information (which I talked about last week) was more satisfying. The visualization tools Shneiderman presented (mostly the same ones as in Ryerson, but with more time -or so I felt) are really, really appealing – very polished in comparison to most academic prototypes. I’ll cover them in future posts, but if you want a glimpse, you can check the already famous Treemaps (with applications in the stock market) and the Hierarchical Clustering Explorer, which is way more impressive than this link suggests.

The demos were fun, but since they took most of the talk’s time there was not a lot of room to discuss the principles behind them. Shneiderman didn’t really explain the theory that supports the tools, the criteria used to evaluate their effectiveness, nor the limits of these approaches. I actually got the feeling that the tools were built at least partially on a hunch, not on an information visualization theory, and that there isn’t an articulated theory behind some of them, but I might be wrong. I’ll expand on this when I learn more.

Categories: External cognition · Information visualization

Tufte’s “The Visual Display of Quantitative Information”

October 12, 2006 · 2 Comments

Tufte Visual Display coverIt feels a bit strange to discuss a book that you probably have heard about already if you cared about the topic, and if you haven’t heard of it, you might not care anyway. Still, I did not know about Edward Tufte until I started grad school, when I read a couple of his books and his short demolition of Powerpoint, and I imagine there’s some use in spreading the word a bit more.

Some months ago, Tufte published his new book, “Beautiful Evidence”. He also offered a package of his four books for sale at a slight discount, and I couldn’t resist the temptation, went ahead and bought the bundle. I’ll be talking about the books as I read them.

Minard Napoleon campaign“The Visual Display of Quantitative Information”, which deals precisely with what its title suggests, has a huge reputation behind it. It’s been labelled “a visual Strunk and White” and a “timeless classic”. With some caveats, I think the praise is well deserved. Tufte has an unrelenting clinical eye for quality in graphics, and throughout the book he shares his judgments with the reader, explaining what is it that makes Minard’s graphic of Napoleon’s Russian campaign so effective, and what makes so many of our newspapers’ graphics so lame. His guided tour through the history of displays of quantitative information is entertaining and tremendously insightful.

I’ve heard people accuse Tufte of vagueness in his advice, of basically arguing that to create appealing displays you need to (a) become a genius like himself, and (b) then everything else will follow. That might be a fair criticism in other books of his –I’ll get there eventually-, but not this one. Here you get advice to choose, for example, between tables and charts, guidelines such as “Show data variation, not design variation”, and “The number of information-carrying (variable) dimensions depicted should not exceed the number of dimensions in the data”, which after being understood could easily be applied by us mortals.

I’m actually worried of the opposite: Tufte here has a set of principles that he follows until their last consequences, and I’m not sure this is always advisable. I’m particularly thinking of his “erase non-data ink” principle. Essentially, the advice boils down to “eliminate everything in a graphic that does not convey data”; and for most graphics that’s a desperately needed advice. But it seems you can go too far in this direction too. For example, Tufte transforms “range bars”, which show the four quartiles of a distribution, emphasizing the middle two and the median:

Range bar

into “quartile plots”:

Quartile plot

that eliminate non-data ink, but eliminate as well the ease to locate the median and the perception of the bulk at the middle of the distribution. Waste of ink should not be our priority – frankness and waste of human attention should be, and we will address this by maximizing comprehensibility, not necessarily by eliminating non-data ink.

The book feels slightly dated for one reason. In these past few years, the capabilities of devices for printing and displaying information that are available for the general public have increased dramatically. We can now use thinner lines and subtler colours to create graphics that, only a decade ago, required professional equipment, blunter lines, and sight-damaging primary colours. Whether we do use the capabilities that we now have available, however, is up to us.

It is actually disheartening, considering the exposure this book has had over the years, to see how little our commercial packages have progressed in following Tufte’s guidelines. I have a personal grudge against Microsoft Excel in this regard. Excel must be the most powerful enabler of graphic disasters in the world. Most people don’t have the time, or the dedication, or the skill, to improve Excel’s default graphic settings. So even the simplest data tables become hideous charts with a couple of clicks:

Bar Chart

And if we’re feeling creative, Excel kindly allows even more terrible beasts with the same number of clicks: 3-D bar charts! 3-D pie-charts!

Bar Chart 3D

Pie Chart

Tufte rightly calls these “chartjunk”, and we are being fed this junk constantly. If you want to help stop this, you should definitely check out this book and follow his advice. The price tag is slightly high, but it’s worth it.

Categories: Books

Manufactured Landscapes – Beautiful nightmares

October 6, 2006 · 3 Comments

Burtynsky - Three Gorges

Manufactured Landscapes is a documentary film that follows photographer Ed Burtynsky at work, while he shoots fascinating pictures of the underbelly of our beast: industrial waste, gigantic manufacturing plants, oil wells, shipyards and mines. Most of the pictures concern China’s extraordinary recent growth and its ecological and social consequences.

Burtynsky doesn’t attempt to polemize. He claims he just portrays our interactions with the rest of nature, as they are -with the same detachment with which we would observe a beaver dam or a bird’s nest. It’s the viewer’s role to judge them. And I must say I had a very hard time processing his images: On one hand, I can’t help but think how absolutely gorgeous they are; on the other hand, it is depressing and horrifying to see this is who we are and this is the footprint we, as humans, leave on the planet.

The film premiered at the Toronto International Film Festival, and it is now showing at several theatres in Toronto, at least during this week. It was well received, so I guess the chances of it running on other places as well are pretty high.

Ed Burtynsky - Plant

Categories: Off Topic

Fun with representations V – Maps of the abstract world

October 4, 2006 · 10 Comments

Representing information means mapping it into a particular medium –focusing on certain elements of the original data, ignoring the irrelevant ones, and, ideally, simplifying the process of understanding and using it. Unfortunately, our resulting information ‘maps’ are sometimes inappropriate: they may be ambiguous, unintuitive, or downright misleading. To illustrate what I mean, here are some examples of good and bad mappings:

Numbering systems: Our Arabic number system is tremendously effective. It’s not just that there are only ten digits to learn. Assessing magnitudes is easy: a simple glance to a number can tell us if we’re talking about a small (3) or a large (320150297) quantity. And since we have ten fingers in our hands, it’s natural for us to count in base 10.

In comparison, Roman numerals suck (is MXLI less than CCCXLXXI?). They’re only really useful to sound pretentious, as in my title above. And our standard scientific notation, although it uses Arabic numerals, can be very misleading to novices: 3.0×10^1 is humongous when compared to 3.0×10^-10, yet it’s difficult to conceptualize the difference in magnitudes.

Spatial, n-dimensional information: It’s much easier to represent two-dimensional information, such as the Physics problem below, in diagrammatic form rather than as sentences.
Pulley problem
The equivalent textual problem statement would have to be several lines long (“There are two masses hanging on a frictionless pulley located at the top of a…”), and it can’t convey the simplicity of the picture. The diagram gives an instantaneous overview of the problem’s information that is perhaps impossible to beat with text. However, things begin to mess up at three dimensions. Is this structure
Cube
representing the left or the right cube below?
Cube 3 Cube 2

For complex three-dimensional structures, a two-dimensional representation is rarely satisfying. Incidentally, this three-dimensional ambiguity is the central motive in many optical illusions and Escher drawings such as this one, Belvedere:
M.C. Escher - Belvedere
Geographical maps are effective at conveying spatial information because, even though they are a two-dimensional representation of a three-dimensional world, data on the height of landmarks and streets is not normally necessary. But we sometimes require three-dimensional information to make sense of our environment, and maps won’t capture this effectively. For example, in Toronto, many people use the CN Tower as a compass of sorts (“the tower is to my right, so I’m facing East”), yet street maps don’t give it the relevance we do.

Can we represent more than three dimensions in a two-dimension diagram? Not really –we can try, but the results (by using colours, animations, etc.) are never entirely satisfactory, even for a small number (four, five) of dimensions. Mapping high-n-dimensional structures in a diagram in two dimensions is impossible by all practical means.

ClefMusical notation systems: We have a notation system to represent musical compositions that is quite effective for Western music. Since Western music is based on a 12-tone scale, with relatively rigid rhythms, the notation makes a pretty good job at recording this information. Unfortunately, it’s useless to describe some kinds of more flexible, traditional Asian and African music. People have attempted to represent these in other systems, with varying degrees of success. For some types of music, repetition and imitation are still the most successful strategies to pass on musical knowledge.

Phonetic notations: One of my early embarrassments when learning English was finding out that recipe is not pronounced as recite. And I was already in Toronto while still calling wheat with the same termination as sweat, not as sweet (so asking, naturally, for wheat bread always led to puzzled looks). Written English is terrible as a phonetic notation. Spanish is much better, but it still doesn’t help me when I try to pronounce non-Spanish sounds, such as those in even the most basic Polish (czesc!) or Chinese (xie xie) words.

In comparison, the International Phonetic Alphabet is precise and comprehensive. Once you learn it (not an easy task!) you can use it to find out, exactly, how to pronounce words in practically any language.

I write about all of these examples because in my very own Software Engineering field, there is a strong community convinced that models and diagrams are the best way to represent software constructs. We have modelling languages to represent almost any software-related concept you can think of: objects, scenarios, states, classes, goals, beliefs, design rationales, threats, risks, you name it. What we don’t have is any real indication that our diagrams map satisfactorily to constructs in the world.

The problem is that we’re dealing with very abstract, very difficult to represent concepts, not with the two-dimensional structures of High School Physics problems. Where did we get the idea that use-case diagrams are an appropriate high-level mapping of human-computer system interactions? What leads us to believe that goal analysis diagrams are an accurate depiction of the real goals of stakeholders? Is a sequence diagram really better than pseudocode to represent the logic of a scenario? Simply put, there is no convincing evidence justifying any of these beliefs.

(Ontological analyses help us address these issues, by pointing out, for instance, that a weak point of entity-relationship diagrams is their difficulty at expressing entities with fuzzy boundaries and non-entities (such as fluids, thoughts and intentions). But ontological analyses do not answer the question of whether a representation appropriately conveys what it should convey to other humans –such as entities in the entity-relationship diagram case.)

I am not claiming that software engineering diagrams are inadequate mappings to the real world. I’m claiming we don’t know, that –considering we’re using these diagrams as communication artifacts– we should know, and that we’re giving too little thought to these matters in our community.

Categories: Cognition · External cognition · Software development · XCog