You’re probably familiar with Wordle. It’s a neat application that picks up the most common words in a text and arranges them in a pretty word cloud. As a toy, it’s quite fun.
But there’s an idea seeping in among the blogs I read, suggesting that Wordle clouds are actually useful as a communication tool. For instance, here is a flyer produced for a talk by Deepak Singh. He liked it a lot:
They used my delicious feed to build a tag cloud using Wordle. The results are wonderful and give you a better insight into things that I get interested about that I could have told you in 5 minutes myself.
Michael Nielsen liked it too, wondering “if only we could replace all speaker bios this way…”
I, however, thought it shows much of what is wrong with our ultra-caffeinated Web 2.0. Looking at the flyer, I can guess that Deepak is interested in a number of topics: the semantic web, bioinformatics, programming, science, etc. And yet I can’t guess, at all, what are his opinions on any of these topics, nor how he relates them. What does he think of the state of the art of bioinformatics? Is he particularly interested in applying cloud computing to healthcare systems? I have no idea. It’s as if the word cloud processed Deepak’s ideas to take out all of their nutritional value and gave me just the pure sugar: exciting, yummy, and devoid of nourishment. Oddly, the title of his talk (“Science Big. Science Connected”) fits the flyer; I picture myself reading it as a caveman or as an ape, jumping all over at the sight of big, pretty, meaningless words, basking at the glory of scientific progress.
More recently, some students at my Department have been producing Wordle clouds of their Masters’ Theses. Here’s Christian Muise’s, and here’s Neil Ernst’s, for instance. The result is just as problematic; I guess Christian studies something to do with implicants and Neil built something called “Jambalaya”, but I can’t go any further.
To compare, here are two summarizations of my Masters’ work. The first is a Wordle cloud of my 2005 thesis, the second is the abstract of the paper that goes along with it:

Anchoring and adjustment is a form of cognitive bias that affects judgments under uncertainty. If given an initial answer, the respondent seems to use this as an ‘anchor’, adjusting it to reach a more plausible answer, even if the anchor is obviously incorrect. The adjustment is frequently insufficient and so the final answer is biased. In this paper, we report a study to investigate the effects of this phenomenon on software estimation processes. The results show that anchoring and adjustment does occur in software estimation, and can significantly change the resulting estimates, no matter what estimation technique is used. The results also suggest that, considering the magnitude of this bias, software estimators tend to be too confident of their own estimations.
And a second comparison (which is a blatant plug to my recent work with Gina Venolia, to be presented at ICSE this May). Here is the Wordle cloud of “The Secret Life of Bugs”, and the abstract of the corresponding paper:

Every bug has a story behind it. The people that discover and resolve it need to coordinate, to get information from documents, tools, or other people, and to navigate through issues of accountability, ownership, and organizational structure. This paper reports on a field study of coordination activities around bug fixing that used a combination of case study research and a survey of software professionals. Results show that the histories of even simple bugs are strongly dependent on social, organizational, and technical knowledge that cannot be solely extracted through automation of electronic repositories, and that such automation provides incomplete and often erroneous accounts of coordination. The paper uses rich bug histories and survey results to identify common bug fixing coordination patterns and to provide implications for tool designers and researchers of coordination in software development.
In both cases, the second version actually gives you some information beyond “estimation stuff” or “bugs stuff”; it’s not the full story but perhaps it’s enough to tell you whether you should read it or not.
There are some cases where playing around with Wordle may be a bit illuminating. I once heard that the most common words in Bernal Diaz del Castillo’s “The Truthful History of the Conquest of New Spain“ were God, blood, and gold. This is probably false, though it is still a pretty good symbolic summary of the Conquest. But in general we shouldn’t fool ourselves thinking that disconnected pretty words are a good substitute for actual sentences and ideas.