Measuring something as simple as the size of a software organization turns out to be a tricky problem. It’s clear to see that while IBM, Microsoft, and Google are “large”, the company my friends and I started when we finished our undergrads was “small”. But as I will show, beyond these basic comparisons it’s hard to come up with a satisfactory metric. Consider:
First, there’s the number of members of the organization. This is perhaps the most intuitive metric we can come up with, as knowing the number of people in a group immediately tells us much about the group’s characteristics. It also tends to be a very accessible metric: get the headcount and you’re done. I’ve actually used it in the past, precisely for its intuitiveness and its accessibility. But it’s not as simple as it looks:
- Do we count all members, or just developers? If we count all members we might get in trouble. Let’s say that Beehive Inc. has 99 developers and one boss in charge of everything else, while Peacock Inc. has one developer and 99 people in charge of advertising, selling, schmoozing, and charming customers. Although both have the same number of members, in terms of software productivity they are probably quite different: perhaps Beehive produces software like mad, so it needs lots of developers, and Peacock has a single product that it just fine-tunes for its high-rolling clients, so it needs only one. And if we count only developers, what do we do with all the borderline roles, such as testers, maintainers, user interface designers, and hybrid positions?
- What counts as a member? For instance, who counts as a member of Mozilla? If we only count people that have contributed with code, we may exclude a significant fraction of the community, such as everybody that has tested and reported issues with the software but that has not submitted a patch. And if we count all bug reporters we may be stretching the meaning of the term “organization”.
- And on a related note, what counts as an organization? Should the people in the XBOX and Office divisions of Microsoft really be considered as part of the same organization? They get a paycheck from the same company, and they may be working across the street from each other, but if there’s little-to-none interaction between the groups, should we still consider them as subsets of a larger group?
So the number of members is a troubling metric. Unfortunately, the alternatives look just as bad, or worse. Let’s switch to financial metrics and say that an organization is large in proportion to its revenue. This is an acceptable metric for many sociologists, but it’s questionable for software organizations for several reasons –most notably the open source movement. And profit is not better: mammoths often sustain heavy losses for significant periods and we’d still like to classify them as large. (What would we do with organizations with gigantic financial losses if we used profit as our metric? Treat them as negatively huge?)
Perhaps an important component of what we mean when we talk about size is power. “Large” organizations exert plenty of influence and we cannot dismiss them as easily as “small” organizations. But trying to measure power is even more difficult than trying to measure size: it’s not even clear how to get started. Are we talking about power over the public sector? Over the company’s customers? Over its own employees? What happens to our metric when small companies associate and increase their power? And how do we account for the mixture of individual and organizational power? All else being equal, any “small” organization that Steve Jobs joins will become significantly more powerful (“larger?”) than if I were to join it. I can’t see how we can emerge from a discussion on power with a clearer mind than if we use other approaches to measuring size.
Some sociologists refer to size in terms of the outputs of the production process. For instance, a brickworks plant is larger if it produces more bricks. But this works much better with the manufacturing sector than with the service sector, and for software most inputs and outputs don’t make much sense. Let’s look at outputs. First, lines of code are an infamously bad metric. Function points are better, but the metric is very inaccessible: unless the organization itself is keeping track of the number of function points it produces, one can spend an extraordinary amount of time deriving the metric when all one wanted was a simple way to place the organization on a size continuum. The number of modules and number of products the organization releases are too vague to be satisfactory. And as for inputs, besides human effort (which gets us into the problems I went through a few paragraphs above when I discussed number of members), what could we use? Electricity, paper? Coffee?
We’re running out of options here, but we’re not through yet. Going back to easily accessible numbers, perhaps we could look at size in geographical terms, and claim that a software organization is large to the extent of its physical area. As most of the previous metrics, physical area makes some intuitive sense. Google’s area, including server farms and cafeterias and plastic ball pools, is quite large, whereas a basement start-up is tiny. Could we claim that a simple square feet count is an appropriate measure of an organization’s size? I don’t think so, since we run into problems we’ve struggled with before: open source projects, home offices, and the fact that a firm can cram lots of people in a space, increasing its totals on every other metric we’ve gone through, while still remaining just as large from a physical area point of view.
Nevertheless, physical area is still important, as is physical proximity. The number of sites an organization uses, as well as some calculation of their geographic distance, could give a good indication of some of the pathological consequences of size: cumbersome coordination, loss of cohesion, formalization of structure. But as our foremost size metrics they’d be very flawed.
Two more possibilities and I’m done. One has to do with the degree to which an organization has enough intellectual capital, perhaps represented as the number of specializations it masters through one or more of its members. I frankly don’t like this metric at all: it is, again, too ambiguous, and too inaccessible. But it seems to me that there is something to the insight that “larger” organizations have a greater number of experts in a variety of fields. The basement start-up may have little more than a couple of software-oriented generalists, while Microsoft has experts in internationalization, compilers, user interface, databases, security, performance, and many other domains, not to mention non-software expertise such as in law, marketing, human resources, public relations, and more. So the number of specializations metric shouldn’t be dismissed completely.
Finally, we can try to assess size through the number and kinds of external agents that an organization has to deal with. By “external agents” I mean mostly users and paying customers, but also policy makers, advocates, community members, and so on. According to this metric, an organization that only produces software to be used by itself is far smaller than one that produces software that is highly (un)popular for wide sectors of the population. If you are your only customer, your software will probably not be as complex as it would be if people from all over the world needed to use it, and this complexity in the software and its requirements would translate into a complexity of the structure of the organization that produced it.
This, like the others, is a deficient metric. Though it is somewhat accessible, it doesn’t fully match the intuition we want to capture. It would claim that a massive and massively inefficient organization would be equivalent, size-wise, with a lean, brash, innovative start-up, which may not be what we want.
So. I’ve gone through a lot of possible metrics, and I’ve listed the reasons why they’re imperfect. I suspect the reason why it’s so hard to pin down organizational size, which sounds simple in principle, is that there’s a lot of confusion about what it means to organize in the first place—I mentioned this above, and will expand on it later.
I don’t think this means that we shouldn’t use size metrics at all. John Kimberly, an organizational scientist, explored a similar thorny issue, for a more general case, and argued, back in 1976, that:
“It is perfectly possible that the important variable is not really size at all, and that this has simply been a heading under which researchers lacking any sharper theoretical perspective have lumped many variables together. Even if it is felt that jettisoning the concept would be premature, at the very least a more differentiated approach would lead to the identification of several aspects of the global construct. Researchers would be constrained to think less about the question of how big a given organization is and how this bigness might be the cause or consequence of other organizational characteristics, and more precisely about under what conditions particular aspects of size are important for what other organizational characteristics.”
Each of the aspects of size I discussed is useful for some purposes, but we need to be careful to select the one(s) that would work better for our problems, and be aware that each carries with it some flaws that may bias our results.