How can diverse ways of thinking contribute to the ability of a collection of people to make accurate predictions and forecasts? This predictive phenomenon is sometimes called the “wisdom of crowds.” It exists or occurs when a crowd of people is more accurate than the people in it.
In examining the wisdom of crowds, we can see that it depends on two things: Talent—good predictors—and diversity. Both talent and diversity matter in equal parts. We’re going to learn to make this argument in a formal way using some pretty straightforward mathematics.
Before we get to that mathematics, let’s use an example with cattle. Cattle are large creatures that are bought by the pound. When I bought my cattle, I paid a price per pound, selling them a few months later after fattening them up on Iowa pasture land, and I auctioned them off again at a price per pound. If you are going to trade in cattle, you have to be able to estimate their weight. It turns out, one of the most famous examples of the wisdom of crowds involves guessing the weight of cattle.
An Astonishing Example of the Wisdom of Crowds
This example is due to Sir Francis Galton, and the story opens Jim Surowiecki’s book The Wisdom of Crowds. Surowiecki tells us that the great scientist Francis Galton had collected some data from the 1906 West of England Fat Stock and Poultry Exhibition; 787 people at this exhibition guessed the weight of a steer. Their average guess was 1,197 pounds. The actual weight of the steer: 1,198 pounds.
787 people at this exhibition guessed the weight of a steer. Their average guess of that weight was 1,197 pounds. The actual weight of the steer: 1,198 pounds.
Amazing, right? That’s why Surowiecki’s book is called The Wisdom of Crowds. It’s incredible. Let’s not get too excited because Galton’s cattle example is a one-shot case, it is a single example. By no means does that imply that in every case a crowd is going to be incredibly accurate like that. In fact, crowds often make big mistakes. They can be bad as well. Just like individuals can make mistakes so can crowds.
This is a transcript from the video series The Hidden Factor: Why Thinking Differently Is Your Greatest Asset. Watch it now, on Wondrium.
Are Crowds Better at Making Decisions All the Time?
But, the weight of the evidence, not just from cattle guessing or jelly bean jar contests, and stuff like that, but from the trenches, from business and policy worlds, is that though Galton’s example has this big wow factor, there is a grain of truth: Groups tend to be more accurate at prediction than individuals. They are more likely to be wise, at least in a predictive context. It is a generally accepted fact that crowds have greater accuracy than individuals, but that they can also be horribly wrong resonates with my own experiences as well.
For a decade now, students in my classes make predictions on everything from my weight—almost always within a pound—to the number of floors in the tallest building in Rio. Last year they were off by fewer than two floors. They even guess the number of chairs in a coffee shop that was just opening. They were off by only three. I also had them guess the height of the Saturn V rocket. They were off by 1,000 feet the first time, and the number of pizza places in Ann Arbor—and there they were off by a factor of two. They were off by nearly double the amount. Later on, I taught them how to make better estimates, and they got within just a few percentage points on both of those questions. They became a wise crowd.
Crowds are able to make more accurate predictions most of the time, but sometimes they can be way off. What we want to do is make sense of this and to understand when can crowds predict.
Crowds can make more accurate predictions most of the time, but sometimes they can be way off. What we want to do is make sense of this by understanding when crowds can predict accurately. When are they wise and when are they not? The key to this is going to be that diversity plays a big role.
The Value of Diversity
Let’s consider two mathematical results. The first is going to show that the collective accuracy of a crowd depends in equal measure on the accuracy of its members—that’s talent—and on their diversity. More diversity is going to be better. The second result is going to be a corollary of the first and that’s going to say that a diverse crowd will always be more accurate than its average member, not sometimes, but always. The crowd is always more accurate than the people in it.
Learn more about the importance of diversity
As a way to introduce the formal statistical terminology needed to state these mathematical results, we’ll look at a fairly simple example. Suppose you have three people: Amy, Belle, and Carlos, and they are making predictions regarding the number of new clients that their firm is going to attract in the next year. We are going to work through this example to build some understanding of how the statistics work.
The Crowd’s Prediction
Our three people, Amy, Belle, and Carlos, have each predicted the number of new clients. Amy predicts 12, Belle predicts six, and Carlos predicts 15. We first want to figure out the crowd’s prediction. The crowd’s prediction is going to be an average of these three predictions. If we sum these up, we get 33; the crowd’s prediction is equal to 11, which is the average of the three people.
Let’s suppose that the actual number of new clients turns out to be 10. This example is set up so the crowd is pretty accurate—it is only off by one, but they are not perfect. They’re a smart crowd but not an incredibly wise crowd. What we want is we want some way of measuring the accuracy of these individuals as well as the accuracy of the crowd. How do we do it?
The Accuracy of Individuals and Crowds
This is a problem statisticians have thought about for a long time. They typically do it by taking the difference between the actual prediction and the true value and squaring that amount. They call the result the squared error. In the case of Amy, we take 12 − 10, which is the true amount, and square that to get a total of four.
Why do we square it? Here is the simple reason: Suppose one person had an error of plus five and another person had an error of minus five. If we added those up we would get zero, or, we would get no error. Think of it this way. Suppose I am shooting arrows and I shoot one high and then one low. I can’t average those and then say “bull’s eye!” What I do is I want the distance from the center. What squaring does is it gives us the distance.
Let’s compute the errors for the other people as well. Let’s look at Belle. Belle predicts six, the actual value is 10 so (6 − 10)2, is going to be 16. What about Carlos? Carlos predicted 15, (15 − 10) is five, and if we square that we get 25. Amy is off by four, Belle is off by 16, and Carlos is off by 25. If we average these things up, we can average the squared errors and we are going to get 41 + 4, which is 45. We are going to get an average of 15. The individual squared error, in this case, is 15. What is the crowd’s squared error? Well, this is easy. Remember the crowd guessed 11, so 11 − 10 is one, and 12 is just one. The crowd’s squared error is one. What we get is the crowd is more accurate than the individuals are on average. Notice, the crowd is also more accurate than anybody in the crowd.
That isn’t always going to be true. Sometimes someone in the crowd can be more accurate than the crowd—but the former always will be. The crowd will be more accurate than the average member in it. Rember this point for later. At the moment, let’s focus on the idea that we have individuals who make mistakes and the crowd who makes mistakes. The crowd is smarter than the individuals in it.
How to Determine Diversity
Next, here is what we need to do. We need some way to think about why the crowd is smarter. What is making the crowd good? One thing I figured out is the accuracy of the people, their squared error; I need some way of measuring the diversity of the crowd. How different are those people? Here again, we can go to statistics. We have a standard approach. What we do is instead of computing the difference between people and the true value, we look at the variation in the prediction; therefore, the difference between the predictions and the crowd prediction.
One quick aside, statisticians use this expression to compute what they call the variance of a data-generating process, or some process that generates numbers. By variance, what they typically mean is how much noise or error is produced by the process. Here we are using it to mean diversity. Let’s think about variance for a second.
Cookies and Variance
Suppose I have a machine that’s producing cookies that are supposed to weigh six ounces, some are going to weigh a little more and some are going to weigh a little bit less. How much more or less is the variation? What causes that variation? The variation can be caused by vibrations in the machine or clumping of the dough, all sorts of things.
In our case, we are getting variations in these predictions. The cause of the variation isn’t the shaking of machines or lumping of cookie dough; it is differences in how people think. When we think about this variation in the predictive context, we are going to call it diversity of predictions because that’s what it is. It is differences in how people predict. Remember in our case, people’s average prediction was 11.
Learn more about how cognitive diversity and identity diversity differ
What we can do is we can figure out what in some sense is the diversity of those predictions? Again, let’s do some simple math. Amy’s prediction was 12 and the crowd’s prediction was 11 so we can say Amy’s contribution to diversity is (12 − 11)2, so that’s one. Belle’s contribution to diversity is (6 − 11)2, so that’s −52, which is 25. And Carlos’ contribution to diversity is (15 − 11)2, which is 42, which is 16.
Added up, we get 42. When I then divide that by three, we get that on average, people are off by 14. We can call this 14 the diversity of the predictions because this is how different people are in the predictions that they make.
The crowd’s squared error equals the average individual squared error minus the diversity of the predictions. This can be called the diversity prediction theorem.
Let’s look at these three numbers that we have calculated. So far we have calculated the average individual squared error; that’s 15, so that’s on average how far off people’s estimates have been. We have calculated the diversity of the predictions (that’s 14) and then we have calculated the crowd’s squared error (that’s 1). Do you notice anything? That’s correct: The crowd’s squared error equals the average individual squared error minus the diversity of the predictions. This can be called the diversity prediction theorem.
This happens not only in this example; it is true in every example. It is a mathematical identity, like the Pythagorean theorem. Remember, if you take any right triangle, the hypotenuse squared equals the sum of the squares of the two sides. This is the same thing. The diversity prediction theorem is like the Pythagorean theorem in the same way: It’s always true.
Learn more about how the notion of diversity is changing
Counterintuitive Logic of Diversity
Now, this is a really important result and it is counterintuitive. What it tells us is the crowd’s ability depends in equal measure on ability (that’s the average individual error) and on diversity. It is so important—so central and also counterintuitive.
Let’s think for a minute how unintuitive this is because I asked what would make the crowd smarter. You might have said well it is having smart people and that’s captured in average error, but it is also diversity. We are going to see that diversity matters. The interesting thing here is diversity matters just as much as ability; it matters just as much as an average error. For many of the ideas we discuss, I am going to just ask you to take them on faith.
The wisdom of crowds—by that I mean the ability of crowds to make accurate collective predictions—depends in equal measure on the crowd’s ability (their averaged individual squared error) and on the diversity of their predictions.
Let me give you a really interesting corollary here and that’s this. The crowd error has to be less than the average individual error. In my proof, I had the crowd error equals the average error minus diversity. Well if it equals average error minus diversity, if diversity is positive at all, the crowd error has to be less than the average error. That means if there is any diversity in the room in any way, then the crowd is going to be better than the average person in it because it is just a mathematical fact. That’s an interesting thing that follows from what we found.
The formal name I use is called the crowd beats the average law. It’s really easy to see why it is true. Before I move on to a quick corollary, here is a second main result. If the crowd squared error equals the average individual error minus the diversity of their predictions, then the following also has to be true. If the crowd had any diversity at all in its predictions, then the crowd’s error is strictly less than the average squared error of the people in the crowd. In other words, the crowd is better than the average person in it. I call this the crowd beats the average law. It is really easy to see why it is true. The crowd’s error is the average individual error minus the diversity. If the crowd’s got positive diversity at all then the crowd’s error has to be smaller than the average individual error. Crowds are better than the people in them, at least on average.
As the diversity prediction theorem is a mathematical fact, it is got to be true in every single case, and it is. It is got to be true in Galton’s data so let’s check. If we take his data, we get that the crowd’s squared error equals 0.6. The average individual squared error, in that case, was 2,956. Wait a minute, that’s huge! The cow only weighed 1,150 pounds how could the error be 2,956? Remember, this is the squared error. If we take the square root of that number, we get 54.4, 55 pounds. That isn’t bad. Cattle weigh about five times the size of people. We can guess the weight of a person within about 10 pounds so 50 pounds isn’t that far off. Wait a minute, if the crowd was off by 0.6 and people are off by 2,955, then there must have been a lot of diversity. In fact, there was. The diversity was 2,954.4. That’s how crowd error can equal average error minus the diversity.
Two Ways to Arrive at a Wise Crowd
If you want a wise crowd, this suggests we have two options. We could find brilliant people who all know the answer so then the individual error would be zero, crowd error would be zero, and diversity would be zero; or we can find a bunch of fairly smart people who have moderate errors, who happen to be diverse, so you also get moderate diversity. If you take any example from one of these books on wise crowds, such as Surowiecki’s book—from jelly bean guessing contests, cattle weight guessing, to predicting the NFL draft—you are going to see it is almost always the case that it looks like Galton’s data where you see a wise crowd, because you have moderately accurate people who happen to be diverse.
Let’s see why this works. Here is another way to think of it. We have this equation: Crowd error equals average error minus diversity. How do we get a small crowd error? Let’s think of it this way. It always looks like small equals big minus big. This is what we see in these books. Why is that the case?
The Wisdom of Crowds
Let’s think about it. How does it make a book called The Wisdom of Crowds? The only way it can make The Wisdom of Crowds is if the crowd error is small, if people don’t make mistakes. For it to make The Wisdom of Crowds accurate, you need a small crowd error. What also has to be true for it to make The Wisdom of Crowds? Well, it has to be the case that the average error is pretty big because if the average error was not big that would mean that everybody can get it right and it wouldn’t be surprising because we could just basically say this was an easy question.
If you make a book like The Wisdom of Crowds, you have to have a small crowd error. You have to have a big average error. Guess what! It has to be the case that diversity is also big—the only examples that make the books about wise crowds look like this. You have high average individual error and high diversity. Therefore, you have The Wisdom of Crowds explained by diversity in most cases.
Let’s think about this intuition for a second. When doing the math, we can see how each line follows from the next, but that does not mean we necessarily intuitively understand why this diversity prediction theorem works.
Let’s go back and look at another example of why diversity matters so much. Let’s look at 100 people guessing the weight of a steer. Suppose that each person is off by exactly 20 pounds. If each person is off by exactly 20 pounds then what we are going to get is an average error of 400. Let’s first suppose there is no predictive diversity so everybody guesses 20 pounds too high. That means the diversity is going to be zero, which means the crowd error is going to be 400 as well.
What we get is that we don’t get a wise crowd. The crowd is no better than the average people in it because the diversity is effectively zero. Let’s suppose instead that people are off by an average of 20, but now we have a lot of diversity, so half the people are 20 too high and half the people are 20 too low. That’s going to mean that the diversity is also 400.
If the diversity is 400 and the average individual error is 400, what we are going to get is a crowd error of zero. We get 0 = 400 − 400. Here is what is interesting. If we compare case one to case two, we see that the people did not get any smarter, but the crowd got smarter. How did the crowd get smarter? It got smarter because it got more diverse. What we see in this simple example is that collective wisdom comes from diversity.
Now that we have the core intuition, let’s drive it home. Suppose that we have a crowd that isn’t wise. We have a big crowd error. To have a big crowd error that means you have to have a big average individual error. The people can’t be getting it right. It also means that diversity has to be relatively small because otherwise, that diversity would cancel out the errors. Remember at the beginning I said sometimes crowds get things right, sometimes they get things wrong, and sometimes crowds are mad. For a crowd to be mad what has to be true is we have to have big equals big minus small because if the diversity were not small relative to the error, the crowd couldn’t possibly be mad. Wise crowds come from diversity; mad crowds come from a lack of diversity and a lack of talent.
Where does that diversity come from? We have done all the mathematics, and so we have seen the intuition for why diversity improves the ability of a crowd to make predictions, but we haven’t explored at all what causes this diversity of predictions.
The Tallest Building In Rio
To get us started on this question, I want to go back to some of the examples we have talked about in terms of the crowds making accurate predictions. In the first, I had my students predict the height of the tallest building in Rio. The tallest building in Rio was only 60 stories high. The individual guesses from my students went from around 30 up to 90 stories. I went back to my students and asked them how they came up with these predictions. The students who predicted 90 floors said Rio is the second-largest city in Brazil, it is one of the largest cities in all of the Americas, and it is beautiful. Since there is a lot of money there it must be the case that they have huge skyscrapers. Why wouldn’t there be tall buildings?
The people who guessed 30 floors did something different. They used different logic. They said Rio is a beach city; you don’t want huge skyscrapers in a beach city. It isn’t even the capital of Brazil, so the tallest building is probably going to be a hotel. Beach hotels tend to be about 30 stories high. Other people predicted short buildings and they said remember, there is that Christ the Redeemer statue that sits on Corcovado Mountain, which is huge, and you don’t want anything that detracts from that, so there’s probably nothing above 40 floors.
This is kind of funny because people made different predictions based on having different models and different understandings of what Rio was like. They had different conceptual models of how the world works. Those different models—one based on capital and wealth, one on beach culture, and one based on aesthetics all led to different predictions.
In reality, all of these ideas probably contributed to the tallest building only being 60 floors. Rio does have lots of money and Rio is a major city, but it is also a beach city. It isn’t a city of bankers, so that argues against something being extremely tall. As for the Corcovado Mountain arguments, it turns out that’s 2,300 feet high, and a 1,200-foot tower wouldn’t block out the view. It remains true that Christ the Redeemer serves as this iconic image of Rio and no one would want some glass tower to detract from it. But the fact is that probably does not restrict the height. The truth, like the average prediction, in this case, lies in between what people thought.
Learn more about the Diversity Prediction Theorem
Let’s go back to Galton’s steer because this one is more problematic. How in the heck do people look at a steer in a whole bunch of different ways? Here, the explanation comes less from a diversity of mental models than it comes from the diversity of experiences. Each of these people, in the West of England, probably had steer at home and knew the weights of those steers. So, if I have a steer at home that weighs 1,300 pounds and I look at the contest steer and it looks a little smaller, then I will guess a little less than 1,300 pounds. If I have a steer at home that weighs 1,050, maybe a little smaller than the contest steer, maybe I guess a little bit north of that 1,050 amount.
This tendency for people to base predictions on what they know has a formal name, called a base rate bias. We are influenced by how we start thinking about a problem. In the case of the Galton’s steer, the crowd gets it right because the idiosyncratic errors of the individuals cancel out, and they are equally likely to be high or low, and so therefore they are diverse and we get a wise crowd.
We have two primary reasons for these predictions to be diverse. The first one is different models and the second is different idiosyncratic errors. In the case of guessing the tallest building in Rio, it is the different models that explain the wise crowd. In Galton’s steer, it seems that it is a classic case of errors canceling because they happened to be diverse due to the base rate thought.
A Final Observation
I want to finish up with an observation that concerns disagreement. Let’s suppose you go to a meeting and you are asked to predict something, like maybe to invest in some business opportunity. It could be something as simple as the number of attendees you are going to get at some event. It could be the sales of a new product, the price of a stock, or, if you are sitting on the Federal Reserve’s Open Market Committee, it could be how much unemployment is going to change in the next month, or what you should do with the money supply.
Let me lay out two scenarios. Scenario 1: You go to this meeting and everybody agrees, you all think the same thing, you make the same predictions, you use similar logic. Scenario 2: Your predictions differ because people use different models or they have different sets of experiences.
In scenario one, there are two possibilities: Either we are all right, or we are all wrong. If it is a hard problem, it is probably not that likely that we are all right. The only way you should feel good about the outcome is if you feel that it was an easy task. If it was an easy task, then why did you have the meeting? It was easy. Everybody could have gotten it right on their own. There was no reason for bringing the diverse group together.
In scenario two, there is disagreement. Some people might think sales are high, others might think sales are going to be low. We know from our corollary to the diversity prediction theorem, the crowd beats the averages law, that the crowd is going to be more accurate than its average member. When you leave this meeting, you should feel good, you should feel like “Wow, the crowd probably made a better prediction than a random person, including me, would have made on my own. We did better.”
In other words, if you go to a meeting and you are predicting something and people disagree: That’s good. It is good because it means there is diversity in the room, and that diversity in the room improves performance. This isn’t a metaphor; we just worked through the math. It is a mathematical fact.
What is the key lesson? The key lesson is this: Within your organizations, you should include multiple, diverse models when you are making a forecast. In your daily life, you should do the same thing. You should open your mind to new and diverse ways of looking at the world. Not only is it going to be fun and enlarging, but it is also going to be better and make you better at predicting what lies ahead.
Common Questions About the Wisdom of Crowds
The wisdom of crowds is different from crowdsourcing as it is based on the concept of a synthesis of the aggregation of the crowd’s thoughts, while crowdsourcing isn’t concerned with aggregating the crowd’s input.