Misleading statistics

I am writing a series of posts on the theme of misleading statistics. As a quantitative analyst, my business is using statistics to inform decisions. Previously, when I worked as a scientist for NASA, I focused on statistical models as well. It is not surprising, then, that I am often in situations when a writer or speaker cites statistics in support of an argument and I immediately see problems. A British prime minister, Benjamin Disraeli, is famously quoted as saying, “There are three kinds of lies: lies, damned lies, and statistics.” This does not quite get to the heart of my concerns. I prefer the observation that the average American has one testicle. This is entirely true, but it is not informative. The same may be said of a range of other statistics about the average American. In a widely-distributed population, averages may tell you very little. False statistics are a problem, but my particular focus is on statistics that are, strictly speaking, correct but lead to bad decisions or conclusions.

Quite often, misleading statistical arguments start from a premise that is not inherently reasonable. The statistic that the average American’s real income has not increased over the past thirty years suggests what? Constant real incomes are often referred to as wage stagnation, and are considered to be a negative for workers. Just Google the phrase wage stagnation to see a sample of the discourse. A recent article on Bloomberg states, for example “One of the most vexing and puzzling problems in the U.S. economy is wage stagnation.” Why do we believe that the average worker’s salary should, or even could, outpace inflation over long periods of time? Why should people expect, as a matter of course, to be better off than their parents or grandparents? So-called ‘stagnant wages’ mean that an average worker today has the same purchasing power as he or she would have in the past. In one of the wealthiest countries in the world, why is that a bad thing?

A similarly misleading statistical argument relates to the observation that workers are receiving a declining share of corporate income. This metric is used to argue that workers are being paid a smaller percentage of the value of what they produce than in the past. This statistic needs to be considered very cautiously. Consider the case of a ditch digger whose only tool is a shovel. We can all agree that he should be paid the full value of the ditches that he digs. Now imagine that he gets hired by a construction company that trains him to use an excavator. When he is deployed on a job, and using the company’s excavator, he can now do twenty times the amount of digging as before. But when he is paid, should he expect a wage that is twenty times what he made when all he had was a shovel? Of course not. The company has provided capital to buy technology that allows the worker to be far more productive and he should expect that he will earn less of the value of the work that he performs. As a skilled worker (he knows how to use an excavator), he should expect to earn more than when he used a shovel, but his higher productivity is primarily due to the company’s better tools.

As my final example for this post, I want to address an emerging statistical argument that I believe is incredibly dangerous. I am increasingly seeing statistics that are used to suggest that individuals don’t need to reduce their carbon footprints because its the corporations that are the really big emitters. I have now seen this argument in many mainstream publications. A recent article in Fast Company, titled Focusing on how individuals can stop climate change is very convenient for corporations, argues that “it’s morally good to reduce your footprint–but don’t let that deflect attention from who is really to blame” and “If just a few companies and countries are responsible for so much of global greenhouse gas emissions, then why is our first response to blame individuals for their consumption patterns?” These arguments are based on the oft-quoted statistic that 100 companies are responsible for about 70% of global carbon emissions. Exxon Mobil, Shell, BP, and Chevron are the biggest culprits. Can individuals seriously believe that they can blamelessly drive their big SUVs and fly on as many vacations as they want, and the blame for the carbon emissions falls on the big oil companies and the airlines. This is patently ridiculous.

We live in a world of data and data-driven policy is often a very good thing. The interpretations of data in making decisions requires care, and I am often reminded of Disraeli’s quote. Statistics are often used to provide a legitimacy to arguments, although statistics are easily framed in misleading ways. As our technologies for collecting and analyzing data practically explode, we must always be mindful that translating statistics into meaningful conclusions is a tricky business.