Friday, September 17, 2010

Lies, damn lies, and statistics

Mark Twain once said, "There are three kinds of lies: Lies, damn lies, and statistics." Twain was ahead of his time – the development of statistics, and our ability to aggregate and analyze data, have come a long way over the past hundred-plus years.

One of my favorite books of all time is "How to Lie With Statistics" by Darrell Huff – a classic first published in 1954. In it Huff clearly explains all the different ways one can lie, or at least mislead, with statistics. It is a must read for anyone involved in politics, where statistics are thrown out on just about every issue to justify one position or another.

I am a big fan of charts and find them to be a great way to display data. However, I was recently reminded of one of the main methods of misleading with statistics when I came across this chart just the other day:


This chart, which appeared on the site Downsizing the Federal Government, clearly shows a huge increase in the number of Federal Subsidy programs over the past forty years. Or does it?

The answer is yes, and no. This chart is a perfect example of how to mislead with statistics.

A first glance at this chart would lead one to believe that the number of federal subsidy programs has increased by about a factor of ten over the past forty years. But a closer look shows where the misrepresentation comes from. Take a look at the Y-axis scale. The Y-axis crosses the X-axis at a value of 900. Simply increasing or decreasing that value leads to completely different representations of the data. Consider what happens when we change the value so that the Y-axis crosses at a value of zero, which is more typical:



Same data, but very different visual conclusions. Yes, the chart still clearly and effectively shows that the number of federal subsidies has in fact roughly doubled over forty years (still a great concern), but it no longer appears to show a ten-fold increase, as the previous chart did.

But let's play with the chart a little more. Let's keep the Y-axis crossing at zero, but increase the upper limit from 2500 to 5000. That yields the following chart:


The higher limit smoothes out the increase over time so that it doesn't appear as dramatic. In fact, a first glance at this chart would make one think that, yes, federal subsidy programs have increased over time, but the increase has been fairly slow and gradual – nothing of great concern.

I think these three examples really highlight how simple it is to visually depict different "pictures" of the same data. But just because it is easy to do this doesn't mean you have to mistrust every chart or piece of data you see. For example, I came across the following chart this morning that I think does a fair job of representing the data and the problem without misleading:


This chart clearly shows how public school employment has drastically outpaced enrollment over the past forty years. There has been no monkeying with the axis limits or other portions of the chart to create a misleading pictorial and conclusion about the changes that have occurred over time. The conclusion that one might draw from this chart, which I find valid, is that part of our funding problem in education is that we've drastically grown the size of the employee to student ratio over time. This growth could actually be justified had student achievement also significantly increased over that time period, but unfortunately it has not. And even then, there are more factors at play to consider, such as the make-up of the student body, and other factors affecting outcomes that may have changed over time.

It's wise to always keep these issues in mind when studying policy and looking at statistics. In short, don't believe everything you see, and always take a second look at data and charts to make sure your initial conclusions are valid.

Public policy debates would be much more cordial and trustworthy if those involved would simply remember this: While you can sometimes make a more dramatic statement by tweaking the numbers, the most convincing arguments have always been and always will be those that play it fair. If your case is strong enough, you shouldn't need to create any illusions.

No comments:

Post a Comment