This tutorial will cover how graphics can be used in misleading ways. You will learn about:
At certain times, people create graphics that make you want to think a certain way. They're trying to sway you to believe something, so they'll distort information or create misleading graphics to try to get you to believe a certain way.
A graph meant to mislead a reader or make a reader feel or believe a certain way.
Graphs can be distorted or misleading in a variety of ways.
Suppose we had three buddies, Paul, Hector, and Juan that were looking at the number baseball cards they had. For that argument, this graph is pretty obvious visual distortion.
Notice that the y-axis has unequal scaling.
The way an axis on a graph is measured. Inappropriate scaling can lead to a misleading graph.
In this graph, one point on the Y-axis apparently represents a 50 point increase (between 0 and 50), whereas another apparently represents a 70 point increase (between 50 and 120). But size gap between the two is the same, even though 70 is larger than 50. And even more ludicrous, there’s another spot on the Y-axis that apparently represents a 5 point increase (between 120 and 125). But it's size gap is also the same!
So some graphs are drawn with unequal scales so as to represent things disproportionately.
Who do you think created this graph? Because the distortion of the data makes Juan appear to have more cards than his friends, it was probably drawn by Juan.
Consider this presentation of data about the preferred brand of dish soap.
Let's say that 15 people selected Brand A, 8 people selected B, and 20 people selected C. Both of these graphs can show us that.
Which graph is more accurate?
As it turns out, the one on the top is more accurate. The one on the bottom is clearly used to exaggerate the difference between B and C. Notice how much taller C is than B in the bottom graph as opposed to the graph on the top. It appears to be twice as tall as B in the top graph, whereas it seems to be about five times taller than B in the bottom graph.
If you use a bar graph or a histogram, it's a good idea to start the vertical axis at zero, unless there's a good reason not to start there.
If you were tracking the change in home prices starting from zero and going all the way up to 300,000, the graph won't show a big difference between 300,000 and 280,000.
However, to the homeowner, that drop in $20,000 is significant.
Graphs beginning anywhere besides zero have a tendency to exaggerate differences. But graphs starting at zero sometimes can minimize very real differences. It's important to know what you're trying to emphasize and not create perceptual distortion.
Using area or three-dimensional visual tricks to make certain values appear bigger or smaller than they are.
The images you use to represent bins in your graphs can also distort meaning.
Suppose a class of 18 students was asked their favorite sport and some kids said soccer and some kids said baseball. Three said soccer. Five said baseball. And the remaining 10 said soccer and some kids said baseball. Three said soccer. Five said baseball. And the remaining 10 said basketball.
Suppose a student drew this graph:
According to this graph, twice as many students chose basketball and five students chose baseball.
So what's the problem? Well, the problem is that while the height of the basketball is twice the height of the baseball, it's also twice the width. And so something that's twice the height and twice the width, if you compare the areas taken up by the basketball and the baseball, it's about four times as much area taken up by the basketball.
In fact, you can even see that more clearly by putting the box that represented the baseball inside the box that represented the basketball. And it's clearly only about ¼ the size.
To make matters worse, technology has introduced us to lots of different misleading graphs. This is kind of a ridiculous graph:
Because the previous graph is three-dimensional, there's no way to easily compare heights. Is the cone supposed to be taller, shorter, or the same height as the cylinder behind it? It's going to be very hard for anyone to tell, which makes this is an incredibly misleading graphic.
To make matters worse, technology, like certain spreadsheet programs, have allowed you to easily create many different graphs.
Using so many technological tricks, like three-dimensional cones and cylinders, distort your data. The best choice if you're going to use bar graphs would be the simple ones, the simple two-dimensional ones at the top.
So the better choice, since this graph is meant to compare across time, would be something like a time series.
This is a lot more useful to anyone reading it then the previous example would be. It isn’t as flashy, but none of the information is hidden or distorted.
Graphical displays can be manipulated in many different ways. If you use an inappropriate scale, you can exaggerate the differences. Or you can use areas to make differences seem larger than they actually are. Or you can use three-dimensional displays that aren't really clear at all. These are all ways that misleading shapes can create misleading graphics. And technological distortions are ways to create these misleading graphs.
As a statisticians, your goal is to make the complicated simple, to make the data easy to understand. Statisticians are trying to clean up a messy world. The goal is clarity. And all these misleading graphics don't do that.
Thank you and good luck.
Source: THIS WORK IS ADAPTED FROM SOPHIA AUTHOR JONATHAN OSTERS