Misleading Graphs and Data 2.ppt

Download Report

Transcript Misleading Graphs and Data 2.ppt

MISLEADING
STATISTICS
Twisting information to your
advantage…
Statistical thinking will one day be as necessary for efficient
citizenship as the ability to read and write. – H.G. Wells
Indeed, statistics may be one of our most effective and
efficient vehicles for communicating information. It is the
natural inclination of people to trust numbers over words,
and statistics present numbers in an attractive format that
even the most innumerate man can follow. In addition,
statistics can be presented in a wide variety of forms, from
line graphs to tables to pie charts. Each performs its own
unique function and offers information from a new
perspective.
Statistical thinking will one day be as necessary for efficient
citizenship as the ability to read and write. – H.G. Wells
Yet with every benefit comes a setback. Many
people do not realize that numbers in a graph
can be easily manipulated to reflect the author’s
own wishes. The problem with graphs is that
even with missing information, incomplete
figures, and vague captions, they can still be
presented with reasonable realism. People have
grown so accustomed to seeing graphs that they
accept its information unquestionably.
Statistical thinking will one day be as necessary for efficient
citizenship as the ability to read and write. – H.G. Wells
In the following presentation, we will show you two such
misleading graphs, point out their errors, and attempt to
recreate the same graph using more accurate forms of
presentation. You will see how the same set of information
can produce two completely different graphs, and learn
about the many ways in which statistics can deceive you.
This graph is misleading in many ways. Here are
some examples of the most commonly used
graph-manipulation tactics.
First of all, there is the title
to consider. While retail sales
do go down in April 2002,
the title doesn’t accurately
reflect what the rest of the
graph shows. Yes, the sales
do rise and fall over a period
of a year and a half, but in
general, they have been
steadily rising since
November 1998.
This graph is misleading in many ways. Here are
some examples of the most commonly used
graph-manipulation tactics.
Second, notice that the yaxis does not begin at zero,
but at $225 billion. This has
the unfortunate effect of
making the rising slope
shown in the graph much
steeper than it actually is.
This graph is misleading in many ways. Here are
some examples of the most commonly used
graph-manipulation tactics.
Third, the little white box that
shows the rate of change from
pervious months only includes the
last three months in the graph.
This immediately biases the graph
in favor of the title, as it shows
that sales have actually gone down
since February. A reader just
looking at the box will not know
that sales have also gone down in
May and September 1999, and
that these did not affect the rising
number sales one bit.
This graph is misleading in many ways. Here are
some examples of the most commonly used
graph-manipulation tactics.
Fourth, note that the year
1999 is written under June
and July, and not January.
This may be a minor
transgression, but it will
certainly lead some readers
to believe that the time
period spans three whole,
consecutive years and not
fragments of a year.
This graph is misleading in many ways. Here are
some examples of the most commonly used
graph-manipulation tactics.
One final observation: Is it fair
to compare retail sales of the
months of a year all together?
Christmas in December, for
example, would prompt gift
buying, but slower months like
February might now have any
at all. Wouldn’t it be much
fairer to compare the same
months and calculate how much
it has grown over the year?
On this second graph, the y-axis begins on zero, therefore making the rising slope much less
dramatic. When presented like this it is also harder to tell which bars are higher and lower. The
last two bars, for example – March and April – look almost exactly the same on this graph. If
the reader wasn’t told that the sales had actually gone down from March in April, he would
never know. The title, likewise, has been changed to something that can encompass all aspects
of this graph. In addition, instead of labeling a group of months with one year, we have given
each month its own year so that its easier to read.
Retail Sales from November 1998 to April 2000
$300.00
Billions
$250.00
$200.00
$150.00
$100.00
$50.00
$0.00
Nov- Dec- Jan- Feb- Mar- Apr- May- Jun- Jul- Aug- Sep- Oct- Nov- Dec- Jan- Feb- Mar- Apr98 98 99 99 99 99
99 99 99
99 99 99
99 99 00 00 00 00
Month
However, this graph still does not address the problem of comparing all months together as
equals. In our next graph, we will show you what that might look like.
Comparing the months of consecutive years yields yet another perspective to the picture.
From this graph, it is easy to see that sales have steadily risen for each month, and by a
fairly predictable percentage at that. Nearly all the months are rising by the same margin
from one year to the next. Even April sales, which the original graph proclaimed was falling,
have risen compared to its sales from the previous year.
Month
Ap
ril
ar
ch
M
Fe
br
ua
ry
ry
Ja
nu
a
De
ce
m
No
ve
m
be
r
$300.00
$250.00
$200.00
$150.00
$100.00
$50.00
$0.00
be
r
Billions
Retail Sales Rise
First Year
Second Year
Once again, while the original graph seems to be trying to convince us that April sales
have very obviously fallen, these two graphs tell us the opposite. Appropriately, the title for
this third graph has been changed completely to give the opposite minute.
Of course, there are many different ways to lie with
statistics, and now we’ll show you how it can be done
with a pictograph.
Of course, there are many different ways
to lie with statistics, and now we’ll show
you how it can be done with a pictograph.
The most deceptive aspect of this graph
is the way in which it was drawn. Firstly,
the perspective puts barrel 1979 at the
forefront and barrel 1973 at the back. This
effectively draws reader’s eyes to the 1979
barrel first and then forces him read the
rest of the years in descending order.
Supporting this deceptive tactic is the fact
that only the foremost barrels have
complete year to read. The rest are
indicated with only the last two digits, as in
‘76. Obviously, the makers of the graph
intend for the audience to read in reverse
chronological order, which has the effect of
making oil prices seem to fall.
Of course, there are many different ways to lie with
statistics, and now we’ll show you how it can be done
with a pictograph.
Secondly, the perspective makes
it hard to judge the numerical
difference between each barrel.
For example, even though barrel
1975 appears to be over two
thirds the height of 1976, in
reality, the difference between
them is only $0.95. Likewise,
barrel 1973 seems less than half
the height of 1974, yet they differ
by a whopping $8.54! the effect of
making oil prices seem to fall.
Of course, there are many different ways to lie with
statistics, and now we’ll show you how it can be done
with a pictograph.
A third misleading aspect is that
this pictograph doesn’t contain a
scale or axis’ of any kind. Without
it, the reader’s attention might be
directed to the area of each barrel
instead. Numerically, the smallest
barrel should only be about one
1/5 of the largest barrel, but in
terms of area, the ratio is about
1/25. This makes the different
between the two much larger than
it actually is.
Of course, there are many different ways to lie with
statistics, and now we’ll show you how it can be done
with a pictograph.
Lastly, the way in which the
barrels are labeled seem
somewhat awkward.
Shouldn’t the prices be on
the barrel instead of years?
Prices written on the barrel
will clarify that it is the cost
that is changing, not the
years. And with more space
to indicate years, readers
won’t be forced to read in
reverse.
Price per barrel of crude oil leaving Saudi Arabia on
Jan. 1
$14.00
$12.00
$10.00
$8.00
Price
$6.00
$4.00
$2.00
$0.00
1973
1974
1975
1976
1977
1978
1979
Year
This graph neatly depicts the steadily rising prices of crude oil, and
shows sudden rises or drops. Each bar represents a number by its
height without using fancy images to distract the reader. The presence
of the x and y-axis’ also make it much more organized. While the
original graph tended to overstate small differences and gloss over wide
gaps, this graph is much more honest. One can see that the largest rise
occurs between 1973 and 1973, and that it continues to rise by smaller
amounts steadily over the next five years. The years on the x-axis are
all clearly marked in chronological order as well so that it is easy for
readers to understand.
For this type of information, using a line graph may be even more
useful than a bar graph. With a line to define the rise of fall of oil
prices, it is all the more obvious what the shape of the changing rates
look like. This graph even seems to accentuate the huge rise between
1973 and 1974. The biggest benefits of a using a line graph, however,
lies in the fact that each point is marked with small, accurate dots.
These are much easier to read than bars, and the line between them
outlines the contour of the rise.
Price per barrel of crude oil leaving Saudi Arabia
on Jan. 1
$16.00
$14.00
$12.00
Price
$10.00
$8.00
$6.00
$4.00
$2.00
$0.00
1973 1974 1975 1976 1977 1978 1979
Year
What makes statistical information reliable and accurate?
To make sure statistics are accurate and reliable, one must keep a
number of things in mind. Here are the some of the most important
points to remember:
The first and most important is the collection of information. It’s
alarmingly easy to make graphs with missing figures, and this only
produces inaccurate results. Before making any graph, it is wise to make
that the data is sufficient. This is especially true in surveys, where the
accuracy of the results is in direct proportion to the number of people
surveyed. Next to quantity in importance is quality. There is little point
in making a graph with inaccurate information.
Even with accurate information, however, you must know which is the
best way of using and presenting it. Many perfectly accurate statistics
become misleading when they are unfairly compared. You would not, for
example, compare the average grades of a small school to the average
grades of a large school without making allowances for the larger
diversity of students. Therefore, when presenting data, care must be
taken to prevent this.
What makes statistical information reliable and accurate?
To make sure statistics are accurate and reliable, one must keep a
number of things in mind. Here are the some of the most important
points to remember:
The first and most important is the collection of information. It’s
alarmingly easy to make graphs with missing figures, and this only
produces inaccurate results. Before making any graph, it is wise to make
that the data is sufficient. This is especially true in surveys, where the
accuracy of the results is in direct proportion to the number of people
surveyed. Next to quantity in importance is quality. There is little point
in making a graph with inaccurate information.
Even with accurate information, however, you must know which is the
best way of using and presenting it. Many perfectly accurate statistics
become misleading when they are unfairly compared. You would not, for
example, compare the average grades of a small school to the average
grades of a large school without making allowances for the larger
diversity of students. Therefore, when presenting data, care must be
taken to prevent this.
What makes statistical information reliable and accurate?
Although this graph is pleasing to look at, it can also
be confusing. The author meant for the Number of
Buyers to be calculated by the height of each picture,
but the reader’s attention will be more focused on
area. What makes it even more biased is that each
monitor on the graph is a Macintosh.