Using Only the Simplest of Statistics

Many people believe that misleading the public with statistics requires a great deal of special expertise in the field. Nothing could be further from the truth. Using the five basic Rules of Befuddlement, any thoughtful person can fool many of the people much of the time using the most elementary kinds of statistics.

The naive and simple will believe that you cannot mislead them merely by counting. Either (they think) you will count accurately or you will be in error, and that is that. But what is it that you count? It is easy to mislead by following the first rule of befuddlement:

1. In order to confuse an issue, count unlike things together.

For example, I ate one piece of fruit for lunch yesterday. Today
I ate 21 pieces.

It should be obvious that today's lunch was
larger – that is, unless I mention that yesterday I had a banana
and today I brought grapes. Bananas and grapes are enough alike
that you can group them into one category (fruit) but they are
much more unlike than alike.

There is a corollary which goes with the first rule of befuddlement:

1a. When counting unlike things together, it is more confusing to supply irrelevant classifications.

The preceding example can be made more misleading by applying
this corollary as follows: Yesterday I ate 2 sandwiches, 4 cookies,
and only 1 piece of fruit for lunch. Today I had 2 sandwiches,
4 cookies, and 21 pieces of fruit.

Even a moderately careful audience will fall for this one
because it appears as though you have separated your counts into
meaningful categories. You *have* classified your counts,
but the classifications used have little to do with the inference
that the audience will draw.

Here's another example taken from our work at the hospital. In order to track how much business we're getting, we count patients. Now, obviously, a patient is a patient (just like a piece of fruit is a piece of fruit). If you audience is knowledgeable enough to know that inpatients create more revenue than outpatients, you should classify your counts in this way. Now everyone will accept your counts – but what have you really done? You've counted short-stay surgery beds, lab samples, and long-term renal dialysis together as if they were somehow equivalent. By counting in this or a similar way, you can say virtually anything about the hospital's current business.

To increase confusion and mislead even more people, the next logical step is to use percentages. A surprising number of people don't understand percentages at all and most of the rest can be quickly thrown off the scent. The second rule of befuddlement is especially appropriate to work with percentages:

2. Numbers can be made meaningless by including irrelevant data or omitting significant data.

It is important that you don't omit only irrelevant data. Although this rule seems simple, a careful application of it can produce quite subtle results.

A classic example of the second rule comes from the data
processing department whose computer was unexpectedly unavailable
for two hours on Tuesday morning, again for three hours on
Thursday afternoon, and yet again for three more hours on Friday
morning. The next week, they reported that the computer was
available 95.2% of the time.

This was true, but nobody cared
because most of that available time was during evenings, nights, and
weekends. The percentage looks good because a lot of irrelevant
time was included.

The preceding example might fool a lot of people, but it probably won't fool those who were present on Tuesday, Thursday, and Friday. People tend to have a good intuitive grasp of realities that they live through and aspiring befuddlers are well advised to aim their statistics at people who are somewhat removed from the matter at issue.

The third rule of befuddlement also relates to work with percentages:

3. The impact of a percentage increases as the number of cases decreases.

Of course, the validity of your statistic decreases as you have fewer cases, but if nobody knows the size of the sample they will probably just accept the percentage.

Suppose we were to examine certain reported incidents at the hospital. For example, we might be concerned with the patients who received incorrect medication because the nurse misread the label. Here's the basic data:

April | May | June | |
---|---|---|---|

Misread label | 1 | 4 | 2 |

All med errors | 21 | 17 | 35 |

To show how terribly serious the problem is, we should calculate
the percentage only for May: In May, 23.5% of medication errors
were due to misreading the label.

Similarly, to show how
ridiculous it is to worry about this problem, we would use only
April: In April, less than 5% of medication errors were due to
misreading the label.

Which is correct? Neither, of course;
misreading labels is more likely to be about 10% of the problem
(7 out of 73 cases).

Notice that in this example it does not seem as though we were being overly selective. A month is a natural period of time to use, and why should one month be different from another? Avoid being too obvious about the selection you use. Don't use "the errors which occurred every other Tuesday in even numbered months." Such obviousness just invites your audience to think about what you've done, which otherwise they may not do.

Averages are a good way to mislead because you can apply so many of the rules of befuddlement to them. For example, the third rule can be easily adapted for use with averages:

3a. The potential of an average to mislead increases as the number of cases decreases.

This can be illustrated by using the data on medication errors
that we used in the previous section. During one sample period,
there was, on average, one misread label every 7.75 days.

(It helps not to specify the length of the sample period since
that points out the paucity of the data involved.) Of course, if
we use April instead of May as our sample period, we find an average
of 30.00 days between misread labels.

The fourth rule of befuddlement, although it may be used with percentages, is ideally suited to work with averages:

4. A statistic is more befuddlling if it is calculated to three decimal places.

A typical example of this is the grade-point average, which almost every school calculates to three or more decimal places. The average is calculated from letter grades, which are only accurate to the nearest 7% or 10%, but the extra digits in the average are usually accepted by everyone. (Actually, letter grades are only assigned for the passing range, often 70% to 100%, so the data is not even accurate to one decimal place.) By calcuating the grade-point averages to three decimal places, it is possible to set up an entire industry of ranking students for financial aid, employment, and so on. The same principle can be applied when you want to evaluate employees or departments without having enough information to do it.

Once you've computed your misleading statistics, you will want to present them in a form which is clear, memorable, and convincing. One of the best devices for this purpose is the graph. (This is not a rule of befuddlement; graphs can be just as effective with good statistics as with misleading ones.) Besides simply graphing misleading statistics, you can increase the effectiveness of your data by applying the fifth rule of befuddlement:

5. To make people feel good (or bad) about data, let them think there is a trend.

This can be easily done by using a graph with the points all tied together. (The effect is much reduced in bar charts.) In the following example, even the most cursory glance will confirm that there is a distinct, increasing trend. The example graphs the number of bascule leaves on the Main Street bridge, the average number of meals per day, the number of rules of befuddlement, and the days of the week:

That the trend exists only in the picture is not important; it is there and people will be influenced by it.

This five simple rules can be applied by almost anyone in almost any situation. A practiced application is adequate to mislead and befuddle the vast majority of people, especially in areas outside of the normal field of experience.

It maybe useful to summarize our five rules into an overarching rule, the Cardinal Rule of Befuddlement:

* To confuse, mislead, and befuddle your audience, distract them from the real issues.

Of course, this presupposes that you yourself are aware of the
real issues. It would be a disaster for your case if you
accidentally "distracted" people *to* the
real issues instead of away from them!

May 1991

October 2002