Published on July 18, 2019
(Source)
Most people see the headline “90% of Drivers Consider Themselves Above Average” and think “wow, other people are terrible at evaluating themselves objectively.” What you should think is “that doesn’t sound so implausible if we’re using the mean for average in a heavily negative-skewed distribution.”
Although a headline like this is often used to illustrate the illusion of superiority, (where people overestimate their competence) it also provides a useful lesson in clarifying your assertions when you talk statistics about data. In this particular case, we need to differentiate between the mean and median of a set of values. Depending on the question we ask, it is possible for 9/10 drivers to be above average. Here’s the data to prove it:
Driver Skill Dataset and Dot Plot with Mean and Median
The distinction is whether we use mean or median for “average” driver skill. Using the mean, we add up all the values and divide by the number of values, giving us 8.03 for this dataset. Since 9 of the 10 drivers have a skill rating greater than this, 90% of the drivers could be considered above average!
The median, in contrast, is found by ordering the values from lowest to highest and selecting the value where half the data points are smaller and half are larger. Here it’s 8.65 with 5 drivers below and 5 above. By definition, 50% of drivers are below the median and 50% exceed the median. If the question is “do you consider yourself better than 50% of other drivers?” than 90+% of drivers cannot truthfully answer in the affirmative.
(The median is a particular case of a percentile (also called a quantile), a value at which the given % of numbers are smaller. The median is the 50th quantile: 50% of numbers in a dataset are smaller. We could also find the 90th quantile, where 90% of values are smaller or the 10th quantile, where 10% of values are smaller. Percentiles are an intuitive way to describe a dataset.)