Originally posted Apr 22, 2015
A nagging little question finally gets my attention
In a recent post on measurement accuracy and the use of supplemental measurement data, the measured accuracy in the figure was given in terms of the mean and standard deviations. Error bounds or statistics are often provided in terms of standard deviation, but why that measure? Why not the mean or average deviation, something that is conceptually similar and measures approximately the same thing?
I’ve wondered about standard and average deviation since my college days, but my curiosity was never quite strong enough to compel me to find the differences, and I don’t recall my books or my teachers ever explaining the practicalities of the choice. Because I’m working on a post on variance reduction in measurements, this blog is the spur I need to learn a little more about how statistics meets the needs of real-world measurements.
First, a quick summary: Standard deviation and mean absolute—or mean average—deviation are both ways to express the spread of sampled data. If you average the absolute value of sample deviations from the mean, you get the mean or average deviation. If you instead square the deviations, the average of the squares is the variance, and the square root of the variance is the standard deviation.
For the normal or Gaussian distributions that we see so often, expressing sample spread in terms of standard deviations neatly represents how often certain deviations from the mean can be expected to occur.
This plot of a normal or Gaussian distribution is labeled with bands that are one standard deviation in width. The percentage of samples expected to fall within that band is shown numerically. (Image from Wikimedia Commons)
Totaling up the percentages in each standard deviation band provides some convenient rules of thumb for expected sample spread:
- About one in three samples will fall outside one standard deviation
- About one in twenty samples will fall outside two standard deviations
- About one in 300 samples will fall outside three standard deviations
Compared to mean deviation, the squaring operation makes standard deviation more sensitive to samples with larger deviation. This sensitivity to outliers is often appropriate in engineering, due to their rarity and potentially larger effects.
Standard deviation is also friendlier to mathematical operations because squares and roots are generally easier to handle than absolute values in operations such as differentiation and integration.
Engineering use of standard deviation and Gaussian distribution is not limited to one dimension. For example, in new calculations of mismatch error the complementary elements of the reflection coefficient both have Gaussian distributions. Standard deviation measures—such as the 95% or two standard deviation limit—provide a practical representation of the expected error distribution.
I’ve written previously about how different views of data can each be useful, depending on your focus. Standard and mean deviation measures are no exception, and it turns out there’s a pretty lively debate in some quarters. Some contend, for example, that mean deviation is a better basis on which to make conclusions if the samples include any significant amount of error.
I have no particular affection for statistics, but I have lots of respect for the insight it can provide and its power in making better and more efficient measurements in our noisy world.