When to Report the Standard Deviation vs. the Standard Error
“I've always been confused about the difference between the standard deviation and the standard error, specifically when you use one or the other. You see SD in a paper of some data, then see SE in another paper.”
The standard deviation (SD) is a measure of the natural amount of variation in whatever it is you are measuring. A common misconception in statistics is that increasing sample size decreases the SD. That is incorrect. When you are estimating the standard deviation of a population from a sample, the estimation is independent of sample size. On average, the standard deviation calculated from a sample (s) will equal the standard deviation of the population (σ) no matter how many individuals you measure. This is what is called an unbiased estimate.
The standard error (SE) is a measure of the precision of the parameter being estimated (e.g. precision of the estimate of the mean). In contrast to the standard deviation, the SE is dependent on sample size: on average, as sample size increases, the standard error decreases. In order to understand why this happens, you need to understand what the SE really is. The SE is the SD of the sampling mean distribution.
What is a sampling mean distribution? Let’s say you have a population of 1,000 polar bears. If you want to estimate the mean and variability in the weight of adult polar bears, are you going to go out and weigh all 1,000 polar bears? Absolutely not! Instead, you are going to weigh a subset, or “sample”, of polar bears. Let’s say you have decided that you will weigh 20 polar bears. Is every polar bear going to have the same weight? Certainly not. So if you weigh one set of 20 polar bears and calculate a mean, will that be the same exact mean you calculate from measuring 20 different bears? No. What we do know is that both calculated means (x̄) will be unbiased estimates of the true population mean (µ). But, how confident are we that we are close to the true population mean? If you took all possible combinations of 20 polar bears from the population of 1,000 bears, calculated a mean for each combination, and plotted the probability distribution of all of those sample means, you would have the sampling mean distribution. So again, I will remind you that the SE is the SD of the sampling mean distribution.
Now let’s say that you could only weigh 5 polar bears. What do you think the sampling mean distribution would look like then? It will be wider because some samples will have calculated means that are farther away from the true population mean, therefore the SE will be larger. The natural variation in polar bear weights hasn’t changed (SD): polar bears are still polar bears. But, we’ve measured fewer bears, and we know that smaller samples are more greatly affected by outliers, thus the probability of how close our estimated mean is to the true population mean decreases, therefore the SE is larger (we have less precision in our estimate).
Now that I have hopefully helped clarify the difference between the standard deviation and the standard error, let’s revisit the question: when should you report the SD vs. the SE in a publication? In my opinion, if your paper is more descriptive, and you are trying to communicate information about the mean and variability among individuals for the trait (or other variable) of interest, then you should report the standard deviation. In contrast, if you have run an experiment and you are comparing means from different groups (e.g. multiple treatments), then it is important to communicate the amount of precision you have in your estimates of the different means. Thus, when you compare the means and declare a difference between those means (reject the null hypothesis), you want to include a measure of how much confidence you have that they are truly different. Therefore, in this case, I would report the standard errors.
Still having trouble wrapping your brain around the difference between the standard deviation and the standard error? Check out this video.