Bias
The bias of an estimate is the difference between the expectation
value of the point estimate and value of the parameter.
Note that the expectation value of is computed over the (unknown) generative distribution whose PDF is .
Bias of the plug-in estimate for the mean
We often want a small bias because we want to choose estimates that
give us back the parameters we expect. Let’s first investigate the bias of the plug-in estimate of the mean. As a reminder, the plug-in estimate is
where is the arithmetic mean of the observed data. To compute the bias of the plug-in estimate, we need to compute and compare it to .
Because , the bias in the plug-in estimate for the mean is zero. It is said to be unbiased.
Bias of the plug-in estimate for the variance
To compute the bias of the plug-in estimate for the variance, first recall that the variance, as the second central moment, is computed as
So, the expectation value of the plug-in estimate is
We now need to compute ,
which is a little trickier. We will use the fact that the measurements
are independent, so
for .
Thus, we have
Therefore, the bias is
If is the plug-in estimate for the variance, an unbiased estimator would instead be
Justification of using plug-in estimates.
Despite the apparent bias in the plug-in estimate for the variance, we
will normally just use plug-in estimates going forward. (We will use the
hat, e.g. , to denote an estimate, which can be
either a plug-in estimate or not.) Note that the bootstrap procedures we
lay out in what follows do not need to use plug-in estimates, but we
will use them for convenience. Why do this? The bias is typically small.
We just saw that the biased and unbiased estimators of the variance
differ by a factor of , which is negligible for large
. In fact, plug-in estimates tend to have much smaller error
than the confidence intervals for the parameter estimate, which we will
discuss next.