Homework 6.1: Bootstrapping “theory” with hacker stats [25 pts]


Say we have a data set with \(n\) unique measurements.

a) Show that on average a fraction of \((1-1/n)^n\) of the measurements do not appear in a bootstrap sample.

Note that for large samples, this is approximately \(1/e \approx 1/2.7\), since

\begin{align} \lim_{n\to\infty} (1-1/n)^n = 1/e. \end{align}

b) Use hacker stats to demonstrate that the result you derived in is indeed true. Hint: Think about a convenient “data set” to use for drawing samples.

This is kind of fun; you’re investigating some theory behind hacker stats with hacker stats! If you couldn’t solve part (a), you could still make a plot using results from part (b) and get the basic idea.