mindstalk | The power of small samples. (Reply)

I'm reading How To Measure Anything and it's had some surprising revelations.

Rule of Five: if you take 5 random samples from a population, there's a 93.75% chance the population median is within the minimum and maximum (range) of the sample. For it not to be, all 5 samples would have to be e.g. below the median. They're random, so the chance of that is 0.5**5 = 3.125%. They could also be above the median, so that's another 3.125%.

The same math gives that a sample set of 3 has a 75% chance of bracketing the population median! A set of 7, 98.4%.

Single Sample Majority: This one's a bit trickier, but say you have a bunch of urns, containing red or green marbles, the urns have a completely uniform distribution. If you draw a single marble from each urn, and bet that the majority of each urn matches the corresponding sample, you'll be right 75% of the time. Bayes:

p(urn|draw) = p(draw|urn) * p(urn)/p(draw)

Uniform, so p(urn) = p(draw)

Uniform, so p(draw|urn) = 0.75. For individual urns it varies: if the urn is 51% red, 51% chance of drawing a red; if 95% red, 95% chance of a red draw, but the urns range over all the percentages.

Honestly this one seems less generally useful than the Rule of Five, but it's still impressive -- if you don't know much, even a single sample can be meaningful.

The power of small samples.

Post a comment in response: