mindstalk: (Default)
mindstalk ([personal profile] mindstalk) wrote2020-03-15 12:12 am

The power of small samples.

I'm reading How To Measure Anything and it's had some surprising revelations.

Rule of Five: if you take 5 random samples from a population, there's a 93.75% chance the population median is within the minimum and maximum (range) of the sample. For it not to be, all 5 samples would have to be e.g. below the median. They're random, so the chance of that is 0.5**5 = 3.125%. They could also be above the median, so that's another 3.125%.

The same math gives that a sample set of 3 has a 75% chance of bracketing the population median! A set of 7, 98.4%.


Single Sample Majority: This one's a bit trickier, but say you have a bunch of urns, containing red or green marbles, the urns have a completely uniform distribution. If you draw a single marble from each urn, and bet that the majority of each urn matches the corresponding sample, you'll be right 75% of the time. Bayes:

p(urn|draw) = p(draw|urn) * p(urn)/p(draw)

Uniform, so p(urn) = p(draw)

Uniform, so p(draw|urn) = 0.75. For individual urns it varies: if the urn is 51% red, 51% chance of drawing a red; if 95% red, 95% chance of a red draw, but the urns range over all the percentages.

Honestly this one seems less generally useful than the Rule of Five, but it's still impressive -- if you don't know much, even a single sample can be meaningful.

Post a comment in response:

This account has disabled anonymous posting.
(will be screened if not validated)
If you don't have an account you can create one now.
HTML doesn't work in the subject.
More info about formatting

If you are unable to use this captcha for any reason, please contact us by email at support@dreamwidth.org