Minimally Sufficient

Estimating Prevalence using Imperfect Classifiers

How to correct prevalence estimates using PPI

Cerberus Race Report

I did a three-day adventure racing stage race

Well-calibrated predictions are not enough

Calibration is an important idea in statistical prediction. However, it’s not the only thing.

Diagnosing Bad Hypothesis Tests

I find a strange plot of p-values from a hypothesis test and investigate.

Correlation Chains

SMBC had a interesting comic recently on correlation chains. It suggests the new “Funtime Activity” of creating correlation chains. As a statistician, I naturally wanted to evaluate the methodology. The basic idea is that you start with a given variable X1 (“Amount of Sex” in the comic) and start by linking to a positively correlated variable X2 (“Happiness”). Then X2 is positively correlated to X3 (“Income”) so you expand the chain to X1 → X2 → X3. Eventually you end up at variable Xn (“Likelihood that you are, in fact, J.K. Rowling”) that’s you conclude is positively correlated the variable X1. ...

Reservoir Sampling

How do you sample from a stream of unknown length?