Do read the whole thing here…
“In the wrong hands, though, backtesting can go horribly wrong. It once found that the best predictor of the S&P 500, out of all the series in a batch of United Nations data, was butter production in Bangladesh. The nerd webcomic xkcd by Randall Munroe captures the ethos perfectly: It features a woman claiming jelly beans cause acne. When a statistical test shows no evidence of an effect, she revises her claim—it must depend on the flavor of jelly bean. So the statistician tests 20 flavors. Nineteen show nothing. By chance there’s a high correlation between jelly bean consumption and acne breakouts for one flavor. The final panel of the cartoon is the front page of a newspaper: “Green Jelly Beans Linked to Acne! 95% Confidence. Only 5% Chance of Coincidence!””
“(Campbell) Harvey’s term for torturing the data until it confesses is “p-hacking,” a reference to the p-value, a measure of statistical significance. P-hacking is also known as overfitting, data-mining—or data-snooping, the coinage of Andrew Lo, director of MIT’s Laboratory of Financial Engineering. Says Lo: “The more you search over the past, the more likely it is you are going to find exotic patterns that you happen to like or focus on. Those patterns are least likely to repeat.””
Leave a Reply