The Franken formula

There are plenty of polls and political tealeaves along with people who don’t know how to read them. Nate Silver, who started fivethirtyeight.com, isn’t one of them, as the New York Times properly pointed out in a lengthy article today.

In March, he introduced FiveThirtyEight.com, and it quickly became a go-to site for readers whose interest in raw numbers had grown after the close (and miscalled) elections in 2000 and 2004. As his reputation grew online — there’s a Facebook group called “There’s a 97.3 Percent Chance That Nate Silver Is Totally My Boyfriend” — the mainstream media he disparaged for sloppy reporting came calling.

Political predictions are “big this year because of Nate Silver,” said Sam Wang, who runs the rival site Princeton Election Consortium. “He loves discussing the details of the data, and his commentary is quite good. He’s made this hobby mainstream.”

In other words, he gets it right. So a lot of people listen up when he reports, as he did today that a recount of the U.S. Senate election in Minnesota likely will favor Al Franken.

Why? Silver’s a stat geek so he’ll explain it with terms like correctable error rate, binomial distribution, matrices, and average value.

But the bottom line? He calculates that Franken was the overwhelming choice of people most likely to make a mistake on the ballot, which would have prevented it from being counted by an optical scan machine.

Later on Monday, Silver added another log to the fire. I pointed out last week that there was a large difference between the number of people who voted for president, but not for Senate. Some sharp-eyed readers pointed out the percentage difference wasn’t that far removed from previous elections, but how do we know voters in those previous elections intentionally left a race blank?

Silver’s research suggests that up to a third of them did not mean to do so.

Let’s assume that in most of these cases, the voter intentionally skipped the senate race, but that in one-third of cases he did not. This equals another 8,277 votes, or a total of 15,001 cases in which the voter intended to vote for the senate race, but his vote was not recorded.

In not all of these 15,001 cases, however, will the voter’s intention be clear. Let’s assume that one-quarter of these ballots will be unresolvable, even upon a hand recount. This means that 11,251 ballots will actually be reclassified during the recount, or about 0.4% of the total cast.

Bitwise notes, however, that Franken did in fact perform better — really, quite a bit better — in precincts with more undervotes. If undervotes follow the pattern of the recorded votes, then Franken would win 52.5% of recounted ballots (excluding any ballots cast for third parties). This is a significant finding, as these are the first numbers I have seen to break the undervote down to the precinct level.

Still not buying it? Silver got his start in the stat geek business by developing a formula that can project the ability of players and teams — the PECOTA formula.

In February, he wrote this head-shaking headline for an article in Sports Illustrated about the worst team in baseball the previous season:

Thanks to improved pitching and (especially) defense, the Rays won’t merely be better in ’08, they’ll be 22 wins better

Silver was wrong. They were 31 wins better.

Nate Silver has a lot riding on the upcoming recount. So does Minnesota.