Instability of the probability of winning in election prediction (with some R)

We’ve been talking a lot about election predictions lately.

Election Prediction Markets: What Happens Next?

Why do we make probabilistic election predictions? (And why don’t we put a lot of effort into it?)

What will happen from now until November 5th?

What do you think will actually happen in November, based on polling averages and political predictions?

The election is coming up. What predictions should we believe?

One thing that comes up when making election predictions is that people confuse the probability of winning with the expected vote share. That’s not always the case. No one expects a candidate to win 90% of the vote when the probability of winning is 90%. But in situations like the current election where both numbers are closer to 50%, it becomes a problem. Even if Harris is predicted to win the Electoral College at 60%, that doesn’t mean she’s predicted to win 60% of the Electoral College vote or 60% of the popular vote.

There are several ways to think about this. We can draw an S-curve showing Pr(win) as a function of expected vote share. If the expected share of the two-party vote falls below 40% or above 60%, the probability of winning is essentially 0 or 1. In fact, if a candidate gets 54% of the two-party vote, he or she will actually secure an Electoral College victory. However, Expected 54% does not translate into a 100% chance of winning, because there is uncertainty about the election outcome. If your prediction is 54% and your standard deviation is 2%, you are likely to lose in reality.

A few years ago, we did some calculations based on the assumption that the national popular vote can be predicted within a standard deviation of 1.5% with normal distribution uncertainty. If Harris is currently predicted to get 52% of the two-party vote, let’s say that the prediction gives her a two-thirds chance of getting between 50.5% and 53.5% and a 95% chance of getting between 49% and 55%. This isn’t exact, but if you change the numbers, you get the same overall picture. This prediction gives her a 90% chance of winning the popular vote (which in R is calculated as 1 – pnorm(0.5, 0.52, 0.015) = 0.91). But her chance of winning the Electoral College is about 60%. It’s only 60%, not much more, because the state-by-state distribution of votes means that she likely needs a little more than a majority of the national vote to win a majority of the Electoral College. Roughly speaking, we can say that she would need about 51.6% of the two-party vote to have a 50-50 chance of winning the Electoral College (in R, this is calculated as qnorm(0.4, 0.52, 0.015) = 0.516).

Now, what if the predictions changed? Increasing Harris’s expected vote share by 0.1 percentage point (from 52% to 52.1%) increases her odds of winning by 2.5 percentage points (in R, this is pnorm(0.516, 0.52, 0.015) – pnorm(0.516, 0.521, 0.015) = 0.025).

Increasing (or decreasing) Harris’s expected turnout by 0.4 percentage points raises (or decreases) her odds of winning by 10 percentage points. Another way to change her odds of winning (by raising it to 50 percent) is to increase the uncertainty in the forecast.

So one reason I don’t trust the precision of reporting the win probability is because in a close race, these win probabilities are very sensitive to small changes in the inputs. These small changes can be significant. A 0.4% vote change in a close race can be decisive, but that’s the point. The fact that it’s likely to be decisive is what makes the win probability so important.

One of my favorite things The Economist’s Display (See the image at the top of this post) That’s what reports the probability as “3 out of 5”. That’s good because it rounds it up. It’s 60%, not 58.3%. It’s also better to say “3 out of 5”, not “60%”, because it’s less likely to be confused with the predicted turnout.

P.S. This all ties into Jessica’s recent post, because we co-authored a paper a few years ago (with Chris Wlezien and Elliott Morris). Information, Incentives, and Goals in Election ForecastingAnd more specifically, binary predictions are difficult to evaluate empirically (see here). So this is a realistic example of a common scientific problem where you have to make choices that cannot be evaluated purely by empirical or statistical criteria.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.