We're sunsetting PodQuest on 2025-07-28. Thank you for your support!

538's New Polling Averages Show Close Presidential Race

2024/4/25

FiveThirtyEight Politics

AI Deep Dive AI Chapters Transcript

People

G. Elliott Morris

Galen Druk

Topics

Galen Druk：就2024年美国总统大选的民调平均值进行了讨论，重点关注了特朗普和拜登在全国和各战场州的支持率，以及罗伯特·肯尼迪·小将的影响。分析了民调结果的潜在误差来源，包括系统性偏差和样本代表性问题，并探讨了影响选举结果的各种因素，例如种族极化、教育程度和投票率。 G. Elliott Morris：详细解释了五十三八的民调平均值模型，包括其数据处理方法、误差范围和对各种潜在偏差的调整。他分析了民调结果中阳光地带州和北方战场州之间的差异，并解释了这种差异可能由多种因素造成，例如民主党在非白人选民中的支持率下降、在低教育程度选民中的下降幅度更大以及白人选民在不同地区的差异。他还讨论了罗伯特·肯尼迪·小将的影响以及其支持率可能随着选举日临近而下降。此外，他还对民调的准确性和可靠性提出了担忧，并建议未来需要改进民调方法以提高其代表性和准确性。 Galen Druk: 对当前民调结果进行了总结，并与G. Elliott Morris 讨论了民调结果的意义和局限性。他强调了民调结果的不确定性，并指出民调结果不能完全预测选举结果，但可以帮助理解当前的竞选形势。他还就民调中可能存在的偏差和误差进行了讨论，并强调了对民调结果进行谨慎解读的重要性。

Deep Dive

Chapters

Shownotes Transcript

Translations:

中文

You're a podcast listener, and this is a podcast ad. Reach great listeners like yourself with podcast advertising from Lipson Ads. Choose from hundreds of top podcasts offering host endorsements, or run a reproduced ad like this one across thousands of shows to reach your target audience with Lipson Ads. Go to LipsonAds.com now. That's L-I-B-S-Y-N-Ads.com.

All right, I'm good to go. I have the spreadsheet up and I have you up and I have coffee. Do you need anything else? I couldn't tell if you were being cheeky. Always being cheeky. Are we British? Is that an overly British word to use? I don't know if you missed this and all of the polling averaging you've been doing, but this is America. These polls are of Scotland. That's why they look so weird.

Hello and welcome to the FiveThirtyEight Politics Podcast. I'm Galen Druk and happy FiveThirtyEight polling average day. Yes, that's right. Today we are launching our state and national polling averages. From this day forth, there will be no more poll by poll freakouts. Everyone will say, put it in the average and

move on. And that will be that. Of course, I kid. There will still be many freakouts, but you, dear listener, need not participate because you have the 538 averages at your fingers. In fact, you can go check them out right now at 538.com.

But if you're more of an auditory learner, as of 3 p.m. on April 24th, Trump leads Biden nationally by half a percentage point, so roughly a tie. In the Sunbelt states of Arizona, Georgia, Nevada, and maybe add North Carolina to the Sunbelt states, Trump leads Biden by on average four to six percentage points. In the northern battleground states of Pennsylvania and Michigan, Trump leads by about a point,

And Wisconsin is the only state where Trump doesn't lead. There, it is a dead heat. Nationally, RFK Jr. pulls in about 10 percentage points of support, and he's in the high single digits in the battleground states. Some quick caveats before we dive in. We are, of course, more than six months away from Election Day, so these numbers are liable to change, and possibly quite a bit between now and November 5th.

There are also margins of error around these numbers, which we're going to get into. These numbers also don't take into account the possibility of a systemic polling error. That is the job of an election forecast, which is, of course, forthcoming. So without further ado, here with me to discuss is Director of Data Analytics, Elliot Morris. Welcome to the podcast, Elliot. Hey, Galen. Happy to be here and unleashing our average upon the world.

It's one of those things where like you raise a child and then you like set it out in the world and you don't know what impact it's going to have.

You hope for the best, but any worries? The model could be a sociopath. We're just not sure yet. We have to see how it behaves around animals. Oh, boy. Okay, I also want to say that if you have questions about these averages specifically or about the election in general, you know where to reach me, podcasts at FiveThirtyEight.com or, of course, on Twitter. So, Elliot, first, a more personal question. Did any of these averages...

surprise you? Or did you learn something that you didn't know while doing this exercise of putting together the averages? Well, in terms of the results, I think the most striking findings for most people are going to be this divergence between the more Sunbelt, as you call them, or non-white states and those white battlegrounds. I mean, there's a five or four to five percentage point difference between those two groups of states on average, and it's counterintuitive.

Democrats have been gaining ground with non-white voters for the last decade until really until this sort of U-turn in 2020 and potentially in 2024. So that is a narrative breaker. Trump up just one point in Wisconsin, Pennsylvania and Michigan on average, but up five in Georgia and North Carolina. I do think that's going to be surprising to people. And is that specifically about race?

racial dynamics, which is to say Biden's numbers are holding up better amongst white voters and deteriorating amongst voters of color? Or is that also education polarization? I mean, I know places like Nevada don't have high degrees of four-year college degree attainment, for example. Is that playing into this? Or is this really a story of actually racial polarization decreasing?

We don't have crosstab averages, so I can only speculate. I can use some anecdotes. What I'll say is there seems to be three things going on. Yeah, there's Democrats losing a little bit of ground with non-white voters, including Black and Hispanic voters. It seems to be less than what was sort of hyped up a couple of months ago as maybe a 40 percentage point change or something among this group. I mean, maybe it's five.

The changes on margin in all these states is only five points total. So you just by definition can't have a tsunami of change in any one subgroup. So for racial polarization, yeah, there's less racial polarization. It does seem to be coming disproportionately among the less educated voters in the population. So that's why you might get bigger decreases in Nevada and Arizona where overall educational attainment is lower than in the Midwestern states.

And then white voters, yeah, they're different in different regions of the country. They're more secular in the North than they are in the South. So maybe Biden's doing better with those. They're, of course, slightly better educated. And they have closer ties to, like, working class politics that have been traditionally associated with democratic gain. Something like a union, for example, to use a heuristic. Yeah.

I think you add all that up together and the Democratic Party today looks a little bit whiter and it looks more working class with white voters, but not with non-white voters. We're far out from the election. Who knows if that'll persist? But that seems to be the explanation today. To be clear, if today were Election Day and these polls were accurate, Trump would win the Electoral College and for that matter, would win the national popular vote by a half a percentage point. Of course, as we've mentioned, there are six months plus until Election Day.

But when we look at the Electoral College math, it looks like Biden would have to win Wisconsin, where it's a dead heat, Pennsylvania and Michigan, and could lose the rest of the states that we've discussed here in order to reach 270. Which of those three upper Midwestern slash Rust Belt states would be the tipping point state in this scenario? Yeah.

So today it's Michigan where Trump leads Biden by 1.3 percentage points. And again, right, like don't take that to the bank. We're going to get into that. But at least Michigan's the most Republican of the northern battlegrounds right now. It's the state that Trump wins sort of first.

in ordering, and that's the one that gets him the 270th electoral college vote. Pennsylvania is right behind that at minus one, so it could be either of those. But Wisconsin looks to be the bluest, which I think is a little surprising if you were looking at those tipping point probabilities in 2016 and 2020. Well, Elliot, it may not be surprising if you're familiar with the history of polling error in Wisconsin. Triggered. I mean, we can...

dive right into that if we want to. I mean, we have seen some pretty messed up polling in Wisconsin of all states. You know, I'm sure we will discuss this more as we get closer to Election Day. But the error nationally in the past two presidential elections has been somewhere more in the range of like two to four points. But part of the reason that people have realized that it seems like the polling error is not

Larger than that, or there have been bigger upsets than that would suggest, is more because of what's happening in the Electoral College. And we've seen pretty big polling errors in some states, particularly like Wisconsin. How are you feeling about the state of polling accuracy today? What the averages try to do is find the most likely trend line through all the polling data we're getting.

You know, we want to know what is the state of public opinion broadly today. That is to speak nothing about the quality of those measurements. We should do another podcast about this.

There's a big risk factor when all the polls are sampling the same population of very highly politically engaged people and response rates are less than 1%. There's a big chance of a misfire. People are going to have to keep that in mind, but that is a quantitative problem for our forecast, not the averages.

To be clear, there are margins of error around these polls, but that's not where the potential error comes from. It's not from either a systemic miss in the polls or the fact that we're six months away. It comes more from the fact that polls have a margin of error associated with them to begin with. So how big are those bans, just to give folks an idea before we dive into the numbers further?

So if you go to the website today, you'll see about a two percentage point uncertainty interval around the polling averages nationally and closer to four or five percentage points at the state level. And this uncertainty represents our uncertainty about the state of public opinion as revealed by those polls without accounting for any systematic bias or the time remaining until the election day. It's not a forecast.

You know, we only have six polls or something in Michigan. Each one of those polls has their own margin of error. So what would happen if we took one of those polls out? Or if the result of that poll were different, how does that impact the average? And we'll get into all the adjustments we make to these polls as well to try to account for biases like pollster and mode. And all of those adjustments also come with some uncertainty. So what we're trying to do is express, right, that...

Even though we have one most likely average, because there's noise in the data, and there's a lot of noise in this polling data this year, let me tell you, there is uncertainty in the average as well. What's the noise sound like? Bang. I don't know. All right, clip that. We're going to make it a button. I'm going to keep pushing it from now until election day.

Elliot, there had previously been some talk about a decrease in the Electoral College popular vote gap. And that's in part because of this split between the Rust Belt and the Sun Belt that we've been talking about, which is that if the Rust Belt holds up better for Biden, then there may not be as big of a Republican advantage in the Electoral College. Does it seem as though that has materialized based on these averages?

So the polling averages today do show a decrease in the Electoral College advantage for Republicans, for Trump. Wisconsin was the tipping point in 2020. Biden won it by, you know, 0.7 percentage points or some sort. And he won nationally by 4.5. So the bias there is in the high three percentage points. Right.

Right now, the difference between the tipping point state in Michigan and nationally is just about a percentage point, although it has been jumping around a lot. Again, caveat emptor about all this noise. So a couple weeks ago, it was closer to two percentage points. Now it's one. It might go back up.

So, that is a decrease, that is a relative improvement in the electoral college outlook for Biden if you keep the national popular vote the same. In other words, if everything is the same in 2020 and you have this level of electoral college popular vote divide, then Biden would have been favored to win more electoral votes than he did.

And this is in part because of that decrease in racial polarization that we've been talking about, right? Yeah, it's because of this increase in Biden's standing in those northern battlegrounds. If you take all the Sunbelt swing states, Nevada, Arizona, Georgia, and North Carolina, and you give them to Trump, he's at 268 electoral votes. That is a very close election, by the way. That is like veep material. Are you trying to manifest a tie in the electoral college? Uh,

No, I would never hope for a particular election outcome, especially not a chaotic one. But if it happened... But if it happened, you'd have no choice but to listen to this podcast every day for two months. I'm sure people would clip it at me ad nauseum. He doesn't care about chaos.

So, right. So that's 268. Trump has to win one more of those states to win the Electoral College. But Biden is doing better in those than you would expect. He's doing about three percentage points better in those northern battlegrounds than you would expect based on how well he's doing in those southern battlegrounds. So it's all really up to those northern states. And hey, you pointed out earlier, the polls have been pretty bad there recently. So don't take that to the bank. But that's at least what the polls are saying. Yeah.

Yeah, which to the point of keeping an open mind about polling error, I think I always tell people we should keep an open mind about what the battleground states will be. It's a pretty narrow scope for understanding the election, all things considered. We can include North Carolina. We can include Minnesota, Maine, New Hampshire. There are a lot of states where, say, like Trump's standing improves by a point or two are in play.

There are some states where, I mean, Biden would have to increase his standing significantly for him to start putting places maybe once considered battleground states like Iowa, Ohio or Florida in play. How does that polling look like? Is Minnesota closer at this point than, say, Arizona or Georgia?

So we don't have any polls in New Hampshire. We do have a couple polls in Maine and Minnesota. These polls are like not counterintuitive. If you just swung every state...

five points towards Trump based on what is implied by the national vote, you get the same answer. And that is a tied race in New Hampshire and Maine and like a 50-50 race in Minnesota, maybe slightly leaning towards Democrats based off of the residual Biden overperformance in the neighboring states. So I think when we get to the point where we're like simulating electoral college outcomes based off of these polls and other factors,

There's not going to be super wide tails because there aren't a whole lot of competitive states, but there will be a lot of what we would say like mass, very high probability of a close election going either way just because of the sheer number of competitive states. Look, it leans Trump right now, the election, as we pointed out. But yeah, there's a lot of uncertainty around that, and especially in states where we have no polls yet.

Quantitatively, how much stock should we put in polling six months and two weeks out from election day on average, which is to say, how much should we expect these polls to move? Yeah, if you're trying to forecast an election with polls this early, you're going to have that time. What this is useful for is for us to anchor our understanding of the campaign as it exists right now and try to figure out

Like in a newsroom, we could make coverage decisions based off of this. Oh, it looks like Maine is in play. We should probably send a reporter there. If you're a campaign, this is useful for you because you want to know whether or not you need to allocate resources in certain places. I think if you're a reader, this is important to you because...

The public opinion in general at any given moment is important. It impacts... We live in a democracy. What people think matters. We live in a society. But it's not useful for forecasting election outcomes early. I mean, look, if the polls were all like plus 10 Biden, then you should calibrate your expectation for November to be a little bit more pro-Biden, not too much. If Trump was up 10 in all these states, then it'd be like, oh, Biden has an uphill battle.

Right now, it's all close enough that anything between Biden plus five and Trump plus five nationally, totally reasonable guess. Anything between, you know, 350 electoral votes either way, also totally plausible. It's close. So it can be useful as a signal for that, but not really as a signal for who's going to win.

Well, speaking of polling movement, folks will see if they go to 530.com that the averages are backdated to the beginning of March. So you can see how they have been moving over time already. And you'll see that Trump led Biden by two points nationally, according to the averages back on March 1st. Now, the lead is closer to a half a percentage point. And we've already talked on the podcast about some movement towards Biden, both in terms of a

approval and whatnot. Is it clear what happened here? I mean, in the same way that we shouldn't be making a huge deal of a half a percentage point lead, we probably shouldn't be making a huge deal of a point and a half movement. But if you did have to pinpoint trends over the past month and a half, what is it?

Look, I wouldn't read too much into a movement of four or five points in one poll or in one state, but the averages are designed to strip out the noise. So a two percentage point increase in Biden's vote margin or decrease in Trump's is probably significant. I think we can read into that and say, oh, this is real movement. This is real change in public opinion. So...

Why has it occurred? I can think of a couple of hypotheses. All of these are obviously against speculation. Trump being in the news again. If you go back and you look over the past 10 years of a chart showing media coverage in general for either Democratic or Republican candidate, including Hillary Clinton, you see a pretty clear inverse relationship between their margin and the polls. Maybe Biden's seeing a little bit of a State of the Union bump or like a primary winning bump.

or a bounce if it fades. But that's all speculation. The averages really are not a tool for causal inference insofar as they help us discuss politics and identify trends, but not why. Yeah, that's the difficulty. The data always tells us what, but it's harder for it to tell us the why. I guess another way of getting at the why is where is the movement? Is it amongst...

Self-identified Democrats who had an unfavorable view of Biden in March now saying they're going to vote for Biden. Is it amongst independents who genuinely didn't have an opinion a month and a half ago? Is it amongst RFK Jr. voters who we've already seen some? We're about to talk about him next. We've already seen some decline in his standing have said, I thought I was going to vote for RFK Jr., but I discovered over the past month and a half that he doesn't align with my views on Biden.

X or Y policy. Is it clear who the movement has happened amongst? Yeah, it is pretty parallel across all groups. Like I said, there's a little bit more movement in Michigan and Pennsylvania and Wisconsin, to some extent, Iowa and Ohio as well. So, you know, you could read into that, I think, and say the traditionally moderate swing group of voters that the press really likes to pay attention to for Trump's first election, maybe they're

they're moving, the moderates. But equally, we have evidence just from the crosstabs that Biden has increased his vote share, his consolidated support among those traditionally Democratic groups, especially Black and young Americans. I think I was on this podcast four or five months ago, and we were talking about whether or not we

believed the horse race numbers among young Americans. And our theory at the time was that when people are answering a stranger on the phone or an empty box on the internet on their phone or their computer, they're like accessing different information in their head than they will actually be putting onto a ballot in November. And at the time, that information was

Look how poorly Biden is doing just in general on the on the economy in Israel, Gaza, immigration, kind of just overall is old. What have you? The tone was just negative. So that tone seems to be a little better now. We seem to be having more recovery. Similarly, we seem to be having more recovery in those voters who may have been accessing those that tone more in their response. But.

That would be a really good survey experiment for someone to do. It's not necessarily answerable with the crosstabs. Okay, so RFK Jr. is pulling in about 10 percentage points of support nationally. It's a little bit less in the battleground states. It's more like eight points or so.

I said the margin, but Biden is polling at about 41 percentage point to Trump's 41 percentage point, roughly nationally. So clearly, neither of those candidates has a majority. You get about 82 percent of the electorate by adding those together. You get maybe another nine for RFK Jr. That takes you close to, you know, 92 percent. There's other folks who say maybe they're not voting, they're voting third party, whatever.

How much does it change the race when you take RFK Jr. out altogether? What kinds of margins are we looking at? When Kennedy is included in a survey, support for both Trump and Biden falls by about three percentage points, but it falls more for Biden. It falls for about 3.4 percentage points for Biden and about three percentage points for Trump. So that's about 4% who...

generally say, I'm going to vote for someone else. And then if you name RFK, more people go to him. And when that happens, there's a penalty for Joe Biden of about a quarter to a half a percentage point, 0.4 percentage point. Is the median estimate there is uncertainty there.

Often when we talk about RFK Jr. on this podcast, folks will say, OK, well, maybe he's pulling at 10% today, but that is probably not sustainable. He'll collapse as we get closer to Election Day. That's what always happens with third-party candidates. Do you agree with that? Do you think there's any reason to expect his numbers to hold up better than past third-party candidates? In the research for our forecast, just to put a number on this, in late April, early May, we find that support for third-party candidates tends to decrease by about half.

So we should expect five percentage points of those votes to flow back to either candidate. And again, the inference from the average today is that Joe Biden would gain a little bit more from that, about 0.4 percentage points. So we don't know if that's going to be true. Like in November, the actual composition of these voters is up for grabs. So that effect is obviously within the margin of error of zero. We don't really know. You know, if you operate under the assumption that RFK is your typical independent candidate,

then you read more into those numbers. Personally, I think we see a little bit more dissatisfaction, disaffection in politics today, especially among young voters. I've been sort of operating under this assumption, using some of the crosstabs as some light anecdotal evidence for this.

that he does have a little more staying power among the Biden voters, among those young voters especially, than maybe your Gary Johnson or your traditional Jill Stein candidacy would have, which didn't seem to really take off with the same media ecosystem, the same online ecosystem that Kennedy has today.

You know, one of the big takeaways from the averages for me is that neither of the major party candidates is anywhere close to 50%. And it seems like we may well end up in a situation where after November 5th, neither candidate gets a majority mandate from the public.

And if you look at sort of the approval ratings of Trump and Biden today and compare it to where their polling average is, roughly, you find that the people who approve of Trump say that they're going to vote for him. The people who approve of Biden say they're going to vote for him. And that means that there's about 20% of the electorate that roughly is not into either. So that gets at, you know, this double-hater group, people who have a negative view of both Trump and Biden and what they're going to do. They can vote for RFK Jr.,

Or another third party candidate, they can reluctantly vote for Biden or Trump, or they cannot vote at all. Let's address that first. Do you think that this is liable to be a low turnout election, given just how poor Biden and Trump's numbers are at this moment in time? Totally. And people say they're less interested in voting.

the election now less enthusiastic about voting for either candidate than they were in 2020, especially the Biden supporters, the ones that voted for him last time. So honestly, I think our modal assumption should be that turnout will be lower than in 2020. People are kind of just tired of it. It's the same race. There's not a whole large novelty factor with Kennedy. It turned out to be closer to a six percentage point decline. I mean, that'd be a lot, but it wouldn't be historically unprecedented.

It's interesting that you say that it seems particularly like Biden voters are not enthusiastic to vote. When pollsters conduct a poll, they can look at all American adults, they can look at registered voters specifically, or they can sort of tighten the lens even further and just look at likely voters.

And it appears that when you look at likely voters, Biden does a little better than if you look at registered voters, which is to say amongst the most reliable voters, amongst the likeliest of the people in the electorate to actually turn out, Biden has more support. How large is that advantage? And is there any sense of why?

There's two correlations going on here. One is that the more educated you are, the more likely you are to vote, right? Maybe you're closer to some civic education that instilled in you this virtue to vote. Maybe you're more exposed to social groups that vote or media advertising or whatever. You're just more aware. Similarly, there is a correlation, especially among white voters, but among the whole country. The more educated you are, the more likely you are to vote for Democrats.

So, you know, that is a robust finding. So it's not surprising at all that as Democrats have become more reliant on educated voters, that they would be doing better in elections that are lower turnout. That finding makes sense. It would almost be surprising if we didn't if we didn't see that these days. All right. So in case this hasn't been nerdy enough, now we're going to get a little nerdier.

Today's podcast is brought to you by Shopify. Ready to make the smartest choice for your business? Say hello to Shopify, the global commerce platform that makes selling a breeze.

Whether you're starting your online shop, opening your first physical store, or hitting a million orders, Shopify is your growth partner. Sell everywhere with Shopify's all-in-one e-commerce platform and in-person POS system. Turn browsers into buyers with Shopify's best converting checkout, 36% better than other platforms. Effortlessly sell more with Shopify Magic, your AI-powered all-star. Did

Did you know Shopify powers 10% of all e-commerce in the U.S. and supports global brands like Allbirds, Rothy's, and Brooklinen? Join millions of successful entrepreneurs across 175 countries, backed by Shopify's extensive support and help resources.

Because businesses that grow, grow with Shopify. Start your success story today. Sign up for a $1 per month trial period at shopify.com slash 538. That's the numbers, not the letters. Shopify.com slash 538.

Today's podcast is brought to you by GiveWell. You're a details person. You want to understand how things really work. So when you're giving to charity, you should look at GiveWell, an independent resource for rigorous, transparent research about great giving opportunities whose website will leave even the most detail-oriented reader

Busy. GiveWell has now spent over 17 years researching charitable organizations and only directs funding to a few of the highest impact opportunities they've found. Over 100,000 donors have used GiveWell to donate more than $2 billion.

Rigorous evidence suggests that these donations will save over 200,000 lives and improve the lives of millions more. GiveWell wants as many donors as possible to make informed decisions about high-impact giving. You can find all their research and recommendations on their site for free. And you can make tax-deductible donations to their recommended funds or charities. And GiveWell doesn't take a cut.

Again, that's givewell.org to donate or find out more.

The model that you use to calculate the averages here is formally called a Bayesian multilevel dynamic linear model, which is fit using a statistical method called Markov chain Monte Carlo, which I want to be clear, I totally understand what that is. Galen, please, you gotta tell people. For lay listeners, what the f*** does that mean?

What the does that mean? - Galen, what do you know about the no U-turn sampler? - Everything. - Nuts. - Elliot, in the layest terms possible, what are we talking about here? - Polling averages typically have followed one of three methods. One is you take all the polls and you put them in an Excel spreadsheet and you average the ones that came out the last 30 days. Maybe you put a little bit more weight on the ones that came out in the last week or whatever. Or you can draw a trend line through points.

Imagine your undergraduate statistics, the graph that has people's

height on the x-axis and their weight on the y-axis. As you get taller, you get heavier, typically. And to know exactly that relationship, you would fit like a straight line through those points. Or you can combine them, which does a pretty good job. It allows you to put more weight on a more aggressive trend line closer to an election, which is novel and important. All three of these suffer from a problem in statistics where it

If you're doing your model in multiple steps where you're like taking an average and then making an adjustment and then doing an average again and putting an adjustment on that and you're stacking all these things together, you lose the ability to reliably measure the uncertainty in your data. So we've opted for a method that's a little bit more complicated so that we can take into proper account that uncertainty than the polling average.

At the end of the day, we are curve fitting. We are drawing a trend line through points. In fact, it's really cool the way our model works as it draws a trend line through support for all parties in all states simultaneously. And we make our various adjustments like for house effects, which I'll list properly in a second. But the important thing people to know is that like that is designed to take proper account of the uncertainty. And that's what you get out of the Markov chain Monte Carlo simulation.

And so what are you trying to take into account simultaneously? The sort of bias of the pollster, the rating according to our pollster ratings, what else? When every poll comes out, like imagine you are a pollster right now and you are trying to generate a poll.

You'd have to get a list of people to call, and you call them, and you write an interview questionnaire, and you interview them. You weight your data somehow to make it representative to account for the aforementioned less than 1% response rate for a poll, and you publish it. And all of those choices can impact the result of the poll. So we want to take into account the effects of which pollster have done this poll, which we typically call a house effect. We also find that there are systematic differences in

in polls that are published from online sources and those that are published over the phone, especially lower quality online sources, those that don't sort of control for the people that are entering into the sample. So some online pollsters use representative sampling from a larger panel of people who have signed up. That's really interesting.

That's a lot better. You know, we want to take into account the population that's being sampled. Likely voter polls are better for predicting an election. So we adjust the polls to the population of likely voters. There's this third party adjustment, which you've talked about for Kennedy. And then there's like randomness. A poll can be off due to

any number of factors regarding the sample that had been interviewed. Like, maybe the people who answered the phone yesterday were just weirdos. That's called sampling error. And then there's a bunch of other sources of randomness that aren't biased that can just enter your poll as the process goes along. And that's called non-sampling error. And we take that stuff into account, too. So...

Honestly, if you're hearing this and you're thinking, well, it sounds like a lot of different ways a poll could go wrong. Yeah, that's the point. That's why we do a bunch of different polls and average them together. And we just want to make sure we're taking, that we're like properly measuring the impact of all these different things that could go wrong so that we can tell stories about them and communicate about what's going on in the polls. So part of the considerations that the model makes is,

blunts the effect that a possible outlier poll could have on the average. And just out of curiosity, does that have the potential to, like, overemphasize the conventional wisdom, right? Like, tale as old as time, the week before the election, Ann Selzer puts out a poll in Iowa showing...

winning by eight percentage points when the polling average shows it a dead heat and actually Ann Seltzer's right. You know, it goes on to be a much closer election than previously expected. Of course, I'm talking about 2020 here. How does the model consider that, you know, you don't want to like overfit to the conventional wisdom and, you know,

end up with a sort of hearted average and be open to the possibility of outliers being leading indicators or something like that. Yeah, there's multiple ways you could do this. We account for also the pollster rating, the 538 pollster rating of the pollster. So if you're a good pollster historically and you've shown, you've demonstrated empirically that your outliers are good, reliable signals, then the model will take that into account and move closer to you than it otherwise would have. But

Like, philosophically, there's just no real way out of discounting outlier polls. If your belief, and I think this is empirically correct, is that the average of polls over the long run is more accurate than any one individual poll can be over that entire period, similarly to, you know, like how an index fund typically beats hedge funds in the stock market, then you want to average. And if you don't believe that, then you should become a bolster and do it yourself.

We tend to believe, and again this is sort of empirically demonstrated in our work, that the average of all poles, accounting for all this other stuff and reacting properly to the amount of movement in a race as the model decides, is the better way to go about it.

Which brings us back to, in many ways, the averages are as good as the polls. And in 2016, obviously, the polls were off nationally by about two percentage points, but the increased error in some of the battleground states increased.

meant that the election didn't go the way that the national polls has indicated. And even the upper Midwestern battleground states had indicated in 2020, it was more like four percentage points. So the election still went the way that the polls indicated, but people felt like, wow, this is a lot closer than I was expecting based on the averages and the forecasts going in. And the way that we sort of explained 2016 is that

Well, we weren't waiting by education. We missed this polarization along educational lines that ended up having a really big impact. And so while we may have been getting a bunch of Republicans in polls or we may have been getting a bunch of Democrats, even in proportion to their existence within the electorate, we were getting the wrong kind. We weren't getting enough non-college educated voters in our sample. And so therefore, we missed something that was happening underneath the surface.

In 2020, various explanations again, but one of them is COVID, right? You have an environment where one party is encouraging its voters to a much greater extent to stay home and not participate in social activities and whatnot, and maybe are more available to pick up the phone and more inclined. The people who are most inclined to vote for Biden are more inclined to pick up the phone, and you may have some error there.

It's impossible for you to look at the environment right now. Yeah, let me get my crystal ball out of the closet and... Get your crystal ball and sort of see where potential concerns would be in this moment. But...

I would like to give you that opportunity. Like if you were to say, this is what I'm concerned about polling in this moment, and we're not going to be able to have this kind of postmortem until it actually happens. But do you have any concerns going in? Because we have sort of after 2016, more folks started waiting by education. And at this point, the pandemic is over. So presumably, if that was what caused it, then it won't be an issue in 2024. Where do we stand now?

I'll just touch on one thing as I answer this. The other thing that gets constantly cited as a criticism of the averages is 2022. Well, that's just bulls**t. The averages were right in 22. Well, yeah, on average. And I won't stand for the bulls**t.

If you average across the state level errors, then sure. But there's a set of Senate races where there is a right-leaning bias in the average, and it comes from pollsters that were signaling that they were doing things differently to try to boost the voice of—

the low social trust, or more Republican-leaning voters by some weird method to back into that group. And they generated results that are more Republican-leaning in an election when the other polls were good. Like fighting the last battle can get you in a tricky position. Yeah, those pollsters were fighting the last battle, and the averages weren't really enabled to handle that. Now, the way that our average works this year combats this by considering...

a national house effect for all the pollsters. If you're releasing, you know, reliably Republican surveys in Pennsylvania and Michigan and Georgia and Arizona, right, you're going to get a stronger house effect this year than the previous methodology would have given you. We've really tried to account for that. Now, again, as you raise, if all of the polls are biased because of the way people are answering phones or something or filling out their online forms,

is leading to bias, there's nothing we can do about that. We are at the mercy of good, high-quality public opinion data in America, and we can try our best to

discern the trends and the biased firms and the more accurate historical firms or whatever. But at the end of the day, if all the polls are 10 points off, the average is going to be somewhere between 5 and 10 points off. But to get back to your question, I do think there are a couple of things to be worried about. I think there's a problem with the polls over-representing engaged Americans, which can typically be solved with weighting, but likely cannot be this year. As political divisions seem to be played out

You know, on the news, that would have a residual effect on the types of people answering polls who really pay attention to the news and not necessarily the wider public. So when I was talking earlier about you have this group of Democrats who are sort of protest responding to the polling about Joe Biden, then that would, in that sense—

cause a lower numbers for Biden than how they would actually vote. But I'm also worried about just overall representativeness among the types of people who don't answer polls and don't typically turn out for elections. And those tend to be the same types of people.

So that's similar to the type of bias we saw in 2020, but it would be a little bit more exaggerated this time, where polls are not representing the group of Americans who aren't volunteering or aren't constantly posting about politics. And if those groups are like...

more impacted by inflation, or just generally unhappy about the direction of things or involvement in foreign wars, right, then the polls would overestimate support for Democrats, underestimating support for Republicans. I think we need to do some more work here. And again, our picture will become clearer before we get to the election. But of course, the caveat that you raise is we never really know what the direction of the polling error is going to be. So our forecast will be relatively agnostic to that.

But, of course, the storytelling we tell about the reliability of the polling is kind of where we add a lot when we engage in conversations like this. Final question. If you could have one more piece of data, what would it be?

I'd really like every poll in America to release the variables that they wait on. So in 2016, we have this big bias concentrated among pollsters who weren't making sure there were enough non-college educated, especially white, but all voters in their samples.

Knowing that ahead of time and indexing on it, accounting for it in your average, like putting more weight on the pollsters that were weighting by more variables, or looking for systematic differences between polls with different weighting schemes, empirically really helps the way your model works. And we can do some crude backtesting for this, and it will be a part of our forecasting model where we take into account a group of pollsters that we think are doing things better

above and beyond how we are rating you. And that has helped in out-of-sample elections in the past. But we don't have that granular of data at the poll level because lots of pollsters just don't tell us how they're doing their polls. I'm going to do a bonus here.

Lots of pollsters today are mixing methods. They're doing a poll over the phone, and they're doing a poll online, and they're doing a poll on text, and they're squashing it together and calling it one poll. That's not really the way you want to go about this. You want to look for systematic differences between your methods, the types of people you're getting in different modes, especially among the subgroups, and then take those

considerations into account in some sort of model, in your waiting scheme, and account for it. And that could lead to a lot of error this year too. Should we go back to door knocking? Ooh, awesome! Do you have $10 million? Polsters just take a random walk through a neighborhood and knock on doors? Let's do it. With that, Elliot, we're going to say goodbye today.

I'll remind folks that they can view these averages at FiveThirtyEight.com. And also, if you have any questions about these averages or the election in general, podcasts at FiveThirtyEight.com or Twitter. But for now, thank you, Elliot. Hey, thanks, Galen.

My name is Galen Druk. Our producers are Shane McKeon and Cameron Chertavian, and our intern is Jayla Everett. Jesse DiMartino is on video editing. You can get in touch, like I mentioned, by emailing us at podcast at 538.com. You can also, of course, tweet us with any questions or comments. If you're a fan of the show, leave us a rating or review in the Apple Podcast Store or tell someone about us. Thanks for listening, and we will see you soon.

538's New Polling Averages Show Close Presidential Race

FiveThirtyEight Politics

Deep Dive

What do the new 538 polling averages reveal about the presidential race?

How reliable are the current polling averages and what factors could affect their accuracy?

What insights do the polling averages provide about the electoral college and battleground states?

How might third-party candidates like RFK Jr. impact the election dynamics?

What are the potential challenges and biases in current polling methods?

What additional data could enhance the accuracy of polling and election forecasts?

Shownotes Transcript

538's New Polling Averages Show Close Presidential Race 42:08 Share

FiveThirtyEight Politics

Deep Dive

What do the new 538 polling averages reveal about the presidential race?

How reliable are the current polling averages and what factors could affect their accuracy?

What insights do the polling averages provide about the electoral college and battleground states?

How might third-party candidates like RFK Jr. impact the election dynamics?

What are the potential challenges and biases in current polling methods?

What additional data could enhance the accuracy of polling and election forecasts?

Shownotes Transcript

538's New Polling Averages Show Close Presidential Race