New York Times, June 20, 2015
Headline: What’s the Matter With Polling?

Byline: CLIFF ZUKIN

OVER the past two years, election polling has had some spectacular disasters. Several organizations tracking the 2014 midterm elections did not catch the Republican wave that led to strong majorities in both houses; polls in Israel badly underestimated Prime Minister Benjamin Netanyahu’s strength, and pollsters in Britain predicted a close election only to see the Conservatives win easily. What’s going on here? How much can we trust the polls as we head toward the 2016 elections?

Election polling is in near crisis, and we pollsters know. Two trends are driving the increasing unreliability of election and other polling in the United States: the growth of cellphones and the decline in people willing to answer surveys. Coupled, they have made high-quality research much more expensive to do, so there is less of it. This has opened the door for less scientifically based, less well-tested techniques. To top it off, a perennial election polling problem, how to identify “likely voters,” has become even thornier.

In terms of speed, the growth of cellphones is like few innovations in our history. About 10 years ago, opinion researchers began taking seriously the threat that the advent of cellphones posed to our established practice of polling people by calling landline phone numbers generated at random. At that time, the National Health Interview Survey, a high-quality government survey conducted through in-home interviews, estimated that about 6 percent of the public used only cellphones. The N.H.I.S. estimate for the first half of 2014 found that this had grown to 43 percent, with another 17 percent “mostly” using cellphones. In other words, a landline-only sample conducted for the 2014 elections would miss about three-fifths of the American public, almost three times as many as it would have missed in 2008.

Since cellphones generally have separate exchanges from landlines, statisticians have solved the problem of finding them for our samples by using what we call “dual sampling frames” — separate random samples of cell and landline exchanges. The problem is that the 1991 Telephone Consumer Protection Act has been interpreted by the Federal Communications Commission to prohibit the calling of cellphones through automatic dialers, in which calls are passed to live interviewers only after a person picks up the phone. To complete a 1,000-person survey, it’s not unusual to have to dial more than 20,000 random numbers, most of which do not go to actual working telephone numbers. Dialing manually for cellphones takes a great deal of paid interviewer time, and pollsters also compensate cellphone respondents with as much as $10 for their lost minutes.

THE best survey organizations, like the Pew Research Center, complete about two of the more expensive cellphone interviews for every one on a landline. For many organizations, this is a budget buster that leads to compromises in sampling and interviewing.

The second unsettling trend is the rapidly declining response rate. When I first started doing telephone surveys in New Jersey in the late 1970s, we considered an 80 percent response rate acceptable, and even then we worried if the 20 percent we missed were different in attitudes and behaviors than the 80 percent we got. Enter answering machines and other technologies. By 1997, Pew’s response rate was 36 percent, and the decline has accelerated. By 2014 the response rate had fallen to 8 percent. As Nate Silver of fivethirtyeight.com recently observed, “The problem is simple but daunting. The foundation of opinion research has historically been the ability to draw a random sample of the population. That’s become much harder to do.”

This decline is worrisome for two reasons. First, of course, is representativeness. Strangely, for some reason that no one really understands, well-done probability samples seem to have retained their representative character despite the meager response rate. We know this because we can compare the results we get from our surveys to government gold-standard benchmarks like the census’ American Community Survey, where participation is mandated. Even so, Robert M. Groves, the provost of Georgetown and a former director of the Census Bureau, cautions, “The risk of failures of surveys to reflect the facts increases with falling response rates. The risk is not always realized, but with the very low response rates now common, we should expect more failed predictions based on surveys.”

The low response rate has also had a significant impact on the cost. Survey organizations have to pay interviewers to complete between 700 and 1,000 cellphone interviews with a response rate of 8 percent, with multiple callbacks to numbers that don’t answer and ineligible young people. This means tens of thousands of calls dialed by hand, where not that long ago automatic dialers called a 100 percent landline sample. Mark Schulman, a co-founder and research chief at Abt SRBI, estimates that interviewing costs in 2016 will be more than twice what they were in 2008.

And news budgets have shrunk for the media organizations that underwrite much of this research.

The new economics have driven many election pollsters to the Internet, where expenses are a fraction of what it costs to do a good telephone sample. However, there are major problems with Internet polls. First is what pollsters call “coverage error.” Not everybody is reachable online; Pew estimates that 87 percent of American adults are Internet users.

But Internet use correlates inversely with age and voting habits, making this a more severe problem in predicting elections. While all but 3 percent of those ages 18 to 29 use the Internet, they made up just 13 percent of the 2014 electorate, according to the exit poll conducted by Edison Research. Some 40 percent of those 65 and older do not use the Internet, but they made up 22 percent of voters.

A much bigger issue is that we simply have not yet figured out how to draw a representative sample of Internet users. Statisticians make a primary distinction between two types of samples. Probability samples are based on everyone’s having a known chance of being included in the sample. This is what allows us to use mathematical theorems to confidently generalize from our sample back to the larger population, to calculate the odds of our sample’s being an accurate picture of the public and to quantify a margin of error.

Almost all online election polling is done with nonprobability samples. These are largely unproven methodologically, and as a task force of the American Association for Public Opinion Research has pointed out, it is impossible to calculate a margin of error on such surveys. What they have going for them is that they are very inexpensive to do, and this has attracted a number of new survey firms to the game. We saw a lot more of them in the midterm congressional election in 2014, in Israel and in Britain, where they were heavily relied on. We will see them more still in 2016.

The other big problem with election polling, though not a new one, is that survey respondents overstate their likelihood of voting. It is not uncommon for 60 percent to report that they definitely plan to vote in an election in which only 40 percent will actually turn out. Pollsters have to guess, in effect, who will actually vote, and organizations construct “likely voter” scales from respondents’ answers to maybe half a dozen questions, including how interested they are in the election, how much they care who wins, their past voting history and their reported likelihood of voting in this particular election. Unfortunately, research shows there is no single magic-bullet question or set of questions to correctly predict who will vote, leaving different polling organizations with different models of who will turn out.

This has become a bigger problem lately. Scott Keeter, a former colleague of mine who is now the director of survey research at Pew, told me that “as coverage has shrunk and nonresponse has grown, forecasting who will turn out has become more difficult, especially in sub-presidential elections. So accuracy in polling slowly shifts from science to art.”

The problem here of course is that actual turnout is unknown until the election is over. An overestimation of turnout is likely to be one of the reasons the 2014 polling underestimated Republican strength. Turnout in that midterm election was the lowest since World War II; fewer than 40 percent of eligible voters cast ballots. Since Democrats are on average less well educated and less affluent than Republicans, and less likely to vote, a low turnout would be disproportionately Republican, as fewer occasional voters (who are disproportionately Democratic) participated. And of course we don’t know what to expect for the general election in 2016.

So what’s the solution for election polling? There isn’t one. Our old paradigm has broken down, and we haven’t figured out how to replace it. Political polling has gotten less accurate as a result, and it’s not going to be fixed in time for 2016. We’ll have to go through a period of experimentation to see what works, and how to better hit a moving target.

Those paying close attention to the 2016 election should exercise caution as they read the polls. Because of the high cost, the difficulty in locating the small number of voters who will actually turn out in primaries and the increasing reliance on non-probability Internet polls, you are likely to see a lot of conflicting numbers. To make matters still worse, the cellphone problem is more acute in states than it is at the national level, because area codes and exchanges often no longer respect state or congressional boundaries. Some polling organizations will move to sampling from voter lists, which will miss recently registered voters and campaigns’ efforts to mobilize them.

We are less sure how to conduct good survey research now than we were four years ago, and much less than eight years ago. And don’t look for too much help in what the polling aggregation sites may be offering. They, too, have been falling further off the track of late. It’s not their fault. They are only as good as the raw material they have to work with.

In short, polls and pollsters are going to be less reliable. We may not even know when we’re off base. What this means for 2016 is anybody’s guess.