Throughout the election cycle astute campaign watchers have fixated on a unique polling phenomenon known as social desirability bias. There are answers to questions that are more socially desirable (e.g., I am not a racist v. I am a racist, I recycle v. I do not recycle, and I do not support Donald Trump v. I support Donald Trump). The concern is that people will tell pollster the socially desirable answer (e.g., I am not a racist or I recycle or I do not support Donald Trump), but then do the socially undesirable action (e.g., be a racist or not recycle or vote for Donald Trump). Could social undesirability be a factor in Hillary Clinton’s large lead over Donald Trump in the polls? It probably does not exist and it is not anywhere close to explaining the massive 8.6 percentage point lead Clinton currently enjoys in the Huffington Post’s Pollster trend.
We have a bunch of live telephone polls and bunch of online polls and they told a slightly different story in the Republican primary; presumably people would be more honest with a computer than a person. And, Trump did poll much better online versus in live telephone conversations in the primaries. But, he actually under-performed his polling when people voted! What that means is that online polling actually over-estimated his support. Online polling was not correcting a problem of people who would vote for him being too shy to say so; online polling had too many people saying they would vote for Trump, versus how many people actually voted for him.
Figure 1: Huffington Post’s Pollster Republican Primary trends. (Left) live telephone polls only and (right) online polls only.
In the primaries, Trump may have been over-represented by the online sample, making the story more about sample selection and less about social desirability bias. Polling comes in many variations, but the for this article we will focus on two key attributes: the sample (i.e., the population of people polled) and the mode (i.e., the technology used to poll them). Traditional polling focuses on finding random and representative samples of the target population. And, they do so using live telephone polling. A lot of newer polling uses either random or quota samples from opt-in online panels. Note that the sample is heavily tied to the mode: the mode of using a live telephone generally uses a more random and representative sample than mode of using the internet. So, when live telephone polls differ from online polls the difference could be driven by either the sample or the mode.
In the general election, this is a non-story: Clinton is up by 7.3 percentage points in live telephone polls and 7.9 percentage points in online polls. They are virtually the same. While polling still may be way off in 2016, it is possible, a lack of any difference between modes is strong suggestion that social desirability bias is not a meaningful factor in any possible polling bias for the 2016 presidential election.
Figure 2: Huffington Post’s Pollster General Election trends. (Left) live telephone polls only and (right) online polls only.