Tobias Konitzer and David Rothschild on Posted on

This is the third piece of our series, Evolution of Polling Samples from RDD to RDE

In part 1: we describe a polling industry that is ripe for transformation as Random Digit Dialing (RDD) becomes increasingly tenuous.
In part 2: we discuss issues involved with online panels which are increasingly replacing RDD.
In part 4: we introduce Random Device Engagement (that is what we do!).

Part 3: Assisted crowdsourcing: Organic but not Random

Assisted crowdsourcing: identifies a set of quotas and submits these quotas to a social media platform, along with an ad to invite respondents to participate in the survey. Then, social networks serve this content to a targeted group of users, and polling firms survey respondents who click on the ad and go to survey site. The mode is mainly desktop, but could be tablet or mobile as well. The sample can have high coverage. Non-response is extremely high, and the sample is highly biased and (for the polling firm) impossible to correct.

Assisted crowdsourcing polling relies on social networks with massive penetration to supply respondents. This method recruits respondents via ads on social networks with high penetration, read Facebook (while it can be done on other display or search ad platforms, the massive penetration/coverage and availability of background demographic data mean that Facebook is really one of the few alternatives), relying on “quota” sampling, i.e. recruiting a certain number of black females, black males, educated males etc. to achieve representativeness of their sample.

There are some advantages here:

  1. Speed and Targeting: The main advantage here is that due to the penetration and reach of Facebook, polling can be done at granular areas (think state legislative districts), at a somewhat cheaper cost (by our estimates, respondents will run at about $5). Thus, a polling firm engaging in Assisted crowdsourcing could sell a poll of N 1,000 for about $8,000-$10,000, slightly cheaper than traditional polls (but with a similar costs to online panels), and, due to Facebook’s reach, faster. In summary, good depth, speed, and relatively good costs.
  2. Organic Sample: Facebook is an organic location for getting opinions. Instead of curating “professional survey takers” who answer many political polls akin to a (side-)job, assisted crowdsourcing reaches respondents where they spend time organically. That is to say, people live on Facebook, get their information on Facebook, share their thoughts on Facebook; assisted crowdsourcing gathers opinions in that natural environment.

There are BIGGER disadvantages:

  1. Quota Sampling is bad. Quota sampling has long been shunned by high-quality polls, and for good reasons: The debacle in the 1948 election laid bare the dangers in quota sampling. If respondents are “recruited” to fill demographic buckets, pollsters are going to recruit respondents in that bucket who are easiest to reach. You need to recruit 10 non-college educated Whites? Great, you have interacted with representatives of that demographic bucket in the past, why not simply recruit these folks? While this is done in practice, hitting the same respondents over and over again is problematic. More important, the ability to reach someone within a bucket is likely correlated with the respondents’ level of political engagement, partisan affiliation, and political knowledge: the same things you are trying to measure. Specifically, respondents of certain demographic strata who are easy to reach have abnormally high levels of political engagement, knowledge, etc., leading to a sizable bias that cannot easily be corrected.
  2. Quota sampling on social networks is worse. If you are dealing with social networks, the quota sampling problem discussed above gets much worse. Facebook algorithms are designed to expose the “cheapest respondent” to the ad, i.e. the respondent that is most likely to maximize click-through rates (see for example this discussion about Facebook targeting algorithms in this recent PNAS letter). Hence, it makes sense to show ads to participate in a political survey, especially those that have a political cue, to users who are more likely to click on political content – for example users who declare a self-reported ideology as part of their profile, or like a lot of political content. If polling firms relying on assisted crowdsourcing target, say, non-college educated Whites, chances are that those non-college educated Whites who are exposed to the ad because of their high likelihood to click on political content exhibit unusually high levels of political engagement. To make matters worse, the characteristics most predictive of that non-representativeness, behavioral metrics from Facebook such as “Likes” of political content, are not available to polling firms for bias correction. And, in expectation, Facebook’s machine learning algorithms get better at predicting who clicks on ads to participate in political polls, and who does not, over time. This means that (a) biases exacerbate the longer polling firms recruit respondents on Facebook, and (b) the number of fresh respondents diminishes, in effect leading to a panel structure bringing with it concerns of measurement error due to panel fatigue, or panel conditioning effects, meaning a change in underlying attitudes as a direct consequence of membership in a panel-like structure.
  3. Assisted crowdsourcing is at the mercy of social networks. Simply, any survey tool on a social network is reliant on the legal framework surrounding social networks with high penetration (and, there are really only two or three to speak of). Much like online panels, assisted crowdsourcing lacks agility with technology and adaptability to new audiences. Should any preemptive legislative strike result in the social network’s withdrawal from the political ad market (or a dramatic shift in costs or types of exposure) – a possible scenario amidst the recent turmoil surrounding the data breach leveraged by the now defunct right-wing analytics firm Cambridge Analytica – the respondent market, and methodology fine-tuned to the idiosyncrasies of respondents drawn from the social network in question, can become obsolete in a matter of minutes.

Bottom line: Polling companies relying on Assisted crowdsourcing have the ability to poll every political race from presidential elections to state legislative elections – and that is commendable. But, biases introduced by quota sampling, exacerbated by fine-tuned targeting algorithms of social networks, mean that severe and uncorrectable sample bias can lead to serious polling error. In addition, the nature and extent of respondent supply is completely dependent on a legal framework polling firms have no influence over.