Get To The Point - Knowledge Extraction

Doug Rivers

About

Professor of political science at Stanford, senior fellow at Hoover Institution, polling expert, affiliated with YouGov

Claims by Doug Rivers (20)

Telephone polling, which arose in the 1970s, replaced in-person area probability sampling because in-person interviewing (door-to-door) was the dominant method from the 1930s until then.

factualpolitical science

Roughly 80-90% of polling errors come from sample skews rather than from sampling (random) error.

factualpolitical science

Random digit dialing became the dominant phone-polling method because roughly 30% of the population had unlisted numbers, making listing-based sampling infeasible; it worked well for 20-30 years but broke down in the last 10-15 years as marketing calls, cell phones, and distrust drove cooperation rates from ~70% down to ~20% or less, hurting accuracy.

causalpolitical science

Pollsters typically select 10 to 20 times as many phone numbers as the number of completed interviews they want, meaning the people who actually respond are a small, non-random slice that skews toward more women, higher education, and higher income, requiring weighting and adjustment.

factualpolitical science

A truly random sample of about 1,000 people gives a margin of error of about plus or minus 3% with 95% confidence regardless of total population size, so a tiny fraction (one in 300,000 Americans) suffices; quadrupling to ~4,000 tightens it to about 1% at 99% confidence.

factualscience

The clean statistical results (law of large numbers and central limit theorem) only hold for perfectly executed random sampling; in practice sampling plans are rarely executed perfectly because you never get near 100% cooperation, and the non-responders differ systematically from responders in unobservable ways, producing skews rather than mere random noise.

causalscience

Weighting underrepresented groups by large factors (5-10x for some groups) drastically inflates variability: with a weight of 10 in a sample of 1,000, a single miscoded respondent shifts the whole sample by 1%, so a single error can destroy a nominal 3% margin of error.

causalscience

Weighting creates two distinct problems: it roughly doubles the real variability compared to a simple random sample, and it leaves residual skews (unobserved or uncorrected biases) that do NOT shrink as sample size grows, so conventional sampling-error estimates tell you how the procedure varies sample-to-sample but not how systematically wrong the procedure is.

causalscience

In the 2008 New Hampshire primary, all roughly 30 pre-election polls failed to show Hillary Clinton ahead of Obama, which she won, indicating something systematically wrong affecting all polls rather than random chance.

factualpolitical science

Likely the New Hampshire 2008 polling miss came from over-representation of college- and graduate-degree voters (who favored Obama, over-represented by 2-3x and under-corrected) and from unreliable self-reports of voting intention, rather than from racism (the Bradley effect).

causalpolitical science

Self-reported voting intention is unreliable: about 90% of people say they will vote in a primary but actual turnout is much lower (US presidential turnout has hovered in the mid-50% range, congressional under 40%), and likely-voter screens cannot fully fix this because people answer based on what they think they should say.

causalpolitical science

Of about 1,300 polls Rivers examined from the 2008 presidential primaries, fewer than 50 reported anything other than the margin of error for a simple random sample with no weighting—meaning the reported margins of error are systematically misleading, which Rivers calls scandalous.

factualpolitical science

Typical media polls correct only for age, race, and gender (sometimes education, rarely income) using weighting methods about 60 years old, well behind modern statistical theory, partly from a long-standing belief that random digit dialing was good enough.

factualpolitical science

Final pre-election polls tend to be too similar to each other relative to their known sampling variability, suggesting pollsters herd—choosing among multiple defensible weighting schemes the one that places them in the pack rather than the one that yields a divergent answer—though Rivers attributes this to risk-aversion and the art of weighting rather than dishonesty.

causalpolitical science

There would seem to be an incentive for a pollster to trust divergent weights and look like a genius (as the lone correct caller of an upset), but in practice small subsamples (e.g. a NH poll with ~100 Republicans) make divergent results look like noise that gets discounted, which is why having 30 polls all wrong in New Hampshire was an extraordinary wake-up call.

causalpolitical science

The fraction of people with internet access at home, school, or work now probably exceeds the number with landline access once you exclude cell-phone-only households and those with no phone, making internet-based polling a viable alternative to telephone polling.

factualtechnology

Massive consumer and voter databases that did not exist 25 years ago now provide detailed information (income, home value, etc.) on most people, allowing pollsters to draw a randomly selected target sample from a voter list and then match closely-similar respondents from a large opt-in internet panel, creating a sample that mimics a random sample across many dimensions and removes skews that simple demographic weighting cannot.

causalpolitical science

In 2006, the matched internet-panel method produced election forecasts whose average error was substantially less than the average reported telephone poll, mainly because it removed biases better (not because sampling variability was smaller), and Rivers argues telephone polling could be improved by similarly substituting a respondent who resembles the missed person rather than drawing another number from the same skewed population.

causalpolitical science

Interactive voice response (robocall) polls make up roughly 80-90% of polling done in 2008 because newspapers cut back on expensive live-interviewer polling, and despite low response rates and not knowing who is answering, their record is not bad—largely because IVR organizations pay closer attention to weighting than traditional phone pollsters.

causalpolitical science

Some organizations (Associated Press, New York Times) refuse to report polls using nonprobability sampling—respondents selected without known probabilities of selection—but Rivers argues they are in denial because their own low-response telephone polls also lack known selection probabilities.

factualmedia

My Notes

Loading notes...