To make accurate models you need data. Lots of data. Building a model of the American electorate – whether for polling, advertising buys, or for Get Out the Vote efforts – is an exercise in Big Data with huge reams of the stuff being processed by very smart people. As I mentioned in the comments in my last piece, the polling in this election has not “herded” so much as clumped. The preferred MSM narrative has Hilly ahead by quite a bit and is based on a set of polls which are aggregated at fivethirtyeight.com. The “outliers” to this – USC Dornsife/Los Angeles Times “Daybreak” , IBD/TIPP and Rasmussen – consensus position has the race tied or Trump a bit ahead.
Plenty of ink has been spilt trying to explain the clumping as an artefact of the differing methodologies used by the different polling operations: the different samples, different weightings, different measurement techniques. I suspect there is useful information to be gleaned from this sort of comparison but not enough to actually explain what is causing the clumping. For that you need to look at a larger picture.
If you are building a model you have to make some basic assumptions and the most important of these is about what relationship the present has to the past. Put another way, one question you have to ask is how closely the electorate you are looking at right now resembles the electorate of, in this case 2012 and 2008. What’s the same, what’s changed?
Eight years ago America was in the throes of what turned out to be a huge economic crisis. It was offered the chance to elect its first black President. The word Millennial was just entering the lexicon. The iPhone had arrived the year before. So had Netflix as a streaming service.
A few numbers
Traditional media was hanging on in 2008. Newspapers still had readers, advertisers and staff. But that changed a lot in the eight intervening years. The three big TV networks saw their viewership decline. In fact, overall television watching dropped.
Along with declining “reach” mainstream media also saw trust in media drop to new lows in the last eight years. Only 32% of people surveyed by Gallup September 2016 said they great deal or fair amount of trust and confidence in mass media as compared to 43% in 2008. (Actually, only 7% said they had a great deal of trust.)
The number of people of “prime working age” in work in the US – a measure which discounts things like retirement and immigration – was at 78.8% in September 2008 and very nearly the same at 78.0 in September 2016. But during that period it dipped to 75.0 in the aftermath of the 2008 crash.
In the second quarter of 2016 homeownership fell to 62.9% down from 68.1% in 2008.
And, one more number: there are now more Millennials than Boomers.
The raw material for modelling is the same, those numbers and hundreds of other time series: so how can you have the variance implied by the poll clumping?
If the data was just the data there should be very little variation. But, in fact, each of the data sets I’m citing and many, many others, represent actual human experience. If you owned a house in 2008 and lost it in the housing crisis, you have a particular sort of experience. If you had a job in 2008, lost it in 2010 and have only recently re-entered the labour force you have had a particular sort of experience. If you are a Millennial rather than a Boomer, your lived experience is very, very different. The job you lost in 2009 may have been your first and only job. The job the Boomer lost maybe the very last job he’ll ever have. As a Millennial the job you lost in 2009 may have been your first and only job. The job the Boomer lost maybe the very last job he’ll ever have.
Polls tend to work by adjusting their samples to reflect demographics and an estimate of a given demographic’s propensity to actually vote. On a toy model basis, you can think of it as a layer cake with each layer representing an age cohort. So, for example, if you look at younger voters 18-29 you might find that 90% of them support Hilly and 10% Trump. If there are 100 of these voters in your sample of 500 a simple projection would suggest 90 votes for Hilly, 10 for Trump. The problem is that it is difficult to know how many of those younger voters will actually go out and vote. As a rule of thumb the older you are the more likely you are to vote so now you have to estimate voting propensity.
There are two ways to get a sense of voting propensity: ask the people in your sample or look at the behaviour of people the same age but in the last couple of elections.
And now the landscape begins to shift. In 2008, nearly 50% of voters aged 18-29 voted. In 2012, 40% voted. In both elections, the youth vote was heavily pro-Obama. If you were designing a poll at this point, what sort of weighting would make sense for youth voters? Making that call will change the landscape your poll will reflect. If you want your poll to tilt Hilly you can believe that the prospect of the first woman President of the United States will be as motivating as Obama was and assign a voting propensity of 40-50%; alternatively, if you don’t see many signs of Hillary catching fire among younger voters, you can set the propensity number at 30% and create a tie or a slight Trump lead.
(The results of this are even more dramatic if you look at the black vote and turnout. In 2008 black turnout was 69.1%, 2012, 67.4% with Obama taking well over 90%. Will the nice white lady achieve anything like these numbers?)
One the other side of the ledger, the turnouts of the less educated have been low for the last two elections. 52% in 2008 and a little less than 50 in 2012. There is room for improvement. Now, as any educated person will tell you, often at length, Trump draws a lot of support in the less educated cohorts. But that support is easily discounted because these people (the deplorables and their ilk) barely show up to vote.
Build your model on the basis that lower education people’s participation in 2016 will be similar to 2008 and 20012 and you will produce a result in line with the 538.com consensus view. But if you think that the tens of thousands people who show up for Trump’s rallies might just show up to vote, you will have a model tending towards the LA Times view of things.
Pick Your Landscape
If you, like me, cannot stand Hillary and think she belongs in prison, you are going to tend towards a view of the landscape in which the black vote collapses and the idiocracy figures out how the calendar works and shows up in all their bumpkin splendour. If you think Trump is a giant orange racist/groper/fascist, the Millennials will all serenely leave the coffee houses where they serve in honour of their women’s studies degrees and student debt and nobly vote for Hilly despite really wanting Bernie. Black people will embrace Hilly and give Obama a great send off by voting for the nice white lady.
What will determine the actual political landscape is who actually shows up to vote on November 8th. The danger which the 358.com consensus poses to Hillary is that her own, not terrifically enthusiastic supporters, may assume the election is in the bag and binge watch Orange is the New Black, on Netflix, on their smartphone. Because, after Trump’s measured performance in last night’s debate, the wind has gone out of the “literally Hitler” sails. Voting against Trump is no longer quite like hiding Anne Frank in your attic.
For Trump the last three weeks of campaigning are all about getting his people, his deplorables to believe, against all past experience that their votes matter. His rallies, his Tweet fights, his advertising all have to pound home the message that ordinary people’s votes matter.
And then, of course, there are those “events” which are beyond the candidates and the pollster’s control.
Gallup Poll, Oct 26, 1980, Two Weeks Before Election:
REAGAN 39%, CARTER 47%