• Subscribe

Data wins the day: How HPC turned the tide for Trump

Speed read
  • Trump victory suprised many, but not the scientists at Cambridge Analytica
  • 13 terabytes of big data and computer modeling proved decisive in election upset
  • Trump's nostalgic ethos cloaked his high tech modus operandi

Two weeks before election day it appeared that Donald Trump was stuck.

Polling websites, ranging from the New York Times’ Upshot model to statistician Nate Silver’s FiveThirtyEight, showed Trump with a low chance of winning the election.

Yet one data organization was able to see a bigger picture. UK-based Cambridge Analytica (CA) culled comprehensive datasets and built custom computing clusters to model a far different electorate than other polling sites – models that showed Trump a path to victory.

CA, with senior Trump strategist Steve Bannon on its board of directors, was hired by the Trump campaign shortly after he won the Republican nomination.

Democratic prism. This interactive map offers a 3D view of voter preference while maintaining regional significance without map distortion. This view associates county height with county vote total. For full screen version, see here. Courtesy Max Galka and Mark Kearney. 

“What we found is that one of the strongest signals was an urban/rural split,” says David Wilkinson, lead data scientist at CA.

“Our models showed that if you had higher turnout among rural areas, and lower turnout among urban areas, particularly in some ethnic minorities or some higher incomes, then you saw some very different changes in the election.”

Source material

CA built modeling software based on four different types of data: voter records, commercial data, campaign data, and weekly surveys. The company used online and telephone polling to survey thousands of people every week in all 50 states. They eventually focused on about 17 battleground states that would be critical to win the election.

Through this multi-tiered approach, CA scientists were able to model the electoral sentiments of roughly 100 million people who were constantly updating their Trump or Hillary Clinton preference.

Trump support, CA revealed, was different from a typical Republican electorate. In general, Republicans prefer American-made products. But with Trump voters, this was especially important. CA scientists found that American-made cars, in particular, were a strong predictor of who supported Trump.

Data can certainly help make sure that the messages are getting to the right people, and make sure the strategy is being focused in the right way. But it is a candidate that makes the decisions and wins the election, not the data.    ~David Wilkinson

The public polls used by most campaigns can weight only one or two characteristics at a time – age or gender, for instance. When these polls are combined with other sources of data, they can provide a more realistic picture for political campaigns.

“When you plug those results [from public polls] into other sorts of data, when you have commercial data available to you, when you have other political sorts of data, and when you match those responses to a database of voters, you can use a lot more information. You can see which features are the most effective at weighting, and you can get a much more accurate picture,” says Wilkinson.

CA also analyzed early voting returns from rural areas to see if they matched the firm’s modeling of a more active rural electorate. Wilkinson’s team saw the rising early voter turnout in the Rust Belt – and a historical correlation between early turnout and final election results – and alerted the Trump campaign to the shift that was occurring.

Science for the win

Their discovery that early voting turnout was high in the Rust Belt, and that their models could weigh more factors than other polling organizations, led Trump to return to states that hadn’t voted for Republicans since the 1980s.

“Re-calculating voter turnout and reweighting our models showed us the scenario in which Trump could win,” says Wilkinson. “So we presented this to the campaign, and I think they really took it to heart. That's when the campaign revisited areas like Michigan and Wisconsin and areas that really surprised people that Trump would even bother campaigning in.”

<strong>Blown into proportion. </strong> The map on the left shows voter preference across the US. The cartogram map on the right represents the nation in proportion to its population. Courtesy <a href='http://www-personal.umich.edu/~mejn/election/2016/'>Mark Newman.</a>

CA’s analysis relied on custom high-performance computing clusters – upwards of 560 processing cores and over 130 TB of data storage.

Total data analyzed during the campaign approached 13 TB, analysis possible via a data cloud accessed through Amazon Web Services.

While winning campaigns depend on more factors than just data, it is unmistakable that data can be a critically important factor – one that can put one candidate over the top in a close race. For all Trump’s populist, anti-intellectualist appeals, he ultimately relied on computational power and scientific analysis to secure victory. This acceptance of science is a heartening, if ironic, signal from the incoming president.

“Data can certainly help make sure that the messages are getting to the right people, and make sure the strategy is being focused in the right way,” notes Wilkinson. “But it is a candidate that makes the decisions and wins the election, not the data.”

Join the conversation

Do you have story ideas or something to contribute?
Let us know!

Copyright © 2017 Science Node ™  |  Privacy Notice  |  Sitemap

Disclaimer: While Science Node ™ does its best to provide complete and up-to-date information, it does not warrant that the information is error-free and disclaims all liability with respect to results from the use of the information.

Republish

We encourage you to republish this article online and in print, it’s free under our creative commons attribution license, but please follow some simple guidelines:
  1. You have to credit our authors.
  2. You have to credit ScienceNode.org — where possible include our logo with a link back to the original article.
  3. You can simply run the first few lines of the article and then add: “Read the full article on ScienceNode.org” containing a link back to the original article.
  4. The easiest way to get the article on your site is to embed the code below.