By using Big Data, FiveThirtyEight has made stunningly accurate predictions that other polls didn’t see coming– a fact that proves that Big Data are not only useful in statistical analysis, but essential.

Journalists and analysts often predict elections using polls. Despite conducting extensive research, polling regularly and considering past trends, traditional approaches to election prediction have an Achilles Heel: the data that they use is limited.

Nate Silver sought to improve predictive journalism in founding FiveThirtyEight, an opinion polling analysis and prediction website, and effectively introduced Data Journalism. By using Big Data and statistical analysis, the site has managed to make stunningly accurate predictions that other pollsters didn’t see coming.

Predicting Elections

FiveThirtyEight started gaining attention almost eight years ago by correctly predicting the outcome of the historic 2008 US Election. The site established a reputation for insanely accurate statistical analysis when it correctly predicted the outcome of the 2012 US Election. Its prediction in 2012 was notable because most polls, pundits and politicians argued that the race was too close to call, but FiveThirtyEight called Obama’s reelection by a comfortable margin.

How FiveThirtyEight Makes Stunningly Accurate Predictions

Big Data is defined by complex data sets that require specialized tools for analysis.
The accuracy of some FiveThirtyEight functions forecasts is certainly remarkable, but how does the site work? The site’s approach to statistical analysis is characterized by the fact that the project is very receptive to new technologies and new programming languages.  FiveThirtyEight’s Quantitative Editor, Andrew Flowers, explains that the site uses R Language to produce its complex charts and detailed analysis, which is a relatively new and open-source R Language for analysis.

When compared to traditional analysis that sources mainly from a limited amount of polls, FiveThirtyEight uses a much more comprehensive approach when selecting its sources for statistical analysis.  What’s different about the site is that it uses draws upon multiple data sources, and also factors in current trends.
Typically, the site uses hundreds of surveys conducted at the State level. But, it also incorporates other, seemingly irrelevant variables that are likely to sway polling, like historical data, economic performance, population and demographic trends, registration records and other Big Data.

As a result, this comprehensive approach has shed light on particular political tendencies in certain States. For example, FiveThirtyEight’s analysis of registration and residency data has shown that political trends in Texas are likely to influence decisions in neighboring Oklahoma.

Big Data and the Future of Prediction

Outside of politics and elections, the comprehensive approach to statistical analysis pioneered by FiveThirtyEight will have massive implications for a variety of industries.
In some developing countries, for example, microlenders are using a FiveThirtyEight-style approach to assessing customer suitability for small loans. Lenders analyze Big Data like the length of customer calls, the call recipient’s identity, the caller’s friends on Social Media, and hundreds of other data points using an algorithm system to generate a type of credit score.

Of course, using multiple data sources is the standard today, but back in 2008, it was still novel. Silver recognized that the more information you have to analyze, the more accurate a prediction will be. As more and more data are created and become available for analysis, FiveThirtyEight’s adaptable and comprehensive approach to prediction using Big Data will not only help give us more accurate visions of the future, but will also act as a mirror for emerging trends.

