• Subscribe

Quantifying opinion

Speed read
  • Social media provides valuable data about political beliefs
  • Machine learning can help sort and categorize these opinions 
  • We may one day be able to use an automated tool to see which politicians think most like us 

Trying to nail down a politician’s beliefs is a bit like figuring out what’s wrong with a broken toilet. It requires hard work, dedication, and is impossible to do without being a little grossed out. 

No matter how you look at it, politicians lie. Most of us know that, but simply don’t have the time or energy to dig through their record and find the truth. 

<strong>Politicians say</strong> lots of things when they are campaigning. But how do we know what they really believe?But what if there were a tool that helped you choose which politician to vote for based solely on how well their beliefs align with the issues that matter to you? 

Dr. Srijith Rajamohan, a computational scientist at Virginia Tech, thinks this could be within reach. In fact, he’s working on a deep-learning based interactive visualization tool to understand and plot political ideologies based on Twitter activity. 

“Is there a way to extract and understand people’s ideologies from the things that they say?” asks Rajamohan. “I turned to natural language understanding to see if we can take text from social media, run it through a deep learning model, and find some way to quantify it.”

Someday soon, a tool like Rajamohan’s could have a huge impact on how people understand political ideologies. And if we’re lucky, it could make voting for the right candidate a lot easier.  

Cleaning up the data

The end-goal of this work was to construct a visualization tool that could help identify political ideology. However, as many important endeavors do, this project began with an intriguing conversation.

“It all started over coffee when Alana Romanella and I were discussing white supremacy and hate speech,” says Rajamohan. “The conversation evolved, and we started talking about different political groups.” 

<strong>Attention weights for model interpretability.</strong> Emphasized words (darker boxes) have a larger contribution to the classification outcome, informing the user what words are relevant from the network’s perspective. Courtesy Rajamohan, et al.Eventually, Romanella and Rajamohan decided they could use deep learning to better understand political ideology by investigating social media posts. They decided to focus on Twitter, as it is a data-rich environment with a free application programming interface (API). After collecting data for four months, the team had roughly 3 million tweets to work with. 

“We pulled the tweets based on certain hashtags provided by our in-house political scientist.”

But, as Rajamohan explains, this approach has some drawbacks. “A particular hashtag can be used by people from widely varying beliefs and backgrounds, so that’s not necessarily going to tell you that they belong to a particular group or they have a certain ideology.”

For example, say you’re trying to figure out how groups of people feel about the Black Lives Matter movement. You can’t simply assume anyone tweeting out the #BlackLivesMatter hashtag is a sympathizer to the cause, as members of white supremacist groups might also use this hashtag in a derogatory context.

This kind of ambiguous information is called dirty data, and it can be a big problem in machine learning. It can prevent scientists from actualizing any real analysis of a given dataset, and it is the most common issue facing data science workers

For this project, Rajamohan decided to move to a weakly supervised form of machine training to understand intent. Contextual embeddings helped mitigate noise in the data by guiding a human researcher to the incorrect records on the plots that were generated from the neural network. 

<strong>Assessing political affiliation.</strong> Affiliation is projected along the orientation of the cluster with liberal ideology represented at the bottom left and conservative at the top right. This type of projection allowed researchers to identify some errors. Courtesy Rajamohan, et al. Once they had cleaner data, Rajamohan and his colleagues were able to visualize these belief structures. Although they experimented with various visualization techniques such as t-SNE, Isomap, and PCA, multidimensional scaling (MDS) turned out to be the most efficient. 

This model places liberal ideologies on the bottom left, while conservative opinions are placed on the top right. Although other techniques such as t-SNE are able to provide a more effective separation of data, MDS is able to better identify incorrect labels in the corpus.

Quantifying an opinion

To make this whole process simpler, Rajamohan focused less on specific political ideologies in favor of plotting a person’s political affiliation based on their relationship to an important public figure. For instance, your opinion of Elizabeth Warren or Donald Trump reveals a lot about your own beliefs. 

"In an ideal world, I would have a tool like this before voting." says Rajamohan.

I would take all of the beliefs and opinions a politician ever expressed and project it on a screen. I would then take my own beliefs and opinions and put it on the screen and look who I’m closest to.

While this tool has a long way to go before it becomes something the public can rely on to pick a candidate, simply pursuing this endeavor keeps Rajamohan interested. 

“Being able to understand intent is hard,” said Rajamohan. “You have an entity, and it’s easy to say someone feels positively or negatively about something, but how do you quantify or extract someone’s intent? That is a really ill-defined concept, so if we can use deep learning or AI to extract them – I think that’s a pretty neat concept to explore.”

Read more:

 

Join the conversation

Do you have story ideas or something to contribute? Let us know!

Copyright © 2019 Science Node ™  |  Privacy Notice  |  Sitemap

Disclaimer: While Science Node ™ does its best to provide complete and up-to-date information, it does not warrant that the information is error-free and disclaims all liability with respect to results from the use of the information.

Republish

We encourage you to republish this article online and in print, it’s free under our creative commons attribution license, but please follow some simple guidelines:
  1. You have to credit our authors.
  2. You have to credit ScienceNode.org — where possible include our logo with a link back to the original article.
  3. You can simply run the first few lines of the article and then add: “Read the full article on ScienceNode.org” containing a link back to the original article.
  4. The easiest way to get the article on your site is to embed the code below.