Quant Finance

Sentiment Classification and Opinion Mining Using News Wires and Micro Blogs (Twitter)

Course Background 

Topics to be covered:

  • Aspect-based Sentiment Analysis
  • Multi-Dimensional Sentiment Analysis
  • Extracting User-Level Sentiments with Approval Relations 

Programme

9:00 REGISTRATION AND COFFEE 

Natural Language Processing challenges in analysing Social Media messages

Stephen Pulman, Professor of Computational Linguistics, Oxford University/TheySay Analytics

While by no means a solved problem, we are getting reasonably good at the syntactic and semantic analysis of well-behaved text of the type found in news feeds or in other traditional media.But the informal and rapidly evolving language styles found on social media like Twitter or Facebook cause problems for our usual analysis techniques, and accuracy levels typically are much lower for such texts. In this talk I will describe some of these challenging linguistic phenomena and outline some attempts to overcome the difficulties posed for automated linguistic analysis. 

Text and Network Analysis for Sentiment  Mining

Enza Messina, Professor Department of Informatics Systems & Communication (DISCo) - University of Milano-Bicocca, Italy and Federico Alberto Pozzi, currently: Analytical Consultant SAS / formerly: Researcher, University of Milano-Bicocca, Italy 

In this talk we show how social relationships can be managed to improve user-level sentiment analysis of microblogs, overcoming the limitation of the state-of-the-art methods that generally consider posts as independent data. Early approaches consist in exploiting friendship relations, but since two friends could have different opinions about the same topic, it could however be inappropriate to measure sentiment similarity. We show how combining post contents and approval relations may lead to significant improvements in the polarity classification of the sentiment both at post and at user level 

Ensemble Learning for Sentiment Analysis

Enza Messina, Professor, Department of Informatics Systems & Communication (DISCo) – University of Milano-Bicocca, Italy

Polarity classification is one the most relevant tasks for analysing the sentiment of the huge amount of textual data on the Web. Most existing approaches select the best classification model but these do not take into account the inherent complexity of natural language, particularly when dealing with user generated contents. This talk presents a paradigm of ensemble learning which reduces the noise sensitivity related to language ambiguity and therefore provides a more accurate prediction of the polarity. 

Identifying Types of Sentiment Spikes that Have Significant Predictive Power

Tomaso Aste, Head of the Financial Computing & Analytics Group and Director of the MSc in Financial Risk Management  & Olga Kolchyna, Ph.D Researcher, University College London  & Tharsis Souza, PhD Researcher, University College London

We study the power of Twitter sentiment to predict consumer sales, by analysing sales for 50 companies and over a 100 million tweets mentioning those companies along with their sentiment. We developed a robust method for identifying and clustering bursts in sales and Twitter Sentiment series based on their shape. We find that bursts from Twitter Sentiment time series can be clearly separated into four categories. For each category we calculate the number of sales events that are preceded by Twitter volume bursts. We find that prediction of sales based on unclustered Twitter spikes is not better than random guessing, however, clustering of Twitter Sentiment bursts revealed two classes of spikes that have significant (p-value < 0.01) predictive power.

 

SAS® Text Analytics and Sentiment Analysis

Federico Alberto Pozzi and Marco Zavarini (SAS Institute)
In this talk, we will present the SAS® solutions for Text Analytics and Sentiment Analysis. In particular, a demo using SAS® Visual Analytics will be presented. In this demo, we will present two case studies based on real data regarding a famous customer in the banking sector. The first case study regards the customer care, where data are retrieved from the customer's official facebook page, while the second case study regards the brand analysis on textual information coming from different online sources. In particular, we will see how to detect the hot topics, analyse the overall sentiment and the sentiment taxonomy. At the end, the SAS® infrastructure for high-performance analytics will be discussed. 

Three overview presentations (details TBC): 

Text to sentiment classification process: the approaches taken and features offered to clients 

Presentation 1: RavenPack

Presentation 2: Bloomberg

Presentation 3: Thomson Reuters 

 Panel with Discussion and Q & A

Led by Tomaso Aste, University College London 

Close of Workshop 

17:00 CLOSE