Using Twitter to Monitor Collective Mood

Large scale analysis of social media content allows for real time discovery of macro-scale patterns in public opinion and sentiment. In this paper we analyse a collection of 484 million tweets generated by more than 9.8 million users from the United Kingdom over the past 31 months, a period marked by economic downturn and some social tensions. Our findings, besides corroborating our choice of method for the detection of public mood, also present intriguing patterns that can be explained in terms of events and social changes. On the one hand, the time series we obtain show that periodic events such as Christmas and Halloween evoke similar mood patterns every year. On the other hand, we see that a significant increase in negative mood indicators coincide with the announcement of the cuts to public spending by the government, and that this effect is still lasting. We also detect events such as the riots of summer 2011, as well as a possible calming effect coinciding with the run up to the royal wedding.

ANIMATION: http://mediapatterns.enm.bris.ac.uk/mood

REFERENCE: Thomas Lansdall-Welfare, Vasileios Lampos and Nello Cristianini: Effects of the Recession on Public Mood in the UK. Accepted for publication in the International Workshop on Social Media Applications in News and Entertainment (SMANE), 2012.

Monitoring Social Media to Detect Possible Hazards

— —
Note that an improved version of this article has been published in Natural Hazards Observer, Volume XXXVI, Number 4, pp. 7-9, March 2012.
— —

Vasileios Lampos and Nello Cristianini
Intelligent Systems Laboratory
University of Bristol

Abstract. Real time monitoring of environmental and social conditions is an important part of developing early warning of natural hazards such as epidemics and floods. Rather than relying on dedicated infrastructure, such as sensor networks, it is possible to gather valuable information by monitoring public communications from people on the ground. A rich source of raw data is provided by social media, such as Blogs, Twitter or Facebook. In this study we describe two experiments based on the use of Twitter content in the UK, showing that it is possible to detect a flu epidemic, and to assess the levels of rainfall, by analysing text data. These measurements can in turn be used as inputs of more complex systems, for example for the prediction of floods, or disease propagation.

Introduction

The fast expansion of the social web that is currently under way means that large numbers of people can publish their thoughts at no cost. Current estimates put the number of Facebook users at 800 million and of Twitter active users at 100 million [1, 2]. The result is a massive stream of digital text that has attracted the attention of marketers [3], politicians [4] and social scientists [5]. By analysing the stream of communications in an unmediated way, without relying on questionnaires or interviews, many scientists are having direct access to people’s opinions and observations for the first time. Perhaps equally important they have access – although indirectly – to situations on the ground that affect the web users, such as for example extreme weather conditions, as long as these are mentioned in the messages being published.

The analysis of social media content is a statistical game, as there is no guarantee that a specific user will describe the weather state in her current location when we need it. But by gathering a large amount of messages from a given location, and by monitoring the right keywords and expressions, it is possible to obtain indirect statistical evidence in favour of a given weather state. In this article we describe two experiments that we have conducted by using Twitter content in the United Kingdom, showing that it can be used to infer the levels of rainfall or of influenza-like-illness (ILI) in a given location, with significant accuracy. The enabling technology behind this study is Statistical Learning Theory, a branch of Artificial Intelligence concerned with the automatic detection of statistical patterns in data.

The use of Twitter data is particularly convenient because its users can only exchange very short messages that are often geo-located, and because this data is freely available via an API [6]. Furthermore the use of this data does not raise the serious privacy concerns that would be raised by the analysis – say – of email or SMS messages, as this is all data that the users have willingly made public.

We believe that the kind of signal that we can extract from that textual stream can be of interest in its own right, and be a valuable input to more complex modelling software, aimed at the prediction of epidemics or floods, as well as other hazards.

What is Intelligence? Modelling And Designing Cognitive Behaviour

The lecture is available at: http://videolectures.net/snnsymposium2010_cristianini_wii/

While the question in the title has remained unanswered for thousands of years, it is perhaps easier to address the apparently similar question: “What is intelligence for?” We take a pragmatic approach to intelligent behavior, and we examine systems that can pursue goals in their environment, using information gathered from it in order to make useful decisions, autonomously and robustly. We review the fundamental aspects of their behavior, methods to model it and architectures to realize it. The discussion will cover both natural and artificial systems, ranging from single cells to software agents.

On Science Automation and Patterns in Media Content

(Notes for my keynote in CPM 2011) – Download the article in PDF format

The strong trend towards the automation of many aspects of scientific enquiry and scholarship has started to affect also the social sciences and even the humanities. Several recent articles have demonstrated the application of pattern analysis techniques to the discovery of non-trivial relations in various datasets that have relevance for social and human sciences, and some have even heralded the advent of “Computational Social Sciences” and “Culturomics”. In this review article I survey the results obtained over the past 5 years at the Intelligent Systems Laboratory in Bristol, in the area of automating the analysis of news media content. This endeavor, which we approach by combining pattern recognition, data mining and language technologies, is traditionally a part of the social sciences, and is normally performed by human researchers on small sets of data. The analysis of news content is of crucial importance due to the central role that the global news system plays in shaping public opinion, markets and culture. It is today possible to access freely online a large part of global news, and to devise automated methods for large scale constant monitoring of patterns in content. The results presented in this survey show how the automatic analysis of millions of documents in dozens of different languages can detect non-trivial macro-patterns that could not be observed at a smaller scale, and how the social sciences can benefit from closer interaction with the pattern analysis, artificial intelligence and text mining research communities.