The ability for ordinary people to express and exchange their opinions and feelings has increased beyond all expectations in the past ten years of internet expansion and availability. To the military and national security agencies this has provided both opportunities and challenges. Opportunities have emerged in the sense of readily available awareness of discontent and oppositional movements and initiatives. Recent urban disturbances have illustrated the key role played by social networks in the fast-moving events of Summer 2011. The challenges have escalated due to the sheer number of sources of social interaction and public communication media. This research addresses some of these issues in a bold initiative to combine well established and considered science with the increasingly familiar tools of Web 2.0.
Four of the most popular sources of the public exchange of ideas (email, social networks, such as Facebook, microblogs, such as Twitter and comments to newspaper editorials and high-profile stories) will be selectively monitored. These kinds of texts are relatively sparse, grammatically incorrect, informal and largely very different from classical NLP texts. Due to this it was suggested that a new NLP pipeline is necessary to deal with sparse texts of a highly informal nature. The figure 1, below, highlights the NLP pipeline which the team has developed and evaluated.
Figure 1 – NLP Pipeline for Sparse Text processing (with the EMOTIVE Emotions Ontology Matching Module Highlighted)
Sensitive words and phrases which may be of concern to the military and national security agencies and especially emotionally charged words and phrases are extracted by extending a Natural Language Processing technique already developed for email by the Principal Investigator. The team has developed an ontology (a rule-based linguistic database) in which the extracted words and phrases can be semantically filtered and restricted to a manageable set of agreed terms. The ontology is trained to recognise the words and phrases, make semantic links between them and deliver one or more accepted descriptors to the analysts. EMOTIVE monitors the traffic of sensitive words and phrases filtered through the ontology when applied to specific incidents, individuals and groups. Increased activity is indicated by frequency of occurrence or severity, which can be presented through a concept cloud which uses the size of words as a metaphor for frequency and hence importance, with a colour-coded indication of the strength of emotion attached to the language-based terms. The fine-grained, explicit emotions that EMOTIVE is capable to extract from sparse, informal messages are Anger, Disgust, Fear, Happiness, Sadness, Surprise (Ekman’s 6 basic emotions) + Shame, and Confusion. This is considerably more specific than much of existing work in sentiment analysis area, where most techniques tend to focus on overall document polarity (negative, neutral, positive scales) or bundle together emotions with states and opinions, rather than focus on real and basic emotions.
The final feature of Emotive is a geo interface to point to the location of the emotionally charged traffic. Since a small number of messages tend to actually contain accurate geo-location metadata the project explores several techniques to that effect, in order to help to identify sensitive hot spots of communication and activities. Outputs from the system consisting of effectively presented new knowledge will enable defence and national security agencies both to predict and monitor selected events as they develop and will assist in the formulation of policy. The figure 2, below, illustrates the three primary elements to the EMOTIVE monitoring system, with the visualisation element.
Figure 2 – EMOTIVE Social-media Stream Monitoring Vision
It can be argued that the general public will be direct beneficiaries of this research in that the defence and national security agencies who act as guardians of public safety and order will be further equipped by this tool to identify, evaluate and ultimately safeguard the public from potentially harmful events. Defence and national agencies will already be experienced at monitoring these data sources but this tool adds an extra filter of analysis, it will work in almost real time, will amalgamate data from several sources if desired and will provide harmonised output.
The EMOTIVE project is funded by the EPSRC and DSTL and conducted at Loughborough University, within the Centre for Information Management in School of Business and Economics (formerly Information Science department). The researchers working on the project are Prof. Tom Jackson (Principal Investigator), Dr. Ann O’Brien (Co-Investigator), Dr. Martin Sykora (Research Associate), and Dr. Suzanne Elayan (Research Associate).