Developing a System for the Automated Coding of Protest Event Data

Posted: 17 Apr 2014

See all articles by Alex Hanna

Alex Hanna

affiliation not provided to SSRN

Date Written: April 15, 2014

Abstract

Scholars and policy makers recognize the need for better and timelier data about contentious collective action, both the peaceful protests that are understood as part of democracy and the violent events that are threats to it. News media provide the only consistent source of information available outside government intelligence agencies and are thus the focus of all scholarly efforts to improve collective action data. Human coding of news sources is time-consuming and thus can never be timely and is necessarily limited to a small number of sources, a small time interval, or a limited set of protest “issues” as captured by particular keywords. There have been a number of attempts to address this need through machine coding of electronic versions of news media, but approaches so far remain less than optimal. The goal of this paper is to outline the steps needed to build, test and validate an open-source system for coding protest events from any electronically available news source using advances from natural language processing and machine learning. Such a system should have the effect of increasing the speed and reducing the labor costs associated with identifying and coding collective actions in news sources, thus increasing the timeliness of protest data and reducing biases due to excessive reliance on too few news sources. The system will also be open, available for replication, and extendable by future social movement researchers, and social and computational scientists.

Keywords: social movements, event data, quantitative methods, machine learning, natural language processing

Suggested Citation

Hanna, Alex, Developing a System for the Automated Coding of Protest Event Data (April 15, 2014). Available at SSRN: https://ssrn.com/abstract=2425232 or http://dx.doi.org/10.2139/ssrn.2425232

Alex Hanna (Contact Author)

affiliation not provided to SSRN

Do you have negative results from your research you’d like to share?

Paper statistics

Abstract Views
2,973
PlumX Metrics