Teargas, Water Cannons and Twitter: A case study on detecting protest repression events in Turkey 2013

Fatma Elsafoury


Published in Third International Workshop on Narrative Extraction from Texts held in conjunction with the 42nd European Conference on Information Retrieval

Link to paper Link to code

Abstract: Since the Arab spring in 2011, protests have been spreading around the world for different reasons, often these protests are faced with violent repression. Studying protest repression requires appropriate datasets. Existing datasets like GDELT focus mainly on events reported in news media. However, news media reports have issues including censorship and coverage bias. Recently, social scientists have started using Machine Learning (ML) to detect political events, but it is costly and time consuming to hand label data for training ML models. This paper proposes using ML and crowdsourcing to detect protest repression events from Twitter. Our case study is the Turkish Gezi Park protest in 2013. Our results show that Twitter is a reliable source reflecting events happening on the ground as soon as they happen. Moreover, training conventional ML models on crowdsourced labelled data gave good results with an AUC score of 0.896 to detect protest events and 0.8189 to detect repression events.