{"title":"Better, Faster, Stronger: Using Machine Learning to Analyse South African Police-recorded Protest Data","authors":"M. Bekker","doi":"10.1080/21528586.2021.1982762","DOIUrl":null,"url":null,"abstract":"ABSTRACT A long-important tool for quantitative analysis of protests, the potential power of Protest Event Analysis (PEA) has only increased with the rise of Machine Learning technologies and the ubiquity of big data. PEA coders also present an advantage over contemporary Natural Language Programming innovations by being customisable to incorporate locally appropriate terms and vernaculars, expressed as personalised ontologies. As such, there is a need to develop a standard process for deploying machine learning tools that can draw on the local. This paper introduces such a tool, innovating the numeration of abstract indicators. “Machine Learning Protest Event Analysis Keyword Enumerated Recoding” is a protocol that enables PEA coders to read and classify large “event databases”, incorporating local terms and abstract indicators into the analysis. Applying this protocol to 150,000 records in a police-recorded database of crowd events in South Africa, protest events could be individually rated by levels of “tumult”—a feat hitherto inhibited by conventional PEA methods. Innovations in estimating crowd sizes, as well as an updated view of post-apartheid protest, showing that protests tend to be more common but less prone to violence than previous theories concluded, speaks to the potential for this protocol to unearth novel insights on even bigger data sets.","PeriodicalId":44730,"journal":{"name":"South African Review of Sociology","volume":"67 1","pages":"4 - 23"},"PeriodicalIF":0.5000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"South African Review of Sociology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/21528586.2021.1982762","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"SOCIOLOGY","Score":null,"Total":0}
引用次数: 2
Abstract
ABSTRACT A long-important tool for quantitative analysis of protests, the potential power of Protest Event Analysis (PEA) has only increased with the rise of Machine Learning technologies and the ubiquity of big data. PEA coders also present an advantage over contemporary Natural Language Programming innovations by being customisable to incorporate locally appropriate terms and vernaculars, expressed as personalised ontologies. As such, there is a need to develop a standard process for deploying machine learning tools that can draw on the local. This paper introduces such a tool, innovating the numeration of abstract indicators. “Machine Learning Protest Event Analysis Keyword Enumerated Recoding” is a protocol that enables PEA coders to read and classify large “event databases”, incorporating local terms and abstract indicators into the analysis. Applying this protocol to 150,000 records in a police-recorded database of crowd events in South Africa, protest events could be individually rated by levels of “tumult”—a feat hitherto inhibited by conventional PEA methods. Innovations in estimating crowd sizes, as well as an updated view of post-apartheid protest, showing that protests tend to be more common but less prone to violence than previous theories concluded, speaks to the potential for this protocol to unearth novel insights on even bigger data sets.