{"title":"A Deep Decision Forests Model for Hate Speech Detection","authors":"M. Ndenga","doi":"10.5455/jjcit.71-1667394363","DOIUrl":null,"url":null,"abstract":"Detecting and controlling propagation of hate-speech over social media platforms is a challenge. This problem is exacerbated by extreme fast flow, readily available audience, and relative permanence of information on social media. The objective of this research is to propose a model that could be used to detect political hate speech that is propagated through social media platforms in Kenya. Using Twitter textual data and Keras TensorFlow Decision Forests (TF-DF), three models were developed i.e., Gradient Boosted Trees with Universal Sentence Embeddings(USE), Gradient Boosted Trees, and Random Forest respectively. The Gradient Boosted Trees with USE model exhibited a superior performance with an accuracy of 98.86%, recall of 0.9587, precision of 0.9831, and AUC of 0.9984. Therefore, this model can be utilized for detecting hate speech on social media platforms.","PeriodicalId":36757,"journal":{"name":"Jordanian Journal of Computers and Information Technology","volume":"1 1","pages":""},"PeriodicalIF":0.9000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Jordanian Journal of Computers and Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5455/jjcit.71-1667394363","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Detecting and controlling propagation of hate-speech over social media platforms is a challenge. This problem is exacerbated by extreme fast flow, readily available audience, and relative permanence of information on social media. The objective of this research is to propose a model that could be used to detect political hate speech that is propagated through social media platforms in Kenya. Using Twitter textual data and Keras TensorFlow Decision Forests (TF-DF), three models were developed i.e., Gradient Boosted Trees with Universal Sentence Embeddings(USE), Gradient Boosted Trees, and Random Forest respectively. The Gradient Boosted Trees with USE model exhibited a superior performance with an accuracy of 98.86%, recall of 0.9587, precision of 0.9831, and AUC of 0.9984. Therefore, this model can be utilized for detecting hate speech on social media platforms.