Mohsan Ali, Ali Muhammad, Muhammad Asad, Makhdoom Sajawal, C. Alexopoulos, Y. Charalabidis
{"title":"基于机器学习的人-阿拉伯语乌尔都语仇恨检测:基于大数据集和时间复杂度的比较研究","authors":"Mohsan Ali, Ali Muhammad, Muhammad Asad, Makhdoom Sajawal, C. Alexopoulos, Y. Charalabidis","doi":"10.1145/3575879.3576011","DOIUrl":null,"url":null,"abstract":"Social media users are growing daily, with hundreds of millions of active users per month on certain networking sites. For any administrative institution, the manual method for regulating user content is challenging. There are hundreds of languages through which you can direct your attention on the web. The Urdu language is among the most widely utilized languages in the world. We have proposed a quick way of detecting the content of Urdu language hate using machine learning models. We used the open data set and manually created instances to make this investigation viable on a balanced data set. Our experimental set-up has demonstrated that support vector machine in the detection of Urdu hatred detection is 81.87% accurate. The training time, testing time, and accuracy helped us select the best model for Urdu hate detection on social media sites. We also compared the training and testing times of various methods. Additionally, we demonstrated k and stratified folding via indexing to provide a better understanding of folding in machine learning. Finally, we compared our findings to those of previously published works in the field of Urdu hate detection.","PeriodicalId":164036,"journal":{"name":"Proceedings of the 26th Pan-Hellenic Conference on Informatics","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Towards Perso-Arabic Urdu Language Hate Detection Using Machine Learning: A Comparative Study Based on a Large Dataset and Time-Complexity\",\"authors\":\"Mohsan Ali, Ali Muhammad, Muhammad Asad, Makhdoom Sajawal, C. Alexopoulos, Y. Charalabidis\",\"doi\":\"10.1145/3575879.3576011\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Social media users are growing daily, with hundreds of millions of active users per month on certain networking sites. For any administrative institution, the manual method for regulating user content is challenging. There are hundreds of languages through which you can direct your attention on the web. The Urdu language is among the most widely utilized languages in the world. We have proposed a quick way of detecting the content of Urdu language hate using machine learning models. We used the open data set and manually created instances to make this investigation viable on a balanced data set. Our experimental set-up has demonstrated that support vector machine in the detection of Urdu hatred detection is 81.87% accurate. The training time, testing time, and accuracy helped us select the best model for Urdu hate detection on social media sites. We also compared the training and testing times of various methods. Additionally, we demonstrated k and stratified folding via indexing to provide a better understanding of folding in machine learning. Finally, we compared our findings to those of previously published works in the field of Urdu hate detection.\",\"PeriodicalId\":164036,\"journal\":{\"name\":\"Proceedings of the 26th Pan-Hellenic Conference on Informatics\",\"volume\":\"5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 26th Pan-Hellenic Conference on Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3575879.3576011\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 26th Pan-Hellenic Conference on Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3575879.3576011","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Towards Perso-Arabic Urdu Language Hate Detection Using Machine Learning: A Comparative Study Based on a Large Dataset and Time-Complexity
Social media users are growing daily, with hundreds of millions of active users per month on certain networking sites. For any administrative institution, the manual method for regulating user content is challenging. There are hundreds of languages through which you can direct your attention on the web. The Urdu language is among the most widely utilized languages in the world. We have proposed a quick way of detecting the content of Urdu language hate using machine learning models. We used the open data set and manually created instances to make this investigation viable on a balanced data set. Our experimental set-up has demonstrated that support vector machine in the detection of Urdu hatred detection is 81.87% accurate. The training time, testing time, and accuracy helped us select the best model for Urdu hate detection on social media sites. We also compared the training and testing times of various methods. Additionally, we demonstrated k and stratified folding via indexing to provide a better understanding of folding in machine learning. Finally, we compared our findings to those of previously published works in the field of Urdu hate detection.