用递归神经网络识别印尼语文本中的仇恨言论

2019 International Conference on Advanced Computer Science and information Systems (ICACSIS) Pub Date : 2019-10-01 DOI:10.1109/ICACSIS47736.2019.8979959

Erryan Sazany, I. Budi

{"title":"用递归神经网络识别印尼语文本中的仇恨言论","authors":"Erryan Sazany, I. Budi","doi":"10.1109/ICACSIS47736.2019.8979959","DOIUrl":null,"url":null,"abstract":"Some researches had succeeded in doing hate speech identification automatically from text with machine learning and deep learning approaches. However, it was still unclear how adaptive is a deep learning-based model if it is tested on a different set of text data with different domain. To address this issue, this research proposed some deep learning-based methods, using some variants of Recurrent Neural Network to identify hate speech in texts sourced from Twitter, and then used to predict other set of text data sourced from Facebook and Twitter. The experiment was done in order to measure the difference of model performance between training phase and testing phase. Experiment results showed that the proposed method outperformed the machine learning based methods, both in training phase, by GRU algorithm with 85.37% F1-score, and in testing phase, by LSTM algorithm with 76.30% F1-score. Then, in terms of adaptability of model performance, the proposed method gave comparable result against the baseline method.","PeriodicalId":165090,"journal":{"name":"2019 International Conference on Advanced Computer Science and information Systems (ICACSIS)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Hate Speech Identification in Text Written in Indonesian with Recurrent Neural Network\",\"authors\":\"Erryan Sazany, I. Budi\",\"doi\":\"10.1109/ICACSIS47736.2019.8979959\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Some researches had succeeded in doing hate speech identification automatically from text with machine learning and deep learning approaches. However, it was still unclear how adaptive is a deep learning-based model if it is tested on a different set of text data with different domain. To address this issue, this research proposed some deep learning-based methods, using some variants of Recurrent Neural Network to identify hate speech in texts sourced from Twitter, and then used to predict other set of text data sourced from Facebook and Twitter. The experiment was done in order to measure the difference of model performance between training phase and testing phase. Experiment results showed that the proposed method outperformed the machine learning based methods, both in training phase, by GRU algorithm with 85.37% F1-score, and in testing phase, by LSTM algorithm with 76.30% F1-score. Then, in terms of adaptability of model performance, the proposed method gave comparable result against the baseline method.\",\"PeriodicalId\":165090,\"journal\":{\"name\":\"2019 International Conference on Advanced Computer Science and information Systems (ICACSIS)\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Conference on Advanced Computer Science and information Systems (ICACSIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICACSIS47736.2019.8979959\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Advanced Computer Science and information Systems (ICACSIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICACSIS47736.2019.8979959","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

摘要

一些研究已经成功地利用机器学习和深度学习方法从文本中自动识别仇恨言论。然而，如果在不同领域的不同文本数据集上进行测试，那么基于深度学习的模型的适应性如何仍然不清楚。为了解决这个问题，本研究提出了一些基于深度学习的方法，使用递归神经网络的一些变体来识别来自Twitter的文本中的仇恨言论，然后用于预测来自Facebook和Twitter的其他文本数据集。实验是为了衡量模型在训练阶段和测试阶段的性能差异。实验结果表明，该方法在训练阶段的GRU算法和测试阶段的LSTM算法分别以85.37%和76.30%的f1得分优于基于机器学习的方法。然后，在模型性能的适应性方面，该方法与基线方法具有可比性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Hate Speech Identification in Text Written in Indonesian with Recurrent Neural Network

Some researches had succeeded in doing hate speech identification automatically from text with machine learning and deep learning approaches. However, it was still unclear how adaptive is a deep learning-based model if it is tested on a different set of text data with different domain. To address this issue, this research proposed some deep learning-based methods, using some variants of Recurrent Neural Network to identify hate speech in texts sourced from Twitter, and then used to predict other set of text data sourced from Facebook and Twitter. The experiment was done in order to measure the difference of model performance between training phase and testing phase. Experiment results showed that the proposed method outperformed the machine learning based methods, both in training phase, by GRU algorithm with 85.37% F1-score, and in testing phase, by LSTM algorithm with 76.30% F1-score. Then, in terms of adaptability of model performance, the proposed method gave comparable result against the baseline method.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 International Conference on Advanced Computer Science and information Systems (ICACSIS)

自引率

0.00%

发文量