用递归神经网络识别印尼语文本中的仇恨言论

Erryan Sazany, I. Budi
{"title":"用递归神经网络识别印尼语文本中的仇恨言论","authors":"Erryan Sazany, I. Budi","doi":"10.1109/ICACSIS47736.2019.8979959","DOIUrl":null,"url":null,"abstract":"Some researches had succeeded in doing hate speech identification automatically from text with machine learning and deep learning approaches. However, it was still unclear how adaptive is a deep learning-based model if it is tested on a different set of text data with different domain. To address this issue, this research proposed some deep learning-based methods, using some variants of Recurrent Neural Network to identify hate speech in texts sourced from Twitter, and then used to predict other set of text data sourced from Facebook and Twitter. The experiment was done in order to measure the difference of model performance between training phase and testing phase. Experiment results showed that the proposed method outperformed the machine learning based methods, both in training phase, by GRU algorithm with 85.37% F1-score, and in testing phase, by LSTM algorithm with 76.30% F1-score. Then, in terms of adaptability of model performance, the proposed method gave comparable result against the baseline method.","PeriodicalId":165090,"journal":{"name":"2019 International Conference on Advanced Computer Science and information Systems (ICACSIS)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Hate Speech Identification in Text Written in Indonesian with Recurrent Neural Network\",\"authors\":\"Erryan Sazany, I. Budi\",\"doi\":\"10.1109/ICACSIS47736.2019.8979959\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Some researches had succeeded in doing hate speech identification automatically from text with machine learning and deep learning approaches. However, it was still unclear how adaptive is a deep learning-based model if it is tested on a different set of text data with different domain. To address this issue, this research proposed some deep learning-based methods, using some variants of Recurrent Neural Network to identify hate speech in texts sourced from Twitter, and then used to predict other set of text data sourced from Facebook and Twitter. The experiment was done in order to measure the difference of model performance between training phase and testing phase. Experiment results showed that the proposed method outperformed the machine learning based methods, both in training phase, by GRU algorithm with 85.37% F1-score, and in testing phase, by LSTM algorithm with 76.30% F1-score. Then, in terms of adaptability of model performance, the proposed method gave comparable result against the baseline method.\",\"PeriodicalId\":165090,\"journal\":{\"name\":\"2019 International Conference on Advanced Computer Science and information Systems (ICACSIS)\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Conference on Advanced Computer Science and information Systems (ICACSIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICACSIS47736.2019.8979959\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Advanced Computer Science and information Systems (ICACSIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICACSIS47736.2019.8979959","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

摘要

一些研究已经成功地利用机器学习和深度学习方法从文本中自动识别仇恨言论。然而,如果在不同领域的不同文本数据集上进行测试,那么基于深度学习的模型的适应性如何仍然不清楚。为了解决这个问题,本研究提出了一些基于深度学习的方法,使用递归神经网络的一些变体来识别来自Twitter的文本中的仇恨言论,然后用于预测来自Facebook和Twitter的其他文本数据集。实验是为了衡量模型在训练阶段和测试阶段的性能差异。实验结果表明,该方法在训练阶段的GRU算法和测试阶段的LSTM算法分别以85.37%和76.30%的f1得分优于基于机器学习的方法。然后,在模型性能的适应性方面,该方法与基线方法具有可比性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Hate Speech Identification in Text Written in Indonesian with Recurrent Neural Network
Some researches had succeeded in doing hate speech identification automatically from text with machine learning and deep learning approaches. However, it was still unclear how adaptive is a deep learning-based model if it is tested on a different set of text data with different domain. To address this issue, this research proposed some deep learning-based methods, using some variants of Recurrent Neural Network to identify hate speech in texts sourced from Twitter, and then used to predict other set of text data sourced from Facebook and Twitter. The experiment was done in order to measure the difference of model performance between training phase and testing phase. Experiment results showed that the proposed method outperformed the machine learning based methods, both in training phase, by GRU algorithm with 85.37% F1-score, and in testing phase, by LSTM algorithm with 76.30% F1-score. Then, in terms of adaptability of model performance, the proposed method gave comparable result against the baseline method.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Evaluation of Instructional and User Interface Design for MOOC: Short and Free FutureLearn Courses Evaluation and Recommendations for the Instructional Design and User Interface Design of Coursera MOOC Platform Adult Content Classification on Indonesian Tweets using LSTM Neural Network Development of the Online Collaborative Summarizing Feature on Student-Centered E-Learning Environment Discriminating Unknown Software Using Distance Model
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1