利用RoBERTa方法检测印尼艺术家Instagram评论栏中的印尼仇恨言论

Adhe Akram Azhari, Yuliant Sibaroni, Sri Suryani Prasetiyowati
{"title":"利用RoBERTa方法检测印尼艺术家Instagram评论栏中的印尼仇恨言论","authors":"Adhe Akram Azhari, Yuliant Sibaroni, Sri Suryani Prasetiyowati","doi":"10.29100/jipi.v8i3.3898","DOIUrl":null,"url":null,"abstract":"This study detects hate speech comments from Instagram post comments where the method used is RoBERTa. Roberta's model was chosen based on the consideration that this model has a high level of accuracy in classifying text in English compared to other models, and possibly has good potential in detecting Indonesian as used in this research. There are two test scenarios namely full-preprocessing and non full-preprocessing where the experimental results show that non full-preprocessing has an average value of accuracy higher than full-preprocessing, and the average value of non full-preprocessing accuracy is 85.09%. Full-preprocessing includes several preprocessing stages, namely cleansing, case folding, normalization, tokenization, and stemming. While non full-preprocessing includes all processes in preprocessing except the stemming process. This shows that RoBERTa predicts comments well when not using full-preprocessing.","PeriodicalId":32696,"journal":{"name":"JIPI Jurnal IPA dan Pembelajaran IPA","volume":"59 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Detection of Indonesian Hate Speech in the Comments Column of Indone-sian Artists' Instagram Using the RoBERTa Method\",\"authors\":\"Adhe Akram Azhari, Yuliant Sibaroni, Sri Suryani Prasetiyowati\",\"doi\":\"10.29100/jipi.v8i3.3898\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This study detects hate speech comments from Instagram post comments where the method used is RoBERTa. Roberta's model was chosen based on the consideration that this model has a high level of accuracy in classifying text in English compared to other models, and possibly has good potential in detecting Indonesian as used in this research. There are two test scenarios namely full-preprocessing and non full-preprocessing where the experimental results show that non full-preprocessing has an average value of accuracy higher than full-preprocessing, and the average value of non full-preprocessing accuracy is 85.09%. Full-preprocessing includes several preprocessing stages, namely cleansing, case folding, normalization, tokenization, and stemming. While non full-preprocessing includes all processes in preprocessing except the stemming process. This shows that RoBERTa predicts comments well when not using full-preprocessing.\",\"PeriodicalId\":32696,\"journal\":{\"name\":\"JIPI Jurnal IPA dan Pembelajaran IPA\",\"volume\":\"59 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-08-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JIPI Jurnal IPA dan Pembelajaran IPA\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.29100/jipi.v8i3.3898\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JIPI Jurnal IPA dan Pembelajaran IPA","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.29100/jipi.v8i3.3898","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

本研究从Instagram帖子评论中检测仇恨言论评论,其中使用的方法是RoBERTa。选择Roberta的模型是考虑到该模型在英语文本分类方面比其他模型具有较高的准确率,并且可能在本研究中使用的印尼语检测方面具有良好的潜力。有全预处理和非全预处理两种测试场景,实验结果表明,非全预处理的准确率平均值高于全预处理,非全预处理的准确率平均值为85.09%。全预处理包括几个预处理阶段,即清理、案例折叠、规范化、标记化和词干提取。而非全预处理则包括预处理中除词干提取过程外的所有过程。这表明RoBERTa在不使用完全预处理的情况下可以很好地预测评论。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Detection of Indonesian Hate Speech in the Comments Column of Indone-sian Artists' Instagram Using the RoBERTa Method
This study detects hate speech comments from Instagram post comments where the method used is RoBERTa. Roberta's model was chosen based on the consideration that this model has a high level of accuracy in classifying text in English compared to other models, and possibly has good potential in detecting Indonesian as used in this research. There are two test scenarios namely full-preprocessing and non full-preprocessing where the experimental results show that non full-preprocessing has an average value of accuracy higher than full-preprocessing, and the average value of non full-preprocessing accuracy is 85.09%. Full-preprocessing includes several preprocessing stages, namely cleansing, case folding, normalization, tokenization, and stemming. While non full-preprocessing includes all processes in preprocessing except the stemming process. This shows that RoBERTa predicts comments well when not using full-preprocessing.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
25
审稿时长
12 weeks
期刊最新文献
The Eligibility of the Encyclopedia of Circulatory System Diseases and Disorders Based on Traditional Medicinal Plants for Hypertension as Learning Media Application of Problem Based Learning Assisted by Reward and Punishment to Improve Self-Regulation of Junior High School Students Application of Process Portofolio Assessment Based on Guided Inquiry Model in Improving Critical Thinking Skills and Learning Outcomes of Science Education Students Development of Chatbot Learning Media on Earth Rotation and Revolution Materials for Grade 6 Elementary School Students Analysis of Knowledge and Understanding of Regarding Waste Management in the Aie Dingin Landfill Area in Balai Gadang Koto Tangah District Padang City
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1