利用基于 CROA 的特征选择和 BERT 模型检测 Twitter 数据中的攻击性言论

R. J. Anandhi, V. S. A. Devi, B. S. K. Devi, Balasubramanian Prabhu kavin, Gan Hong Seng
{"title":"利用基于 CROA 的特征选择和 BERT 模型检测 Twitter 数据中的攻击性言论","authors":"R. J. Anandhi, V. S. A. Devi, B. S. K. Devi, Balasubramanian Prabhu kavin, Gan Hong Seng","doi":"10.32629/jai.v7i3.1122","DOIUrl":null,"url":null,"abstract":"Online hate speech has flourished on social networking sites due to the widespread availability of mobile computers and other Web knowledge. Extensive research has shown that online exposure to hate speech has real-world effects on marginalized communities. Research into methods of automatically identifying hate speech has garnered significant attention. Hate speech can affect any demographic, while some populations are more vulnerable than others. Relying solely on progressive learning is insufficient for achieving the goal of automatic hate speech identification. It need access to large amounts of labelled data to train a model. Inaccurate statistics on hate speech and preconceived notions have been the biggest obstacles in the field of hate speech research for a long time. This research provides a novel strategy for meeting these needs by combining a transfer-learning attitude-based BERT (Bidirectional Encoder Representations from Transformers) with a coral reef optimization-based approach (CROA). A feature selection (FC) optimization strategy for coral reefs, a coral reefs optimization method mimics coral behaviours for reef location and development. We might think of each potential answer to the problem as a coral trying to establish itself in the reefs. The results are refined at each stage by applying specialized operators from the coral reefs optimization algorithm. When everything is said and done, the optimal solution is chosen. We also use a cutting-edge fine-tuning method based on transfer learning to assess BERT’s ability to recognize hostile contexts in social media communications. The paper evaluates the proposed approach using Twitter datasets tagged for racist, sexist, homophobic, or otherwise offensive content. The numbers show that our strategy achieves 5%–10% higher precision and recall compared to other approaches.","PeriodicalId":508223,"journal":{"name":"Journal of Autonomous Intelligence","volume":"16 5","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"CROA-based feature selection with BERT model for detecting the offensive speech in Twitter data\",\"authors\":\"R. J. Anandhi, V. S. A. Devi, B. S. K. Devi, Balasubramanian Prabhu kavin, Gan Hong Seng\",\"doi\":\"10.32629/jai.v7i3.1122\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Online hate speech has flourished on social networking sites due to the widespread availability of mobile computers and other Web knowledge. Extensive research has shown that online exposure to hate speech has real-world effects on marginalized communities. Research into methods of automatically identifying hate speech has garnered significant attention. Hate speech can affect any demographic, while some populations are more vulnerable than others. Relying solely on progressive learning is insufficient for achieving the goal of automatic hate speech identification. It need access to large amounts of labelled data to train a model. Inaccurate statistics on hate speech and preconceived notions have been the biggest obstacles in the field of hate speech research for a long time. This research provides a novel strategy for meeting these needs by combining a transfer-learning attitude-based BERT (Bidirectional Encoder Representations from Transformers) with a coral reef optimization-based approach (CROA). A feature selection (FC) optimization strategy for coral reefs, a coral reefs optimization method mimics coral behaviours for reef location and development. We might think of each potential answer to the problem as a coral trying to establish itself in the reefs. The results are refined at each stage by applying specialized operators from the coral reefs optimization algorithm. When everything is said and done, the optimal solution is chosen. We also use a cutting-edge fine-tuning method based on transfer learning to assess BERT’s ability to recognize hostile contexts in social media communications. The paper evaluates the proposed approach using Twitter datasets tagged for racist, sexist, homophobic, or otherwise offensive content. The numbers show that our strategy achieves 5%–10% higher precision and recall compared to other approaches.\",\"PeriodicalId\":508223,\"journal\":{\"name\":\"Journal of Autonomous Intelligence\",\"volume\":\"16 5\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Autonomous Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.32629/jai.v7i3.1122\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Autonomous Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32629/jai.v7i3.1122","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

由于移动电脑和其他网络知识的普及,网上仇恨言论在社交网站上大行其道。大量研究表明,在网上接触仇恨言论会对边缘化群体产生现实影响。对仇恨言论自动识别方法的研究引起了广泛关注。仇恨言论可能影响任何人群,而某些人群比其他人群更容易受到影响。仅仅依靠渐进式学习不足以实现仇恨言论自动识别的目标。它需要获取大量标记数据来训练模型。长期以来,不准确的仇恨言论统计数据和先入为主的观念一直是仇恨言论研究领域的最大障碍。本研究通过将基于迁移学习态度的 BERT(来自变压器的双向编码器表征)与基于珊瑚礁优化的方法 (CROA) 相结合,为满足这些需求提供了一种新颖的策略。珊瑚礁优化方法是一种针对珊瑚礁的特征选择(FC)优化策略,它模仿珊瑚的行为来确定珊瑚礁的位置和发展。我们可以把问题的每一个潜在答案都看作是试图在珊瑚礁中建立自己的珊瑚。通过应用珊瑚礁优化算法中的专门运算符,在每个阶段对结果进行完善。一切完成后,就会选出最佳解决方案。我们还使用基于迁移学习的尖端微调方法来评估 BERT 识别社交媒体传播中敌对语境的能力。本文使用标记有种族主义、性别歧视、仇视同性恋或其他攻击性内容的 Twitter 数据集对所提出的方法进行了评估。结果表明,与其他方法相比,我们的策略的精确度和召回率提高了 5%-10%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
CROA-based feature selection with BERT model for detecting the offensive speech in Twitter data
Online hate speech has flourished on social networking sites due to the widespread availability of mobile computers and other Web knowledge. Extensive research has shown that online exposure to hate speech has real-world effects on marginalized communities. Research into methods of automatically identifying hate speech has garnered significant attention. Hate speech can affect any demographic, while some populations are more vulnerable than others. Relying solely on progressive learning is insufficient for achieving the goal of automatic hate speech identification. It need access to large amounts of labelled data to train a model. Inaccurate statistics on hate speech and preconceived notions have been the biggest obstacles in the field of hate speech research for a long time. This research provides a novel strategy for meeting these needs by combining a transfer-learning attitude-based BERT (Bidirectional Encoder Representations from Transformers) with a coral reef optimization-based approach (CROA). A feature selection (FC) optimization strategy for coral reefs, a coral reefs optimization method mimics coral behaviours for reef location and development. We might think of each potential answer to the problem as a coral trying to establish itself in the reefs. The results are refined at each stage by applying specialized operators from the coral reefs optimization algorithm. When everything is said and done, the optimal solution is chosen. We also use a cutting-edge fine-tuning method based on transfer learning to assess BERT’s ability to recognize hostile contexts in social media communications. The paper evaluates the proposed approach using Twitter datasets tagged for racist, sexist, homophobic, or otherwise offensive content. The numbers show that our strategy achieves 5%–10% higher precision and recall compared to other approaches.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Detecting people in sprinting motion using HPRDenoise: Point cloud denoising with hidden point removal Adaptive Multi-Layer Security Framework (AMLSF) for real-time applications in smart city networks Effective speech recognition for healthcare industry using phonetic system Integrating multisensory information fusion and interaction technologies in smart healthcare systems An investigation to identify the factors that cause failure in English essay, precis, and composition papers in CSS exams
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1