泰国社交网络中网络骚扰文本的分类和关键词提取

Siranuch Hemtanon, Ketsara Phetkrachang, Wachira Yangyuen
{"title":"泰国社交网络中网络骚扰文本的分类和关键词提取","authors":"Siranuch Hemtanon, Ketsara Phetkrachang, Wachira Yangyuen","doi":"10.11591/eei.v12i6.5939","DOIUrl":null,"url":null,"abstract":"Online harassment in social network services (SNS) is a type of cyberbullying issue that needs to be addressed and required preventive measures. In this paper, we develop a detection of cyberbullying regarding harassment textual posts in Thai on the Facebook SNS. We collect public posts and ask experts to label the post as positive or negative regarding harassment posts or not. The annotated data are trained for binary classification considering words in the centre as features to predict malicious intent to insult and threaten other users. The information gain score obtained in generating a prediction model is ranked for the top 20 words with the highest score as significant words involving online harassment. From experiments, the results show that the detection performance obtained a 0.78 f1 score on average. The result analysis indicated that the word surface approach helps detect insulting post decently, but some posts with metaphor to tone down the malicious intent may not be detected as harmful semantic intent are hidden behind word form. Top-20 significant words for bullying showed that bullying posts were body-shaming and lower social status.","PeriodicalId":37619,"journal":{"name":"Bulletin of Electrical Engineering and Informatics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Classification and keyword extraction of online harassment text in Thai social network\",\"authors\":\"Siranuch Hemtanon, Ketsara Phetkrachang, Wachira Yangyuen\",\"doi\":\"10.11591/eei.v12i6.5939\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Online harassment in social network services (SNS) is a type of cyberbullying issue that needs to be addressed and required preventive measures. In this paper, we develop a detection of cyberbullying regarding harassment textual posts in Thai on the Facebook SNS. We collect public posts and ask experts to label the post as positive or negative regarding harassment posts or not. The annotated data are trained for binary classification considering words in the centre as features to predict malicious intent to insult and threaten other users. The information gain score obtained in generating a prediction model is ranked for the top 20 words with the highest score as significant words involving online harassment. From experiments, the results show that the detection performance obtained a 0.78 f1 score on average. The result analysis indicated that the word surface approach helps detect insulting post decently, but some posts with metaphor to tone down the malicious intent may not be detected as harmful semantic intent are hidden behind word form. Top-20 significant words for bullying showed that bullying posts were body-shaming and lower social status.\",\"PeriodicalId\":37619,\"journal\":{\"name\":\"Bulletin of Electrical Engineering and Informatics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bulletin of Electrical Engineering and Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.11591/eei.v12i6.5939\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Mathematics\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bulletin of Electrical Engineering and Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.11591/eei.v12i6.5939","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 0

摘要

社交网络服务(SNS)中的在线骚扰是一种需要解决和需要预防措施的网络欺凌问题。在本文中,我们开发了一种针对Facebook SNS上的泰语骚扰文本帖子的网络欺凌检测方法。我们收集公众帖子,并请专家对骚扰帖子进行正面或负面的标记。对标注的数据进行二元分类训练,将中心的单词作为预测恶意侮辱和威胁其他用户的特征。生成预测模型得到的信息增益分数,将得分最高的前20个词作为涉及网络骚扰的重要词进行排序。实验结果表明,该算法的检测性能平均达到0.78 f1。结果分析表明,词面方法可以较好地检测出侮辱性帖子,但一些使用隐喻淡化恶意意图的帖子可能无法被检测出,因为有害的语义意图隐藏在词的形式背后。与欺凌相关的前20个重要词汇显示,欺凌帖子主要是身体羞辱和社会地位低下。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Classification and keyword extraction of online harassment text in Thai social network
Online harassment in social network services (SNS) is a type of cyberbullying issue that needs to be addressed and required preventive measures. In this paper, we develop a detection of cyberbullying regarding harassment textual posts in Thai on the Facebook SNS. We collect public posts and ask experts to label the post as positive or negative regarding harassment posts or not. The annotated data are trained for binary classification considering words in the centre as features to predict malicious intent to insult and threaten other users. The information gain score obtained in generating a prediction model is ranked for the top 20 words with the highest score as significant words involving online harassment. From experiments, the results show that the detection performance obtained a 0.78 f1 score on average. The result analysis indicated that the word surface approach helps detect insulting post decently, but some posts with metaphor to tone down the malicious intent may not be detected as harmful semantic intent are hidden behind word form. Top-20 significant words for bullying showed that bullying posts were body-shaming and lower social status.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Bulletin of Electrical Engineering and Informatics
Bulletin of Electrical Engineering and Informatics Computer Science-Computer Science (miscellaneous)
CiteScore
3.60
自引率
0.00%
发文量
0
期刊介绍: Bulletin of Electrical Engineering and Informatics publishes original papers in the field of electrical, computer and informatics engineering which covers, but not limited to, the following scope: Computer Science, Computer Engineering and Informatics[...] Electronics[...] Electrical and Power Engineering[...] Telecommunication and Information Technology[...]Instrumentation and Control Engineering[...]
期刊最新文献
An optimistic-pessimistic game cross-efficiency method based on a Gibbs entropy model for ranking decision making units A study on the solution of interval linear fractional programming problem FPGA implementation of DTCWT architecture's high-speed DA structure for OFDM-based transceiver with CS Mathematics for 2D face recognition from real time image data set using deep learning techniques A comprehensive survey on several fire management approaches in wireless sensor networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1