面向社交媒体流数据的大规模仇恨言论检测系统的实现

Long-An Doan, Phuong-Thao Nguyen, Thi-Oanh Phan, Trong-Hop Do
{"title":"面向社交媒体流数据的大规模仇恨言论检测系统的实现","authors":"Long-An Doan, Phuong-Thao Nguyen, Thi-Oanh Phan, Trong-Hop Do","doi":"10.1109/COMNETSAT56033.2022.9994299","DOIUrl":null,"url":null,"abstract":"The omnipresence of online social media brings various positive and negative consequences for society. Besides benefits, social media can cause big problem caused by hate and offensive contents. Detecting and removing those toxic contents using machine learning is a major research topic in social network. Two of the challenges of this topic are that the volume of social media data is so big and that these data need to be processed in real-time. In this paper, we set out to develop system to detect hate speech in Vietnamese YouTube comments using machine learning and big data technology. The streaming data from Youtube is processed in real-time using Kafka, Spark, and machine learning technology. Finally, a dashboard powered by Streamlit will be used to display the results.","PeriodicalId":221444,"journal":{"name":"2022 IEEE International Conference on Communication, Networks and Satellite (COMNETSAT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Implementation of Large Scale Hate Speech Detection System for Streaming Social Media Data\",\"authors\":\"Long-An Doan, Phuong-Thao Nguyen, Thi-Oanh Phan, Trong-Hop Do\",\"doi\":\"10.1109/COMNETSAT56033.2022.9994299\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The omnipresence of online social media brings various positive and negative consequences for society. Besides benefits, social media can cause big problem caused by hate and offensive contents. Detecting and removing those toxic contents using machine learning is a major research topic in social network. Two of the challenges of this topic are that the volume of social media data is so big and that these data need to be processed in real-time. In this paper, we set out to develop system to detect hate speech in Vietnamese YouTube comments using machine learning and big data technology. The streaming data from Youtube is processed in real-time using Kafka, Spark, and machine learning technology. Finally, a dashboard powered by Streamlit will be used to display the results.\",\"PeriodicalId\":221444,\"journal\":{\"name\":\"2022 IEEE International Conference on Communication, Networks and Satellite (COMNETSAT)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference on Communication, Networks and Satellite (COMNETSAT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/COMNETSAT56033.2022.9994299\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Communication, Networks and Satellite (COMNETSAT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COMNETSAT56033.2022.9994299","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

网络社交媒体的无所不在给社会带来了各种积极和消极的后果。除了好处之外,社交媒体还会因为仇恨和冒犯性的内容而造成很大的问题。利用机器学习技术检测和去除这些有毒内容是社交网络领域的一个重要研究课题。这个主题的两个挑战是,社交媒体数据的量是如此之大,这些数据需要实时处理。在本文中,我们着手开发使用机器学习和大数据技术检测越南YouTube评论中的仇恨言论的系统。来自Youtube的流数据使用Kafka, Spark和机器学习技术进行实时处理。最后,由Streamlit驱动的仪表板将用于显示结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
An Implementation of Large Scale Hate Speech Detection System for Streaming Social Media Data
The omnipresence of online social media brings various positive and negative consequences for society. Besides benefits, social media can cause big problem caused by hate and offensive contents. Detecting and removing those toxic contents using machine learning is a major research topic in social network. Two of the challenges of this topic are that the volume of social media data is so big and that these data need to be processed in real-time. In this paper, we set out to develop system to detect hate speech in Vietnamese YouTube comments using machine learning and big data technology. The streaming data from Youtube is processed in real-time using Kafka, Spark, and machine learning technology. Finally, a dashboard powered by Streamlit will be used to display the results.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A Small-Scale Temperature Forecasting System using Time Series Models Applied in Ho Chi Minh City Clickbait Detection for Internet News Title with Deep Learning Feed Forward New Approach of Ensemble Method to Improve Performance of IDS using S-SDN Classifier Design and Implementation of On-Body Textile Antenna for Bird Tracking at 2.4 GHz Performance analysis of FBMC-PAM systems in frequency-selective Rayleigh fading channels in the presence of phase error
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1