基于bert的在线短视频仇恨语音分类器

Rommel Hernandez Urbano Jr., Jeffrey Uy Ajero, Angelic Legaspi Angeles, Maria Nikki Hacar Quintos, Joseph Marvin Regalado Imperial, Ramon Llabanes Rodriguez
{"title":"基于bert的在线短视频仇恨语音分类器","authors":"Rommel Hernandez Urbano Jr., Jeffrey Uy Ajero, Angelic Legaspi Angeles, Maria Nikki Hacar Quintos, Joseph Marvin Regalado Imperial, Ramon Llabanes Rodriguez","doi":"10.1145/3485768.3485806","DOIUrl":null,"url":null,"abstract":"With the rise of human-centric technologies such as social media platforms, the amount of hate also continues to grow proportionally with the increasing number of users worldwide. TikTok is one of the most-used social media platforms due to its feature that allows users to express themselves via creating and sharing short-form videos based on any desired topic and content. In addition, it has also become a platform for political discourse and mudslinging as users can freely express an opinion and indirectly debate with random people online. In this study, we propose the use of BERT, a complex bidirectional transformer-based model, for the task of automatic hate speech detection from speech transcribed from Tagalog TikTok videos. Results of our experiments show that a BERT-based hate speech classifier scores 61% F1. We also extended the task beyond several algorithms such as LSTM, Naïve Bayes, and Decision Tree and found out that traditional methods such as a simple Bernoulli Naïve Bayes approach remain at par with the BERT model.","PeriodicalId":328771,"journal":{"name":"2021 5th International Conference on E-Society, E-Education and E-Technology","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"A BERT-based Hate Speech Classifier from Transcribed Online Short-Form Videos\",\"authors\":\"Rommel Hernandez Urbano Jr., Jeffrey Uy Ajero, Angelic Legaspi Angeles, Maria Nikki Hacar Quintos, Joseph Marvin Regalado Imperial, Ramon Llabanes Rodriguez\",\"doi\":\"10.1145/3485768.3485806\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the rise of human-centric technologies such as social media platforms, the amount of hate also continues to grow proportionally with the increasing number of users worldwide. TikTok is one of the most-used social media platforms due to its feature that allows users to express themselves via creating and sharing short-form videos based on any desired topic and content. In addition, it has also become a platform for political discourse and mudslinging as users can freely express an opinion and indirectly debate with random people online. In this study, we propose the use of BERT, a complex bidirectional transformer-based model, for the task of automatic hate speech detection from speech transcribed from Tagalog TikTok videos. Results of our experiments show that a BERT-based hate speech classifier scores 61% F1. We also extended the task beyond several algorithms such as LSTM, Naïve Bayes, and Decision Tree and found out that traditional methods such as a simple Bernoulli Naïve Bayes approach remain at par with the BERT model.\",\"PeriodicalId\":328771,\"journal\":{\"name\":\"2021 5th International Conference on E-Society, E-Education and E-Technology\",\"volume\":\"75 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-08-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 5th International Conference on E-Society, E-Education and E-Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3485768.3485806\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 5th International Conference on E-Society, E-Education and E-Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3485768.3485806","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

随着社交媒体平台等以人为中心的技术的兴起,仇恨的数量也随着全球用户数量的增加而继续成比例地增长。TikTok是最常用的社交媒体平台之一,因为它允许用户根据任何想要的主题和内容创建和分享短视频来表达自己。此外,它也成为政治话语和诽谤的平台,因为用户可以自由地表达意见,并在网上与随机的人间接辩论。在本研究中,我们提出使用BERT(一种复杂的基于双向变压器的模型)来完成从他加禄语TikTok视频转录的语音中自动检测仇恨语音的任务。实验结果表明,基于bert的仇恨语音分类器的F1得分为61%。我们还将任务扩展到LSTM, Naïve贝叶斯和决策树等几种算法之外,并发现传统方法(如简单的伯努利Naïve贝叶斯方法)仍然与BERT模型相当。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A BERT-based Hate Speech Classifier from Transcribed Online Short-Form Videos
With the rise of human-centric technologies such as social media platforms, the amount of hate also continues to grow proportionally with the increasing number of users worldwide. TikTok is one of the most-used social media platforms due to its feature that allows users to express themselves via creating and sharing short-form videos based on any desired topic and content. In addition, it has also become a platform for political discourse and mudslinging as users can freely express an opinion and indirectly debate with random people online. In this study, we propose the use of BERT, a complex bidirectional transformer-based model, for the task of automatic hate speech detection from speech transcribed from Tagalog TikTok videos. Results of our experiments show that a BERT-based hate speech classifier scores 61% F1. We also extended the task beyond several algorithms such as LSTM, Naïve Bayes, and Decision Tree and found out that traditional methods such as a simple Bernoulli Naïve Bayes approach remain at par with the BERT model.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Strategic Research on the Teaching Application of Content and Language Integrated Learning Methods in the Chinese Language Region Does Credit Information Sharing Affect Corporate Cash Holdings?:Evidence from Chinese Listed Companies Student Readiness for Transformative Learning: A Case Study in a Vocational College Selecting Potential Medical Professional Ability Students in Chinese NCEE by Predicting GPA through Data Mining No-arbitrage Pricing of European Options based on Trinomial Tree Model
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1