借助稳健的fastText和CNN实现社交网络评论的自动化

2019 1st International Conference on Innovations in Information and Communication Technology (ICIICT) Pub Date : 2019-04-01 DOI:10.1109/ICIICT1.2019.8741503

S. Mestry, Hargun Singh, Roshan Chauhan, V. Bisht, Kaushik Tiwari

{"title":"借助稳健的fastText和CNN实现社交网络评论的自动化","authors":"S. Mestry, Hargun Singh, Roshan Chauhan, V. Bisht, Kaushik Tiwari","doi":"10.1109/ICIICT1.2019.8741503","DOIUrl":null,"url":null,"abstract":"Social networking and online conversation platforms provide us with the power to share our views and ideas. However, nowadays on social media platforms, many people are taking these platforms for granted, they see it as an opportunity to harass and target others leading to cyber-attack and cyber-bullying which lead to traumatic experiences and suicidal attempts in extreme cases. Manually identifying and classifying such comments is a very long, tiresome and unreliable process. To solve this challenge, we have developed a deep learning system which will identify such negative content on online discussion platforms and successfully classify them into proper labels. Our proposed model aims to apply the text-based Convolution Neural Network (CNN) with word embedding, using fastText word embedding technique. fastText has shown efficient and more accurate results compared to Word2Vec and GLOVE model. Our model aims to improve detecting different types of toxicity to improve the social media experience. Our model classifies such comments in six classes which are Toxic, Severe Toxic, Obscene, Threat, Insult and Identity-hate. Multi-Label Classification helps us to provide an automated solution for dealing with the toxic comments problem we are facing.","PeriodicalId":118897,"journal":{"name":"2019 1st International Conference on Innovations in Information and Communication Technology (ICIICT)","volume":"116 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Automation in Social Networking Comments With the Help of Robust fastText and CNN\",\"authors\":\"S. Mestry, Hargun Singh, Roshan Chauhan, V. Bisht, Kaushik Tiwari\",\"doi\":\"10.1109/ICIICT1.2019.8741503\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Social networking and online conversation platforms provide us with the power to share our views and ideas. However, nowadays on social media platforms, many people are taking these platforms for granted, they see it as an opportunity to harass and target others leading to cyber-attack and cyber-bullying which lead to traumatic experiences and suicidal attempts in extreme cases. Manually identifying and classifying such comments is a very long, tiresome and unreliable process. To solve this challenge, we have developed a deep learning system which will identify such negative content on online discussion platforms and successfully classify them into proper labels. Our proposed model aims to apply the text-based Convolution Neural Network (CNN) with word embedding, using fastText word embedding technique. fastText has shown efficient and more accurate results compared to Word2Vec and GLOVE model. Our model aims to improve detecting different types of toxicity to improve the social media experience. Our model classifies such comments in six classes which are Toxic, Severe Toxic, Obscene, Threat, Insult and Identity-hate. Multi-Label Classification helps us to provide an automated solution for dealing with the toxic comments problem we are facing.\",\"PeriodicalId\":118897,\"journal\":{\"name\":\"2019 1st International Conference on Innovations in Information and Communication Technology (ICIICT)\",\"volume\":\"116 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 1st International Conference on Innovations in Information and Communication Technology (ICIICT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICIICT1.2019.8741503\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 1st International Conference on Innovations in Information and Communication Technology (ICIICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIICT1.2019.8741503","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 11

摘要

社交网络和在线对话平台为我们提供了分享观点和想法的能力。然而，如今在社交媒体平台上，许多人认为这些平台是理所当然的，他们认为这是一个骚扰和瞄准他人的机会，导致网络攻击和网络欺凌，在极端情况下导致创伤经历和自杀企图。手动识别和分类这些评论是一个非常漫长，令人厌烦和不可靠的过程。为了解决这一挑战，我们开发了一个深度学习系统，该系统将识别在线讨论平台上的此类负面内容，并成功地将其分类为适当的标签。我们提出的模型旨在将基于文本的卷积神经网络(CNN)与词嵌入相结合，使用fastText词嵌入技术。与Word2Vec和GLOVE模型相比，fastText显示出更高效、更准确的结果。我们的模型旨在改进检测不同类型的毒性，以改善社交媒体体验。我们的模型将这些评论分为六类，分别是有毒的、严重有毒的、淫秽的、威胁的、侮辱的和身份仇恨的。多标签分类帮助我们提供了一个自动化的解决方案来处理我们所面临的有毒评论问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Automation in Social Networking Comments With the Help of Robust fastText and CNN

Social networking and online conversation platforms provide us with the power to share our views and ideas. However, nowadays on social media platforms, many people are taking these platforms for granted, they see it as an opportunity to harass and target others leading to cyber-attack and cyber-bullying which lead to traumatic experiences and suicidal attempts in extreme cases. Manually identifying and classifying such comments is a very long, tiresome and unreliable process. To solve this challenge, we have developed a deep learning system which will identify such negative content on online discussion platforms and successfully classify them into proper labels. Our proposed model aims to apply the text-based Convolution Neural Network (CNN) with word embedding, using fastText word embedding technique. fastText has shown efficient and more accurate results compared to Word2Vec and GLOVE model. Our model aims to improve detecting different types of toxicity to improve the social media experience. Our model classifies such comments in six classes which are Toxic, Severe Toxic, Obscene, Threat, Insult and Identity-hate. Multi-Label Classification helps us to provide an automated solution for dealing with the toxic comments problem we are facing.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 1st International Conference on Innovations in Information and Communication Technology (ICIICT)

自引率

0.00%

发文量