基于文本深度分析的用户反馈严重性等级识别与分类

2023 4th International Conference on Computing, Mathematics and Engineering Technologies (iCoMET) Pub Date : 2023-03-17 DOI:10.1109/iCoMET57998.2023.10099177

Muhammad Umair, Syed Aun Irtaza, Shahid Salim

{"title":"基于文本深度分析的用户反馈严重性等级识别与分类","authors":"Muhammad Umair, Syed Aun Irtaza, Shahid Salim","doi":"10.1109/iCoMET57998.2023.10099177","DOIUrl":null,"url":null,"abstract":"Now a days world is look right on digitalized. Social media is captivating in this digital age through the accessibility of consumer's feedback. The recent work in the field of classification based on comments on social media is gaining appeal on a global scale. Unfortunately, the study does not offer better accuracy in terms of toxic comments. On social media platforms, hateful and abusive language has a detrimental effect on users' mental health and involvement from people from all diverse backgrounds. Automatic methods is most commonly used datasets with categorical labels to detect foul language. The level of offensiveness of comments varies. In NLP we use binary classification like either a comment is offensive or not and leave continues classification. In continues classification one can identify the severity level of comments, can set a threshold, and by using Deep Learning and modeling techniques can directly identify the severity level of comments by considering context. The review of related literature shows that identification of toxicity of user comments can be improved by pre-processing methods, such as deleting null values and anomies from the dataset, to refine the dataset and increase its accuracy by applying different algorithm techniques to make feature more valuables. This research provides analysis of user comments datasets and study's user comments toxicity with different machine learning approaches. First, we need to do pre-processing steps including punctuations, stop words, null entries, and duplicate removal to remove anomalies. After that we need to apply different methods like count vectorizer and bag of words to extract features. After that, we MCPL algorithm applied on these datasets to predicts results. By applying MCPL model on user comments dataset 88.5% accuracy were founded.","PeriodicalId":369792,"journal":{"name":"2023 4th International Conference on Computing, Mathematics and Engineering Technologies (iCoMET)","volume":"88 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"User Feedback Severity Level Identification and Classification through Deeper Analysis of Text\",\"authors\":\"Muhammad Umair, Syed Aun Irtaza, Shahid Salim\",\"doi\":\"10.1109/iCoMET57998.2023.10099177\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Now a days world is look right on digitalized. Social media is captivating in this digital age through the accessibility of consumer's feedback. The recent work in the field of classification based on comments on social media is gaining appeal on a global scale. Unfortunately, the study does not offer better accuracy in terms of toxic comments. On social media platforms, hateful and abusive language has a detrimental effect on users' mental health and involvement from people from all diverse backgrounds. Automatic methods is most commonly used datasets with categorical labels to detect foul language. The level of offensiveness of comments varies. In NLP we use binary classification like either a comment is offensive or not and leave continues classification. In continues classification one can identify the severity level of comments, can set a threshold, and by using Deep Learning and modeling techniques can directly identify the severity level of comments by considering context. The review of related literature shows that identification of toxicity of user comments can be improved by pre-processing methods, such as deleting null values and anomies from the dataset, to refine the dataset and increase its accuracy by applying different algorithm techniques to make feature more valuables. This research provides analysis of user comments datasets and study's user comments toxicity with different machine learning approaches. First, we need to do pre-processing steps including punctuations, stop words, null entries, and duplicate removal to remove anomalies. After that we need to apply different methods like count vectorizer and bag of words to extract features. After that, we MCPL algorithm applied on these datasets to predicts results. By applying MCPL model on user comments dataset 88.5% accuracy were founded.\",\"PeriodicalId\":369792,\"journal\":{\"name\":\"2023 4th International Conference on Computing, Mathematics and Engineering Technologies (iCoMET)\",\"volume\":\"88 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-03-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 4th International Conference on Computing, Mathematics and Engineering Technologies (iCoMET)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/iCoMET57998.2023.10099177\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 4th International Conference on Computing, Mathematics and Engineering Technologies (iCoMET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/iCoMET57998.2023.10099177","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

现在的世界是数字化的。在这个数字时代，社交媒体通过获取消费者的反馈而具有吸引力。最近，基于社交媒体评论的分类工作在全球范围内越来越受欢迎。不幸的是，这项研究在有毒评论方面并没有提供更好的准确性。在社交媒体平台上，仇恨和辱骂的语言对用户的心理健康和来自不同背景的人的参与产生了有害影响。自动方法是最常用的带有分类标签的数据集检测脏话的方法。评论的冒犯程度各不相同。在NLP中，我们使用二元分类，比如评论是否冒犯，然后继续分类。在连续分类中，人们可以识别评论的严重级别，可以设置阈值，并且通过使用深度学习和建模技术，可以通过考虑上下文直接识别评论的严重级别。对相关文献的回顾表明，可以通过预处理方法来改进用户评论毒性的识别，例如从数据集中删除空值和反常，通过应用不同的算法技术来改进数据集，并提高其准确性，使特征更有价值。本研究提供了用户评论数据集的分析，并使用不同的机器学习方法研究用户评论的毒性。首先，我们需要进行预处理步骤，包括标点符号、停止词、空条目和重复删除，以消除异常。之后，我们需要使用不同的方法，如计数矢量器和词包提取特征。然后，我们将MCPL算法应用于这些数据集上进行结果预测。将MCPL模型应用于用户评论数据集，准确率达到88.5%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

User Feedback Severity Level Identification and Classification through Deeper Analysis of Text

Now a days world is look right on digitalized. Social media is captivating in this digital age through the accessibility of consumer's feedback. The recent work in the field of classification based on comments on social media is gaining appeal on a global scale. Unfortunately, the study does not offer better accuracy in terms of toxic comments. On social media platforms, hateful and abusive language has a detrimental effect on users' mental health and involvement from people from all diverse backgrounds. Automatic methods is most commonly used datasets with categorical labels to detect foul language. The level of offensiveness of comments varies. In NLP we use binary classification like either a comment is offensive or not and leave continues classification. In continues classification one can identify the severity level of comments, can set a threshold, and by using Deep Learning and modeling techniques can directly identify the severity level of comments by considering context. The review of related literature shows that identification of toxicity of user comments can be improved by pre-processing methods, such as deleting null values and anomies from the dataset, to refine the dataset and increase its accuracy by applying different algorithm techniques to make feature more valuables. This research provides analysis of user comments datasets and study's user comments toxicity with different machine learning approaches. First, we need to do pre-processing steps including punctuations, stop words, null entries, and duplicate removal to remove anomalies. After that we need to apply different methods like count vectorizer and bag of words to extract features. After that, we MCPL algorithm applied on these datasets to predicts results. By applying MCPL model on user comments dataset 88.5% accuracy were founded.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 4th International Conference on Computing, Mathematics and Engineering Technologies (iCoMET)

自引率

0.00%

发文量