An automated learning model for sentiment analysis and data classification of Twitter data using balanced CA-SVM

C. Cyril, J. Beulah, Neelakandan Subramani, Prakash Mohan, A. Harshavardhan, D. Sivabalaselvamani
{"title":"An automated learning model for sentiment analysis and data classification of Twitter data using balanced CA-SVM","authors":"C. Cyril, J. Beulah, Neelakandan Subramani, Prakash Mohan, A. Harshavardhan, D. Sivabalaselvamani","doi":"10.1177/1063293X211031485","DOIUrl":null,"url":null,"abstract":"The modern society runs over the social media for their most time of every day. The web users spend their most time in social media and they share many details with their friends. Such information obtained from their chat has been used in several applications. The sentiment analysis is the one which has been applied with Twitter data set toward identifying the emotion of any user and based on those different problems can be solved. Primarily, the data as of the Twitter database is preprocessed. In this step, tokenization, stemming, stop word removal, and number removal are done. The proposed automated learning with CA-SVM based sentiment analysis model reads the Twitter data set. After that they have been processed to extract the features which yield set of terms. Using the terms, the tweets are clustered using TGS-K means clustering which measures Euclidean distance according to different features like semantic sentiment score (SSS), gazetteer and symbolic sentiment support (GSSS), and topical sentiment score (TSS). Further, the method classifies the tweets according to support vector machine (CA-SVM) which classifies the tweet according to the support value which is measured based on the above two measures. The attained results are validated utilizing k-fold cross-validation methodology. Then, the classification is performed by utilizing the Balanced CA-SVM (Deep Learning Modified Neural Network). The results are evaluated and compared with the existing works. The Proposed model achieved 92.48 % accuracy and 92.05% sentiment score contrasted with the existing works.","PeriodicalId":10680,"journal":{"name":"Concurrent Engineering","volume":"31 1","pages":"386 - 395"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"39","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Concurrent Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/1063293X211031485","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 39

Abstract

The modern society runs over the social media for their most time of every day. The web users spend their most time in social media and they share many details with their friends. Such information obtained from their chat has been used in several applications. The sentiment analysis is the one which has been applied with Twitter data set toward identifying the emotion of any user and based on those different problems can be solved. Primarily, the data as of the Twitter database is preprocessed. In this step, tokenization, stemming, stop word removal, and number removal are done. The proposed automated learning with CA-SVM based sentiment analysis model reads the Twitter data set. After that they have been processed to extract the features which yield set of terms. Using the terms, the tweets are clustered using TGS-K means clustering which measures Euclidean distance according to different features like semantic sentiment score (SSS), gazetteer and symbolic sentiment support (GSSS), and topical sentiment score (TSS). Further, the method classifies the tweets according to support vector machine (CA-SVM) which classifies the tweet according to the support value which is measured based on the above two measures. The attained results are validated utilizing k-fold cross-validation methodology. Then, the classification is performed by utilizing the Balanced CA-SVM (Deep Learning Modified Neural Network). The results are evaluated and compared with the existing works. The Proposed model achieved 92.48 % accuracy and 92.05% sentiment score contrasted with the existing works.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于平衡CA-SVM的Twitter数据情感分析与分类自动学习模型
现代社会每天大部分时间都在使用社交媒体。网络用户在社交媒体上花费的时间最多,他们与朋友分享许多细节。从他们的聊天中获得的这些信息已经在几个应用程序中使用。情感分析是一种应用于Twitter数据集的分析,旨在识别任何用户的情感,并基于这些不同的问题可以解决。首先,对Twitter数据库中的数据进行预处理。在这一步中,完成了标记化、词干提取、停止词删除和数字删除。提出了基于CA-SVM的情感分析模型的自动学习方法。然后对它们进行处理以提取产生术语集的特征。使用这些术语,使用TGS-K聚类方法对tweet进行聚类,该聚类方法根据语义情感评分(SSS)、地名和符号情感支持(GSSS)以及主题情感评分(TSS)等不同特征测量欧几里得距离。进一步,该方法根据支持向量机(CA-SVM)对推文进行分类,支持向量机根据基于上述两个度量测量的支持值对推文进行分类。利用k-fold交叉验证方法验证了所获得的结果。然后,利用平衡CA-SVM (Deep Learning Modified Neural Network)进行分类。对结果进行了评价,并与已有的工作进行了比较。与现有模型相比,该模型的准确率为92.48%,情感得分为92.05%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Sensitivity study of process parameters of wire arc additive manufacturing using probabilistic deep learning and uncertainty quantification Retraction Notice Decision-making solutions based artificial intelligence and hybrid software for optimal sizing and energy management in a smart grid system Harness collaboration between manufacturing Small and medium-sized enterprises through a collaborative platform based on the business model canvas Research on the evolution law of cloud manufacturing service ecosystem based on multi-agent behavior simulation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1