Twitter情感分析的启发式辅助BERT

Gokul Yenduri, R. RajakumarBoothalingam, K. Praghash, D. Binu
{"title":"Twitter情感分析的启发式辅助BERT","authors":"Gokul Yenduri, R. RajakumarBoothalingam, K. Praghash, D. Binu","doi":"10.1142/s1469026821500152","DOIUrl":null,"url":null,"abstract":"The identification of opinions and sentiments from tweets is termed as “Twitter Sentiment Analysis (TSA)”. The major process of TSA is to determine the sentiment or polarity of the tweet and then classifying them into a negative or positive tweet. There are several methods introduced for carrying out TSA, however, it remains to be challenging due to slang words, modern accents, grammatical and spelling mistakes, and other issues that could not be solved by existing techniques. This work develops a novel customized BERT-oriented sentiment classification that encompasses two main phases: pre-processing and tokenization, and a “Customized Bidirectional Encoder Representations from Transformers (BERT)”-based classification. At first, the gathered raw tweets are pre-processed under stop-word removal, stemming and blank space removal. After pre-processing, the semantic words are obtained, from which the meaningful words (tokens) are extracted in the tokenization phase. Consequently, these extracted tokens are classified via optimized BERT, where biases and weight are tuned optimally by Particle-Assisted Circle Updating Position (PA-CUP). Moreover, the maximal sequence length of the BERT encoder is updated using standard PA-CUP. Finally, the performance analysis is carried out to substantiate the enhancement of the proposed model.","PeriodicalId":422521,"journal":{"name":"Int. J. Comput. Intell. Appl.","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Heuristic-Assisted BERT for Twitter Sentiment Analysis\",\"authors\":\"Gokul Yenduri, R. RajakumarBoothalingam, K. Praghash, D. Binu\",\"doi\":\"10.1142/s1469026821500152\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The identification of opinions and sentiments from tweets is termed as “Twitter Sentiment Analysis (TSA)”. The major process of TSA is to determine the sentiment or polarity of the tweet and then classifying them into a negative or positive tweet. There are several methods introduced for carrying out TSA, however, it remains to be challenging due to slang words, modern accents, grammatical and spelling mistakes, and other issues that could not be solved by existing techniques. This work develops a novel customized BERT-oriented sentiment classification that encompasses two main phases: pre-processing and tokenization, and a “Customized Bidirectional Encoder Representations from Transformers (BERT)”-based classification. At first, the gathered raw tweets are pre-processed under stop-word removal, stemming and blank space removal. After pre-processing, the semantic words are obtained, from which the meaningful words (tokens) are extracted in the tokenization phase. Consequently, these extracted tokens are classified via optimized BERT, where biases and weight are tuned optimally by Particle-Assisted Circle Updating Position (PA-CUP). Moreover, the maximal sequence length of the BERT encoder is updated using standard PA-CUP. Finally, the performance analysis is carried out to substantiate the enhancement of the proposed model.\",\"PeriodicalId\":422521,\"journal\":{\"name\":\"Int. J. Comput. Intell. Appl.\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Int. J. Comput. Intell. Appl.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1142/s1469026821500152\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Comput. Intell. Appl.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/s1469026821500152","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

摘要

从推特中识别观点和情绪被称为“推特情绪分析(TSA)”。TSA的主要过程是确定tweet的情绪或极性,然后将其分类为消极或积极的tweet。实施TSA有几种方法,然而,由于俚语,现代口音,语法和拼写错误以及其他现有技术无法解决的问题,它仍然具有挑战性。这项工作开发了一种新的定制的面向BERT的情感分类,它包括两个主要阶段:预处理和标记化,以及基于“自定义的双向编码器表示来自变压器(BERT)”的分类。首先,对收集到的原始推文进行停词去除、词干提取和空格去除等预处理。预处理后获得语义词,在标记化阶段从中提取有意义的词(标记)。因此,这些提取的标记通过优化的BERT进行分类,其中偏差和权重通过粒子辅助圆更新位置(PA-CUP)进行优化调整。此外,使用标准PA-CUP更新BERT编码器的最大序列长度。最后,进行了性能分析,以验证所提出模型的改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Heuristic-Assisted BERT for Twitter Sentiment Analysis
The identification of opinions and sentiments from tweets is termed as “Twitter Sentiment Analysis (TSA)”. The major process of TSA is to determine the sentiment or polarity of the tweet and then classifying them into a negative or positive tweet. There are several methods introduced for carrying out TSA, however, it remains to be challenging due to slang words, modern accents, grammatical and spelling mistakes, and other issues that could not be solved by existing techniques. This work develops a novel customized BERT-oriented sentiment classification that encompasses two main phases: pre-processing and tokenization, and a “Customized Bidirectional Encoder Representations from Transformers (BERT)”-based classification. At first, the gathered raw tweets are pre-processed under stop-word removal, stemming and blank space removal. After pre-processing, the semantic words are obtained, from which the meaningful words (tokens) are extracted in the tokenization phase. Consequently, these extracted tokens are classified via optimized BERT, where biases and weight are tuned optimally by Particle-Assisted Circle Updating Position (PA-CUP). Moreover, the maximal sequence length of the BERT encoder is updated using standard PA-CUP. Finally, the performance analysis is carried out to substantiate the enhancement of the proposed model.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
CT Images Segmentation Using a Deep Learning-Based Approach for Preoperative Projection of Human Organ Model Using Augmented Reality Technology Styling Classification of Group Photos Fusing Head and Pose Features Genetic Algorithm-Based Optimal Resource Trust Line Prediction in Cloud Computing Shearlet Transform-Based Novel Method for Multimodality Medical Image Fusion Using Deep Learning An Energy-Efficient Clustering and Fuzzy-Based Path Selection for Flying Ad-Hoc Networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1