基于多模式情感词典的数字媒体短文本情感分析框架设计

IF 0.9 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING Scalable Computing-Practice and Experience Pub Date : 2023-09-10 DOI:10.12694/scpe.v24i3.2371

Shuqin Lin

{"title":"基于多模式情感词典的数字媒体短文本情感分析框架设计","authors":"Shuqin Lin","doi":"10.12694/scpe.v24i3.2371","DOIUrl":null,"url":null,"abstract":"Along the continuous advancement of the network and the rise of digital media, the amount of data produced by the exponential explosion. And how to use these data to provide personalized services for users is one of the current research focuses. To address the issue of insufficient coverage in the current sentiment lexicon and the difficulty of constructing sentiment lexicon in specific fields, this study proposes a multi-modal emotional thesaurus. Semi-supervised learning is used to solve the problem of insufficient coverage of emotional thesaurus, and a semi-supervised classification algorithm is realized by using a large number of unlabeled sample data combined with a small number of labeled sample data. Optimized learning is used to solve the problem of difficult construction of emotional thesaurus in specific fields, the corresponding specific emotional thesaurus is constructed by adaptive adjustment of emotional word score, and finally the improved emotional thesaurus is used to build a digital media short text sentiment analysis framework. For testing, the NLPCC dataset was used in this study, Experiments show that the framework constructed in this study requires 87 iterations, a Recall value of 0.912, a F1 value of 0.753, and an average accuracy of 83.39%, all of which are better than the sentiment analysis framework without the use of multi-pattern sentiment lexicon. In the simulation experiment, the recognition accuracy reached 85.88%, which was 16.85%, 11.57% and 6.72% higher than the test scenarios using a single emotion thesaurus selected in this study. The above results show that the digital media short-text sentiment analysis framework built in this research based on multi-pattern sentiment lexicon can carry out short-text sentiment analysis more accurately and efficiently, so as to accurately analyze users’ needs and provide customized services precisely.","PeriodicalId":43791,"journal":{"name":"Scalable Computing-Practice and Experience","volume":"77 1","pages":"0"},"PeriodicalIF":0.9000,"publicationDate":"2023-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Design of Sentiment Analysis Framework of Digital Media Short Text Based on Multi-pattern Sentiment Lexicon\",\"authors\":\"Shuqin Lin\",\"doi\":\"10.12694/scpe.v24i3.2371\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Along the continuous advancement of the network and the rise of digital media, the amount of data produced by the exponential explosion. And how to use these data to provide personalized services for users is one of the current research focuses. To address the issue of insufficient coverage in the current sentiment lexicon and the difficulty of constructing sentiment lexicon in specific fields, this study proposes a multi-modal emotional thesaurus. Semi-supervised learning is used to solve the problem of insufficient coverage of emotional thesaurus, and a semi-supervised classification algorithm is realized by using a large number of unlabeled sample data combined with a small number of labeled sample data. Optimized learning is used to solve the problem of difficult construction of emotional thesaurus in specific fields, the corresponding specific emotional thesaurus is constructed by adaptive adjustment of emotional word score, and finally the improved emotional thesaurus is used to build a digital media short text sentiment analysis framework. For testing, the NLPCC dataset was used in this study, Experiments show that the framework constructed in this study requires 87 iterations, a Recall value of 0.912, a F1 value of 0.753, and an average accuracy of 83.39%, all of which are better than the sentiment analysis framework without the use of multi-pattern sentiment lexicon. In the simulation experiment, the recognition accuracy reached 85.88%, which was 16.85%, 11.57% and 6.72% higher than the test scenarios using a single emotion thesaurus selected in this study. The above results show that the digital media short-text sentiment analysis framework built in this research based on multi-pattern sentiment lexicon can carry out short-text sentiment analysis more accurately and efficiently, so as to accurately analyze users’ needs and provide customized services precisely.\",\"PeriodicalId\":43791,\"journal\":{\"name\":\"Scalable Computing-Practice and Experience\",\"volume\":\"77 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.9000,\"publicationDate\":\"2023-09-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Scalable Computing-Practice and Experience\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.12694/scpe.v24i3.2371\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scalable Computing-Practice and Experience","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.12694/scpe.v24i3.2371","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

摘要

随着网络的不断推进和数字媒体的兴起，产生的数据量呈指数级爆炸。而如何利用这些数据为用户提供个性化服务是当前的研究热点之一。针对当前情感词典覆盖范围不足和特定领域情感词典构建困难的问题，本研究提出了一个多模态情感词典。采用半监督学习来解决情感词库覆盖不足的问题，将大量未标记的样本数据与少量标记的样本数据相结合，实现半监督分类算法。利用优化学习解决特定领域情感词库构建困难的问题，通过自适应调整情感词分构建相应的特定情感词库，最后利用改进后的情感词库构建数字媒体短文本情感分析框架。实验表明，本文构建的框架迭代次数为87次，Recall值为0.912,F1值为0.753，平均准确率为83.39%，均优于未使用多模式情感词典的情感分析框架。在模拟实验中，识别准确率达到85.88%，分别比本研究选择的单一情感词库测试场景提高了16.85%、11.57%和6.72%。以上结果表明，本研究构建的基于多模式情感词汇的数字媒体短文本情感分析框架能够更加准确高效地进行短文本情感分析，从而准确分析用户需求，精准提供定制化服务。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Design of Sentiment Analysis Framework of Digital Media Short Text Based on Multi-pattern Sentiment Lexicon

Along the continuous advancement of the network and the rise of digital media, the amount of data produced by the exponential explosion. And how to use these data to provide personalized services for users is one of the current research focuses. To address the issue of insufficient coverage in the current sentiment lexicon and the difficulty of constructing sentiment lexicon in specific fields, this study proposes a multi-modal emotional thesaurus. Semi-supervised learning is used to solve the problem of insufficient coverage of emotional thesaurus, and a semi-supervised classification algorithm is realized by using a large number of unlabeled sample data combined with a small number of labeled sample data. Optimized learning is used to solve the problem of difficult construction of emotional thesaurus in specific fields, the corresponding specific emotional thesaurus is constructed by adaptive adjustment of emotional word score, and finally the improved emotional thesaurus is used to build a digital media short text sentiment analysis framework. For testing, the NLPCC dataset was used in this study, Experiments show that the framework constructed in this study requires 87 iterations, a Recall value of 0.912, a F1 value of 0.753, and an average accuracy of 83.39%, all of which are better than the sentiment analysis framework without the use of multi-pattern sentiment lexicon. In the simulation experiment, the recognition accuracy reached 85.88%, which was 16.85%, 11.57% and 6.72% higher than the test scenarios using a single emotion thesaurus selected in this study. The above results show that the digital media short-text sentiment analysis framework built in this research based on multi-pattern sentiment lexicon can carry out short-text sentiment analysis more accurately and efficiently, so as to accurately analyze users’ needs and provide customized services precisely.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Scalable Computing-Practice and Experience COMPUTER SCIENCE, SOFTWARE ENGINEERING-

CiteScore

2.00

自引率

0.00%

发文量

期刊介绍： The area of scalable computing has matured and reached a point where new issues and trends require a professional forum. SCPE will provide this avenue by publishing original refereed papers that address the present as well as the future of parallel and distributed computing. The journal will focus on algorithm development, implementation and execution on real-world parallel architectures, and application of parallel and distributed computing to the solution of real-life problems.