Enhanced analysis of large-scale news text data using the bidirectional-Kmeans-LSTM-CNN model

IF 3.5 4区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE PeerJ Computer Science Pub Date : 2024-08-01 DOI:10.7717/peerj-cs.2213
Qingxiang Zeng
{"title":"Enhanced analysis of large-scale news text data using the bidirectional-Kmeans-LSTM-CNN model","authors":"Qingxiang Zeng","doi":"10.7717/peerj-cs.2213","DOIUrl":null,"url":null,"abstract":"Traditional methods may be inefficient when processing large-scale data in the field of text mining, often struggling to identify and cluster relevant information accurately and efficiently. Additionally, capturing nuanced sentiment and emotional context within news text is challenging with conventional techniques. To address these issues, this article introduces an improved bidirectional-Kmeans-long short-term memory network-convolutional neural network (BiK-LSTM-CNN) model that incorporates emotional semantic analysis for high-dimensional news text visual extraction and media hotspot mining. The BiK-LSTM-CNN model comprises four modules: news text preprocessing, news text clustering, sentiment semantic analysis, and the BiK-LSTM-CNN model itself. By combining these components, the model effectively identifies common features within the input data, clusters similar news articles, and accurately analyzes the emotional semantics of the text. This comprehensive approach enhances both the accuracy and efficiency of visual extraction and hotspot mining. Experimental results demonstrate that compared to models such as Transformer, AdvLSTM, and NewRNN, BiK-LSTM-CNN achieves improvements in macro accuracy by 0.50%, 0.91%, and 1.34%, respectively. Similarly, macro recall rates increase by 0.51%, 1.24%, and 1.26%, while macro F1 scores improve by 0.52%, 1.23%, and 1.92%. Additionally, the BiK-LSTM-CNN model shows significant improvements in time efficiency, further establishing its potential as a more effective approach for processing and analyzing large-scale text data","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":null,"pages":null},"PeriodicalIF":3.5000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PeerJ Computer Science","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.7717/peerj-cs.2213","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Traditional methods may be inefficient when processing large-scale data in the field of text mining, often struggling to identify and cluster relevant information accurately and efficiently. Additionally, capturing nuanced sentiment and emotional context within news text is challenging with conventional techniques. To address these issues, this article introduces an improved bidirectional-Kmeans-long short-term memory network-convolutional neural network (BiK-LSTM-CNN) model that incorporates emotional semantic analysis for high-dimensional news text visual extraction and media hotspot mining. The BiK-LSTM-CNN model comprises four modules: news text preprocessing, news text clustering, sentiment semantic analysis, and the BiK-LSTM-CNN model itself. By combining these components, the model effectively identifies common features within the input data, clusters similar news articles, and accurately analyzes the emotional semantics of the text. This comprehensive approach enhances both the accuracy and efficiency of visual extraction and hotspot mining. Experimental results demonstrate that compared to models such as Transformer, AdvLSTM, and NewRNN, BiK-LSTM-CNN achieves improvements in macro accuracy by 0.50%, 0.91%, and 1.34%, respectively. Similarly, macro recall rates increase by 0.51%, 1.24%, and 1.26%, while macro F1 scores improve by 0.52%, 1.23%, and 1.92%. Additionally, the BiK-LSTM-CNN model shows significant improvements in time efficiency, further establishing its potential as a more effective approach for processing and analyzing large-scale text data
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用双向均值-LSTM-CNN 模型加强对大规模新闻文本数据的分析
在文本挖掘领域处理大规模数据时,传统方法可能效率低下,往往难以准确高效地识别和聚类相关信息。此外,传统技术难以捕捉新闻文本中细微的情感和情绪背景。为解决这些问题,本文介绍了一种改进的双向-均值-长短期记忆网络-卷积神经网络(BiK-LSTM-CNN)模型,该模型结合了情感语义分析,可用于高维新闻文本视觉提取和媒体热点挖掘。BiK-LSTM-CNN 模型包括四个模块:新闻文本预处理、新闻文本聚类、情感语义分析和 BiK-LSTM-CNN 模型本身。通过结合这些组件,该模型能有效识别输入数据中的共同特征,聚类相似的新闻文章,并准确分析文本的情感语义。这种综合方法提高了视觉提取和热点挖掘的准确性和效率。实验结果表明,与 Transformer、AdvLSTM 和 NewRNN 等模型相比,BiK-LSTM-CNN 的宏观准确率分别提高了 0.50%、0.91% 和 1.34%。同样,宏观召回率提高了 0.51%、1.24% 和 1.26%,宏观 F1 分数提高了 0.52%、1.23% 和 1.92%。此外,BiK-LSTM-CNN 模型在时间效率方面也有显著提高,进一步证实了其作为处理和分析大规模文本数据的更有效方法的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
PeerJ Computer Science
PeerJ Computer Science Computer Science-General Computer Science
CiteScore
6.10
自引率
5.30%
发文量
332
审稿时长
10 weeks
期刊介绍: PeerJ Computer Science is the new open access journal covering all subject areas in computer science, with the backing of a prestigious advisory board and more than 300 academic editors.
期刊最新文献
A model integrating attention mechanism and generative adversarial network for image style transfer. Detecting rumors in social media using emotion based deep learning approach. Harnessing AI and analytics to enhance cybersecurity and privacy for collective intelligence systems. Improving synthetic media generation and detection using generative adversarial networks. Intelligent accounting optimization method based on meta-heuristic algorithm and CNN.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1