Distilroberta2gnn：基于方面的情感分析的新型混合深度学习方法

IF 3.5 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE PeerJ Computer Science Pub Date : 2024-08-16 DOI:10.7717/peerj-cs.2267

Aseel Alhadlaq, Alaa Altheneyan

{"title":"Distilroberta2gnn：基于方面的情感分析的新型混合深度学习方法","authors":"Aseel Alhadlaq, Alaa Altheneyan","doi":"10.7717/peerj-cs.2267","DOIUrl":null,"url":null,"abstract":"In the field of natural language processing (NLP), aspect-based sentiment analysis (ABSA) is crucial for extracting insights from complex human sentiments towards specific text aspects. Despite significant progress, the field still faces challenges such as accurately interpreting subtle language nuances and the scarcity of high-quality, domain-specific annotated datasets. This study introduces the Distil- RoBERTa2GNN model, an innovative hybrid approach that combines the DistilRoBERTa pre-trained model’s feature extraction capabilities with the dynamic sentiment classification abilities of graph neural networks (GNN). Our comprehensive, four-phase data preprocessing strategy is designed to enrich model training with domain-specific, high-quality data. In this study, we analyze four publicly available benchmark datasets: Rest14, Rest15, Rest16-EN, and Rest16-ESP, to rigorously evaluate the effectiveness of our novel DistilRoBERTa2GNN model in ABSA. For the Rest14 dataset, our model achieved an F1 score of 77.98%, precision of 78.12%, and recall of 79.41%. The Rest15 dataset shows that our model achieves an F1 score of 76.86%, precision of 80.70%, and recall of 79.37%. For the Rest16-EN dataset, our model reached an F1 score of 84.96%, precision of 82.77%, and recall of 87.28%. For Rest16-ESP (Spanish dataset), our model achieved an F1 score of 74.87%, with a precision of 73.11% and a recall of 76.80%. These metrics highlight our model’s competitive edge over different baseline models used in ABSA studies. This study addresses critical ABSA challenges and sets a new benchmark for sentiment analysis research, guiding future efforts toward enhancing model adaptability and performance across diverse datasets.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"168 1","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Distilroberta2gnn: a new hybrid deep learning approach for aspect-based sentiment analysis\",\"authors\":\"Aseel Alhadlaq, Alaa Altheneyan\",\"doi\":\"10.7717/peerj-cs.2267\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the field of natural language processing (NLP), aspect-based sentiment analysis (ABSA) is crucial for extracting insights from complex human sentiments towards specific text aspects. Despite significant progress, the field still faces challenges such as accurately interpreting subtle language nuances and the scarcity of high-quality, domain-specific annotated datasets. This study introduces the Distil- RoBERTa2GNN model, an innovative hybrid approach that combines the DistilRoBERTa pre-trained model’s feature extraction capabilities with the dynamic sentiment classification abilities of graph neural networks (GNN). Our comprehensive, four-phase data preprocessing strategy is designed to enrich model training with domain-specific, high-quality data. In this study, we analyze four publicly available benchmark datasets: Rest14, Rest15, Rest16-EN, and Rest16-ESP, to rigorously evaluate the effectiveness of our novel DistilRoBERTa2GNN model in ABSA. For the Rest14 dataset, our model achieved an F1 score of 77.98%, precision of 78.12%, and recall of 79.41%. The Rest15 dataset shows that our model achieves an F1 score of 76.86%, precision of 80.70%, and recall of 79.37%. For the Rest16-EN dataset, our model reached an F1 score of 84.96%, precision of 82.77%, and recall of 87.28%. For Rest16-ESP (Spanish dataset), our model achieved an F1 score of 74.87%, with a precision of 73.11% and a recall of 76.80%. These metrics highlight our model’s competitive edge over different baseline models used in ABSA studies. This study addresses critical ABSA challenges and sets a new benchmark for sentiment analysis research, guiding future efforts toward enhancing model adaptability and performance across diverse datasets.\",\"PeriodicalId\":54224,\"journal\":{\"name\":\"PeerJ Computer Science\",\"volume\":\"168 1\",\"pages\":\"\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2024-08-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PeerJ Computer Science\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.7717/peerj-cs.2267\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PeerJ Computer Science","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.7717/peerj-cs.2267","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

在自然语言处理（NLP）领域，基于方面的情感分析（ABSA）对于从人类对特定文本方面的复杂情感中提取洞察力至关重要。尽管取得了重大进展，但该领域仍面临一些挑战，如准确解释微妙的语言细微差别，以及缺乏高质量、特定领域的注释数据集。本研究介绍了 Distil- RoBERTa2GNN 模型，这是一种创新的混合方法，将 DistilRoBERTa 预训练模型的特征提取能力与图神经网络（GNN）的动态情感分类能力相结合。我们全面的四阶段数据预处理策略旨在利用特定领域的高质量数据丰富模型训练。在本研究中，我们分析了四个公开的基准数据集：Rest14、Rest15、Rest16-EN 和 Rest16-ESP，以严格评估我们的新型 DistilRoBERTa2GNN 模型在 ABSA 中的有效性。在 Rest14 数据集中，我们的模型获得了 77.98% 的 F1 分数、78.12% 的精确度和 79.41% 的召回率。在 Rest15 数据集上，我们的模型获得了 76.86% 的 F1 分数、80.70% 的精确度和 79.37% 的召回率。对于 Rest16-EN 数据集，我们模型的 F1 得分为 84.96%，精确度为 82.77%，召回率为 87.28%。对于 Rest16-ESP（西班牙数据集），我们模型的 F1 得分为 74.87%，精确度为 73.11%，召回率为 76.80%。这些指标凸显了我们的模型与 ABSA 研究中使用的不同基线模型相比所具有的竞争优势。这项研究解决了 ABSA 面临的关键挑战，为情感分析研究树立了一个新的标杆，为今后在不同数据集上提高模型的适应性和性能提供了指导。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Distilroberta2gnn: a new hybrid deep learning approach for aspect-based sentiment analysis

In the field of natural language processing (NLP), aspect-based sentiment analysis (ABSA) is crucial for extracting insights from complex human sentiments towards specific text aspects. Despite significant progress, the field still faces challenges such as accurately interpreting subtle language nuances and the scarcity of high-quality, domain-specific annotated datasets. This study introduces the Distil- RoBERTa2GNN model, an innovative hybrid approach that combines the DistilRoBERTa pre-trained model’s feature extraction capabilities with the dynamic sentiment classification abilities of graph neural networks (GNN). Our comprehensive, four-phase data preprocessing strategy is designed to enrich model training with domain-specific, high-quality data. In this study, we analyze four publicly available benchmark datasets: Rest14, Rest15, Rest16-EN, and Rest16-ESP, to rigorously evaluate the effectiveness of our novel DistilRoBERTa2GNN model in ABSA. For the Rest14 dataset, our model achieved an F1 score of 77.98%, precision of 78.12%, and recall of 79.41%. The Rest15 dataset shows that our model achieves an F1 score of 76.86%, precision of 80.70%, and recall of 79.37%. For the Rest16-EN dataset, our model reached an F1 score of 84.96%, precision of 82.77%, and recall of 87.28%. For Rest16-ESP (Spanish dataset), our model achieved an F1 score of 74.87%, with a precision of 73.11% and a recall of 76.80%. These metrics highlight our model’s competitive edge over different baseline models used in ABSA studies. This study addresses critical ABSA challenges and sets a new benchmark for sentiment analysis research, guiding future efforts toward enhancing model adaptability and performance across diverse datasets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

PeerJ Computer Science Computer Science-General Computer Science

CiteScore

6.10

自引率

5.30%

发文量

332

审稿时长

10 weeks

期刊介绍： PeerJ Computer Science is the new open access journal covering all subject areas in computer science, with the backing of a prestigious advisory board and more than 300 academic editors.