为基于 CNN 的情感分析调整混合参数的经验评估

IF 0.6 Q3 MULTIDISCIPLINARY SCIENCES Pertanika Journal of Science and Technology Pub Date : 2024-04-01 DOI:10.47836/pjst.32.3.05
Mohammed Maree, Mujahed Eleyat, Shatha Rabayah
{"title":"为基于 CNN 的情感分析调整混合参数的经验评估","authors":"Mohammed Maree, Mujahed Eleyat, Shatha Rabayah","doi":"10.47836/pjst.32.3.05","DOIUrl":null,"url":null,"abstract":"Sentiment analysis aims to understand human emotions and perceptions through various machine-learning pipelines. However, feature engineering and inherent semantic gap constraints often hinder conventional machine learning techniques and limit their accuracy. Newer neural network models have been proposed to automate the feature learning process and enrich learned features with word contextual embeddings to identify their semantic orientations to address these challenges. This article aims to analyze the influence of different factors on the accuracy of sentiment classification predictions by employing Feedforward and Convolutional Neural Networks. To assess the performance of these neural network models, we utilize four diverse real-world datasets, namely 50,000 movie reviews from IMDB, 10,662 sentences from LightSide Movie_Reviews, 300 public movie reviews, and 1,600,000 tweets extracted from Sentiment140. We experimentally investigate the impact of exploiting GloVe word embeddings on enriching feature vectors extracted from sentiment sentences. Findings indicate that using larger dimensions of GloVe word embeddings increases the sentiment classification accuracy. In particular, results demonstrate that the accuracy of the CNN with a larger feature map, a smaller filter window, and the ReLU activation function in the convolutional layer was 90.56% using the IMDB dataset. In comparison, it was 80.73% and 77.64% using the sentiment140 and the 300 sentiment sentences dataset, respectively. However, it is worth mentioning that, with large-size sentiment sentences (LightSide’s Movie Reviews) and using the same parameters, only a 64.44% level of accuracy was achieved.","PeriodicalId":46234,"journal":{"name":"Pertanika Journal of Science and Technology","volume":null,"pages":null},"PeriodicalIF":0.6000,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Empirical Evaluation of Adapting Hybrid Parameters for CNN-based Sentiment Analysis\",\"authors\":\"Mohammed Maree, Mujahed Eleyat, Shatha Rabayah\",\"doi\":\"10.47836/pjst.32.3.05\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sentiment analysis aims to understand human emotions and perceptions through various machine-learning pipelines. However, feature engineering and inherent semantic gap constraints often hinder conventional machine learning techniques and limit their accuracy. Newer neural network models have been proposed to automate the feature learning process and enrich learned features with word contextual embeddings to identify their semantic orientations to address these challenges. This article aims to analyze the influence of different factors on the accuracy of sentiment classification predictions by employing Feedforward and Convolutional Neural Networks. To assess the performance of these neural network models, we utilize four diverse real-world datasets, namely 50,000 movie reviews from IMDB, 10,662 sentences from LightSide Movie_Reviews, 300 public movie reviews, and 1,600,000 tweets extracted from Sentiment140. We experimentally investigate the impact of exploiting GloVe word embeddings on enriching feature vectors extracted from sentiment sentences. Findings indicate that using larger dimensions of GloVe word embeddings increases the sentiment classification accuracy. In particular, results demonstrate that the accuracy of the CNN with a larger feature map, a smaller filter window, and the ReLU activation function in the convolutional layer was 90.56% using the IMDB dataset. In comparison, it was 80.73% and 77.64% using the sentiment140 and the 300 sentiment sentences dataset, respectively. However, it is worth mentioning that, with large-size sentiment sentences (LightSide’s Movie Reviews) and using the same parameters, only a 64.44% level of accuracy was achieved.\",\"PeriodicalId\":46234,\"journal\":{\"name\":\"Pertanika Journal of Science and Technology\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.6000,\"publicationDate\":\"2024-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pertanika Journal of Science and Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.47836/pjst.32.3.05\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pertanika Journal of Science and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.47836/pjst.32.3.05","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

摘要

情感分析旨在通过各种机器学习管道了解人类的情感和认知。然而,特征工程和固有的语义差距限制往往会阻碍传统的机器学习技术,并限制其准确性。为了应对这些挑战,人们提出了更新的神经网络模型,以实现特征学习过程的自动化,并通过单词上下文嵌入来丰富所学特征,从而确定其语义取向。本文旨在通过使用前馈神经网络和卷积神经网络,分析不同因素对情感分类预测准确性的影响。为了评估这些神经网络模型的性能,我们使用了四个不同的真实世界数据集,即来自 IMDB 的 50,000 篇电影评论、来自 LightSide Movie_Reviews 的 10,662 个句子、300 篇公开电影评论以及从 Sentiment140 中提取的 1,600,000 篇推文。我们通过实验研究了利用 GloVe 词嵌入对丰富从情感句子中提取的特征向量的影响。研究结果表明,使用更大维度的 GloVe 词嵌入可以提高情感分类的准确性。特别是,结果表明,在使用 IMDB 数据集时,采用较大特征图、较小滤波窗口和卷积层 ReLU 激活函数的 CNN 的准确率为 90.56%。相比之下,使用 sentiment140 和 300 个情感句子数据集的准确率分别为 80.73% 和 77.64%。不过,值得一提的是,在使用大尺寸情感句子(LightSide 的电影评论)和相同参数时,准确率仅为 64.44%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
An Empirical Evaluation of Adapting Hybrid Parameters for CNN-based Sentiment Analysis
Sentiment analysis aims to understand human emotions and perceptions through various machine-learning pipelines. However, feature engineering and inherent semantic gap constraints often hinder conventional machine learning techniques and limit their accuracy. Newer neural network models have been proposed to automate the feature learning process and enrich learned features with word contextual embeddings to identify their semantic orientations to address these challenges. This article aims to analyze the influence of different factors on the accuracy of sentiment classification predictions by employing Feedforward and Convolutional Neural Networks. To assess the performance of these neural network models, we utilize four diverse real-world datasets, namely 50,000 movie reviews from IMDB, 10,662 sentences from LightSide Movie_Reviews, 300 public movie reviews, and 1,600,000 tweets extracted from Sentiment140. We experimentally investigate the impact of exploiting GloVe word embeddings on enriching feature vectors extracted from sentiment sentences. Findings indicate that using larger dimensions of GloVe word embeddings increases the sentiment classification accuracy. In particular, results demonstrate that the accuracy of the CNN with a larger feature map, a smaller filter window, and the ReLU activation function in the convolutional layer was 90.56% using the IMDB dataset. In comparison, it was 80.73% and 77.64% using the sentiment140 and the 300 sentiment sentences dataset, respectively. However, it is worth mentioning that, with large-size sentiment sentences (LightSide’s Movie Reviews) and using the same parameters, only a 64.44% level of accuracy was achieved.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Pertanika Journal of Science and Technology
Pertanika Journal of Science and Technology MULTIDISCIPLINARY SCIENCES-
CiteScore
1.50
自引率
16.70%
发文量
178
期刊介绍: Pertanika Journal of Science and Technology aims to provide a forum for high quality research related to science and engineering research. Areas relevant to the scope of the journal include: bioinformatics, bioscience, biotechnology and bio-molecular sciences, chemistry, computer science, ecology, engineering, engineering design, environmental control and management, mathematics and statistics, medicine and health sciences, nanotechnology, physics, safety and emergency management, and related fields of study.
期刊最新文献
A Review on the Development of Microcarriers for Cell Culture Applications The Compatibility of Cement Bonded Fibreboard Through Dimensional Stability Analysis: A Review Bending Effects on Polyvinyl Alcohol Thin Film for Flexible Wearable Antenna Substrate Mesh Optimisation for General 3D Printed Objects with Cusp-Height Triangulation Approach The Riblet Short-Slot Coupler Using Substrate Integrated Waveguide (SIW) for High-frequency Applications
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1