{"title":"An Empirical Evaluation of Adapting Hybrid Parameters for CNN-based Sentiment Analysis","authors":"Mohammed Maree, Mujahed Eleyat, Shatha Rabayah","doi":"10.47836/pjst.32.3.05","DOIUrl":null,"url":null,"abstract":"Sentiment analysis aims to understand human emotions and perceptions through various machine-learning pipelines. However, feature engineering and inherent semantic gap constraints often hinder conventional machine learning techniques and limit their accuracy. Newer neural network models have been proposed to automate the feature learning process and enrich learned features with word contextual embeddings to identify their semantic orientations to address these challenges. This article aims to analyze the influence of different factors on the accuracy of sentiment classification predictions by employing Feedforward and Convolutional Neural Networks. To assess the performance of these neural network models, we utilize four diverse real-world datasets, namely 50,000 movie reviews from IMDB, 10,662 sentences from LightSide Movie_Reviews, 300 public movie reviews, and 1,600,000 tweets extracted from Sentiment140. We experimentally investigate the impact of exploiting GloVe word embeddings on enriching feature vectors extracted from sentiment sentences. Findings indicate that using larger dimensions of GloVe word embeddings increases the sentiment classification accuracy. In particular, results demonstrate that the accuracy of the CNN with a larger feature map, a smaller filter window, and the ReLU activation function in the convolutional layer was 90.56% using the IMDB dataset. In comparison, it was 80.73% and 77.64% using the sentiment140 and the 300 sentiment sentences dataset, respectively. However, it is worth mentioning that, with large-size sentiment sentences (LightSide’s Movie Reviews) and using the same parameters, only a 64.44% level of accuracy was achieved.","PeriodicalId":46234,"journal":{"name":"Pertanika Journal of Science and Technology","volume":null,"pages":null},"PeriodicalIF":0.6000,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pertanika Journal of Science and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.47836/pjst.32.3.05","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Sentiment analysis aims to understand human emotions and perceptions through various machine-learning pipelines. However, feature engineering and inherent semantic gap constraints often hinder conventional machine learning techniques and limit their accuracy. Newer neural network models have been proposed to automate the feature learning process and enrich learned features with word contextual embeddings to identify their semantic orientations to address these challenges. This article aims to analyze the influence of different factors on the accuracy of sentiment classification predictions by employing Feedforward and Convolutional Neural Networks. To assess the performance of these neural network models, we utilize four diverse real-world datasets, namely 50,000 movie reviews from IMDB, 10,662 sentences from LightSide Movie_Reviews, 300 public movie reviews, and 1,600,000 tweets extracted from Sentiment140. We experimentally investigate the impact of exploiting GloVe word embeddings on enriching feature vectors extracted from sentiment sentences. Findings indicate that using larger dimensions of GloVe word embeddings increases the sentiment classification accuracy. In particular, results demonstrate that the accuracy of the CNN with a larger feature map, a smaller filter window, and the ReLU activation function in the convolutional layer was 90.56% using the IMDB dataset. In comparison, it was 80.73% and 77.64% using the sentiment140 and the 300 sentiment sentences dataset, respectively. However, it is worth mentioning that, with large-size sentiment sentences (LightSide’s Movie Reviews) and using the same parameters, only a 64.44% level of accuracy was achieved.
期刊介绍:
Pertanika Journal of Science and Technology aims to provide a forum for high quality research related to science and engineering research. Areas relevant to the scope of the journal include: bioinformatics, bioscience, biotechnology and bio-molecular sciences, chemistry, computer science, ecology, engineering, engineering design, environmental control and management, mathematics and statistics, medicine and health sciences, nanotechnology, physics, safety and emergency management, and related fields of study.