{"title":"跨域情感分类中特征选择算法的比较分析","authors":"Lipika Goel, Sonam Gupta, Avdhesh Gupta, Neha Nandal, Siddhi Nath Ranjan, Pradeep Gupta","doi":"10.2174/0126662558276889240125062857","DOIUrl":null,"url":null,"abstract":"\n\nCross-domain Sentiment Classification is a well-researched field in\nsentiment analysis. The biggest challenge in CDSC arises from the differences in domains and\nfeatures, which cause a decrease in model performance when applying source domain features\nto predict sentiment in the target domain. To address this challenge, several feature selection\nmethods can be employed to identify the most relevant features for training and testing in\nCDSC.\n\n\n\nThe primary objective of this study is to perform a comparative analysis of different\nfeature selection methods on the various CDSC tasks. In this study, statistical test-based feature\nselection methods using 18 classifiers for the CDSC task has been implemented. The impact\nof these feature selection methods on Amazon product reviews, specifically those in the\nDVD, Electronics, Kitchen, and TV domains, has been compared. Total 12x18 experiments\nwere conducted for each feature selection method by varying source and target domain pairs\nfrom the Amazon product reviews dataset and by using 18 classifiers. Performance evaluation\nmeasures are accuracy and f-score.\n\n\n\nFrom the experiments, it has been inferred that the CSDC task depends on various factors\nfor a good performance, from the right domain selection to the right feature selection\nmethod. We have concluded that the best training dataset is Electronics as it gives more precise\nresults while testing in either domain selected for our study.\n\n\n\nCross-domain sentiment analysis is a dynamic and interdisciplinary field that offers\nvaluable insights for understanding how sentiment varies across different domains.\n","PeriodicalId":506582,"journal":{"name":"Recent Advances in Computer Science and Communications","volume":"20 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Comparative Analysis of Feature Selection Algorithms in Cross Domain\\nSentiment Classification\",\"authors\":\"Lipika Goel, Sonam Gupta, Avdhesh Gupta, Neha Nandal, Siddhi Nath Ranjan, Pradeep Gupta\",\"doi\":\"10.2174/0126662558276889240125062857\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n\\nCross-domain Sentiment Classification is a well-researched field in\\nsentiment analysis. The biggest challenge in CDSC arises from the differences in domains and\\nfeatures, which cause a decrease in model performance when applying source domain features\\nto predict sentiment in the target domain. To address this challenge, several feature selection\\nmethods can be employed to identify the most relevant features for training and testing in\\nCDSC.\\n\\n\\n\\nThe primary objective of this study is to perform a comparative analysis of different\\nfeature selection methods on the various CDSC tasks. In this study, statistical test-based feature\\nselection methods using 18 classifiers for the CDSC task has been implemented. The impact\\nof these feature selection methods on Amazon product reviews, specifically those in the\\nDVD, Electronics, Kitchen, and TV domains, has been compared. Total 12x18 experiments\\nwere conducted for each feature selection method by varying source and target domain pairs\\nfrom the Amazon product reviews dataset and by using 18 classifiers. Performance evaluation\\nmeasures are accuracy and f-score.\\n\\n\\n\\nFrom the experiments, it has been inferred that the CSDC task depends on various factors\\nfor a good performance, from the right domain selection to the right feature selection\\nmethod. We have concluded that the best training dataset is Electronics as it gives more precise\\nresults while testing in either domain selected for our study.\\n\\n\\n\\nCross-domain sentiment analysis is a dynamic and interdisciplinary field that offers\\nvaluable insights for understanding how sentiment varies across different domains.\\n\",\"PeriodicalId\":506582,\"journal\":{\"name\":\"Recent Advances in Computer Science and Communications\",\"volume\":\"20 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-02-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Recent Advances in Computer Science and Communications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2174/0126662558276889240125062857\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Recent Advances in Computer Science and Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2174/0126662558276889240125062857","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Comparative Analysis of Feature Selection Algorithms in Cross Domain
Sentiment Classification
Cross-domain Sentiment Classification is a well-researched field in
sentiment analysis. The biggest challenge in CDSC arises from the differences in domains and
features, which cause a decrease in model performance when applying source domain features
to predict sentiment in the target domain. To address this challenge, several feature selection
methods can be employed to identify the most relevant features for training and testing in
CDSC.
The primary objective of this study is to perform a comparative analysis of different
feature selection methods on the various CDSC tasks. In this study, statistical test-based feature
selection methods using 18 classifiers for the CDSC task has been implemented. The impact
of these feature selection methods on Amazon product reviews, specifically those in the
DVD, Electronics, Kitchen, and TV domains, has been compared. Total 12x18 experiments
were conducted for each feature selection method by varying source and target domain pairs
from the Amazon product reviews dataset and by using 18 classifiers. Performance evaluation
measures are accuracy and f-score.
From the experiments, it has been inferred that the CSDC task depends on various factors
for a good performance, from the right domain selection to the right feature selection
method. We have concluded that the best training dataset is Electronics as it gives more precise
results while testing in either domain selected for our study.
Cross-domain sentiment analysis is a dynamic and interdisciplinary field that offers
valuable insights for understanding how sentiment varies across different domains.