{"title":"基于互信息的在线评论情感分析","authors":"Zuhui Wang, Wei Jiang","doi":"10.1109/FSKD.2012.6233865","DOIUrl":null,"url":null,"abstract":"The extraction of complicated features is essential to the performance of online review sentiment analysis. Aside from conventional word bag features, the regular collocation features play more and more important role in that their structured expression shows great impact on the sentiment orientation. The presented paper propose to apply the mutual information method to mine the complicated features from online reviews, and extend features extraction from the conventional word bags to regular collocations. With extracted collocation features as inputs of Naive Bayes analysis model, experiments on online hotel reviews data show that the presented extraction method improves the performance of Naive Bayes model by 1.36%, and improves the performance of Maximum Entropy model by 0.92%. On the other hand the imbalance between positive and negative reviews leads to foul play where the majority features conceal the minority ones, and also the extreme sentiment of the minority introduces noise into the dataset. With respect to the imbalance problem and corresponding parameter estimation problem, one λ feature filtering strategy and Good Turing smooth method is adopted to improve further the performance of the sentiment analysis model.","PeriodicalId":337941,"journal":{"name":"International Conference on Fuzzy Systems and Knowledge Discovery","volume":"136 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Online reviews sentiment analysis applying mutual information\",\"authors\":\"Zuhui Wang, Wei Jiang\",\"doi\":\"10.1109/FSKD.2012.6233865\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The extraction of complicated features is essential to the performance of online review sentiment analysis. Aside from conventional word bag features, the regular collocation features play more and more important role in that their structured expression shows great impact on the sentiment orientation. The presented paper propose to apply the mutual information method to mine the complicated features from online reviews, and extend features extraction from the conventional word bags to regular collocations. With extracted collocation features as inputs of Naive Bayes analysis model, experiments on online hotel reviews data show that the presented extraction method improves the performance of Naive Bayes model by 1.36%, and improves the performance of Maximum Entropy model by 0.92%. On the other hand the imbalance between positive and negative reviews leads to foul play where the majority features conceal the minority ones, and also the extreme sentiment of the minority introduces noise into the dataset. With respect to the imbalance problem and corresponding parameter estimation problem, one λ feature filtering strategy and Good Turing smooth method is adopted to improve further the performance of the sentiment analysis model.\",\"PeriodicalId\":337941,\"journal\":{\"name\":\"International Conference on Fuzzy Systems and Knowledge Discovery\",\"volume\":\"136 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-05-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Fuzzy Systems and Knowledge Discovery\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FSKD.2012.6233865\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Fuzzy Systems and Knowledge Discovery","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FSKD.2012.6233865","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Online reviews sentiment analysis applying mutual information
The extraction of complicated features is essential to the performance of online review sentiment analysis. Aside from conventional word bag features, the regular collocation features play more and more important role in that their structured expression shows great impact on the sentiment orientation. The presented paper propose to apply the mutual information method to mine the complicated features from online reviews, and extend features extraction from the conventional word bags to regular collocations. With extracted collocation features as inputs of Naive Bayes analysis model, experiments on online hotel reviews data show that the presented extraction method improves the performance of Naive Bayes model by 1.36%, and improves the performance of Maximum Entropy model by 0.92%. On the other hand the imbalance between positive and negative reviews leads to foul play where the majority features conceal the minority ones, and also the extreme sentiment of the minority introduces noise into the dataset. With respect to the imbalance problem and corresponding parameter estimation problem, one λ feature filtering strategy and Good Turing smooth method is adopted to improve further the performance of the sentiment analysis model.