{"title":"酒店评价情感分析的朴素贝叶斯分类器优化","authors":"S. Khomsah","doi":"10.17933/jppi.2020.100206","DOIUrl":null,"url":null,"abstract":"Feature extraction plays an important role in the sentiment analysis process, especially of text data. The Naive Bayes Classifier performs well on low feature dimensions. However, the accuracy provided is not optimal. To acquire optimal machine learning model, information gain method, evolutionary algorithm, and swarm intelligent algorithm are applied. The objective of this study is to determine the performance of the Particle Swarm Optimization (PSO) to optimize the Naive Bayes Classifier. Vectorization of words is carried out using TF-IDF. In order to produce high PSO performance, the PSO-NBC model is tested with several parameters, namely the number of particles (k = 3), setting of the number of iterations and inertia weight, individual intelligence coefficient (c1 = 1), and social intelligence coefficient (c2 = 2). Inert weight is calculated using the formulation (w = 0.5+ Rand ([- 1,1])). In conclusion, PSO is able to solve the problem space of text-based sentiment analysis. PSO is able to optimize the accuracy of Naive Bayes at a value of 89% to 91.76%. PSO performance is determined by the parameters used, especially the number of particles, the number of iterations, and the weight of inertia. A large number of particles accompanied by an increase in inertia weight can increase accuracy. The number of particles 20-30 has reached the optimal accuracy.","PeriodicalId":31332,"journal":{"name":"Jurnal Penelitian Pos dan Informatika","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2020-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Naive Bayes Classifier Optimization on Sentiment Analysis of Hotel Reviews\",\"authors\":\"S. Khomsah\",\"doi\":\"10.17933/jppi.2020.100206\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Feature extraction plays an important role in the sentiment analysis process, especially of text data. The Naive Bayes Classifier performs well on low feature dimensions. However, the accuracy provided is not optimal. To acquire optimal machine learning model, information gain method, evolutionary algorithm, and swarm intelligent algorithm are applied. The objective of this study is to determine the performance of the Particle Swarm Optimization (PSO) to optimize the Naive Bayes Classifier. Vectorization of words is carried out using TF-IDF. In order to produce high PSO performance, the PSO-NBC model is tested with several parameters, namely the number of particles (k = 3), setting of the number of iterations and inertia weight, individual intelligence coefficient (c1 = 1), and social intelligence coefficient (c2 = 2). Inert weight is calculated using the formulation (w = 0.5+ Rand ([- 1,1])). In conclusion, PSO is able to solve the problem space of text-based sentiment analysis. PSO is able to optimize the accuracy of Naive Bayes at a value of 89% to 91.76%. PSO performance is determined by the parameters used, especially the number of particles, the number of iterations, and the weight of inertia. A large number of particles accompanied by an increase in inertia weight can increase accuracy. The number of particles 20-30 has reached the optimal accuracy.\",\"PeriodicalId\":31332,\"journal\":{\"name\":\"Jurnal Penelitian Pos dan Informatika\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Jurnal Penelitian Pos dan Informatika\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.17933/jppi.2020.100206\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Jurnal Penelitian Pos dan Informatika","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.17933/jppi.2020.100206","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Naive Bayes Classifier Optimization on Sentiment Analysis of Hotel Reviews
Feature extraction plays an important role in the sentiment analysis process, especially of text data. The Naive Bayes Classifier performs well on low feature dimensions. However, the accuracy provided is not optimal. To acquire optimal machine learning model, information gain method, evolutionary algorithm, and swarm intelligent algorithm are applied. The objective of this study is to determine the performance of the Particle Swarm Optimization (PSO) to optimize the Naive Bayes Classifier. Vectorization of words is carried out using TF-IDF. In order to produce high PSO performance, the PSO-NBC model is tested with several parameters, namely the number of particles (k = 3), setting of the number of iterations and inertia weight, individual intelligence coefficient (c1 = 1), and social intelligence coefficient (c2 = 2). Inert weight is calculated using the formulation (w = 0.5+ Rand ([- 1,1])). In conclusion, PSO is able to solve the problem space of text-based sentiment analysis. PSO is able to optimize the accuracy of Naive Bayes at a value of 89% to 91.76%. PSO performance is determined by the parameters used, especially the number of particles, the number of iterations, and the weight of inertia. A large number of particles accompanied by an increase in inertia weight can increase accuracy. The number of particles 20-30 has reached the optimal accuracy.