{"title":"使用机器学习方法对印尼语进行移动应用审查分类","authors":"Yudo Ekanata, I. Budi","doi":"10.1109/CATA.2018.8398667","DOIUrl":null,"url":null,"abstract":"The number of user reviews for a mobile app can reach thousands so it will take a lot of time for app developers to sort through and find information that is important for further app development. Therefore, this study aims to automatically classify mobile application user reviews. Automatic classification conducted in this study is using machine learning approach. The features extracted from user review are unigram, bigram, star rating, review length, as well as the ratio of the number of words with positive and negative sentiment. For classification algorithms, we used Naïve Bayes, Support Vector Machine, Logistic Regression and Decision Tree. The experiment result shows that Logistic Regression gives the best F-Measure of 85% when combined with unigram plus sentence length and sentiment score. Unigram was proven as the most important feature since the additional features like sentence length and sentiment score only increased the F-measure around 1%. Bigram and star rating has negative impact on the classifier performance.","PeriodicalId":231024,"journal":{"name":"2018 4th International Conference on Computer and Technology Applications (ICCTA)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Mobile application review classification for the Indonesian language using machine learning approach\",\"authors\":\"Yudo Ekanata, I. Budi\",\"doi\":\"10.1109/CATA.2018.8398667\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The number of user reviews for a mobile app can reach thousands so it will take a lot of time for app developers to sort through and find information that is important for further app development. Therefore, this study aims to automatically classify mobile application user reviews. Automatic classification conducted in this study is using machine learning approach. The features extracted from user review are unigram, bigram, star rating, review length, as well as the ratio of the number of words with positive and negative sentiment. For classification algorithms, we used Naïve Bayes, Support Vector Machine, Logistic Regression and Decision Tree. The experiment result shows that Logistic Regression gives the best F-Measure of 85% when combined with unigram plus sentence length and sentiment score. Unigram was proven as the most important feature since the additional features like sentence length and sentiment score only increased the F-measure around 1%. Bigram and star rating has negative impact on the classifier performance.\",\"PeriodicalId\":231024,\"journal\":{\"name\":\"2018 4th International Conference on Computer and Technology Applications (ICCTA)\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 4th International Conference on Computer and Technology Applications (ICCTA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CATA.2018.8398667\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 4th International Conference on Computer and Technology Applications (ICCTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CATA.2018.8398667","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Mobile application review classification for the Indonesian language using machine learning approach
The number of user reviews for a mobile app can reach thousands so it will take a lot of time for app developers to sort through and find information that is important for further app development. Therefore, this study aims to automatically classify mobile application user reviews. Automatic classification conducted in this study is using machine learning approach. The features extracted from user review are unigram, bigram, star rating, review length, as well as the ratio of the number of words with positive and negative sentiment. For classification algorithms, we used Naïve Bayes, Support Vector Machine, Logistic Regression and Decision Tree. The experiment result shows that Logistic Regression gives the best F-Measure of 85% when combined with unigram plus sentence length and sentiment score. Unigram was proven as the most important feature since the additional features like sentence length and sentiment score only increased the F-measure around 1%. Bigram and star rating has negative impact on the classifier performance.