{"title":"Text Preprocessing Impact for Sentiment Classification in Product Review","authors":"Murahartawaty Arief, Mustafa Bin Matt Deris","doi":"10.1109/ICIC54025.2021.9632884","DOIUrl":null,"url":null,"abstract":"In the Covid-19 pandemic situation, the e-commerce platform has significant data of product reviews in real-time. Businesses need rating and review systems to immediately expose their consumers' feelings about their products and services and use every volume of data to strengthen their competitive strategies. Amazon is one platform that can provide a vast quantity of product review data. Unfortunately, data from product reviews are typically unstructured and unmanageable. Therefore, this experimental study observed text preprocessing impact to process unstructured product review data using sentiment classifier Decision Tree, Naïve Bayes, and Support Vector Machine (SVM) with better accuracy. The SVM performed higher evaluation model performance, with an accuracy of 88,13%, but the Naïve Bayes classifier has minimum execution time. Furthermore, the experimental result using our approach TF-IDF for feature extraction may significantly improve classification accuracy. As a result, our approach reveals that a good text preprocessing sequence is critical to the classifier's prediction performance for unstructured product review data.","PeriodicalId":189541,"journal":{"name":"2021 Sixth International Conference on Informatics and Computing (ICIC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Sixth International Conference on Informatics and Computing (ICIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIC54025.2021.9632884","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In the Covid-19 pandemic situation, the e-commerce platform has significant data of product reviews in real-time. Businesses need rating and review systems to immediately expose their consumers' feelings about their products and services and use every volume of data to strengthen their competitive strategies. Amazon is one platform that can provide a vast quantity of product review data. Unfortunately, data from product reviews are typically unstructured and unmanageable. Therefore, this experimental study observed text preprocessing impact to process unstructured product review data using sentiment classifier Decision Tree, Naïve Bayes, and Support Vector Machine (SVM) with better accuracy. The SVM performed higher evaluation model performance, with an accuracy of 88,13%, but the Naïve Bayes classifier has minimum execution time. Furthermore, the experimental result using our approach TF-IDF for feature extraction may significantly improve classification accuracy. As a result, our approach reveals that a good text preprocessing sequence is critical to the classifier's prediction performance for unstructured product review data.