Eka Miranda, Veronica Gabriella, Sriyanda Afrida Wahyudi, Jennifer Chai
{"title":"Text Classification for Analysing Indonesian People's Opinion Sentiment for Covid-19 Vaccination","authors":"Eka Miranda, Veronica Gabriella, Sriyanda Afrida Wahyudi, Jennifer Chai","doi":"10.32520/stmsi.v12i2.2759","DOIUrl":null,"url":null,"abstract":"The purpose of this study is to implement text mining for sentiment analysis of Indonesian public opinion on COVID-19 vaccination on Twitter social media using text classification techniques Support Vector Machine (SVM) and Random Forest. The research begins with crawling data from Twitter from September 2021 to October 2021; data cleansing; text translation into English; data preprocessing using NTLK performed with and without the lemmatization process; sentiment analysis using TextBlob; distribution of training and testing data with the Hold-Out method of 70:30 and 80:20; hyperparameter tuning with GridSearchCV; text classification with SVM and Random Forest; and testing the classification results by calculating Accuracy, Precision, Recall, F-Measure based on confusion matrix. The results show that text classification Random Forest consistently has a higher accuracy rate than SVM with the highest accuracy value of 90,59% and most of the sentiments indicate neutral to the COVID-19 vaccination program.","PeriodicalId":32357,"journal":{"name":"Jurnal Sistem Informasi","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Jurnal Sistem Informasi","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32520/stmsi.v12i2.2759","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The purpose of this study is to implement text mining for sentiment analysis of Indonesian public opinion on COVID-19 vaccination on Twitter social media using text classification techniques Support Vector Machine (SVM) and Random Forest. The research begins with crawling data from Twitter from September 2021 to October 2021; data cleansing; text translation into English; data preprocessing using NTLK performed with and without the lemmatization process; sentiment analysis using TextBlob; distribution of training and testing data with the Hold-Out method of 70:30 and 80:20; hyperparameter tuning with GridSearchCV; text classification with SVM and Random Forest; and testing the classification results by calculating Accuracy, Precision, Recall, F-Measure based on confusion matrix. The results show that text classification Random Forest consistently has a higher accuracy rate than SVM with the highest accuracy value of 90,59% and most of the sentiments indicate neutral to the COVID-19 vaccination program.