{"title":"论文答案分类与smote随机森林和adaboost在自动作文评分","authors":"Wilia Satria, Mardhani Riasetiawan","doi":"10.22146/ijccs.82548","DOIUrl":null,"url":null,"abstract":"Automated essay scoring (AES) is used to evaluate and assessment student essays are written based on the questions given. However, there are difficulties in conducting automatic assessments carried out by the system, these difficulties occur due to typing errors (typos), the use of regional languages , or incorrect punctuation. These errors make the assessment less consistent and accurate. Based on the dataset analysis that has been carried out, there is an imbalance between the number of right and wrong answers, so a technique is needed to overcome the data imbalance. Based on the literature, to overcome these problems, the Random Forest and AdaBoost classification algorithms can be used to improve the consistency of classification accuracy and the SMOTE method to overcome data imbalances.The Random Forest method using SMOTE can achieve an F1 measure of 99%, which means that the hybrid method can overcome the problem of imbalanced datasets that are limited to AES. The AdaBoost model with SMOTE produces the highest F1 measure reaching 99% of the entire dataset. The structure of the dataset is something that also affects the performance of the model. So the best model obtained in this study is the Random Forest model with SMOTE.","PeriodicalId":31625,"journal":{"name":"IJCCS Indonesian Journal of Computing and Cybernetics Systems","volume":"120 ","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ESSAY ANSWER CLASSIFICATION WITH SMOTE RANDOM FOREST AND ADABOOST IN AUTOMATED ESSAY SCORING\",\"authors\":\"Wilia Satria, Mardhani Riasetiawan\",\"doi\":\"10.22146/ijccs.82548\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automated essay scoring (AES) is used to evaluate and assessment student essays are written based on the questions given. However, there are difficulties in conducting automatic assessments carried out by the system, these difficulties occur due to typing errors (typos), the use of regional languages , or incorrect punctuation. These errors make the assessment less consistent and accurate. Based on the dataset analysis that has been carried out, there is an imbalance between the number of right and wrong answers, so a technique is needed to overcome the data imbalance. Based on the literature, to overcome these problems, the Random Forest and AdaBoost classification algorithms can be used to improve the consistency of classification accuracy and the SMOTE method to overcome data imbalances.The Random Forest method using SMOTE can achieve an F1 measure of 99%, which means that the hybrid method can overcome the problem of imbalanced datasets that are limited to AES. The AdaBoost model with SMOTE produces the highest F1 measure reaching 99% of the entire dataset. The structure of the dataset is something that also affects the performance of the model. So the best model obtained in this study is the Random Forest model with SMOTE.\",\"PeriodicalId\":31625,\"journal\":{\"name\":\"IJCCS Indonesian Journal of Computing and Cybernetics Systems\",\"volume\":\"120 \",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-10-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IJCCS Indonesian Journal of Computing and Cybernetics Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.22146/ijccs.82548\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IJCCS Indonesian Journal of Computing and Cybernetics Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22146/ijccs.82548","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
ESSAY ANSWER CLASSIFICATION WITH SMOTE RANDOM FOREST AND ADABOOST IN AUTOMATED ESSAY SCORING
Automated essay scoring (AES) is used to evaluate and assessment student essays are written based on the questions given. However, there are difficulties in conducting automatic assessments carried out by the system, these difficulties occur due to typing errors (typos), the use of regional languages , or incorrect punctuation. These errors make the assessment less consistent and accurate. Based on the dataset analysis that has been carried out, there is an imbalance between the number of right and wrong answers, so a technique is needed to overcome the data imbalance. Based on the literature, to overcome these problems, the Random Forest and AdaBoost classification algorithms can be used to improve the consistency of classification accuracy and the SMOTE method to overcome data imbalances.The Random Forest method using SMOTE can achieve an F1 measure of 99%, which means that the hybrid method can overcome the problem of imbalanced datasets that are limited to AES. The AdaBoost model with SMOTE produces the highest F1 measure reaching 99% of the entire dataset. The structure of the dataset is something that also affects the performance of the model. So the best model obtained in this study is the Random Forest model with SMOTE.