Mohamed Bellaj, Ahmed Ben Dahmane, Said Boudra, Mohammed Lamarti Sefian
{"title":"教育数据挖掘:运用机器学习技术和超参数优化提高学生学业成绩","authors":"Mohamed Bellaj, Ahmed Ben Dahmane, Said Boudra, Mohammed Lamarti Sefian","doi":"10.3991/ijoe.v20i03.46287","DOIUrl":null,"url":null,"abstract":"Educational data mining (EDM) is a specialized field within data mining that focuses on extracting valuable insights from academic data across high school and university levels. A common practice in EDM involves predicting students’ grades to identify at-risk individuals and improve the efficiency of academic tasks. This knowledge benefits students, parents, and institutions equally. Early detection enables interventions that improve student performance. The literature presents various prediction strategies, each with its own unique advantages and disadvantages. This study aims to comprehensively evaluate the methods, tools, and applications of machine learning (ML) and data mining (DM) in education. The main goal is to improve the accuracy of predicting academic achievements by employing eight widely recognized ML algorithms: naïve bayes (NB), k-nearest neighbors (KNN), support vector machine (SVM), random forest (RF), logistic regression (LR), extreme gradient boost (XGBOOST), and ensemble voting classifier (EVC). The focus is on improving data quality by eliminating instances of noise. Performance evaluation involves assessing parameters such as accuracy, precision, F-measure, and recall. Incorporating cross-validation and hyperparameter tuning improves classification accuracy. The ML models outperform other ensemble approaches, providing a valuable tool for predicting student performance and assisting educators in making proactive decisions through timely alerts.","PeriodicalId":507997,"journal":{"name":"International Journal of Online and Biomedical Engineering (iJOE)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Educational Data Mining: Employing Machine Learning Techniques and Hyperparameter Optimization to Improve Students’ Academic Performance\",\"authors\":\"Mohamed Bellaj, Ahmed Ben Dahmane, Said Boudra, Mohammed Lamarti Sefian\",\"doi\":\"10.3991/ijoe.v20i03.46287\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Educational data mining (EDM) is a specialized field within data mining that focuses on extracting valuable insights from academic data across high school and university levels. A common practice in EDM involves predicting students’ grades to identify at-risk individuals and improve the efficiency of academic tasks. This knowledge benefits students, parents, and institutions equally. Early detection enables interventions that improve student performance. The literature presents various prediction strategies, each with its own unique advantages and disadvantages. This study aims to comprehensively evaluate the methods, tools, and applications of machine learning (ML) and data mining (DM) in education. The main goal is to improve the accuracy of predicting academic achievements by employing eight widely recognized ML algorithms: naïve bayes (NB), k-nearest neighbors (KNN), support vector machine (SVM), random forest (RF), logistic regression (LR), extreme gradient boost (XGBOOST), and ensemble voting classifier (EVC). The focus is on improving data quality by eliminating instances of noise. Performance evaluation involves assessing parameters such as accuracy, precision, F-measure, and recall. Incorporating cross-validation and hyperparameter tuning improves classification accuracy. The ML models outperform other ensemble approaches, providing a valuable tool for predicting student performance and assisting educators in making proactive decisions through timely alerts.\",\"PeriodicalId\":507997,\"journal\":{\"name\":\"International Journal of Online and Biomedical Engineering (iJOE)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-02-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Online and Biomedical Engineering (iJOE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3991/ijoe.v20i03.46287\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Online and Biomedical Engineering (iJOE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3991/ijoe.v20i03.46287","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Educational Data Mining: Employing Machine Learning Techniques and Hyperparameter Optimization to Improve Students’ Academic Performance
Educational data mining (EDM) is a specialized field within data mining that focuses on extracting valuable insights from academic data across high school and university levels. A common practice in EDM involves predicting students’ grades to identify at-risk individuals and improve the efficiency of academic tasks. This knowledge benefits students, parents, and institutions equally. Early detection enables interventions that improve student performance. The literature presents various prediction strategies, each with its own unique advantages and disadvantages. This study aims to comprehensively evaluate the methods, tools, and applications of machine learning (ML) and data mining (DM) in education. The main goal is to improve the accuracy of predicting academic achievements by employing eight widely recognized ML algorithms: naïve bayes (NB), k-nearest neighbors (KNN), support vector machine (SVM), random forest (RF), logistic regression (LR), extreme gradient boost (XGBOOST), and ensemble voting classifier (EVC). The focus is on improving data quality by eliminating instances of noise. Performance evaluation involves assessing parameters such as accuracy, precision, F-measure, and recall. Incorporating cross-validation and hyperparameter tuning improves classification accuracy. The ML models outperform other ensemble approaches, providing a valuable tool for predicting student performance and assisting educators in making proactive decisions through timely alerts.