Fidel Cacheda, Manuel F. López-Vizcaíno, Diego Fernández, V. Carneiro
{"title":"EARLY IDENTIFICATION OF ACADEMIC FAILURE ON HIGHER EDUCATION: PREDICTING STUDENTS’ PERFORMANCE USING AI","authors":"Fidel Cacheda, Manuel F. López-Vizcaíno, Diego Fernández, V. Carneiro","doi":"10.36315/2023v1end047","DOIUrl":null,"url":null,"abstract":"In this work we focus on the early identification of academic failure in higher education as a mean to allow educators to provide an early intervention and help students on a risky position to achieve academic success. For this purpose, we define a dataset of more than one thousand students with their respective grades collected from a Computer Networks course on a Computer Science degree at a Spanish university throughout four years. From the dataset we extract different features corresponding to the laboratory and quiz assignments proposed to the students during the course that intend to represent the effort and accomplishment achieved by the students. A preliminary analysis of the dataset shows a potential relation between the scores achieved throughout the course and the final exam mark. The aim is to predict if a student will pass or not the final exam using only information extracted from the different laboratory and quiz assignments. In this sense, we define a data mining classification task following a supervised learning approach where a selection of well-known machine learning algorithms is evaluated following a 10-fold cross-validation scheme to assess the performance and robustness of the models. Our results show that using Random Forest we can accurately predict in more than 91% of the cases if a student will pass or not the final exam, achieving a F1-score of 0.916. Moreover, we perform a feature importance analysis highlighting how laboratory assignments features have a higher contribution to the learning model than quiz assignments.","PeriodicalId":93546,"journal":{"name":"Education and new developments","volume":"55 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Education and new developments","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.36315/2023v1end047","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this work we focus on the early identification of academic failure in higher education as a mean to allow educators to provide an early intervention and help students on a risky position to achieve academic success. For this purpose, we define a dataset of more than one thousand students with their respective grades collected from a Computer Networks course on a Computer Science degree at a Spanish university throughout four years. From the dataset we extract different features corresponding to the laboratory and quiz assignments proposed to the students during the course that intend to represent the effort and accomplishment achieved by the students. A preliminary analysis of the dataset shows a potential relation between the scores achieved throughout the course and the final exam mark. The aim is to predict if a student will pass or not the final exam using only information extracted from the different laboratory and quiz assignments. In this sense, we define a data mining classification task following a supervised learning approach where a selection of well-known machine learning algorithms is evaluated following a 10-fold cross-validation scheme to assess the performance and robustness of the models. Our results show that using Random Forest we can accurately predict in more than 91% of the cases if a student will pass or not the final exam, achieving a F1-score of 0.916. Moreover, we perform a feature importance analysis highlighting how laboratory assignments features have a higher contribution to the learning model than quiz assignments.