高等教育学业失败的早期识别:利用人工智能预测学生的表现

Education and new developments Pub Date : 2023-06-23 DOI:10.36315/2023v1end047

Fidel Cacheda, Manuel F. López-Vizcaíno, Diego Fernández, V. Carneiro

{"title":"高等教育学业失败的早期识别:利用人工智能预测学生的表现","authors":"Fidel Cacheda, Manuel F. López-Vizcaíno, Diego Fernández, V. Carneiro","doi":"10.36315/2023v1end047","DOIUrl":null,"url":null,"abstract":"In this work we focus on the early identification of academic failure in higher education as a mean to allow educators to provide an early intervention and help students on a risky position to achieve academic success. For this purpose, we define a dataset of more than one thousand students with their respective grades collected from a Computer Networks course on a Computer Science degree at a Spanish university throughout four years. From the dataset we extract different features corresponding to the laboratory and quiz assignments proposed to the students during the course that intend to represent the effort and accomplishment achieved by the students. A preliminary analysis of the dataset shows a potential relation between the scores achieved throughout the course and the final exam mark. The aim is to predict if a student will pass or not the final exam using only information extracted from the different laboratory and quiz assignments. In this sense, we define a data mining classification task following a supervised learning approach where a selection of well-known machine learning algorithms is evaluated following a 10-fold cross-validation scheme to assess the performance and robustness of the models. Our results show that using Random Forest we can accurately predict in more than 91% of the cases if a student will pass or not the final exam, achieving a F1-score of 0.916. Moreover, we perform a feature importance analysis highlighting how laboratory assignments features have a higher contribution to the learning model than quiz assignments.","PeriodicalId":93546,"journal":{"name":"Education and new developments","volume":"55 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"EARLY IDENTIFICATION OF ACADEMIC FAILURE ON HIGHER EDUCATION: PREDICTING STUDENTS’ PERFORMANCE USING AI\",\"authors\":\"Fidel Cacheda, Manuel F. López-Vizcaíno, Diego Fernández, V. Carneiro\",\"doi\":\"10.36315/2023v1end047\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this work we focus on the early identification of academic failure in higher education as a mean to allow educators to provide an early intervention and help students on a risky position to achieve academic success. For this purpose, we define a dataset of more than one thousand students with their respective grades collected from a Computer Networks course on a Computer Science degree at a Spanish university throughout four years. From the dataset we extract different features corresponding to the laboratory and quiz assignments proposed to the students during the course that intend to represent the effort and accomplishment achieved by the students. A preliminary analysis of the dataset shows a potential relation between the scores achieved throughout the course and the final exam mark. The aim is to predict if a student will pass or not the final exam using only information extracted from the different laboratory and quiz assignments. In this sense, we define a data mining classification task following a supervised learning approach where a selection of well-known machine learning algorithms is evaluated following a 10-fold cross-validation scheme to assess the performance and robustness of the models. Our results show that using Random Forest we can accurately predict in more than 91% of the cases if a student will pass or not the final exam, achieving a F1-score of 0.916. Moreover, we perform a feature importance analysis highlighting how laboratory assignments features have a higher contribution to the learning model than quiz assignments.\",\"PeriodicalId\":93546,\"journal\":{\"name\":\"Education and new developments\",\"volume\":\"55 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Education and new developments\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.36315/2023v1end047\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Education and new developments","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.36315/2023v1end047","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在这项工作中，我们将重点放在高等教育中学业失败的早期识别上，以此作为一种手段，使教育工作者能够提供早期干预，帮助处于危险地位的学生取得学业成功。为此，我们定义了一个数据集，该数据集包含一千多名学生及其各自的成绩，这些学生来自西班牙一所大学计算机科学学位的计算机网络课程，历时四年。从数据集中，我们提取了不同的特征，这些特征对应于课程期间向学生提出的实验和测验任务，这些特征旨在代表学生所取得的努力和成就。对数据集的初步分析显示，在整个课程中获得的分数与期末考试分数之间存在潜在的关系。其目的是预测学生是否会通过期末考试，仅使用从不同的实验室和测验作业中提取的信息。在这个意义上，我们定义了一个基于监督学习方法的数据挖掘分类任务，其中根据10倍交叉验证方案评估了一系列知名机器学习算法，以评估模型的性能和鲁棒性。我们的结果表明，使用随机森林，我们可以准确预测超过91%的情况下，如果一个学生将通过或不通过期末考试，达到f1分数0.916。此外，我们进行了特征重要性分析，强调了实验室作业特征对学习模型的贡献比测验作业更高。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

EARLY IDENTIFICATION OF ACADEMIC FAILURE ON HIGHER EDUCATION: PREDICTING STUDENTS’ PERFORMANCE USING AI

In this work we focus on the early identification of academic failure in higher education as a mean to allow educators to provide an early intervention and help students on a risky position to achieve academic success. For this purpose, we define a dataset of more than one thousand students with their respective grades collected from a Computer Networks course on a Computer Science degree at a Spanish university throughout four years. From the dataset we extract different features corresponding to the laboratory and quiz assignments proposed to the students during the course that intend to represent the effort and accomplishment achieved by the students. A preliminary analysis of the dataset shows a potential relation between the scores achieved throughout the course and the final exam mark. The aim is to predict if a student will pass or not the final exam using only information extracted from the different laboratory and quiz assignments. In this sense, we define a data mining classification task following a supervised learning approach where a selection of well-known machine learning algorithms is evaluated following a 10-fold cross-validation scheme to assess the performance and robustness of the models. Our results show that using Random Forest we can accurately predict in more than 91% of the cases if a student will pass or not the final exam, achieving a F1-score of 0.916. Moreover, we perform a feature importance analysis highlighting how laboratory assignments features have a higher contribution to the learning model than quiz assignments.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Education and new developments

自引率

0.00%

发文量