Academic Performance Modelling with Machine Learning Based on Cognitive and Non-Cognitive Features

IF 0.5 Q4 COMPUTER SCIENCE, THEORY & METHODS Applied Computer Systems Pub Date : 2021-12-01 DOI:10.2478/acss-2021-0015
Bridgitte Owusu-Boadu, Isaac Kofi Nti, O. Nyarko-Boateng, J. Aning, Victoria Boafo
{"title":"Academic Performance Modelling with Machine Learning Based on Cognitive and Non-Cognitive Features","authors":"Bridgitte Owusu-Boadu, Isaac Kofi Nti, O. Nyarko-Boateng, J. Aning, Victoria Boafo","doi":"10.2478/acss-2021-0015","DOIUrl":null,"url":null,"abstract":"Abstract The academic performance of students is essential for academic progression at all levels of education. However, the availability of several cognitive and non-cognitive factors that influence students’ academic performance makes it challenging for academic authorities to use conventional analytical tools to extract hidden knowledge in educational data. Therefore, Educational Data Mining (EDM) requires computational techniques to simplify planning and determining students who might be at risk of failing or dropping from school due to academic performance, thus helping resolve student retention. The paper studies several cognitive and non-cognitive factors such as academic, demographic, social and behavioural and their effect on student academic performance using machine learning algorithms. Heterogenous lazy and eager machine learning classifiers, including Decision Tree (DT), K-Nearest-Neighbour (KNN), Artificial Neural Network (ANN), Logistic Regression (LR), Random Forest (RF), AdaBoost and Support Vector Machine (SVM) were adopted and training was performed based on k-fold (k = 10) and leave-one-out cross-validation. We evaluated their predictive performance using well-known evaluation metrics like Area under Curve (AUC), F-1 score, Precision, Accuracy, Kappa, Matthew’s correlation coefficient (MCC) and Recall. The study outcome shows that Student Absence Days (SAD) are the most significant predictor of students’ academic performance. In terms of prediction accuracy and AUC, the RF (Acc = 0.771, AUC = 0.903), LR (Acc = 0.779, AUC = 0.90) and ANN (Acc = 0.760, AUC = 0.895) outperformed all other algorithms (KNN (Acc = 0.638, AUC = 0.826), SVM (Acc = 0.727, AUC = 0.80), DT (Acc = 0.733, AUC = 0.876) and AdaBoost (Acc = 0.748, AUC = 0.808)), making them more suitable for predicting students’ academic performance.","PeriodicalId":41960,"journal":{"name":"Applied Computer Systems","volume":"66 1","pages":"122 - 131"},"PeriodicalIF":0.5000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Computer Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2478/acss-2021-0015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 3

Abstract

Abstract The academic performance of students is essential for academic progression at all levels of education. However, the availability of several cognitive and non-cognitive factors that influence students’ academic performance makes it challenging for academic authorities to use conventional analytical tools to extract hidden knowledge in educational data. Therefore, Educational Data Mining (EDM) requires computational techniques to simplify planning and determining students who might be at risk of failing or dropping from school due to academic performance, thus helping resolve student retention. The paper studies several cognitive and non-cognitive factors such as academic, demographic, social and behavioural and their effect on student academic performance using machine learning algorithms. Heterogenous lazy and eager machine learning classifiers, including Decision Tree (DT), K-Nearest-Neighbour (KNN), Artificial Neural Network (ANN), Logistic Regression (LR), Random Forest (RF), AdaBoost and Support Vector Machine (SVM) were adopted and training was performed based on k-fold (k = 10) and leave-one-out cross-validation. We evaluated their predictive performance using well-known evaluation metrics like Area under Curve (AUC), F-1 score, Precision, Accuracy, Kappa, Matthew’s correlation coefficient (MCC) and Recall. The study outcome shows that Student Absence Days (SAD) are the most significant predictor of students’ academic performance. In terms of prediction accuracy and AUC, the RF (Acc = 0.771, AUC = 0.903), LR (Acc = 0.779, AUC = 0.90) and ANN (Acc = 0.760, AUC = 0.895) outperformed all other algorithms (KNN (Acc = 0.638, AUC = 0.826), SVM (Acc = 0.727, AUC = 0.80), DT (Acc = 0.733, AUC = 0.876) and AdaBoost (Acc = 0.748, AUC = 0.808)), making them more suitable for predicting students’ academic performance.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于认知和非认知特征的机器学习学习成绩建模
学生的学习成绩对各级教育的学业进步至关重要。然而,影响学生学习成绩的几个认知和非认知因素的可用性使得学术当局难以使用传统的分析工具来提取教育数据中的隐藏知识。因此,教育数据挖掘(EDM)需要计算技术来简化计划,并确定由于学习成绩可能面临不及格或辍学风险的学生,从而帮助解决学生留校问题。本文使用机器学习算法研究了一些认知和非认知因素,如学术、人口、社会和行为,以及它们对学生学习成绩的影响。采用决策树(DT)、k近邻(KNN)、人工神经网络(ANN)、逻辑回归(LR)、随机森林(RF)、AdaBoost和支持向量机(SVM)等异质惰性和渴望机器学习分类器,并基于k-fold (k = 10)和左一交叉验证进行训练。我们使用曲线下面积(Area under Curve, AUC)、F-1分数、Precision、Accuracy、Kappa、Matthew’s correlation coefficient (MCC)和Recall等众所周知的评估指标来评估它们的预测性能。研究结果显示,学生缺勤天数是学生学业成绩最显著的预测因子。在预测精度和AUC方面,RF (Acc = 0.771, AUC = 0.903)、LR (Acc = 0.779, AUC = 0.90)和ANN (Acc = 0.760, AUC = 0.895)优于KNN (Acc = 0.638, AUC = 0.826)、SVM (Acc = 0.727, AUC = 0.80)、DT (Acc = 0.733, AUC = 0.876)和AdaBoost (Acc = 0.748, AUC = 0.808)算法,更适合预测学生学业成绩。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Applied Computer Systems
Applied Computer Systems COMPUTER SCIENCE, THEORY & METHODS-
自引率
10.00%
发文量
9
审稿时长
30 weeks
期刊最新文献
Multimodal Biometric System Based on the Fusion in Score of Fingerprint and Online Handwritten Signature Multichannel Approach for Sentiment Analysis Using Stack of Neural Network with Lexicon Based Padding and Attention Mechanism BRS-based Model for the Specification of Multi-view Point Ontology Empirical Analysis of Supervised and Unsupervised Machine Learning Algorithms with Aspect-Based Sentiment Analysis Approximate Nearest Neighbour-based Index Tree: A Case Study for Instrumental Music Search
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1