Predictive Modeling of Student Dropout in MOOCs and Self-Regulated Learning

IF 2.6 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Computers Pub Date : 2023-09-27 DOI:10.3390/computers12100194
Georgios Psathas, Theano K. Chatzidaki, Stavros N. Demetriadis
{"title":"Predictive Modeling of Student Dropout in MOOCs and Self-Regulated Learning","authors":"Georgios Psathas, Theano K. Chatzidaki, Stavros N. Demetriadis","doi":"10.3390/computers12100194","DOIUrl":null,"url":null,"abstract":"The primary objective of this study is to examine the factors that contribute to the early prediction of Massive Open Online Courses (MOOCs) dropouts in order to identify and support at-risk students. We utilize MOOC data of specific duration, with a guided study pace. The dataset exhibits class imbalance, and we apply oversampling techniques to ensure data balancing and unbiased prediction. We examine the predictive performance of five classic classification machine learning (ML) algorithms under four different oversampling techniques and various evaluation metrics. Additionally, we explore the influence of self-reported self-regulated learning (SRL) data provided by students and various other prominent features of MOOCs as potential indicators of early stage dropout prediction. The research questions focus on (1) the performance of the classic classification ML models using various evaluation metrics before and after different methods of oversampling, (2) which self-reported data may constitute crucial predictors for dropout propensity, and (3) the effect of the SRL factor on the dropout prediction performance. The main conclusions are: (1) prominent predictors, including employment status, frequency of chat tool usage, prior subject-related experiences, gender, education, and willingness to participate, exhibit remarkable efficacy in achieving high to excellent recall performance, particularly when specific combinations of algorithms and oversampling methods are applied, (2) self-reported SRL factor, combined with easily provided/self-reported features, performed well as a predictor in terms of recall when LR and SVM algorithms were employed, (3) it is crucial to test diverse machine learning algorithms and oversampling methods in predictive modeling.","PeriodicalId":46292,"journal":{"name":"Computers","volume":"139 1","pages":"0"},"PeriodicalIF":2.6000,"publicationDate":"2023-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/computers12100194","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 1

Abstract

The primary objective of this study is to examine the factors that contribute to the early prediction of Massive Open Online Courses (MOOCs) dropouts in order to identify and support at-risk students. We utilize MOOC data of specific duration, with a guided study pace. The dataset exhibits class imbalance, and we apply oversampling techniques to ensure data balancing and unbiased prediction. We examine the predictive performance of five classic classification machine learning (ML) algorithms under four different oversampling techniques and various evaluation metrics. Additionally, we explore the influence of self-reported self-regulated learning (SRL) data provided by students and various other prominent features of MOOCs as potential indicators of early stage dropout prediction. The research questions focus on (1) the performance of the classic classification ML models using various evaluation metrics before and after different methods of oversampling, (2) which self-reported data may constitute crucial predictors for dropout propensity, and (3) the effect of the SRL factor on the dropout prediction performance. The main conclusions are: (1) prominent predictors, including employment status, frequency of chat tool usage, prior subject-related experiences, gender, education, and willingness to participate, exhibit remarkable efficacy in achieving high to excellent recall performance, particularly when specific combinations of algorithms and oversampling methods are applied, (2) self-reported SRL factor, combined with easily provided/self-reported features, performed well as a predictor in terms of recall when LR and SVM algorithms were employed, (3) it is crucial to test diverse machine learning algorithms and oversampling methods in predictive modeling.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
mooc学生退学预测模型与自主学习
本研究的主要目的是研究影响大规模在线开放课程(MOOCs)辍学早期预测的因素,以便识别和支持有风险的学生。我们利用特定时长的MOOC数据,指导学习节奏。数据集表现出类别不平衡,我们采用过采样技术来确保数据平衡和无偏预测。我们在四种不同的过采样技术和各种评估指标下研究了五种经典分类机器学习(ML)算法的预测性能。此外,我们还探讨了学生提供的自我报告自我调节学习(SRL)数据以及mooc的各种其他突出特征作为早期辍学预测的潜在指标的影响。研究问题集中在(1)使用不同过采样方法前后不同评价指标的经典分类ML模型的性能,(2)自我报告数据可能构成辍学倾向的关键预测因子,以及(3)SRL因素对辍学预测性能的影响。主要结论是:(1)突出的预测因素,包括就业状况、聊天工具使用频率、先前的主题相关经验、性别、教育程度和参与意愿,在实现高到优秀的召回性能方面表现出显著的有效性,特别是当应用算法和过采样方法的特定组合时;(2)自我报告的SRL因素,结合容易提供/自我报告的特征;当使用LR和SVM算法时,在召回率方面表现良好,(3)在预测建模中测试不同的机器学习算法和过采样方法至关重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Computers
Computers COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS-
CiteScore
5.40
自引率
3.60%
发文量
153
审稿时长
11 weeks
期刊最新文献
Advanced Road Safety: Collective Perception for Probability of Collision Estimation of Connected Vehicles Forecasting of Bitcoin Illiquidity Using High-Dimensional and Textual Features Mining Negative Associations from Medical Databases Considering Frequent, Regular, Closed and Maximal Patterns Faraway, so Close: Perceptions of the Metaverse on the Edge of Madness Blockchain-Powered Gaming: Bridging Entertainment with Serious Game Objectives
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1