Interpretability Analysis of Academic Achievement Prediction Based on Machine Learning

Jie Yang, Hong Wang
{"title":"Interpretability Analysis of Academic Achievement Prediction Based on Machine Learning","authors":"Jie Yang, Hong Wang","doi":"10.1109/ITME53901.2021.00101","DOIUrl":null,"url":null,"abstract":"In recent years, with the development of artificial intelligence and information technology, we are gradually stepping into the era of big data, in which education-related data has developed sufficiently in terms of quantity and content. To be able to use machine learning techniques to assist educators to help improve the current quality of education and teaching, more and more researchers have started to data-mine educational data. In this paper, various algorithms of machine learning are applied to the field of education to process the data of students' teaching performance and then model it using various algorithms of machine learning to predict the students' performance and provide some suggestions to the teachers to improve the students' performance. The main contributions of this paper are as follows: Firstly, this paper carries out necessary preprocessing operations on the original data to remove some dirty data or missing data. Then, a variety of machine learning algorithms are used to model students' academic performance. By comparing the prediction accuracy, recall rate, and F1 score of the model, the Gradient Boosting Decision Tree Classifier is finally obtained as the optimal model. We then integrated the three best machine learning models as the base models and proposed a new Stacking learning method with better results. Finally, this paper analyzes the interpretability of the Gradient Boosting Decision Tree Classifier, evaluates the importance of different characteristics, and finally concludes that “Visited resources”, “Raised hand”, “Student Absence Days”, and “Viewing announcements” are the most important factors affecting students' performance. This model has an advanced effect and good interpretability.","PeriodicalId":6774,"journal":{"name":"2021 11th International Conference on Information Technology in Medicine and Education (ITME)","volume":"71 1","pages":"475-479"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 11th International Conference on Information Technology in Medicine and Education (ITME)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITME53901.2021.00101","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In recent years, with the development of artificial intelligence and information technology, we are gradually stepping into the era of big data, in which education-related data has developed sufficiently in terms of quantity and content. To be able to use machine learning techniques to assist educators to help improve the current quality of education and teaching, more and more researchers have started to data-mine educational data. In this paper, various algorithms of machine learning are applied to the field of education to process the data of students' teaching performance and then model it using various algorithms of machine learning to predict the students' performance and provide some suggestions to the teachers to improve the students' performance. The main contributions of this paper are as follows: Firstly, this paper carries out necessary preprocessing operations on the original data to remove some dirty data or missing data. Then, a variety of machine learning algorithms are used to model students' academic performance. By comparing the prediction accuracy, recall rate, and F1 score of the model, the Gradient Boosting Decision Tree Classifier is finally obtained as the optimal model. We then integrated the three best machine learning models as the base models and proposed a new Stacking learning method with better results. Finally, this paper analyzes the interpretability of the Gradient Boosting Decision Tree Classifier, evaluates the importance of different characteristics, and finally concludes that “Visited resources”, “Raised hand”, “Student Absence Days”, and “Viewing announcements” are the most important factors affecting students' performance. This model has an advanced effect and good interpretability.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于机器学习的学业成绩预测的可解释性分析
近年来,随着人工智能和信息技术的发展,我们正逐步步入大数据时代,教育相关数据在数量和内容上都得到了充分的发展。为了能够利用机器学习技术协助教育工作者帮助提高当前的教育教学质量,越来越多的研究人员开始对教育数据进行数据挖掘。本文将机器学习的各种算法应用到教育领域,对学生的教学表现数据进行处理,然后利用机器学习的各种算法对其进行建模,预测学生的表现,并为教师提供一些建议,以提高学生的表现。本文的主要贡献如下:首先,对原始数据进行必要的预处理操作,去除一些脏数据或缺失数据。然后,使用各种机器学习算法来模拟学生的学习成绩。通过比较模型的预测准确率、召回率和F1分数,最终得到梯度增强决策树分类器作为最优模型。然后,我们将三种最好的机器学习模型作为基础模型进行整合,提出了一种新的叠加学习方法,效果更好。最后,本文分析了梯度增强决策树分类器的可解释性,评估了不同特征的重要性,最终得出“访问资源”、“举手”、“学生缺勤天数”和“查看公告”是影响学生成绩的最重要因素。该模型具有超前效果和良好的可解释性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Committees ITME 2021 Conference Organization Research on Assistant Diagnostic Method of TCM Based on BERT Drug-Drug Adverse Reactions Prediction Based On Signed Network Java Curriculum Design Concept that Integrates Design Thinking and Heuristic Teaching Keyword-based Data Augmentation Guided Chinese Medical Questions Classification
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1