{"title":"电子书日志对预测学生学习成绩模型创建的影响和贡献","authors":"F. Zhao, Etsuko Kumamoto, Chengjiu Yin","doi":"10.1109/ICALT52272.2021.00063","DOIUrl":null,"url":null,"abstract":"As a kind of data that can reflect learning status, e-book logs have been widely used in learning analytics, especially for the prediction of academic performance. However, the best prediction model cannot be found without determining the contribution of e-book logs to the prediction performance of the model and its creation process. To this end, this study used the scikit-learn, a free software machine learning library, to analyze learning performance of 234 participants by learning behavior logs, which were collected by an e-book system. Finally, six prediction models containing Decision Tree, Random Forests, XGBoost, Logistic Regression, Support Vector Machines, and K-nearest Neighbors were created. Also, the contribution of e-book logs on the establishment of different prediction models was obtained by three feature importance calculation methods, i.e., the impurity-based feature importance, coefficients feature importance, and permutation feature importance. Based on statistical results, it was concluded that the Decision Tree and Random Forests had the best prediction performance, which was compared to the other four models, with prediction performance scores ranging from 0.7 to 0.8. Besides, the four data features of Prev, Highlight, Maker, and Next were found to have the greatest impact on model prediction creation.","PeriodicalId":170895,"journal":{"name":"2021 International Conference on Advanced Learning Technologies (ICALT)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"The effect and contribution of e-book logs to model creation for predicting students’ academic performance\",\"authors\":\"F. Zhao, Etsuko Kumamoto, Chengjiu Yin\",\"doi\":\"10.1109/ICALT52272.2021.00063\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As a kind of data that can reflect learning status, e-book logs have been widely used in learning analytics, especially for the prediction of academic performance. However, the best prediction model cannot be found without determining the contribution of e-book logs to the prediction performance of the model and its creation process. To this end, this study used the scikit-learn, a free software machine learning library, to analyze learning performance of 234 participants by learning behavior logs, which were collected by an e-book system. Finally, six prediction models containing Decision Tree, Random Forests, XGBoost, Logistic Regression, Support Vector Machines, and K-nearest Neighbors were created. Also, the contribution of e-book logs on the establishment of different prediction models was obtained by three feature importance calculation methods, i.e., the impurity-based feature importance, coefficients feature importance, and permutation feature importance. Based on statistical results, it was concluded that the Decision Tree and Random Forests had the best prediction performance, which was compared to the other four models, with prediction performance scores ranging from 0.7 to 0.8. Besides, the four data features of Prev, Highlight, Maker, and Next were found to have the greatest impact on model prediction creation.\",\"PeriodicalId\":170895,\"journal\":{\"name\":\"2021 International Conference on Advanced Learning Technologies (ICALT)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Conference on Advanced Learning Technologies (ICALT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICALT52272.2021.00063\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Advanced Learning Technologies (ICALT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICALT52272.2021.00063","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The effect and contribution of e-book logs to model creation for predicting students’ academic performance
As a kind of data that can reflect learning status, e-book logs have been widely used in learning analytics, especially for the prediction of academic performance. However, the best prediction model cannot be found without determining the contribution of e-book logs to the prediction performance of the model and its creation process. To this end, this study used the scikit-learn, a free software machine learning library, to analyze learning performance of 234 participants by learning behavior logs, which were collected by an e-book system. Finally, six prediction models containing Decision Tree, Random Forests, XGBoost, Logistic Regression, Support Vector Machines, and K-nearest Neighbors were created. Also, the contribution of e-book logs on the establishment of different prediction models was obtained by three feature importance calculation methods, i.e., the impurity-based feature importance, coefficients feature importance, and permutation feature importance. Based on statistical results, it was concluded that the Decision Tree and Random Forests had the best prediction performance, which was compared to the other four models, with prediction performance scores ranging from 0.7 to 0.8. Besides, the four data features of Prev, Highlight, Maker, and Next were found to have the greatest impact on model prediction creation.