{"title":"用于大学辍学预测的可解释深度学习","authors":"Máté Baranyi, Marcell Nagy, Roland Molontay","doi":"10.1145/3368308.3415382","DOIUrl":null,"url":null,"abstract":"The early identification of college students at risk of dropout is of great interest and importance all over the world, since the early leaving of higher education is associated with considerable personal and social costs. In Hungary, especially in STEM undergraduate programs, the dropout rate is particularly high, much higher than the EU average. In this work, using advanced machine learning models such as deep neural networks and gradient boosted trees, we aim to predict the final academic performance of students at the Budapest University of Technology and Economics. The dropout prediction is based on the data that are available at the time of enrollment. In addition to the predictions, we also interpret our machine learning models with the help of state-of-the-art interpretable machine learning techniques such as permutation importance and SHAP values. The accuracy and AUC of the best-performing deep learning model are 72.4% and 0.771, respectively that slightly outperforms XGBoost, the cutting-edge benchmark model for tabular data.","PeriodicalId":374890,"journal":{"name":"Proceedings of the 21st Annual Conference on Information Technology Education","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"34","resultStr":"{\"title\":\"Interpretable Deep Learning for University Dropout Prediction\",\"authors\":\"Máté Baranyi, Marcell Nagy, Roland Molontay\",\"doi\":\"10.1145/3368308.3415382\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The early identification of college students at risk of dropout is of great interest and importance all over the world, since the early leaving of higher education is associated with considerable personal and social costs. In Hungary, especially in STEM undergraduate programs, the dropout rate is particularly high, much higher than the EU average. In this work, using advanced machine learning models such as deep neural networks and gradient boosted trees, we aim to predict the final academic performance of students at the Budapest University of Technology and Economics. The dropout prediction is based on the data that are available at the time of enrollment. In addition to the predictions, we also interpret our machine learning models with the help of state-of-the-art interpretable machine learning techniques such as permutation importance and SHAP values. The accuracy and AUC of the best-performing deep learning model are 72.4% and 0.771, respectively that slightly outperforms XGBoost, the cutting-edge benchmark model for tabular data.\",\"PeriodicalId\":374890,\"journal\":{\"name\":\"Proceedings of the 21st Annual Conference on Information Technology Education\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"34\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 21st Annual Conference on Information Technology Education\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3368308.3415382\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 21st Annual Conference on Information Technology Education","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3368308.3415382","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Interpretable Deep Learning for University Dropout Prediction
The early identification of college students at risk of dropout is of great interest and importance all over the world, since the early leaving of higher education is associated with considerable personal and social costs. In Hungary, especially in STEM undergraduate programs, the dropout rate is particularly high, much higher than the EU average. In this work, using advanced machine learning models such as deep neural networks and gradient boosted trees, we aim to predict the final academic performance of students at the Budapest University of Technology and Economics. The dropout prediction is based on the data that are available at the time of enrollment. In addition to the predictions, we also interpret our machine learning models with the help of state-of-the-art interpretable machine learning techniques such as permutation importance and SHAP values. The accuracy and AUC of the best-performing deep learning model are 72.4% and 0.771, respectively that slightly outperforms XGBoost, the cutting-edge benchmark model for tabular data.