{"title":"可解释的模型不会影响预测大学成功的准确性和公平性","authors":"C. Kung, Renzhe Yu","doi":"10.1145/3386527.3406755","DOIUrl":null,"url":null,"abstract":"The presence of \"big data\" in higher education has led to the increasing popularity of predictive analytics for guiding various stakeholders on appropriate actions to support student success. In developing such applications, model selection is a central issue. As such, this study presents a comprehensive examination of five commonly used machine learning models in student success prediction. Using administrative and learning management system (LMS) data for nearly 2,000 college students at a public university, we employ the models to predict short-term and long-term academic success. Beyond the tradeoff between model interpretability and accuracy, we also focus on the fairness of these models with regard to different student populations. Our findings suggest that more interpretable models such as logistic regression do not necessarily compromise predictive accuracy. Also, they lead to no more, if not less, prediction bias against disadvantaged student groups than complicated models. Moreover, prediction biases against certain groups persist even in the fairest model. These results thus recommend using simpler algorithms in conjunction with human evaluation in instructional and institutional applications of student success prediction when valid student features are in place.","PeriodicalId":20608,"journal":{"name":"Proceedings of the Seventh ACM Conference on Learning @ Scale","volume":"6 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2020-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Interpretable Models Do Not Compromise Accuracy or Fairness in Predicting College Success\",\"authors\":\"C. Kung, Renzhe Yu\",\"doi\":\"10.1145/3386527.3406755\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The presence of \\\"big data\\\" in higher education has led to the increasing popularity of predictive analytics for guiding various stakeholders on appropriate actions to support student success. In developing such applications, model selection is a central issue. As such, this study presents a comprehensive examination of five commonly used machine learning models in student success prediction. Using administrative and learning management system (LMS) data for nearly 2,000 college students at a public university, we employ the models to predict short-term and long-term academic success. Beyond the tradeoff between model interpretability and accuracy, we also focus on the fairness of these models with regard to different student populations. Our findings suggest that more interpretable models such as logistic regression do not necessarily compromise predictive accuracy. Also, they lead to no more, if not less, prediction bias against disadvantaged student groups than complicated models. Moreover, prediction biases against certain groups persist even in the fairest model. These results thus recommend using simpler algorithms in conjunction with human evaluation in instructional and institutional applications of student success prediction when valid student features are in place.\",\"PeriodicalId\":20608,\"journal\":{\"name\":\"Proceedings of the Seventh ACM Conference on Learning @ Scale\",\"volume\":\"6 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-08-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Seventh ACM Conference on Learning @ Scale\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3386527.3406755\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Seventh ACM Conference on Learning @ Scale","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3386527.3406755","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Interpretable Models Do Not Compromise Accuracy or Fairness in Predicting College Success
The presence of "big data" in higher education has led to the increasing popularity of predictive analytics for guiding various stakeholders on appropriate actions to support student success. In developing such applications, model selection is a central issue. As such, this study presents a comprehensive examination of five commonly used machine learning models in student success prediction. Using administrative and learning management system (LMS) data for nearly 2,000 college students at a public university, we employ the models to predict short-term and long-term academic success. Beyond the tradeoff between model interpretability and accuracy, we also focus on the fairness of these models with regard to different student populations. Our findings suggest that more interpretable models such as logistic regression do not necessarily compromise predictive accuracy. Also, they lead to no more, if not less, prediction bias against disadvantaged student groups than complicated models. Moreover, prediction biases against certain groups persist even in the fairest model. These results thus recommend using simpler algorithms in conjunction with human evaluation in instructional and institutional applications of student success prediction when valid student features are in place.