{"title":"Applying Data Mining Techniques to Predict Student Dropout: A Case Study","authors":"B. Pérez, C. Castellanos, D. Correal","doi":"10.1109/COLCACI.2018.8484847","DOIUrl":null,"url":null,"abstract":"The prevention of students dropping out is considered very important in many educational institutions. In this paper we describe the results of an educational data analytics case study focused on detection of dropout of System Engineering (SE) undergraduate students after 7 years of enrollment in a Colombian university. Original data is extended and enriched using a feature engineering process. Our experimental results showed that simple algorithms achieve reliable levels of accuracy to identify predictors of dropout. Decision Trees, Logistic Regression and Na¨ıve Bayes results were compared in order to propose the best option. Also, Watson Analytics is evaluated to establish the usability of the service for a non expert user. Main results are presented in order to decrease the dropout rate by identifying potential causes. In addition, we present some findings related to data quality to improve the students data collection process.","PeriodicalId":138992,"journal":{"name":"2018 IEEE 1st Colombian Conference on Applications in Computational Intelligence (ColCACI)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"30","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 1st Colombian Conference on Applications in Computational Intelligence (ColCACI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COLCACI.2018.8484847","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 30
Abstract
The prevention of students dropping out is considered very important in many educational institutions. In this paper we describe the results of an educational data analytics case study focused on detection of dropout of System Engineering (SE) undergraduate students after 7 years of enrollment in a Colombian university. Original data is extended and enriched using a feature engineering process. Our experimental results showed that simple algorithms achieve reliable levels of accuracy to identify predictors of dropout. Decision Trees, Logistic Regression and Na¨ıve Bayes results were compared in order to propose the best option. Also, Watson Analytics is evaluated to establish the usability of the service for a non expert user. Main results are presented in order to decrease the dropout rate by identifying potential causes. In addition, we present some findings related to data quality to improve the students data collection process.