{"title":"Which noncognitive features provide more information about reading performance? A data-mining approach to big educational data","authors":"O. Aricak, Hakan Guldal, Irfan Erdogan","doi":"10.1177/18344909231164025","DOIUrl":null,"url":null,"abstract":"The purpose of this study is to discover which noncognitive variables provide more information about reading performance. To answer this question, data mining based on information gain, decision tree and random forest methods were utilized in the study. The participants of the study consisted of 606,627 15-year-old students (49.8% female) in a total of 78 countries or economies, 37 of which are OECD members. Reading performance and plausible values of reading, the Student, ICT Familiarity, Financial Literacy, Educational Career, Well-Being and Parent Questionnaire data in PISA 2018 were analyzed to answer the research questions. When 108 features were analyzed as independent variables, it was found that SES (home possessions, cultural possessions, and ICT resources at home), metacognitive skills (assessing credibility and summarizing), and liking/enjoying reading were major variables predicting reading performance. The path analysis revealed that these variables explain 53.3% of the variability in reading performance. It is also remarkable that the decision tree model has a 74.61% accuracy value in estimating the reading performance.","PeriodicalId":45049,"journal":{"name":"Journal of Pacific Rim Psychology","volume":" ","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Pacific Rim Psychology","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1177/18344909231164025","RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
The purpose of this study is to discover which noncognitive variables provide more information about reading performance. To answer this question, data mining based on information gain, decision tree and random forest methods were utilized in the study. The participants of the study consisted of 606,627 15-year-old students (49.8% female) in a total of 78 countries or economies, 37 of which are OECD members. Reading performance and plausible values of reading, the Student, ICT Familiarity, Financial Literacy, Educational Career, Well-Being and Parent Questionnaire data in PISA 2018 were analyzed to answer the research questions. When 108 features were analyzed as independent variables, it was found that SES (home possessions, cultural possessions, and ICT resources at home), metacognitive skills (assessing credibility and summarizing), and liking/enjoying reading were major variables predicting reading performance. The path analysis revealed that these variables explain 53.3% of the variability in reading performance. It is also remarkable that the decision tree model has a 74.61% accuracy value in estimating the reading performance.