Pub Date : 2026-03-01Epub Date: 2026-01-28DOI: 10.1016/j.stueduc.2026.101575
Bora Ly , Savoeun Sorn , Romny Ly , Sokhom Ma , Bunhorn Doeur
Distributed leadership (DL) has emerged as a collaborative governance model in education, yet its relevance in higher education within developing contexts remains underexplored. This study examines the associations between DL and five outcomes, including academic performance (AP), student retention (SR), faculty performance (FP), faculty satisfaction (FS), and institutional development (ID), in Cambodian universities. Data were collected from 342 faculty members across public and private institutions and analyzed using Partial Least Squares Structural Equation Modeling (PLS-SEM). Results show that DL is positively associated with all outcomes and that FS partially mediates these relationships. DL practices emphasizing shared decision-making and collaboration enhance faculty engagement, thereby contributing to academic and institutional improvement. By centering faculty perspectives in a non-Western higher education context, the study addresses a significant research gap and underscores the value of inclusive leadership. Findings suggest that DL can support institutional growth and academic quality in systems marked by hierarchical governance.
{"title":"Distributed leadership in higher education: Evidence from Cambodia on academic and institutional outcomes","authors":"Bora Ly , Savoeun Sorn , Romny Ly , Sokhom Ma , Bunhorn Doeur","doi":"10.1016/j.stueduc.2026.101575","DOIUrl":"10.1016/j.stueduc.2026.101575","url":null,"abstract":"<div><div>Distributed leadership (DL) has emerged as a collaborative governance model in education, yet its relevance in higher education within developing contexts remains underexplored. This study examines the associations between DL and five outcomes, including academic performance (AP), student retention (SR), faculty performance (FP), faculty satisfaction (FS), and institutional development (ID), in Cambodian universities. Data were collected from 342 faculty members across public and private institutions and analyzed using Partial Least Squares Structural Equation Modeling (PLS-SEM). Results show that DL is positively associated with all outcomes and that FS partially mediates these relationships. DL practices emphasizing shared decision-making and collaboration enhance faculty engagement, thereby contributing to academic and institutional improvement. By centering faculty perspectives in a non-Western higher education context, the study addresses a significant research gap and underscores the value of inclusive leadership. Findings suggest that DL can support institutional growth and academic quality in systems marked by hierarchical governance.</div></div>","PeriodicalId":47539,"journal":{"name":"Studies in Educational Evaluation","volume":"88 ","pages":"Article 101575"},"PeriodicalIF":2.6,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146077762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2026-02-11DOI: 10.1016/j.stueduc.2026.101583
Funda Örnek , Ernest Afari
PISA-based research has often concluded that students’ ICT use is weakly, inconsistently, or negatively related to academic achievement, reinforcing the assumption that technology investments deliver limited learning benefits. This study challenges that conclusion by arguing that such findings stem from treating ICT engagement as a homogeneous and linear exposure rather than as a differentiated, psychologically mediated learning condition. Grounded in social cognitive, cognitive load, and engagement theories, we conceptualize ICT effects as conditional on both the mode and intensity of use and students’ digital self-efficacy. Analyzing PISA 2022 data from Türkiye, an emerging-economy context with large-scale ICT reforms, it is shown that ICT engagement exhibits systematically different relationships with mathematics and science achievement. Academically oriented ICT use outside the classroom is positively associated with achievement, whereas leisure-oriented ICT use is negatively associated, demonstrating that increased ICT exposure alone does not improve learning. Moreover, these relationships are non-linear. Moderate, purposeful ICT use is beneficial, but higher levels yield diminishing or adverse returns, offering a theoretical explanation for the mixed results reported in prior PISA studies. Crucially, digital self-efficacy mediates key ICT–achievement pathways, indicating that ICT contributes to learning primarily by strengthening students’ confidence and capacity to use digital tools strategically. By integrating differentiated ICT use, non-linear effects, and psychological mediation within a single explanatory framework, this study advances PISA-based ICT research beyond access- and usage-frequency models. The findings reframe ICT not as a general educational input but as a pedagogically contingent resource whose effectiveness depends on learner readiness and instructional purpose, with direct implications for digital education policy in expanding systems.
{"title":"Evaluating the effects of ICT engagement on science and mathematics achievement: Evidence from PISA 2022 using structural equation modeling","authors":"Funda Örnek , Ernest Afari","doi":"10.1016/j.stueduc.2026.101583","DOIUrl":"10.1016/j.stueduc.2026.101583","url":null,"abstract":"<div><div>PISA-based research has often concluded that students’ ICT use is weakly, inconsistently, or negatively related to academic achievement, reinforcing the assumption that technology investments deliver limited learning benefits. This study challenges that conclusion by arguing that such findings stem from treating ICT engagement as a homogeneous and linear exposure rather than as a differentiated, psychologically mediated learning condition. Grounded in social cognitive, cognitive load, and engagement theories, we conceptualize ICT effects as conditional on both the mode and intensity of use and students’ digital self-efficacy. Analyzing PISA 2022 data from Türkiye, an emerging-economy context with large-scale ICT reforms, it is shown that ICT engagement exhibits systematically different relationships with mathematics and science achievement. Academically oriented ICT use outside the classroom is positively associated with achievement, whereas leisure-oriented ICT use is negatively associated, demonstrating that increased ICT exposure alone does not improve learning. Moreover, these relationships are non-linear. Moderate, purposeful ICT use is beneficial, but higher levels yield diminishing or adverse returns, offering a theoretical explanation for the mixed results reported in prior PISA studies. Crucially, digital self-efficacy mediates key ICT–achievement pathways, indicating that ICT contributes to learning primarily by strengthening students’ confidence and capacity to use digital tools strategically. By integrating differentiated ICT use, non-linear effects, and psychological mediation within a single explanatory framework, this study advances PISA-based ICT research beyond access- and usage-frequency models. The findings reframe ICT not as a general educational input but as a pedagogically contingent resource whose effectiveness depends on learner readiness and instructional purpose, with direct implications for digital education policy in expanding systems.</div></div>","PeriodicalId":47539,"journal":{"name":"Studies in Educational Evaluation","volume":"88 ","pages":"Article 101583"},"PeriodicalIF":2.6,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146172973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gender gaps in mathematics achievement vary significantly across countries and over time, with particularly large gaps disadvantaging girls in Latin America and the Caribbean (LAC). This study examines changes in gender disparities in mathematics achievement among 15-year-olds across 10 LAC countries between 2006 and 2022, using PISA-OECD data. Employing a repeated cross-sectional design and fixed-effects models, we analyze trends and contextual moderators of gender gaps, including socioeconomic status (SES) and rurality. Results show that while boys continue to outperform girls in most LAC countries, the average gender gap narrowed over time in the region, mainly through improvements in girls’ performance led by Brazil and Colombia. Also, gender disparities are generally stable across socioeconomic quintiles and larger in urban areas, while mid-SES and rural settings exhibit smaller but still significant gaps. These findings highlight the complex intersection of gender, SES, and geography in shaping educational inequalities in the region.
{"title":"Pathways to gender equity in Latin America and the Caribbean: A comparative analysis of trends in mathematics gender gaps using PISA data","authors":"Lorena Ortega , Matías Montero , Álvaro Romero , Catalina Canals , Alejandra Mizala","doi":"10.1016/j.stueduc.2026.101579","DOIUrl":"10.1016/j.stueduc.2026.101579","url":null,"abstract":"<div><div>Gender gaps in mathematics achievement vary significantly across countries and over time, with particularly large gaps disadvantaging girls in Latin America and the Caribbean (LAC). This study examines changes in gender disparities in mathematics achievement among 15-year-olds across 10 LAC countries between 2006 and 2022, using PISA-OECD data. Employing a repeated cross-sectional design and fixed-effects models, we analyze trends and contextual moderators of gender gaps, including socioeconomic status (SES) and rurality. Results show that while boys continue to outperform girls in most LAC countries, the average gender gap narrowed over time in the region, mainly through improvements in girls’ performance led by Brazil and Colombia. Also, gender disparities are generally stable across socioeconomic quintiles and larger in urban areas, while mid-SES and rural settings exhibit smaller but still significant gaps. These findings highlight the complex intersection of gender, SES, and geography in shaping educational inequalities in the region.</div></div>","PeriodicalId":47539,"journal":{"name":"Studies in Educational Evaluation","volume":"88 ","pages":"Article 101579"},"PeriodicalIF":2.6,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146172977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2026-01-23DOI: 10.1016/j.stueduc.2026.101564
Fangfang Zhao , Ying Fang , Jinliang Qin
This study provides an exploratory examination of early care and education (ECE) quality in Eastern China from an interaction-dominant perspective. Using network analysis, the study maps the overall structure of ECE quality, capturing interdependence among dimensions both within the network and across classroom- and child-level sub-networks. Data were collected from 266 children across 16 classrooms in 6 childcare centers in Eastern China. Results indicated that the quality network was highly stable and showed strong validity. The classroom-level Program Structure emerged as the key dimension across the entire network, while the child-level Space and Furnishings dimension acted as a bridge between classroom- and child-level sub-networks. These findings provide preliminary support for the applicability of network analysis in the study of ECE quality and reveal structural features of quality that can guide future research with larger, culturally diverse, and geographically varied samples.
{"title":"An exploratory network analysis of early care and education quality","authors":"Fangfang Zhao , Ying Fang , Jinliang Qin","doi":"10.1016/j.stueduc.2026.101564","DOIUrl":"10.1016/j.stueduc.2026.101564","url":null,"abstract":"<div><div>This study provides an exploratory examination of early care and education (ECE) quality in Eastern China from an interaction-dominant perspective. Using network analysis, the study maps the overall structure of ECE quality, capturing interdependence among dimensions both within the network and across classroom- and child-level sub-networks. Data were collected from 266 children across 16 classrooms in 6 childcare centers in Eastern China. Results indicated that the quality network was highly stable and showed strong validity. The classroom-level Program Structure emerged as the key dimension across the entire network, while the child-level Space and Furnishings dimension acted as a bridge between classroom- and child-level sub-networks. These findings provide preliminary support for the applicability of network analysis in the study of ECE quality and reveal structural features of quality that can guide future research with larger, culturally diverse, and geographically varied samples.</div></div>","PeriodicalId":47539,"journal":{"name":"Studies in Educational Evaluation","volume":"88 ","pages":"Article 101564"},"PeriodicalIF":2.6,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146022532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2026-01-19DOI: 10.1016/j.stueduc.2026.101559
Anubha Rohatgi , Ove E. Hatlevik
In today's digital society, it is important for young people to develop a metacognitive understanding of how to react to and respond to messages from strangers. This study explores the relationship between reading achievement, online reading activities, metacognitive skills, and socio-economic backgrounds. Analysis of PISA 2018 Nordic samples indicates a positive statistical association between metacognitive credibility-evaluation skills and reading achievement. Reading achievement is also associated with student background variables. Across the Nordic countries, the tested models explain between 17 % and 29 % of the variation in students’ metacognitive credibility evaluation skills within schools. However, not all students benefit equally as school-based digital skills instruction does not consistently support online metacognition. Addressing these disparities is key to supporting inclusive digital learning and enhancing students’ ability to assess information credibility.
{"title":"The interrelationships between reading achievement, online reading activities, and metacognitive skills: Findings from PISA data","authors":"Anubha Rohatgi , Ove E. Hatlevik","doi":"10.1016/j.stueduc.2026.101559","DOIUrl":"10.1016/j.stueduc.2026.101559","url":null,"abstract":"<div><div>In today's digital society, it is important for young people to develop a metacognitive understanding of how to react to and respond to messages from strangers. This study explores the relationship between reading achievement, online reading activities, metacognitive skills, and socio-economic backgrounds. Analysis of PISA 2018 Nordic samples indicates a positive statistical association between metacognitive credibility-evaluation skills and reading achievement. Reading achievement is also associated with student background variables. Across the Nordic countries, the tested models explain between 17 % and 29 % of the variation in students’ metacognitive credibility evaluation skills within schools. However, not all students benefit equally as school-based digital skills instruction does not consistently support online metacognition. Addressing these disparities is key to supporting inclusive digital learning and enhancing students’ ability to assess information credibility.</div></div>","PeriodicalId":47539,"journal":{"name":"Studies in Educational Evaluation","volume":"88 ","pages":"Article 101559"},"PeriodicalIF":2.6,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146022533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-10DOI: 10.1016/j.stueduc.2025.101553
Zhifei Li , Shuaixing Xu , Xin Zhao , Hongbo Wen
The evaluation of learning efficiency is important for educational research. Previous studies have insufficiently defined its connotation, leading to incomparable results. Therefore, defining learning efficiency and improving assessment methods are crucial. This study examined the double reference point (DRP) method in assessing learning efficiency using PISA 2018 data. Firstly, we construct a comprehensive index for learning efficiency. Secondly, the validity of learning efficiency index based on the DRP method is established by conducting group difference tests on key indicators, comparing the results with traditional learning efficiency metrics, and examining its relationship with achievement motivation. Thirdly, by altering the weights of indicators, reference levels, and synthesis parameters, the robustness of learning efficiency index with DRP method is verified. The results indicate the double reference point method is suitable for learning efficiency evaluation, and offers support for precisely and scientifically improving student efficiency.
{"title":"Development and implementation of learning efficiency evaluation model: Using the double reference point method","authors":"Zhifei Li , Shuaixing Xu , Xin Zhao , Hongbo Wen","doi":"10.1016/j.stueduc.2025.101553","DOIUrl":"10.1016/j.stueduc.2025.101553","url":null,"abstract":"<div><div>The evaluation of learning efficiency is important for educational research. Previous studies have insufficiently defined its connotation, leading to incomparable results. Therefore, defining learning efficiency and improving assessment methods are crucial. This study examined the double reference point (DRP) method in assessing learning efficiency using PISA 2018 data. Firstly, we construct a comprehensive index for learning efficiency. Secondly, the validity of learning efficiency index based on the DRP method is established by conducting group difference tests on key indicators, comparing the results with traditional learning efficiency metrics, and examining its relationship with achievement motivation. Thirdly, by altering the weights of indicators, reference levels, and synthesis parameters, the robustness of learning efficiency index with DRP method is verified. The results indicate the double reference point method is suitable for learning efficiency evaluation, and offers support for precisely and scientifically improving student efficiency.</div></div>","PeriodicalId":47539,"journal":{"name":"Studies in Educational Evaluation","volume":"88 ","pages":"Article 101553"},"PeriodicalIF":2.6,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145737675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-02DOI: 10.1016/j.stueduc.2025.101539
Zitong Huang , Yiyao Yang , Yasemin Gulbahar
Mathematics performance remains a critical indicator of educational equity and academic preparedness in an increasingly datadriven world. This study examined factors affecting math achievement in New York City from 2013-2023, analyzing 3395,816 student records from Grades 3-8 using NYC OpenData. Clustering analysis revealed two distinct performance trajectories, with higher-scoring clusters linked to periods of targeted intervention and policy reform, while declines corresponded with structural disadvantage and postpandemic disruptions. Outlier analysis showed substantial shifts in performance levels following grading policy changes, underscoring the impact of systemic reforms. Results revealed persistent achievement gaps of 1.8–2.4 standard deviations across socioeconomic status, disability status, and English Language Learner groups. Ridge regression identified English proficiency, disability status, and economic disadvantage as the strongest predictors of math achievement. These findings highlight the need for longterm, context-sensitive educational strategies that address persistent inequities through both instructional and systemic interventions.
{"title":"Understanding the interconnected drivers of mathematics test performance: a longitudinal study","authors":"Zitong Huang , Yiyao Yang , Yasemin Gulbahar","doi":"10.1016/j.stueduc.2025.101539","DOIUrl":"10.1016/j.stueduc.2025.101539","url":null,"abstract":"<div><div>Mathematics performance remains a critical indicator of educational equity and academic preparedness in an increasingly datadriven world. This study examined factors affecting math achievement in New York City from 2013-2023, analyzing 3395,816 student records from Grades 3-8 using NYC OpenData. Clustering analysis revealed two distinct performance trajectories, with higher-scoring clusters linked to periods of targeted intervention and policy reform, while declines corresponded with structural disadvantage and postpandemic disruptions. Outlier analysis showed substantial shifts in performance levels following grading policy changes, underscoring the impact of systemic reforms. Results revealed persistent achievement gaps of 1.8–2.4 standard deviations across socioeconomic status, disability status, and English Language Learner groups. Ridge regression identified English proficiency, disability status, and economic disadvantage as the strongest predictors of math achievement. These findings highlight the need for longterm, context-sensitive educational strategies that address persistent inequities through both instructional and systemic interventions.</div></div>","PeriodicalId":47539,"journal":{"name":"Studies in Educational Evaluation","volume":"88 ","pages":"Article 101539"},"PeriodicalIF":2.6,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145685297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-17DOI: 10.1016/j.stueduc.2025.101554
Elpis Grammatikopoulou
The aim of this study was to investigate and compare how two reading assessments measured in Grade 4 predict later school grades. Specifically, the study explored how paper-based and digital reading scores were related to school grades in Grades 6 and 9. Data from the international assessment PIRLS 2016 and its digital counterpart ePIRLS 2016, along with Swedish register data were used. Regression results speak to the lasting effects of early reading achievement on later achievement in various subjects. Notably, digital test scores showed stronger associations with later grades, particularly in Grade 6. Although moderate in strength, the predictive validity of digital test scores was higher for subjects like English and mathematics, compared to paper-based test scores. These findings lend some credibility to the ongoing transition to digital assessment, indicating that digital reading measures may capture skills more relevant to later academic success.
{"title":"The predictive validity of PIRLS and ePIRLS on later academic achievement: Insights from Sweden,","authors":"Elpis Grammatikopoulou","doi":"10.1016/j.stueduc.2025.101554","DOIUrl":"10.1016/j.stueduc.2025.101554","url":null,"abstract":"<div><div>The aim of this study was to investigate and compare how two reading assessments measured in Grade 4 predict later school grades. Specifically, the study explored how paper-based and digital reading scores were related to school grades in Grades 6 and 9. Data from the international assessment PIRLS 2016 and its digital counterpart ePIRLS 2016, along with Swedish register data were used. Regression results speak to the lasting effects of early reading achievement on later achievement in various subjects. Notably, digital test scores showed stronger associations with later grades, particularly in Grade 6. Although moderate in strength, the predictive validity of digital test scores was higher for subjects like English and mathematics, compared to paper-based test scores. These findings lend some credibility to the ongoing transition to digital assessment, indicating that digital reading measures may capture skills more relevant to later academic success.</div></div>","PeriodicalId":47539,"journal":{"name":"Studies in Educational Evaluation","volume":"88 ","pages":"Article 101554"},"PeriodicalIF":2.6,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145790393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2025-12-10DOI: 10.1016/j.stueduc.2025.101552
Johan Braeken , Anne-Catherine Lehre , Kseniia Marcq
International large-scale assessments (ILSA) in education generate a lot of interest and debate among researchers and stakeholders alike. In this study, we take a deep-dive into a provocative claim that “Nordic students cannot do fractions” based on an item-specific result in the Trends in International Mathematics and Science Study (TIMSS) 2011. Using item response data from five cycles of the TIMSS mathematics achievement test, spanning from 2003 to 2019, across 12 countries representing diverse geographical regions (Africa, Europe, East Asia, and North America) and proficiency levels (from lower-performers to higher-performers), including about 54000 students per cycle and a total of about 800 mathematics items, we investigated the empirical basis and generalizability of the stated claim using an explanatory item response model (IRT) approach. Our study underscores the dangers of miscommunication, oversimplification, and overgeneralization of ILSA results, yet also highlights the hidden potential of deeper item-level analyses for educational researchers and stakeholders to generate critical inquiries into educational practice and our understanding of the subject domain.
{"title":"Pearls and perils when interpreting item-specific results from international large-scale educational assessments: The case of the “Nordic” fractions","authors":"Johan Braeken , Anne-Catherine Lehre , Kseniia Marcq","doi":"10.1016/j.stueduc.2025.101552","DOIUrl":"10.1016/j.stueduc.2025.101552","url":null,"abstract":"<div><div>International large-scale assessments (ILSA) in education generate a lot of interest and debate among researchers and stakeholders alike. In this study, we take a deep-dive into a provocative claim that “Nordic students cannot do fractions” based on an item-specific result in the Trends in International Mathematics and Science Study (TIMSS) 2011. Using item response data from five cycles of the TIMSS mathematics achievement test, spanning from 2003 to 2019, across 12 countries representing diverse geographical regions (Africa, Europe, East Asia, and North America) and proficiency levels (from lower-performers to higher-performers), including about 54000 students per cycle and a total of about 800 mathematics items, we investigated the empirical basis and generalizability of the stated claim using an explanatory item response model (IRT) approach. Our study underscores the dangers of miscommunication, oversimplification, and overgeneralization of ILSA results, yet also highlights the hidden potential of deeper item-level analyses for educational researchers and stakeholders to generate critical inquiries into educational practice and our understanding of the subject domain.</div></div>","PeriodicalId":47539,"journal":{"name":"Studies in Educational Evaluation","volume":"88 ","pages":"Article 101552"},"PeriodicalIF":2.6,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145737673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-01Epub Date: 2026-02-07DOI: 10.1016/j.stueduc.2026.101570
Nathalie Vigna, Giuliana Parente, Moris Triventi
This study examines whether standardization increases academic performance and reduces achievement inequalities in primary education. We analyze standardization from a multidimensional perspective, considering educational inputs, such as the national curriculum, and outputs, such as central exams.
We use data from the Progress in International Reading Literacy Study (PIRLS) for 2006, 2011, and 2016, covering 43 countries and nearly 700,000 fourth-grade students. For each country and year, we construct a novel index of curriculum standardization and examine the presence of central exams. Using an innovative two-step multilevel meta-regression strategy, we assess the role of standardization in shaping average reading performance and multiple measures of test score inequality.
Results reveal substantial cross-national differences in standardization practices but no consistent relationship between curricular or assessment standardization and reading proficiency or inequality. Although within-country temporal variation is limited, findings remain robust across specifications, challenging the assumption that standardization alone promotes educational equity or effectiveness in primary education.
{"title":"Curriculum standardization, central exams and achievement inequalities in primary education: A cross-national longitudinal study","authors":"Nathalie Vigna, Giuliana Parente, Moris Triventi","doi":"10.1016/j.stueduc.2026.101570","DOIUrl":"10.1016/j.stueduc.2026.101570","url":null,"abstract":"<div><div>This study examines whether standardization increases academic performance and reduces achievement inequalities in primary education. We analyze standardization from a multidimensional perspective, considering educational inputs, such as the national curriculum, and outputs, such as central exams.</div><div>We use data from the Progress in International Reading Literacy Study (PIRLS) for 2006, 2011, and 2016, covering 43 countries and nearly 700,000 fourth-grade students. For each country and year, we construct a novel index of curriculum standardization and examine the presence of central exams. Using an innovative two-step multilevel meta-regression strategy, we assess the role of standardization in shaping average reading performance and multiple measures of test score inequality.</div><div>Results reveal substantial cross-national differences in standardization practices but no consistent relationship between curricular or assessment standardization and reading proficiency or inequality. Although within-country temporal variation is limited, findings remain robust across specifications, challenging the assumption that standardization alone promotes educational equity or effectiveness in primary education.</div></div>","PeriodicalId":47539,"journal":{"name":"Studies in Educational Evaluation","volume":"88 ","pages":"Article 101570"},"PeriodicalIF":2.6,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146172975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}