Pub Date : 2022-03-01DOI: 10.1080/13803611.2022.2041876
Michael W Crossley
The global expansion of higher education has been increasingly evident since the 1980s, driven partly by the emergence of the knowledge economy and inspiring the internationalisation of all dimensions of the sector worldwide. Western, neoliberal models, values, and assumptions have dominated this process leading to the policy transfer of funding regimes favouring marketisation, the rise of the private sector, increased reliance upon student fees and governance processes seeking intergovernmental and intragovernmental coordination, and the harmonisation of higher education systems, standards, and qualifications frameworks. Within Europe this led to the signing of the Bologna Declaration in 1999 and the creation of the European Higher Education Area: “to facilitate student and staff mobility, to make higher education more inclusive and accessible, and to make higher education in Europe more attractive and competitive worldwide” (https://ec.europa.eu/education/policies/ higher-education/bologna-process-and-european-higher-education-area_en). Beyond Europe and North America, the influence of these and related developments can be seen in the rapid growth and internationalisation of higher education in contexts as diverse as Africa, Latin America, the Indian subcontinent, Southeast Asia, Oceania, and Mainland China. In the latter case, the dramatic growth of the home sector has been combined with exponential increases in the numbers of Chinese students pursuing higher education abroad, most notably in English-speaking countries such as the United States, the United Kingdom, and Australia. While the internationalisation of higher education has also been intensified by the impact of competitive global league tables and university rankings influencing status, reputations, enrolments, and the flow of research funds, such processes and new “governance mechanisms” have been increasingly challenged and problematised. This is especially significant in the work of comparative and international researchers who have long called for greater context sensitivity and the critical interrogation of policy flows in all sectors of education
{"title":"Policy transfer, context sensitivity, and epistemic justice: commentary and overview","authors":"Michael W Crossley","doi":"10.1080/13803611.2022.2041876","DOIUrl":"https://doi.org/10.1080/13803611.2022.2041876","url":null,"abstract":"The global expansion of higher education has been increasingly evident since the 1980s, driven partly by the emergence of the knowledge economy and inspiring the internationalisation of all dimensions of the sector worldwide. Western, neoliberal models, values, and assumptions have dominated this process leading to the policy transfer of funding regimes favouring marketisation, the rise of the private sector, increased reliance upon student fees and governance processes seeking intergovernmental and intragovernmental coordination, and the harmonisation of higher education systems, standards, and qualifications frameworks. Within Europe this led to the signing of the Bologna Declaration in 1999 and the creation of the European Higher Education Area: “to facilitate student and staff mobility, to make higher education more inclusive and accessible, and to make higher education in Europe more attractive and competitive worldwide” (https://ec.europa.eu/education/policies/ higher-education/bologna-process-and-european-higher-education-area_en). Beyond Europe and North America, the influence of these and related developments can be seen in the rapid growth and internationalisation of higher education in contexts as diverse as Africa, Latin America, the Indian subcontinent, Southeast Asia, Oceania, and Mainland China. In the latter case, the dramatic growth of the home sector has been combined with exponential increases in the numbers of Chinese students pursuing higher education abroad, most notably in English-speaking countries such as the United States, the United Kingdom, and Australia. While the internationalisation of higher education has also been intensified by the impact of competitive global league tables and university rankings influencing status, reputations, enrolments, and the flow of research funds, such processes and new “governance mechanisms” have been increasingly challenged and problematised. This is especially significant in the work of comparative and international researchers who have long called for greater context sensitivity and the critical interrogation of policy flows in all sectors of education","PeriodicalId":47025,"journal":{"name":"Educational Research and Evaluation","volume":"27 1","pages":"357 - 363"},"PeriodicalIF":1.4,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41678617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-02-27DOI: 10.1080/13803611.2022.2041875
Sathya Chea, W. Lo
ABSTRACT Building employability skills in Cambodia is emphasised in the development of higher education within the globalisation context. Given that English proficiency has increasingly become an important credential in the more internationalised Cambodian labour market, this paper examines the development and revision of an English language education programme at a renowned university in Phnom Penh to exemplify how an emphasis on employability skills development emerges with the global trends in curriculum in Cambodian higher education. It also discusses how revisions are made in response to the changing national context and how corresponding changes in pedagogical strategy appear at the institutional and classroom levels. This discussion documents and analyses the interplay between curriculum development and changes at global, national, and institutional levels in Cambodia – a fast-growing country that actively engages with globalisation and regionalisation.
{"title":"International connectivity and employability in Cambodian higher education: a case study of developing employability skills in English language education","authors":"Sathya Chea, W. Lo","doi":"10.1080/13803611.2022.2041875","DOIUrl":"https://doi.org/10.1080/13803611.2022.2041875","url":null,"abstract":"ABSTRACT Building employability skills in Cambodia is emphasised in the development of higher education within the globalisation context. Given that English proficiency has increasingly become an important credential in the more internationalised Cambodian labour market, this paper examines the development and revision of an English language education programme at a renowned university in Phnom Penh to exemplify how an emphasis on employability skills development emerges with the global trends in curriculum in Cambodian higher education. It also discusses how revisions are made in response to the changing national context and how corresponding changes in pedagogical strategy appear at the institutional and classroom levels. This discussion documents and analyses the interplay between curriculum development and changes at global, national, and institutional levels in Cambodia – a fast-growing country that actively engages with globalisation and regionalisation.","PeriodicalId":47025,"journal":{"name":"Educational Research and Evaluation","volume":"27 1","pages":"335 - 356"},"PeriodicalIF":1.4,"publicationDate":"2022-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47634234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-02-17DOI: 10.1080/13803611.2021.2022307
Thomas Perry, B. See
The last 2 decades have seen many developments in education research, including the growth of robust studies testing education programmes and policies using experimental designs (Hedges, 2018), such as randomised controlled trials (RCTs). RCTs have been a focus for replication study in education, with researchers seeking to replicate similar programmes under trial conditions. These replications have had varying results and have raised questions about why some results have successfully replicated and others have not. Results from recent Education Endowment Foundaton effectiveness trials are good examples of this. A number of education programmes have shown beneficial effects on young people’s learning outcomes in efficacy trials, but no effects in larger scale effectiveness trials. Examples of these programmes include Philosophy for Children (Gorard et al., 2018), Switch-on (Reading Recovery) (Gorard et al., 2014), and Accelerated Reader (Gorard et al., 2015). Some may conclude that one of these evaluations must be wrong. It is important to realise that in all of these examples, the contexts and the fidelity of implementation differed. In the Philosophy for Children effectiveness trial, 53% of the schools did not implement the intervention as intended (Lord et al., 2021). This is the nature of effectiveness trials, where the programme is delivered in real-life conditions, whereas in efficacy trials the delivery would be closely monitored and controlled, and with a smaller sample. Similarly, with the Switch-on evaluation, although schools delivered the required number of sessions, they modified the content and the delivery format of the intervention (Patel et al., 2017). There were also important differences between the efficacy and effectiveness trials. The efficacy trial was conducted with 1st-year secondary school children, whereas the effectiveness trial was with primary school children. The tests used also differed in the two evaluations. In the efficacy trial, Reading was measured using the GL New Group Reading Test (Gorard et al., 2014), but in the effectiveness trial the test used was the Hodder Group Reading Test (Patel et al., 2017). What these two examples suggest is that variations in the context and target population for the study and variations in the measures and experimental conditions can have an appreciable effect on the result. These examples also highlight the point that adherence to the fundamental principles of the original programme is essential for effective replication. Without this, it is difficult to know whether unsuccessful replication is because the programme does not work, or that it does not work with a certain population or under certain conditions. It is therefore worthwhile replicating these studies while maintaining high fidelity to the intervention and at the same time varying the population and instruments used as suggested by Wiliam (2022). Related to these efforts are questions about the role of science in ed
{"title":"Replication study in education","authors":"Thomas Perry, B. See","doi":"10.1080/13803611.2021.2022307","DOIUrl":"https://doi.org/10.1080/13803611.2021.2022307","url":null,"abstract":"The last 2 decades have seen many developments in education research, including the growth of robust studies testing education programmes and policies using experimental designs (Hedges, 2018), such as randomised controlled trials (RCTs). RCTs have been a focus for replication study in education, with researchers seeking to replicate similar programmes under trial conditions. These replications have had varying results and have raised questions about why some results have successfully replicated and others have not. Results from recent Education Endowment Foundaton effectiveness trials are good examples of this. A number of education programmes have shown beneficial effects on young people’s learning outcomes in efficacy trials, but no effects in larger scale effectiveness trials. Examples of these programmes include Philosophy for Children (Gorard et al., 2018), Switch-on (Reading Recovery) (Gorard et al., 2014), and Accelerated Reader (Gorard et al., 2015). Some may conclude that one of these evaluations must be wrong. It is important to realise that in all of these examples, the contexts and the fidelity of implementation differed. In the Philosophy for Children effectiveness trial, 53% of the schools did not implement the intervention as intended (Lord et al., 2021). This is the nature of effectiveness trials, where the programme is delivered in real-life conditions, whereas in efficacy trials the delivery would be closely monitored and controlled, and with a smaller sample. Similarly, with the Switch-on evaluation, although schools delivered the required number of sessions, they modified the content and the delivery format of the intervention (Patel et al., 2017). There were also important differences between the efficacy and effectiveness trials. The efficacy trial was conducted with 1st-year secondary school children, whereas the effectiveness trial was with primary school children. The tests used also differed in the two evaluations. In the efficacy trial, Reading was measured using the GL New Group Reading Test (Gorard et al., 2014), but in the effectiveness trial the test used was the Hodder Group Reading Test (Patel et al., 2017). What these two examples suggest is that variations in the context and target population for the study and variations in the measures and experimental conditions can have an appreciable effect on the result. These examples also highlight the point that adherence to the fundamental principles of the original programme is essential for effective replication. Without this, it is difficult to know whether unsuccessful replication is because the programme does not work, or that it does not work with a certain population or under certain conditions. It is therefore worthwhile replicating these studies while maintaining high fidelity to the intervention and at the same time varying the population and instruments used as suggested by Wiliam (2022). Related to these efforts are questions about the role of science in ed","PeriodicalId":47025,"journal":{"name":"Educational Research and Evaluation","volume":"27 1","pages":"1 - 7"},"PeriodicalIF":1.4,"publicationDate":"2022-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43525664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-02-17DOI: 10.1080/13803611.2021.2022318
Noemí Suárez Monzón, Vanessa Gómez Suárez, Diego Gudberto Lara Paredes
ABSTRACT Previous studies have identified a positive relationship between students’ perceptions of student evaluations of teaching (SET) and the grades that students provide in SET, controlling for other bias factors. The research by Spooren and Christiaens in 2017 at the University of Antwerp supported this finding. In this study, the methodology used by Spooren and Christiaens was replicated at the Technological Indoamerica University in Ecuador, in a close conceptual replication. In the replicated study, 967 undergraduate participants answered the questionnaires used by the original authors. The replication study sample was very similar in size, seniority, and gender to the original study but not in academic disciplines studied. Most of the students agreed that the evaluation was relevant and could improve teaching practices. Results show a statistically significant but small positive relation among perceptions of SET and SET scores (0.20 for the Belgian university and 0.27 for the Ecuadorian university).
{"title":"Is my opinion important in evaluating lecturers? Students’ perceptions of student evaluations of teaching (SET) and their relationship to SET scores","authors":"Noemí Suárez Monzón, Vanessa Gómez Suárez, Diego Gudberto Lara Paredes","doi":"10.1080/13803611.2021.2022318","DOIUrl":"https://doi.org/10.1080/13803611.2021.2022318","url":null,"abstract":"ABSTRACT Previous studies have identified a positive relationship between students’ perceptions of student evaluations of teaching (SET) and the grades that students provide in SET, controlling for other bias factors. The research by Spooren and Christiaens in 2017 at the University of Antwerp supported this finding. In this study, the methodology used by Spooren and Christiaens was replicated at the Technological Indoamerica University in Ecuador, in a close conceptual replication. In the replicated study, 967 undergraduate participants answered the questionnaires used by the original authors. The replication study sample was very similar in size, seniority, and gender to the original study but not in academic disciplines studied. Most of the students agreed that the evaluation was relevant and could improve teaching practices. Results show a statistically significant but small positive relation among perceptions of SET and SET scores (0.20 for the Belgian university and 0.27 for the Ecuadorian university).","PeriodicalId":47025,"journal":{"name":"Educational Research and Evaluation","volume":"27 1","pages":"117 - 140"},"PeriodicalIF":1.4,"publicationDate":"2022-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48888494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-02-07DOI: 10.1080/13803611.2021.2022316
P. Starkey, Alice E. Klein, Ben Clarke, S. Baker, Jaime Thomas
ABSTRACT A socioeconomic status (SES)-related achievement gap in mathematics emerges prior to school entry, and increases in elementary school. This gap makes implementation of demanding mathematics standards (e.g., the Common Core State Standards) an ongoing challenge. Early educational intervention is a strategy for addressing this challenge. A randomised controlled trial was conducted in public American preschools to (1) replicate the efficacy of an intervention, Pre-K Mathematics, for low-SES children, and (2) test the combined impact of this intervention and a Common-Core-aligned kindergarten intervention, Early Learning in Mathematics. Forty-one clusters of pre-kindergarten and kindergarten classrooms, containing a sample of 389 low-SES children from an agricultural region, were randomly assigned to treatment and control conditions. The original impact findings were replicated: Child mathematics outcomes in pre-kindergarten were positive and significant. Gains were maintained in kindergarten. Thus, the gap can be reduced and gains maintained by sustained early intervention.
{"title":"Effects of early mathematics intervention for low-SES pre-kindergarten and kindergarten students: a replication study","authors":"P. Starkey, Alice E. Klein, Ben Clarke, S. Baker, Jaime Thomas","doi":"10.1080/13803611.2021.2022316","DOIUrl":"https://doi.org/10.1080/13803611.2021.2022316","url":null,"abstract":"ABSTRACT A socioeconomic status (SES)-related achievement gap in mathematics emerges prior to school entry, and increases in elementary school. This gap makes implementation of demanding mathematics standards (e.g., the Common Core State Standards) an ongoing challenge. Early educational intervention is a strategy for addressing this challenge. A randomised controlled trial was conducted in public American preschools to (1) replicate the efficacy of an intervention, Pre-K Mathematics, for low-SES children, and (2) test the combined impact of this intervention and a Common-Core-aligned kindergarten intervention, Early Learning in Mathematics. Forty-one clusters of pre-kindergarten and kindergarten classrooms, containing a sample of 389 low-SES children from an agricultural region, were randomly assigned to treatment and control conditions. The original impact findings were replicated: Child mathematics outcomes in pre-kindergarten were positive and significant. Gains were maintained in kindergarten. Thus, the gap can be reduced and gains maintained by sustained early intervention.","PeriodicalId":47025,"journal":{"name":"Educational Research and Evaluation","volume":"27 1","pages":"61 - 82"},"PeriodicalIF":1.4,"publicationDate":"2022-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45455738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-02-01DOI: 10.1080/13803611.2021.2022320
D. Foung, Lucas Kohnke
ABSTRACT Replication studies are uncommon in education, and replications of validation studies are rarer. This study aimed to replicate, reproduce, and expand the study by Jellicoe and Forsythe published in 2019 that validated the Feedback in Learning Scale. We followed the original procedures, conducting a full validation process. We found only an 87% agreement between our model parameters and those of the original study. The differences were derived from the number of factors retained and the fit indices of alternative models. Fuller details of the methods used in the original study would have helped us to better ensure replicability. We also suggest that feedback in higher education (the context for our study) might be more effective if it were less personal and more task-related than workplace feedback (the context from which the Feedback in Learning Scale was derived).
{"title":"The development and validation of the Feedback in Learning Scale (FLS): a replication study","authors":"D. Foung, Lucas Kohnke","doi":"10.1080/13803611.2021.2022320","DOIUrl":"https://doi.org/10.1080/13803611.2021.2022320","url":null,"abstract":"ABSTRACT Replication studies are uncommon in education, and replications of validation studies are rarer. This study aimed to replicate, reproduce, and expand the study by Jellicoe and Forsythe published in 2019 that validated the Feedback in Learning Scale. We followed the original procedures, conducting a full validation process. We found only an 87% agreement between our model parameters and those of the original study. The differences were derived from the number of factors retained and the fit indices of alternative models. Fuller details of the methods used in the original study would have helped us to better ensure replicability. We also suggest that feedback in higher education (the context for our study) might be more effective if it were less personal and more task-related than workplace feedback (the context from which the Feedback in Learning Scale was derived).","PeriodicalId":47025,"journal":{"name":"Educational Research and Evaluation","volume":"27 1","pages":"164 - 187"},"PeriodicalIF":1.4,"publicationDate":"2022-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41979180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-31DOI: 10.1080/13803611.2021.2022314
K. Morrison
ABSTRACT Conceptual replications have received increased coverage in the educational research agenda. This article argues for clarity in, and justification of, the definition, scope, and boundaries of a conceptual replication and what it can and cannot do. It argues for clear justifications when changing components from those of the original study. The article raises issues concerning internal validity and construct validity which arise from the elision of replication with applicability and generalisability in a conceptual replication, and questions how far the “concept” needs, and can obtain, greater separation from context. It indicates limits to the power of conceptual replications to falsify and verify the original study, and argues for greater specificity, precision, accuracy, and attention to contexts, conditions, and causality and their influence on outcomes. Implications are drawn for preparing, planning, conducting, analysing, judging, and reporting “fair” conceptual replications in education, identifying 10 “rules” for a fair conceptual replication.
{"title":"Conceptual replications, research, and the “what works” agenda in education","authors":"K. Morrison","doi":"10.1080/13803611.2021.2022314","DOIUrl":"https://doi.org/10.1080/13803611.2021.2022314","url":null,"abstract":"ABSTRACT Conceptual replications have received increased coverage in the educational research agenda. This article argues for clarity in, and justification of, the definition, scope, and boundaries of a conceptual replication and what it can and cannot do. It argues for clear justifications when changing components from those of the original study. The article raises issues concerning internal validity and construct validity which arise from the elision of replication with applicability and generalisability in a conceptual replication, and questions how far the “concept” needs, and can obtain, greater separation from context. It indicates limits to the power of conceptual replications to falsify and verify the original study, and argues for greater specificity, precision, accuracy, and attention to contexts, conditions, and causality and their influence on outcomes. Implications are drawn for preparing, planning, conducting, analysing, judging, and reporting “fair” conceptual replications in education, identifying 10 “rules” for a fair conceptual replication.","PeriodicalId":47025,"journal":{"name":"Educational Research and Evaluation","volume":"27 1","pages":"35 - 60"},"PeriodicalIF":1.4,"publicationDate":"2022-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43433836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-31DOI: 10.1080/13803611.2021.2022319
C. Bokhove
ABSTRACT An article by Kim et al. from 2014 examined individual- and school-level variables affecting the information and communication technology (ICT) literacy level of Korean elementary school students, finding differential gender effects. In this secondary data replication, we used data from the 2018 International Computer and Information Literacy Study, focusing on data from Korea as main replication. As many characteristics of the study as possible, such as variables and analytical strategy, were modelled in the analysis. Additional analyses included 13 countries and jurisdictions, varied centring techniques for variables, and missing data treatment. The replication and analyses were pre-registered via the Open Science Framework. The main analysis did not replicate the main gender finding. However, it was also clear that, despite care taken in a rigorous replication, analytical variability still plays a large role in replications of findings, and with secondary datasets. We discuss the implications of this for secondary data replications.
{"title":"The role of analytical variability in secondary data replications: a replication of Kim et al. (2014)","authors":"C. Bokhove","doi":"10.1080/13803611.2021.2022319","DOIUrl":"https://doi.org/10.1080/13803611.2021.2022319","url":null,"abstract":"ABSTRACT An article by Kim et al. from 2014 examined individual- and school-level variables affecting the information and communication technology (ICT) literacy level of Korean elementary school students, finding differential gender effects. In this secondary data replication, we used data from the 2018 International Computer and Information Literacy Study, focusing on data from Korea as main replication. As many characteristics of the study as possible, such as variables and analytical strategy, were modelled in the analysis. Additional analyses included 13 countries and jurisdictions, varied centring techniques for variables, and missing data treatment. The replication and analyses were pre-registered via the Open Science Framework. The main analysis did not replicate the main gender finding. However, it was also clear that, despite care taken in a rigorous replication, analytical variability still plays a large role in replications of findings, and with secondary datasets. We discuss the implications of this for secondary data replications.","PeriodicalId":47025,"journal":{"name":"Educational Research and Evaluation","volume":"27 1","pages":"141 - 163"},"PeriodicalIF":1.4,"publicationDate":"2022-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41841730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-31DOI: 10.1080/13803611.2021.2022309
D. Wiliam
For anyone who understands the logic of null-hypothesis significance testing, the so-called “replication crisis” in the behavioural sciences (Bryan et al., 2021) would not have come as much of a surprise. Since the pioneering work of Carlo Bonferroni (1935) – and subsequent work in the 1950s by Henry Scheffé (1953), John Tukey (1953/1994), and Olive Jean Dunn (1961) – statisticians have repeatedly pointed out the logically obvious fact that the probability of making a Type I error (mistakenly rejecting the null hypothesis) increases when multiple comparisons are made. And yet, studies in leading psychology and education journals commonly present dozens if not hundreds of comparisons of means, correlations, or other statistics, and then go on to claim that any statistic that has a probability of less than 0.05 is “significant”. However, as Gelman and Loken (2013) point out, even when researchers do not engage in such “fishing expeditions”, if decisions about the analysis are made after the data are collected – “hypothesizing after results are known” or “HARKing” (Kerr, 1998) – then the probability of Type 1 errors is increased. At each stage in the analysis, the researcher is presented with many choices – what Gelman and Loken call “the garden of forking paths” after a short story by Argentinian author Jorge Luis (Borges, 1941/1964) – that can profoundly influence the results obtained. Some of these, such as cleaning data, or eliminating outliers, seem innocent, but nevertheless, because these decisions are taken after the results are seen, they are inconsistent with the assumptions of nullhypothesis significance testing. Other, more egregious, examples include outcome switching, collecting additional data, or changing the analytical approach when the desired level of statistical significance is not reached. A good example of how these issues play out in practice is provided by Bokhove (2022) in his replication of a study on gender differences in computer literacy, where he found that different, reasonable, analytical choices lead to very different conclusions.
对于任何理解零假设显著性检验逻辑的人来说,行为科学中所谓的“复制危机”(Bryan et al., 2021)都不会让人感到意外。自从Carlo Bonferroni(1935)的开创性工作,以及20世纪50年代Henry scheff(1953)、John Tukey(1953/1994)和Olive Jean Dunn(1961)的后续工作以来,统计学家们一再指出一个逻辑上显而易见的事实,即当进行多次比较时,犯第一类错误(错误地拒绝零假设)的概率会增加。然而,主要的心理学和教育期刊上的研究通常会提出几十个(如果不是几百个的话)对平均值、相关性或其他统计数据的比较,然后继续声称任何概率小于0.05的统计数据都是“显著的”。然而,正如Gelman和Loken(2013)所指出的那样,即使研究人员不进行这种“钓鱼考察”,如果在收集数据后做出有关分析的决定-“在结果已知后假设”或“HARKing”(Kerr, 1998) -那么类型1错误的可能性就会增加。在分析的每个阶段,研究人员都会面临许多选择——Gelman和Loken以阿根廷作家Jorge Luis(博尔赫斯,1941/1964)的一个短篇小说命名,将其称为“分叉路径的花园”——这些选择会深刻地影响所获得的结果。其中一些,如清理数据或消除异常值,似乎是无害的,但是,由于这些决定是在看到结果之后做出的,因此它们与零假设显著性检验的假设不一致。其他,更令人震惊的例子包括结果转换,收集额外的数据,或者在没有达到预期的统计显著性水平时改变分析方法。Bokhove(2022)在复制一项关于计算机素养性别差异的研究中提供了一个很好的例子,他发现不同的、合理的、分析性的选择会导致非常不同的结论。
{"title":"How should educational research respond to the replication “crisis” in the social sciences? Reflections on the papers in the Special Issue","authors":"D. Wiliam","doi":"10.1080/13803611.2021.2022309","DOIUrl":"https://doi.org/10.1080/13803611.2021.2022309","url":null,"abstract":"For anyone who understands the logic of null-hypothesis significance testing, the so-called “replication crisis” in the behavioural sciences (Bryan et al., 2021) would not have come as much of a surprise. Since the pioneering work of Carlo Bonferroni (1935) – and subsequent work in the 1950s by Henry Scheffé (1953), John Tukey (1953/1994), and Olive Jean Dunn (1961) – statisticians have repeatedly pointed out the logically obvious fact that the probability of making a Type I error (mistakenly rejecting the null hypothesis) increases when multiple comparisons are made. And yet, studies in leading psychology and education journals commonly present dozens if not hundreds of comparisons of means, correlations, or other statistics, and then go on to claim that any statistic that has a probability of less than 0.05 is “significant”. However, as Gelman and Loken (2013) point out, even when researchers do not engage in such “fishing expeditions”, if decisions about the analysis are made after the data are collected – “hypothesizing after results are known” or “HARKing” (Kerr, 1998) – then the probability of Type 1 errors is increased. At each stage in the analysis, the researcher is presented with many choices – what Gelman and Loken call “the garden of forking paths” after a short story by Argentinian author Jorge Luis (Borges, 1941/1964) – that can profoundly influence the results obtained. Some of these, such as cleaning data, or eliminating outliers, seem innocent, but nevertheless, because these decisions are taken after the results are seen, they are inconsistent with the assumptions of nullhypothesis significance testing. Other, more egregious, examples include outcome switching, collecting additional data, or changing the analytical approach when the desired level of statistical significance is not reached. A good example of how these issues play out in practice is provided by Bokhove (2022) in his replication of a study on gender differences in computer literacy, where he found that different, reasonable, analytical choices lead to very different conclusions.","PeriodicalId":47025,"journal":{"name":"Educational Research and Evaluation","volume":"27 1","pages":"208 - 214"},"PeriodicalIF":1.4,"publicationDate":"2022-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43997167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-31DOI: 10.1080/13803611.2021.2022308
John F. Brown
ABSTRACT This paper discusses adapting Churches’ approach to large-scale teacher/researcher conceptual replications of major “science of learning” findings, to increase teachers’ engagement with empirical research on, and building research networks for, gathering data on the science of learning. The project here demonstrated the feasibility of teacher-led randomised controlled trials for conceptually replicating the effects of cognitive science on learning, as specified by researchers. It also indicated high levels of interest by teachers in applying more science of learning in their practice. The approach gave freedom to teachers to design interventions, choose research methods, and measure outcomes, even though such freedom would be in tension with some scientific research which relies on constraining the sources of variation. This paper discusses how a balance can be struck between the objectives of teachers and researchers engaged in replicating cognitive science findings, and promoting teacher engagement in conceptual replication research.
{"title":"Replication studies: an essay in praise of ground-up conceptual replications in the science of learning","authors":"John F. Brown","doi":"10.1080/13803611.2021.2022308","DOIUrl":"https://doi.org/10.1080/13803611.2021.2022308","url":null,"abstract":"ABSTRACT This paper discusses adapting Churches’ approach to large-scale teacher/researcher conceptual replications of major “science of learning” findings, to increase teachers’ engagement with empirical research on, and building research networks for, gathering data on the science of learning. The project here demonstrated the feasibility of teacher-led randomised controlled trials for conceptually replicating the effects of cognitive science on learning, as specified by researchers. It also indicated high levels of interest by teachers in applying more science of learning in their practice. The approach gave freedom to teachers to design interventions, choose research methods, and measure outcomes, even though such freedom would be in tension with some scientific research which relies on constraining the sources of variation. This paper discusses how a balance can be struck between the objectives of teachers and researchers engaged in replicating cognitive science findings, and promoting teacher engagement in conceptual replication research.","PeriodicalId":47025,"journal":{"name":"Educational Research and Evaluation","volume":"27 1","pages":"188 - 207"},"PeriodicalIF":1.4,"publicationDate":"2022-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47534372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}