{"title":"教育中的复制研究","authors":"Thomas Perry, B. See","doi":"10.1080/13803611.2021.2022307","DOIUrl":null,"url":null,"abstract":"The last 2 decades have seen many developments in education research, including the growth of robust studies testing education programmes and policies using experimental designs (Hedges, 2018), such as randomised controlled trials (RCTs). RCTs have been a focus for replication study in education, with researchers seeking to replicate similar programmes under trial conditions. These replications have had varying results and have raised questions about why some results have successfully replicated and others have not. Results from recent Education Endowment Foundaton effectiveness trials are good examples of this. A number of education programmes have shown beneficial effects on young people’s learning outcomes in efficacy trials, but no effects in larger scale effectiveness trials. Examples of these programmes include Philosophy for Children (Gorard et al., 2018), Switch-on (Reading Recovery) (Gorard et al., 2014), and Accelerated Reader (Gorard et al., 2015). Some may conclude that one of these evaluations must be wrong. It is important to realise that in all of these examples, the contexts and the fidelity of implementation differed. In the Philosophy for Children effectiveness trial, 53% of the schools did not implement the intervention as intended (Lord et al., 2021). This is the nature of effectiveness trials, where the programme is delivered in real-life conditions, whereas in efficacy trials the delivery would be closely monitored and controlled, and with a smaller sample. Similarly, with the Switch-on evaluation, although schools delivered the required number of sessions, they modified the content and the delivery format of the intervention (Patel et al., 2017). There were also important differences between the efficacy and effectiveness trials. The efficacy trial was conducted with 1st-year secondary school children, whereas the effectiveness trial was with primary school children. The tests used also differed in the two evaluations. In the efficacy trial, Reading was measured using the GL New Group Reading Test (Gorard et al., 2014), but in the effectiveness trial the test used was the Hodder Group Reading Test (Patel et al., 2017). What these two examples suggest is that variations in the context and target population for the study and variations in the measures and experimental conditions can have an appreciable effect on the result. These examples also highlight the point that adherence to the fundamental principles of the original programme is essential for effective replication. Without this, it is difficult to know whether unsuccessful replication is because the programme does not work, or that it does not work with a certain population or under certain conditions. It is therefore worthwhile replicating these studies while maintaining high fidelity to the intervention and at the same time varying the population and instruments used as suggested by Wiliam (2022). Related to these efforts are questions about the role of science in education and the form it should take. RCTs represent a rigorous method of investigation and are often considered the gold standard in scientific research. There are, however, caveats associated with them and ongoing debates about their benefits and limitations (Connolly","PeriodicalId":47025,"journal":{"name":"Educational Research and Evaluation","volume":"27 1","pages":"1 - 7"},"PeriodicalIF":2.3000,"publicationDate":"2022-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Replication study in education\",\"authors\":\"Thomas Perry, B. See\",\"doi\":\"10.1080/13803611.2021.2022307\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The last 2 decades have seen many developments in education research, including the growth of robust studies testing education programmes and policies using experimental designs (Hedges, 2018), such as randomised controlled trials (RCTs). RCTs have been a focus for replication study in education, with researchers seeking to replicate similar programmes under trial conditions. These replications have had varying results and have raised questions about why some results have successfully replicated and others have not. Results from recent Education Endowment Foundaton effectiveness trials are good examples of this. A number of education programmes have shown beneficial effects on young people’s learning outcomes in efficacy trials, but no effects in larger scale effectiveness trials. Examples of these programmes include Philosophy for Children (Gorard et al., 2018), Switch-on (Reading Recovery) (Gorard et al., 2014), and Accelerated Reader (Gorard et al., 2015). Some may conclude that one of these evaluations must be wrong. It is important to realise that in all of these examples, the contexts and the fidelity of implementation differed. In the Philosophy for Children effectiveness trial, 53% of the schools did not implement the intervention as intended (Lord et al., 2021). This is the nature of effectiveness trials, where the programme is delivered in real-life conditions, whereas in efficacy trials the delivery would be closely monitored and controlled, and with a smaller sample. Similarly, with the Switch-on evaluation, although schools delivered the required number of sessions, they modified the content and the delivery format of the intervention (Patel et al., 2017). There were also important differences between the efficacy and effectiveness trials. The efficacy trial was conducted with 1st-year secondary school children, whereas the effectiveness trial was with primary school children. The tests used also differed in the two evaluations. In the efficacy trial, Reading was measured using the GL New Group Reading Test (Gorard et al., 2014), but in the effectiveness trial the test used was the Hodder Group Reading Test (Patel et al., 2017). What these two examples suggest is that variations in the context and target population for the study and variations in the measures and experimental conditions can have an appreciable effect on the result. These examples also highlight the point that adherence to the fundamental principles of the original programme is essential for effective replication. Without this, it is difficult to know whether unsuccessful replication is because the programme does not work, or that it does not work with a certain population or under certain conditions. It is therefore worthwhile replicating these studies while maintaining high fidelity to the intervention and at the same time varying the population and instruments used as suggested by Wiliam (2022). Related to these efforts are questions about the role of science in education and the form it should take. RCTs represent a rigorous method of investigation and are often considered the gold standard in scientific research. There are, however, caveats associated with them and ongoing debates about their benefits and limitations (Connolly\",\"PeriodicalId\":47025,\"journal\":{\"name\":\"Educational Research and Evaluation\",\"volume\":\"27 1\",\"pages\":\"1 - 7\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2022-02-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Educational Research and Evaluation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1080/13803611.2021.2022307\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"EDUCATION & EDUCATIONAL RESEARCH\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Educational Research and Evaluation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/13803611.2021.2022307","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
The last 2 decades have seen many developments in education research, including the growth of robust studies testing education programmes and policies using experimental designs (Hedges, 2018), such as randomised controlled trials (RCTs). RCTs have been a focus for replication study in education, with researchers seeking to replicate similar programmes under trial conditions. These replications have had varying results and have raised questions about why some results have successfully replicated and others have not. Results from recent Education Endowment Foundaton effectiveness trials are good examples of this. A number of education programmes have shown beneficial effects on young people’s learning outcomes in efficacy trials, but no effects in larger scale effectiveness trials. Examples of these programmes include Philosophy for Children (Gorard et al., 2018), Switch-on (Reading Recovery) (Gorard et al., 2014), and Accelerated Reader (Gorard et al., 2015). Some may conclude that one of these evaluations must be wrong. It is important to realise that in all of these examples, the contexts and the fidelity of implementation differed. In the Philosophy for Children effectiveness trial, 53% of the schools did not implement the intervention as intended (Lord et al., 2021). This is the nature of effectiveness trials, where the programme is delivered in real-life conditions, whereas in efficacy trials the delivery would be closely monitored and controlled, and with a smaller sample. Similarly, with the Switch-on evaluation, although schools delivered the required number of sessions, they modified the content and the delivery format of the intervention (Patel et al., 2017). There were also important differences between the efficacy and effectiveness trials. The efficacy trial was conducted with 1st-year secondary school children, whereas the effectiveness trial was with primary school children. The tests used also differed in the two evaluations. In the efficacy trial, Reading was measured using the GL New Group Reading Test (Gorard et al., 2014), but in the effectiveness trial the test used was the Hodder Group Reading Test (Patel et al., 2017). What these two examples suggest is that variations in the context and target population for the study and variations in the measures and experimental conditions can have an appreciable effect on the result. These examples also highlight the point that adherence to the fundamental principles of the original programme is essential for effective replication. Without this, it is difficult to know whether unsuccessful replication is because the programme does not work, or that it does not work with a certain population or under certain conditions. It is therefore worthwhile replicating these studies while maintaining high fidelity to the intervention and at the same time varying the population and instruments used as suggested by Wiliam (2022). Related to these efforts are questions about the role of science in education and the form it should take. RCTs represent a rigorous method of investigation and are often considered the gold standard in scientific research. There are, however, caveats associated with them and ongoing debates about their benefits and limitations (Connolly
期刊介绍:
International, comparative and multidisciplinary in scope, Educational Research and Evaluation (ERE) publishes original, peer-reviewed academic articles dealing with research on issues of worldwide relevance in educational practice. The aim of the journal is to increase understanding of learning in pre-primary, primary, high school, college, university and adult education, and to contribute to the improvement of educational processes and outcomes. The journal seeks to promote cross-national and international comparative educational research by publishing findings relevant to the scholarly community, as well as to practitioners and others interested in education. The scope of the journal is deliberately broad in terms of both topics covered and disciplinary perspective.