M. Wolf, J. Herman, Jinok S. Kim, J. Abedi, Seth Leon, Noelle C. Griffin, Patina L. Bachman, Sandy Chang, Tim Farnsworth, Hyekyung Jung, J. Nollner, H. Shin
This research project addresses the validity of assessments used to measure the performance of English language learners (ELLs), such as those mandated by the No Child Left Behind Act of 2001 (NCLB, 2002). The goals of the research are to help educators understand and improve ELL performance by investigating the validity of their current assessments, and to provide states with much needed guidance to improve the validity of their English language proficiency (ELP) and academic achievement assessments for ELL students. The research has three phases. In the first phase, the researchers analyze existing data and documents to understand the nature and validity of states’ current practices and their priority needs. This first phase is exploratory in that the researchers identify key validity issues by examining the existing data and formulate research areas where further investigation is needed for the second phase. In the second phase of the research, the researchers will deepen their analysis of the areas identified from Phase I findings. In the third phase of the research, the researchers will develop specific guidelines on which states may base their ELL assessment policy and practice. The present report focuses on the researchers’ Phase I research activities and results. The report also discusses preliminary implications and recommendations for improving ELL assessment systems. 1 We would like to thank Lyle Bachman, Alison Bailey, Frances Butler, Diane August, and Guillermo SolanoFlores for their valuable comments on earlier drafts of this report. We are also very grateful to our three participating states for their willingness to share their data and support of our work.
{"title":"Providing Validity Evidence to Improve the Assessment of English Language Learners. CRESST Report 738.","authors":"M. Wolf, J. Herman, Jinok S. Kim, J. Abedi, Seth Leon, Noelle C. Griffin, Patina L. Bachman, Sandy Chang, Tim Farnsworth, Hyekyung Jung, J. Nollner, H. Shin","doi":"10.1037/e643102011-001","DOIUrl":"https://doi.org/10.1037/e643102011-001","url":null,"abstract":"This research project addresses the validity of assessments used to measure the performance of English language learners (ELLs), such as those mandated by the No Child Left Behind Act of 2001 (NCLB, 2002). The goals of the research are to help educators understand and improve ELL performance by investigating the validity of their current assessments, and to provide states with much needed guidance to improve the validity of their English language proficiency (ELP) and academic achievement assessments for ELL students. The research has three phases. In the first phase, the researchers analyze existing data and documents to understand the nature and validity of states’ current practices and their priority needs. This first phase is exploratory in that the researchers identify key validity issues by examining the existing data and formulate research areas where further investigation is needed for the second phase. In the second phase of the research, the researchers will deepen their analysis of the areas identified from Phase I findings. In the third phase of the research, the researchers will develop specific guidelines on which states may base their ELL assessment policy and practice. The present report focuses on the researchers’ Phase I research activities and results. The report also discusses preliminary implications and recommendations for improving ELL assessment systems. 1 We would like to thank Lyle Bachman, Alison Bailey, Frances Butler, Diane August, and Guillermo SolanoFlores for their valuable comments on earlier drafts of this report. We are also very grateful to our three participating states for their willingness to share their data and support of our work.","PeriodicalId":19116,"journal":{"name":"National Center for Research on Evaluation, Standards, and Student Testing","volume":"49 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2008-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75331741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Wolf, J. Herman, Lyle F. Bachman, A. Bailey, Noelle C. Griffin
The No Child Left Behind Act of 2001 (NCLB, 2002) has had a great impact on states’ policies in assessing English language learner (ELL) students. The legislation requires states to develop or adopt sound assessments in order to validly measure the ELL students’ English language proficiency, as well as content knowledge and skills. While states have moved rapidly to meet these requirements, they face challenges to validate their current assessment and accountability systems for ELL students, partly due to the lack of resources. Considering the significant role of assessment in guiding decisions about organizations and individuals, validity is a paramount concern. In light of this, we reviewed the current literature and policy regarding ELL assessment in order to inform practitioners of the key issues to consider in their validation process. Drawn from our review of literature and practice, we developed a set of guidelines and recommendations for practitioners to use as a resource to improve their ELL assessment systems. The present report is the last component of the series, providing recommendations for state policy and practice in assessing ELL students. It also discusses areas for future research and development. Introduction and Background English language learners (ELLs) are the fastest growing subgroup in the nation. Over a 10-year period between the 1994–1995 and 2004–2005 school years, the enrollment of ELL students grew over 60%, while the total K–12 growth was just over 2% (Office of English Language Acquisition [OELA], n.d.). The increased rate is more astounding for some states. For instance, North Carolina and Nevada have reported their ELL population growth rate as 500% and 200% respectively for the past 10-year period (Batlova, Fix, & Murray, 2005, as cited in Short & Fitzsimmons, 2007). Not only is the size of the ELL population is growing, but the diversity of these students is becoming more extensive. Over 400 different languages are reported among these students; schooling experience is varied depending on the students’ 1 We would like to thank the following for their valuable comments and suggestions on earlier drafts of this report: Jamal Abedi, Diane August, Margaret Malone, Robert J. Mislevy, Charlene Rivera, Lourdes Rovira, Robert Rueda, Guillermo Solano-Flores, and Lynn Shafer Willner. Our sincere thanks also go to Jenny Kao, Patina L. Bachman, and Sandy M. Chang for their useful suggestions and invaluable research assistance.
2001年的《不让一个孩子掉队法案》(NCLB, 2002)对各州评估英语学习者(ELL)学生的政策产生了巨大影响。立法要求各州制定或采用健全的评估,以有效地衡量英语学生的英语语言能力,以及内容知识和技能。虽然各州已经迅速采取行动来满足这些要求,但由于缺乏资源,他们面临着验证当前针对ELL学生的评估和问责制度的挑战。考虑到评估在指导组织和个人决策中的重要作用,有效性是最重要的问题。鉴于此,我们回顾了当前关于ELL评估的文献和政策,以便告知从业者在验证过程中需要考虑的关键问题。根据我们对文献和实践的回顾,我们制定了一套指导方针和建议,供从业人员用作改进其ELL评估系统的资源。本报告是该系列的最后一部分,为评估ELL学生的国家政策和实践提供建议。它还讨论了未来研究和发展的领域。英语学习者(ELLs)是美国增长最快的群体。在1994-1995学年至2004-2005学年的10年间,外语学生的入学率增长了60%以上,而K-12的总增长率仅略高于2% (Office of English Language Acquisition [OELA], n.d)。在一些州,这一增长速度更为惊人。例如,北卡罗来纳州和内华达州在过去10年期间的ELL人口增长率分别为500%和200% (Batlova, Fix, & Murray, 2005,引用于Short & Fitzsimmons, 2007)。不仅是ELL学生的数量在增长,而且这些学生的多样性也越来越广泛。据报道,这些学生会说400多种不同的语言;我们要感谢以下人士对本报告早期草稿的宝贵意见和建议:Jamal Abedi, Diane August, Margaret Malone, Robert J. Mislevy, Charlene Rivera, Lourdes Rovira, Robert Rueda, Guillermo Solano-Flores和Lynn Shafer Willner。我们也要衷心感谢Jenny Kao、Patina L. Bachman和Sandy M. Chang提供的有用建议和宝贵的研究协助。
{"title":"Recommendations for Assessing English Language Learners: English Language Proficiency Measures and Accommodation Uses. Recommendations Report (Part 3 of 3). CRESST Report 737.","authors":"M. Wolf, J. Herman, Lyle F. Bachman, A. Bailey, Noelle C. Griffin","doi":"10.1037/e643112011-001","DOIUrl":"https://doi.org/10.1037/e643112011-001","url":null,"abstract":"The No Child Left Behind Act of 2001 (NCLB, 2002) has had a great impact on states’ policies in assessing English language learner (ELL) students. The legislation requires states to develop or adopt sound assessments in order to validly measure the ELL students’ English language proficiency, as well as content knowledge and skills. While states have moved rapidly to meet these requirements, they face challenges to validate their current assessment and accountability systems for ELL students, partly due to the lack of resources. Considering the significant role of assessment in guiding decisions about organizations and individuals, validity is a paramount concern. In light of this, we reviewed the current literature and policy regarding ELL assessment in order to inform practitioners of the key issues to consider in their validation process. Drawn from our review of literature and practice, we developed a set of guidelines and recommendations for practitioners to use as a resource to improve their ELL assessment systems. The present report is the last component of the series, providing recommendations for state policy and practice in assessing ELL students. It also discusses areas for future research and development. Introduction and Background English language learners (ELLs) are the fastest growing subgroup in the nation. Over a 10-year period between the 1994–1995 and 2004–2005 school years, the enrollment of ELL students grew over 60%, while the total K–12 growth was just over 2% (Office of English Language Acquisition [OELA], n.d.). The increased rate is more astounding for some states. For instance, North Carolina and Nevada have reported their ELL population growth rate as 500% and 200% respectively for the past 10-year period (Batlova, Fix, & Murray, 2005, as cited in Short & Fitzsimmons, 2007). Not only is the size of the ELL population is growing, but the diversity of these students is becoming more extensive. Over 400 different languages are reported among these students; schooling experience is varied depending on the students’ 1 We would like to thank the following for their valuable comments and suggestions on earlier drafts of this report: Jamal Abedi, Diane August, Margaret Malone, Robert J. Mislevy, Charlene Rivera, Lourdes Rovira, Robert Rueda, Guillermo Solano-Flores, and Lynn Shafer Willner. Our sincere thanks also go to Jenny Kao, Patina L. Bachman, and Sandy M. Chang for their useful suggestions and invaluable research assistance.","PeriodicalId":19116,"journal":{"name":"National Center for Research on Evaluation, Standards, and Student Testing","volume":"14 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2008-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83351697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2008-05-01DOI: 10.4324/9781315096773-16
Terry P. Vendlinski, E. Baker, D. Niemi
{"title":"Templates and Objects in Authoring Problem-Solving Assessments. CRESST Report 735.","authors":"Terry P. Vendlinski, E. Baker, D. Niemi","doi":"10.4324/9781315096773-16","DOIUrl":"https://doi.org/10.4324/9781315096773-16","url":null,"abstract":"","PeriodicalId":19116,"journal":{"name":"National Center for Research on Evaluation, Standards, and Student Testing","volume":"23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2008-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81988123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Wolf, Jenny C. Kao, J. Herman, Lyle F. Bachman, A. Bailey, Patina L. Bachman, Tim Farnsworth, Sandy Chang
The No Child Left Behind (NCLB) Act has made a great impact on states’ policies in assessing English language learner (ELL) students. The legislation requires states to develop or adopt sound assessments in order to validly measure the ELL students’ English language proficiency (ELP), as well as content knowledge and skills. Although states have moved rapidly to meet these requirements, they face challenges to validate their current assessment and accountability systems for ELL students, partly due to the lack of resources. Considering the significant role of assessments in guiding decisions about organizations and individuals, it is of paramount importance to establish a valid assessment system. In light of this, we reviewed the current literature and policy regarding ELL assessment in order to inform practitioners of the key issues to consider in their validation processes. Drawn from our review of literature and practice, we developed a set of guidelines and recommendations for practitioners to use as a resource to improve their ELL assessment systems. We have compiled a series of three reports. The present report is the first component of the series, containing pertinent literature related to assessing ELL students. The areas being reviewed include validity theory, the construct of ELP assessments, and the effects of accommodations in the assessment of ELL students’ content knowledge.
{"title":"Issues in Assessing English Language Learners: English Language Proficiency Measures and Accommodation Uses. Literature Review (Part 1 of 3). CRESST Report 731.","authors":"M. Wolf, Jenny C. Kao, J. Herman, Lyle F. Bachman, A. Bailey, Patina L. Bachman, Tim Farnsworth, Sandy Chang","doi":"10.1037/e643592011-001","DOIUrl":"https://doi.org/10.1037/e643592011-001","url":null,"abstract":"The No Child Left Behind (NCLB) Act has made a great impact on states’ policies in assessing English language learner (ELL) students. The legislation requires states to develop or adopt sound assessments in order to validly measure the ELL students’ English language proficiency (ELP), as well as content knowledge and skills. Although states have moved rapidly to meet these requirements, they face challenges to validate their current assessment and accountability systems for ELL students, partly due to the lack of resources. Considering the significant role of assessments in guiding decisions about organizations and individuals, it is of paramount importance to establish a valid assessment system. In light of this, we reviewed the current literature and policy regarding ELL assessment in order to inform practitioners of the key issues to consider in their validation processes. Drawn from our review of literature and practice, we developed a set of guidelines and recommendations for practitioners to use as a resource to improve their ELL assessment systems. We have compiled a series of three reports. The present report is the first component of the series, containing pertinent literature related to assessing ELL students. The areas being reviewed include validity theory, the construct of ELP assessments, and the effects of accommodations in the assessment of ELL students’ content knowledge.","PeriodicalId":19116,"journal":{"name":"National Center for Research on Evaluation, Standards, and Student Testing","volume":"29 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2008-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80074697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Current educational policy highlights the important role that assessment can play in improving education. State standards and the assessments that are aligned with them establish targets for learning and promote school accountability for helping all students succeed; at the same time, feedback from assessment results is expected to provide districts, schools, and teachers with important information for guiding instructional planning and decision making. Yet even as No Child Left Behind (NCLB) and its requirements for adequate yearly progress put unprecedented emphasis on state tests, educators have discovered that annual state tests are too little and too late to guide teaching and learning. Recognizing the need for more frequent assessments to support student learning, many districts and schools have turned to benchmark testing—periodic assessments through which districts can monitor students’ progress, and schools and teachers can refine curriculum and teaching—to help students succeed. We report in this document a collaborative effort of teachers, district administrators, professional developers, and assessment researchers to develop benchmark assessments for elementary school science. In the sections which follow we provide the rationale for our work and its research question, describe our collaborative assessment development process and its results, and present conclusions.
{"title":"Creating Accurate Science Benchmark Assessments to Inform Instruction. CSE Technical Report 730.","authors":"Terry P. Vendlinski, Sam O. Nagashima, J. Herman","doi":"10.1037/e643602011-001","DOIUrl":"https://doi.org/10.1037/e643602011-001","url":null,"abstract":"Current educational policy highlights the important role that assessment can play in improving education. State standards and the assessments that are aligned with them establish targets for learning and promote school accountability for helping all students succeed; at the same time, feedback from assessment results is expected to provide districts, schools, and teachers with important information for guiding instructional planning and decision making. Yet even as No Child Left Behind (NCLB) and its requirements for adequate yearly progress put unprecedented emphasis on state tests, educators have discovered that annual state tests are too little and too late to guide teaching and learning. Recognizing the need for more frequent assessments to support student learning, many districts and schools have turned to benchmark testing—periodic assessments through which districts can monitor students’ progress, and schools and teachers can refine curriculum and teaching—to help students succeed. We report in this document a collaborative effort of teachers, district administrators, professional developers, and assessment researchers to develop benchmark assessments for elementary school science. In the sections which follow we provide the rationale for our work and its research question, describe our collaborative assessment development process and its results, and present conclusions.","PeriodicalId":19116,"journal":{"name":"National Center for Research on Evaluation, Standards, and Student Testing","volume":"107 9‐12","pages":""},"PeriodicalIF":0.0,"publicationDate":"2007-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91418813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Franke, N. Webb, Angela G. Chan, Dan Battey, Marsha Ing, Deanna Freund, Tondra De
{"title":"Eliciting Student Thinking in Elementary School Mathematics Classrooms. CRESST Report 725.","authors":"M. Franke, N. Webb, Angela G. Chan, Dan Battey, Marsha Ing, Deanna Freund, Tondra De","doi":"10.1037/e643702011-001","DOIUrl":"https://doi.org/10.1037/e643702011-001","url":null,"abstract":"","PeriodicalId":19116,"journal":{"name":"National Center for Research on Evaluation, Standards, and Student Testing","volume":"14 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2007-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84694907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Eva Chen, D. Niemi, Jia Wang, Haiwen Wang, J. Mirocha
This study investigated the level of generalizability across a few high quality assessment tasks and the validity of measuring student writing ability using a limited number of essay tasks. More specifically, the research team explored how well writing prompts could measure student general writing ability and if student performance from one writing task could be generalized to other similar writing tasks. A total of four writing prompts were used in the study, with three tasks being literature-based and one task based on a short story. A total of 397 students participated in the study and each student was randomly assigned to complete two of the four tasks. The research team found that three to five essays were required to evaluate and make a reliable judgment of student writing performance. Examining the Generalizability of Direct Writing Assessment Tasks Performance assessment can serve to measure important and complex learning outcomes (Resnick & Resnick, 1989), provide a more direct measurement of student ability (Frederiksen, 1984; Glaser, 1991; Guthrie, 1984), and help guide improvement in instructional practices (Baron, 1991; Bennett, 1993). Of the various types of performance assessment, direct tests of writing ability have experienced the most acceptance in state and national assessment programs (Afflebach, 1985; Applebee, Langer, Jenkins, Mullins & Foertsch, 1990; Applebee, Langer, & Mullis, 1995). Advocates of direct writing assessment point out that students need more exposure to writing in the form of instruction and more frequent examinations (Breland, 1983). However, there are problems associated with using essays to measure students’ writing abilities, like objectivity of ratings, generalizability of scores across raters and tasks (Crehan, 1997). Previous generalizability studies of direct writing assessment
{"title":"Examining the Generalizability of Direct Writing Assessment Tasks. CSE Technical Report 718.","authors":"Eva Chen, D. Niemi, Jia Wang, Haiwen Wang, J. Mirocha","doi":"10.1037/e643812011-001","DOIUrl":"https://doi.org/10.1037/e643812011-001","url":null,"abstract":"This study investigated the level of generalizability across a few high quality assessment tasks and the validity of measuring student writing ability using a limited number of essay tasks. More specifically, the research team explored how well writing prompts could measure student general writing ability and if student performance from one writing task could be generalized to other similar writing tasks. A total of four writing prompts were used in the study, with three tasks being literature-based and one task based on a short story. A total of 397 students participated in the study and each student was randomly assigned to complete two of the four tasks. The research team found that three to five essays were required to evaluate and make a reliable judgment of student writing performance. Examining the Generalizability of Direct Writing Assessment Tasks Performance assessment can serve to measure important and complex learning outcomes (Resnick & Resnick, 1989), provide a more direct measurement of student ability (Frederiksen, 1984; Glaser, 1991; Guthrie, 1984), and help guide improvement in instructional practices (Baron, 1991; Bennett, 1993). Of the various types of performance assessment, direct tests of writing ability have experienced the most acceptance in state and national assessment programs (Afflebach, 1985; Applebee, Langer, Jenkins, Mullins & Foertsch, 1990; Applebee, Langer, & Mullis, 1995). Advocates of direct writing assessment point out that students need more exposure to writing in the form of instruction and more frequent examinations (Breland, 1983). However, there are problems associated with using essays to measure students’ writing abilities, like objectivity of ratings, generalizability of scores across raters and tasks (Crehan, 1997). Previous generalizability studies of direct writing assessment","PeriodicalId":19116,"journal":{"name":"National Center for Research on Evaluation, Standards, and Student Testing","volume":"24 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2007-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85164680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Bailey, Becky H. Huang, H. Shin, Tim Farnsworth, Frances A. Butler
{"title":"Developing Academic English Language Proficiency Prototypes for 5th Grade Reading: Psychometric and Linguistic Profiles of Tasks. An Extended Executive Summary. CSE Report 720.","authors":"A. Bailey, Becky H. Huang, H. Shin, Tim Farnsworth, Frances A. Butler","doi":"10.1037/e643792011-001","DOIUrl":"https://doi.org/10.1037/e643792011-001","url":null,"abstract":"","PeriodicalId":19116,"journal":{"name":"National Center for Research on Evaluation, Standards, and Student Testing","volume":"10 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2007-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84338137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Based on in-depth data from nine demographically similar schools, the study asks five questions in regard to key aspects of the improvement process and that speak to the consequential validity of accountability indicators: Do schools that differ widely according to system performance criteria also differ on the quality of the educational experience they provide to students? Are schools that have posted high growth on the state’s performance index more effective organizationally? Do high-performing schools respond more productively to the messages of their state accountability system? Do highand low-performing schools exhibit different approaches to organizational learning and teacher professionalism? Is district instructional management in an aligned state accountability system related to performance? We report our findings in three results papers1 (Mintrop & Trujillo, 2007a, 2007b; Trujillo & Mintrop, 2007) and this technical report. The results papers, in a nutshell, show that, across the nine case study schools, one positive performance outlier differed indeed in the quality of teaching, organizational effectiveness, response to accountability, and patterns of organizational learning. Across the other eight schools, however, the patterns blurred. We conclude that, save for performance differences on the extreme positive and negative margins, relationships between system-designated performance levels and improvement processes on the ground are uncertain and far from solid. The papers try to elucidate why this may be so. This final technical report summarizes the major components of the study design and methodology, including case selection, instrumentation, data collection, and data analysis techniques. We describe the context of the study as well as descriptive data on our cases and procedures. School improvement is an intricate business. Whether a school succeeds in improving is dependent on a host of factors. Factors come into play that are internal and external to the organization. The motivation and capacity of the workforce, the 1 The three reports are entitled Accountability Urgency, Organizational Learning, and Educational Outcomes: A Comparative Analysis of California Middle Schools; The Practical Relevance of Accountability Systems for School Improvement: A Descriptive Analysis of California Schools; and Centralized Instructional Management: District Control, Organizational Culture, and School Performance.
{"title":"School Improvement under Test-Driven Accountability: A Comparison of High- and Low-Performing Middle Schools in California. CSE Report 717.","authors":"H. Mintrop, Tina Trujillo","doi":"10.1037/e643832011-001","DOIUrl":"https://doi.org/10.1037/e643832011-001","url":null,"abstract":"Based on in-depth data from nine demographically similar schools, the study asks five questions in regard to key aspects of the improvement process and that speak to the consequential validity of accountability indicators: Do schools that differ widely according to system performance criteria also differ on the quality of the educational experience they provide to students? Are schools that have posted high growth on the state’s performance index more effective organizationally? Do high-performing schools respond more productively to the messages of their state accountability system? Do highand low-performing schools exhibit different approaches to organizational learning and teacher professionalism? Is district instructional management in an aligned state accountability system related to performance? We report our findings in three results papers1 (Mintrop & Trujillo, 2007a, 2007b; Trujillo & Mintrop, 2007) and this technical report. The results papers, in a nutshell, show that, across the nine case study schools, one positive performance outlier differed indeed in the quality of teaching, organizational effectiveness, response to accountability, and patterns of organizational learning. Across the other eight schools, however, the patterns blurred. We conclude that, save for performance differences on the extreme positive and negative margins, relationships between system-designated performance levels and improvement processes on the ground are uncertain and far from solid. The papers try to elucidate why this may be so. This final technical report summarizes the major components of the study design and methodology, including case selection, instrumentation, data collection, and data analysis techniques. We describe the context of the study as well as descriptive data on our cases and procedures. School improvement is an intricate business. Whether a school succeeds in improving is dependent on a host of factors. Factors come into play that are internal and external to the organization. The motivation and capacity of the workforce, the 1 The three reports are entitled Accountability Urgency, Organizational Learning, and Educational Outcomes: A Comparative Analysis of California Middle Schools; The Practical Relevance of Accountability Systems for School Improvement: A Descriptive Analysis of California Schools; and Centralized Instructional Management: District Control, Organizational Culture, and School Performance.","PeriodicalId":19116,"journal":{"name":"National Center for Research on Evaluation, Standards, and Student Testing","volume":"112 2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2007-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91024258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In a pair of recent studies, Fryer and Levitt (2004a, 2004b) analyzed the Early Childhood Longitudinal Study – Kindergarten Cohort (ECLS-K) to explore the characteristics of the Black-White test score gap in young children. They found that the gap grew markedly between kindergarten and the third grade and that they could predict the gap from measured characteristics in kindergarten but not in the third grade. In addition, they found that the widening of the gap was differential across areas of knowledge and skill, with Blacks falling behind in all areas other than the most basic. They raised the possibility that Black and Whites may not be on “parallel trajectories” and that Blacks, as they go through school, may never master some skills mastered by Whites. This study re-analyzes the ECLS-K data to address this last question. We find that the scores used by Fryer and Levitt (proficiency probability scores, or PPS) do not support the hypothesis of differential growth of the gap. The patterns they found reflect the nonlinear relationships between overall proficiency, θ , and the PPS variables, as well as ceiling effects in the PPS distributions. Moreover, θ is a sufficient statistic for the PPS variables, and therefore, PPS variables merely re-express the overall mean difference between groups and contain no information about qualitative differences in performance between Black and White students at similar levels of θ . We therefore carried out differential item functioning (DIF) analyses of all items in all rounds of the ECLS-K through grade 5 (Round 6), excluding only the fall of grade 1 (which was a very small sample) and subsamples in which there were too few Black students for reasonable analysis. We found no relevant patterns in the distribution of the DIF statistics or in the characteristics of the items showing DIF that support the notion of differential divergence, other than in kindergarten and the first grade, where DIF favoring Blacks tended to be on items tapping simple skills taught outside of school (e.g., number recognition), while DIF disfavoring Blacks tended to be on material taught more in school (e.g., arithmetic). However, there were exceptions to this. Moreover, because of its construction and reporting, the ECLS-K data were not ideal for addressing this 1Young-Suk Kim is currently at the Florida Center for Reading Research (FCRR) and Department of Childhood Education, Reading, and Disability Services, College of Education, Florida State University
在最近的两项研究中,Fryer和Levitt (2004a, 2004b)分析了早期儿童纵向研究-幼儿园队列(ECLS-K),以探索幼儿黑人-白人测试成绩差距的特征。他们发现,幼儿园和三年级之间的差距明显扩大,他们可以通过幼儿园的测量特征来预测差距,但在三年级时却不能。此外,他们还发现,差距的扩大在知识和技能领域存在差异,黑人在除了最基本的领域之外的所有领域都落后。他们提出了一种可能性,即黑人和白人可能不在“平行轨迹”上,黑人在上学的过程中,可能永远无法掌握白人掌握的一些技能。本研究重新分析了ECLS-K数据来解决最后一个问题。我们发现Fryer和Levitt使用的分数(熟练概率分数,或PPS)不支持差距差异增长的假设。他们发现的模式反映了总体熟练度、θ和PPS变量之间的非线性关系,以及PPS分布中的天花板效应。此外,θ是PPS变量的充分统计量,因此,PPS变量只是重新表达了组间的总体平均差异,而不包含关于黑人和白人学生在相似θ水平下表现的定性差异的信息。因此,我们对ECLS-K到5年级(第6轮)的所有项目进行了差异项目功能(DIF)分析,只排除了1年级的下降(这是一个非常小的样本)和由于黑人学生太少而无法进行合理分析的子样本。除了在幼儿园和一年级,我们发现在DIF统计数据的分布或显示DIF的项目的特征中没有相关的模式支持微分发散的概念,在幼儿园和一年级,有利于黑人的DIF往往是涉及校外教授的简单技能的项目(例如,数字识别),而不利于黑人的DIF往往是在学校教授的更多的材料(例如,算术)。然而,也有例外。此外,由于其结构和报告,ECLS-K数据对于解决这一问题并不理想。Kim young - suk目前就职于佛罗里达阅读研究中心(FCRR)和佛罗里达州立大学教育学院儿童教育、阅读和残疾服务系
{"title":"Changes in the Black-White Test score Gap in the Elementary School Grades. CSE Report 715.","authors":"D. Koretz, Y. Kim","doi":"10.1037/e643902011-001","DOIUrl":"https://doi.org/10.1037/e643902011-001","url":null,"abstract":"In a pair of recent studies, Fryer and Levitt (2004a, 2004b) analyzed the Early Childhood Longitudinal Study – Kindergarten Cohort (ECLS-K) to explore the characteristics of the Black-White test score gap in young children. They found that the gap grew markedly between kindergarten and the third grade and that they could predict the gap from measured characteristics in kindergarten but not in the third grade. In addition, they found that the widening of the gap was differential across areas of knowledge and skill, with Blacks falling behind in all areas other than the most basic. They raised the possibility that Black and Whites may not be on “parallel trajectories” and that Blacks, as they go through school, may never master some skills mastered by Whites. This study re-analyzes the ECLS-K data to address this last question. We find that the scores used by Fryer and Levitt (proficiency probability scores, or PPS) do not support the hypothesis of differential growth of the gap. The patterns they found reflect the nonlinear relationships between overall proficiency, θ , and the PPS variables, as well as ceiling effects in the PPS distributions. Moreover, θ is a sufficient statistic for the PPS variables, and therefore, PPS variables merely re-express the overall mean difference between groups and contain no information about qualitative differences in performance between Black and White students at similar levels of θ . We therefore carried out differential item functioning (DIF) analyses of all items in all rounds of the ECLS-K through grade 5 (Round 6), excluding only the fall of grade 1 (which was a very small sample) and subsamples in which there were too few Black students for reasonable analysis. We found no relevant patterns in the distribution of the DIF statistics or in the characteristics of the items showing DIF that support the notion of differential divergence, other than in kindergarten and the first grade, where DIF favoring Blacks tended to be on items tapping simple skills taught outside of school (e.g., number recognition), while DIF disfavoring Blacks tended to be on material taught more in school (e.g., arithmetic). However, there were exceptions to this. Moreover, because of its construction and reporting, the ECLS-K data were not ideal for addressing this 1Young-Suk Kim is currently at the Florida Center for Reading Research (FCRR) and Department of Childhood Education, Reading, and Disability Services, College of Education, Florida State University","PeriodicalId":19116,"journal":{"name":"National Center for Research on Evaluation, Standards, and Student Testing","volume":"89 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2007-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85838700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}