In this report we focus on the use of propensity score methodology in multisite studies of the effects of educational programs and practices in which both treatment and control conditions are enacted within each of the schools in a sample, and the assignment to treatment is not random. A key challenge in applying propensity score methodology in such settings is that the process by which students wind up in treatment or control conditions may differ substantially from school to school. To help capture differences in selection processes across schools, and achieve balance on key covariates between treatment and control students in each school, we propose the use of multilevel logistic regression models for propensity score estimation in which intercepts and slopes are treated as varying across schools. Through analyses of the data from the Early Academic Outreach Program (EAOP), we compare the performance of this approach with other possible strategies for estimating propensity scores (e.g., single-level logistic regression models; multilevel logistic regression models with intercepts treated as random and slopes treated as fixed). Furthermore, we draw attention to how the failure to achieve balance within each school can result in misleading inferences concerning the extent to which the effect of a treatment varies across schools, and concerning factors (e.g., differences in implementation across schools) that might dampen or magnify the effects of a treatment.
{"title":"Causal Inference in Multilevel Settings in Which Selection Processes Vary across Schools. CSE Technical Report 708.","authors":"Junyeop Kim, Michael H. Seltzer","doi":"10.1037/e644002011-001","DOIUrl":"https://doi.org/10.1037/e644002011-001","url":null,"abstract":"In this report we focus on the use of propensity score methodology in multisite studies of the effects of educational programs and practices in which both treatment and control conditions are enacted within each of the schools in a sample, and the assignment to treatment is not random. A key challenge in applying propensity score methodology in such settings is that the process by which students wind up in treatment or control conditions may differ substantially from school to school. To help capture differences in selection processes across schools, and achieve balance on key covariates between treatment and control students in each school, we propose the use of multilevel logistic regression models for propensity score estimation in which intercepts and slopes are treated as varying across schools. Through analyses of the data from the Early Academic Outreach Program (EAOP), we compare the performance of this approach with other possible strategies for estimating propensity scores (e.g., single-level logistic regression models; multilevel logistic regression models with intercepts treated as random and slopes treated as fixed). Furthermore, we draw attention to how the failure to achieve balance within each school can result in misleading inferences concerning the extent to which the effect of a treatment varies across schools, and concerning factors (e.g., differences in implementation across schools) that might dampen or magnify the effects of a treatment.","PeriodicalId":19116,"journal":{"name":"National Center for Research on Evaluation, Standards, and Student Testing","volume":"80 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2007-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84230394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Huang, Allison Coordt, Deborah La Torre, Seth Leon, Judy N. Miyoshi, P. A. Pérez, C. Peterson
The relationship between afterschool staff and students is very important for encouraging and promoting longevity in school. The primary goal of this study was to examine the connection between perceptions of staff-student relationships and the educational values, future aspirations, and engagement of LA’s BEST students. To this end, we developed a set of research questions which would help us examine the association between strong staff-student relationships—characterized by mutual trust, bonding, and support—and student variables such as academic engagement and future aspirations. To address these evaluation questions, staff and student surveys were piloted and developed by the National Center for Research on Evaluation, Standards, and Student Testing (CRESST) and widely administered to both afterschool staff and students. Descriptive statistics were computed for the survey data; HLM analyses and structural equation models were fitted to examine the variables. Afterschool programs have become much more than childcare providers for working parents or safe havens within violent communities. They have blossomed into powerful learning centers for students with lasting and far-reaching effects. These programs possess an asset that gives them the ability and opportunity to influence students to develop a belief system that will ultimately impact their academic and social futures—that asset is social capital. Executive Summary Afterschool programs offer an important avenue for enhancing educational opportunities. Federal, state, and local educational authorities increasingly see them as environments to improve attitudes toward school, achievement, and academic
{"title":"The Afterschool Hours: Examining the Relationship between Afterschool Staff-Based Social Capital and Student Engagement in LA's BEST. CSE Technical Report 712.","authors":"D. Huang, Allison Coordt, Deborah La Torre, Seth Leon, Judy N. Miyoshi, P. A. Pérez, C. Peterson","doi":"10.1037/e643962011-001","DOIUrl":"https://doi.org/10.1037/e643962011-001","url":null,"abstract":"The relationship between afterschool staff and students is very important for encouraging and promoting longevity in school. The primary goal of this study was to examine the connection between perceptions of staff-student relationships and the educational values, future aspirations, and engagement of LA’s BEST students. To this end, we developed a set of research questions which would help us examine the association between strong staff-student relationships—characterized by mutual trust, bonding, and support—and student variables such as academic engagement and future aspirations. To address these evaluation questions, staff and student surveys were piloted and developed by the National Center for Research on Evaluation, Standards, and Student Testing (CRESST) and widely administered to both afterschool staff and students. Descriptive statistics were computed for the survey data; HLM analyses and structural equation models were fitted to examine the variables. Afterschool programs have become much more than childcare providers for working parents or safe havens within violent communities. They have blossomed into powerful learning centers for students with lasting and far-reaching effects. These programs possess an asset that gives them the ability and opportunity to influence students to develop a belief system that will ultimately impact their academic and social futures—that asset is social capital. Executive Summary Afterschool programs offer an important avenue for enhancing educational opportunities. Federal, state, and local educational authorities increasingly see them as environments to improve attitudes toward school, achievement, and academic","PeriodicalId":19116,"journal":{"name":"National Center for Research on Evaluation, Standards, and Student Testing","volume":"41 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2007-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73848635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zenaida Aguirre‐Muñoz, Jae Eun Parks, A. Benner, A. Amabisca, C. Boscardin
The purpose of this report is to provide the theoretical rationale for the approach to academic language that was adopted to meet the research goals of the second phase of this project as well as to report on the results from the pilot training program that was developed to create the conditions under which varying levels of direct instruction in academic language occurs. The challenge was to find an approach for the instruction of academic language that would serve a dual purpose. The first purpose was aimed at building teachers’ understanding of the key components of academic language to improve their instructional decision-making. The second goal was to provide teachers with tools for providing ELLs with direct instruction on academic language and thereby support their English language development. After careful review of the literature, we found that the functional linguistic approach to language development best met these goals. We developed training modules on writing instruction based on the functional linguistic approach, as it has the strongest potential in providing explicit instruction to support ELL student writing development. Overall, teachers responded positively to the functional linguistic approach and were optimistic about its potential for improving ELL writing development. Responses to the pre-and post institute survey revealed that teachers felt better prepared in evaluating student writing from a functional linguistic perspective as well as in developing instructional plans that targeted specific learning needs.
{"title":"Consequences and Validity of Performance Assessment for English Language Learners: Conceptualizing & Developing Teachers' Expertise in Academic Language. CSE Technical Report 700.","authors":"Zenaida Aguirre‐Muñoz, Jae Eun Parks, A. Benner, A. Amabisca, C. Boscardin","doi":"10.1037/e644092011-001","DOIUrl":"https://doi.org/10.1037/e644092011-001","url":null,"abstract":"The purpose of this report is to provide the theoretical rationale for the approach to academic language that was adopted to meet the research goals of the second phase of this project as well as to report on the results from the pilot training program that was developed to create the conditions under which varying levels of direct instruction in academic language occurs. The challenge was to find an approach for the instruction of academic language that would serve a dual purpose. The first purpose was aimed at building teachers’ understanding of the key components of academic language to improve their instructional decision-making. The second goal was to provide teachers with tools for providing ELLs with direct instruction on academic language and thereby support their English language development. After careful review of the literature, we found that the functional linguistic approach to language development best met these goals. We developed training modules on writing instruction based on the functional linguistic approach, as it has the strongest potential in providing explicit instruction to support ELL student writing development. Overall, teachers responded positively to the functional linguistic approach and were optimistic about its potential for improving ELL writing development. Responses to the pre-and post institute survey revealed that teachers felt better prepared in evaluating student writing from a functional linguistic perspective as well as in developing instructional plans that targeted specific learning needs.","PeriodicalId":19116,"journal":{"name":"National Center for Research on Evaluation, Standards, and Student Testing","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2006-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73667316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Celebrating 20 Years of Research on Educational Assessment: Proceedings of the 2005 CRESST Conference. CSE Technical Report 698.","authors":"A. Lewis","doi":"10.1037/e644132011-001","DOIUrl":"https://doi.org/10.1037/e644132011-001","url":null,"abstract":"","PeriodicalId":19116,"journal":{"name":"National Center for Research on Evaluation, Standards, and Student Testing","volume":"52 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2006-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80290437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gregory K. W. K. Chung, E. Baker, David G. Brill, R. Sinha, F. Saadat, W. L. Bewley
A critical first step in developing training systems is gathering quality information about a trainee’s competency in a skill or knowledge domain. Such information includes an estimate of what the trainee knows prior to training, how much has been learned from training, how well the trainee may perform in future task situations, and whether to recommend remediation to bolster the trainee’s knowledge. This paper describes the design, development, testing, and application of a Web-based tool designed to assess a trainee’s understanding of a content domain in a distributed learning environment. The tool, called the CRESST Human Performance Knowledge Mapping Tool (HPKMT), enables trainees to express their understanding of a content area by creating graphical, network representations of concepts and links that define the relationships of concepts. Knowledge mappers have been used for several years, almost always as an aid for organizing information in support of problem solving or in instructional applications. To use knowledge maps as assessments there must be a reliable scoring method and there must be evidence for the validity of scores produced by the method. Further, to be practical in a distributed learning environment, the scoring should be automated. The HPKMT provides automated, reliable, and valid scoring, and its functionality and scoring method have been built from a base of empirical research. We review and evaluate alternative knowledge mapping scoring methods and online mapping systems. We then describe the overall design approach, functionality, scoring method, usability testing, and authoring capabilities of the CRESST HPKMT. The paper ends with descriptions of applications of the HPKMT to military training, limitations of the system, and next steps. A critical first step in developing learner-centric systems is gathering quality information about an individual’s competency in a skill or knowledge domain. Such information includes, for example, an estimate of what trainees know prior to training, how much they have learned from training, how well they may perform in a future target situation, or whether to recommend remediation content to bolster the trainees’ knowledge.
{"title":"Automated Assessment of Domain Knowledge with Online Knowledge Mapping. CSE Technical Report 692.","authors":"Gregory K. W. K. Chung, E. Baker, David G. Brill, R. Sinha, F. Saadat, W. L. Bewley","doi":"10.1037/e644222011-001","DOIUrl":"https://doi.org/10.1037/e644222011-001","url":null,"abstract":"A critical first step in developing training systems is gathering quality information about a trainee’s competency in a skill or knowledge domain. Such information includes an estimate of what the trainee knows prior to training, how much has been learned from training, how well the trainee may perform in future task situations, and whether to recommend remediation to bolster the trainee’s knowledge. This paper describes the design, development, testing, and application of a Web-based tool designed to assess a trainee’s understanding of a content domain in a distributed learning environment. The tool, called the CRESST Human Performance Knowledge Mapping Tool (HPKMT), enables trainees to express their understanding of a content area by creating graphical, network representations of concepts and links that define the relationships of concepts. Knowledge mappers have been used for several years, almost always as an aid for organizing information in support of problem solving or in instructional applications. To use knowledge maps as assessments there must be a reliable scoring method and there must be evidence for the validity of scores produced by the method. Further, to be practical in a distributed learning environment, the scoring should be automated. The HPKMT provides automated, reliable, and valid scoring, and its functionality and scoring method have been built from a base of empirical research. We review and evaluate alternative knowledge mapping scoring methods and online mapping systems. We then describe the overall design approach, functionality, scoring method, usability testing, and authoring capabilities of the CRESST HPKMT. The paper ends with descriptions of applications of the HPKMT to military training, limitations of the system, and next steps. A critical first step in developing learner-centric systems is gathering quality information about an individual’s competency in a skill or knowledge domain. Such information includes, for example, an estimate of what trainees know prior to training, how much they have learned from training, how well they may perform in a future target situation, or whether to recommend remediation content to bolster the trainees’ knowledge.","PeriodicalId":19116,"journal":{"name":"National Center for Research on Evaluation, Standards, and Student Testing","volume":"20 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2006-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73096529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A significant amount of research treats students who speak a language other than English at home, or language-minority students, as a single demographic group and compares them to students who speak only English at home. If important disparities in early school experiences among language-minority students have been overlooked, then policies aimed at helping them as they begin formal schooling may fall short, as they will not attend to the needs of specific subpopulations. This paper uses data from the Early Childhood Longitudinal Study-Kindergarten Cohort (ECLS-K) to address this gap in the literature by exploring language-minority students’ experiences with grade retention and special education placement and specifically examining variation among language-minority students based on race, immigrant status and socioeconomic status. Findings indicate that language-minority students are no more likely to be retained than their English-only counterparts, while they are less likely than their English-only counterparts to be placed in special education. Furthermore, there was no variation among language-minority students by race or immigrant status. These findings and their implications for language-minority students are explored in the conclusion.
{"title":"Language-Minority Students' Cognitive School Readiness and Success in Elementary School. CSE Technical Report 683.","authors":"L. T. Rutherford","doi":"10.1037/e644682011-001","DOIUrl":"https://doi.org/10.1037/e644682011-001","url":null,"abstract":"A significant amount of research treats students who speak a language other than English at home, or language-minority students, as a single demographic group and compares them to students who speak only English at home. If important disparities in early school experiences among language-minority students have been overlooked, then policies aimed at helping them as they begin formal schooling may fall short, as they will not attend to the needs of specific subpopulations. This paper uses data from the Early Childhood Longitudinal Study-Kindergarten Cohort (ECLS-K) to address this gap in the literature by exploring language-minority students’ experiences with grade retention and special education placement and specifically examining variation among language-minority students based on race, immigrant status and socioeconomic status. Findings indicate that language-minority students are no more likely to be retained than their English-only counterparts, while they are less likely than their English-only counterparts to be placed in special education. Furthermore, there was no variation among language-minority students by race or immigrant status. These findings and their implications for language-minority students are explored in the conclusion.","PeriodicalId":19116,"journal":{"name":"National Center for Research on Evaluation, Standards, and Student Testing","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2006-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78870729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2006-06-01DOI: 10.4324/9780203759820-19
S. Chuang, H. O'Neil
Collaborative problem solving and collaborative skills are considered necessary skills for success in today's world. Collaborative problem solving is defined as problem solving activities that involve interactions among a group of individuals. Large-scale and small-scale assessment programs increasingly use collaborative group tasks in which students work together to solve problems or to accomplish projects. This study attempts to research the role of feedback on a computer-based collaborative problem solving task by extending Hsieh and O’Neil’s (2002) computer-based collaborative knowledge mapping study. In their study, groups of students searched a web environment of information to improve a knowledge map. Various types of feedback were investigated. They found that searching has a negative relationship with group outcome (knowledge map scores). By teaching searching and by providing different types of feedback, this study explores the effects of students’ teamwork and problem solving processes on students’ knowledge mapping performance. Moreover, the effects of two types of feedback (adapted knowledge of response feedback and task-specific adapted knowledge of response feedback) were also investigated. One hundred and twenty college students (60 groups) participated in the main study. The students were randomly assigned either to be a group leader whose responsibility was to construct the map or to be a group searcher whose responsibility was to help the leader construct the map by seeking information and accessing feedback from the Web environment. Results showed that task-specific adapted knowledge of response feedback was significantly more beneficial to group outcome than adapted knowledge of response feedback. In addition, as predicted, for the problem solving process, information seeking including request of feedback, browsing, searching for information and searching using Boolean operators were all significantly related to group outcome for both groups.
协作解决问题和协作技能被认为是当今世界成功的必要技能。协作解决问题被定义为涉及一组个人之间的相互作用的问题解决活动。大规模和小规模的评估项目越来越多地使用协作小组任务,学生们一起工作来解决问题或完成项目。本研究试图延伸Hsieh and O 'Neil(2002)基于计算机的协作知识图谱研究,来研究反馈在基于计算机的协作问题解决任务中的作用。在他们的研究中,学生小组在网络环境中搜索信息,以改进知识地图。研究了各种类型的反馈。他们发现搜索与小组结果(知识地图分数)呈负相关。本研究通过教学搜索和提供不同类型的反馈,探讨学生的团队合作和问题解决过程对学生知识图谱绩效的影响。此外,还研究了两种类型的反馈(适应性知识的反应反馈和特定任务的适应性知识的反应反馈)的效果。120名大学生(60组)参与了主要研究。这些学生被随机分配,其中一组是组长,负责绘制地图,另一组是小组搜索者,负责通过从网络环境中寻找信息和获取反馈来帮助组长绘制地图。结果表明,特定任务的适应性反应反馈知识比适应性反应反馈知识更有利于小组结果。此外,正如预测的那样,在问题解决过程中,信息寻求包括反馈请求、浏览、搜索信息和使用布尔运算符搜索都与两组结果显著相关。
{"title":"Role of Task-Specific Adapted Feedback on a Computer-Based Collaborative Problem-Solving Task. CSE Report 684.","authors":"S. Chuang, H. O'Neil","doi":"10.4324/9780203759820-19","DOIUrl":"https://doi.org/10.4324/9780203759820-19","url":null,"abstract":"Collaborative problem solving and collaborative skills are considered necessary skills for success in today's world. Collaborative problem solving is defined as problem solving activities that involve interactions among a group of individuals. Large-scale and small-scale assessment programs increasingly use collaborative group tasks in which students work together to solve problems or to accomplish projects. This study attempts to research the role of feedback on a computer-based collaborative problem solving task by extending Hsieh and O’Neil’s (2002) computer-based collaborative knowledge mapping study. In their study, groups of students searched a web environment of information to improve a knowledge map. Various types of feedback were investigated. They found that searching has a negative relationship with group outcome (knowledge map scores). By teaching searching and by providing different types of feedback, this study explores the effects of students’ teamwork and problem solving processes on students’ knowledge mapping performance. Moreover, the effects of two types of feedback (adapted knowledge of response feedback and task-specific adapted knowledge of response feedback) were also investigated. One hundred and twenty college students (60 groups) participated in the main study. The students were randomly assigned either to be a group leader whose responsibility was to construct the map or to be a group searcher whose responsibility was to help the leader construct the map by seeking information and accessing feedback from the Web environment. Results showed that task-specific adapted knowledge of response feedback was significantly more beneficial to group outcome than adapted knowledge of response feedback. In addition, as predicted, for the problem solving process, information seeking including request of feedback, browsing, searching for information and searching using Boolean operators were all significantly related to group outcome for both groups.","PeriodicalId":19116,"journal":{"name":"National Center for Research on Evaluation, Standards, and Student Testing","volume":"78 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2006-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91243340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Consistent behavior is a desirable characteristic that common items are expected to have when administered to different groups. Findings from the literature have established that items do not always behave in consistent ways; item indices and IRT item parameter estimates of the same items differ when obtained from different administrations. Content effects, such as discrepancies in instructional emphasis, and context effects, such as changes in the presentation, format, and positioning of the item, may result in differential item difficulty for different groups. When common items are differentially difficult for two groups, using them to generate an equating transformation is questionable. The delta-plot method is a simple, graphical procedure that identifies such items by examining their classical test theory difficulty values. After inspection, such items are likely to drop to a noncommon-item status. Two studies are described in this report. Study 1 investigates the influence of common items that behave inconsistently across two administrations on equated score summaries. Study 2 applies an alternative to the delta-plot method for flagging common items for differential behavior across administrations. The first study examines the effects of retaining versus discarding the common items flagged as outliers by the delta-plot method on equated score summary statistics. For four statewide assessments that were administered in two consecutive years under the common-item nonequivalent groups design, the equating functions that transform the Year2 to the Year-1 scale are estimated using four different IRT equating methods (Stocking & Lord, Haebara, mean/sigma, mean/mean) under two IRT models—the threeand the oneparameter logistic models for the dichotomous items with Samejima’s (1969) graded response model for polytomous items. The changes in the Year-2 equated mean scores, mean gains or declines from Year 1 to Year 2, and proportions above a cut-off point are examined when all the common items are used in the equating process versus when the delta-plot outliers are excluded from the common-item pool. Results under the four equating methods 1 The author would like to thank Edward Haertel for his thoughtful guidance on this project and for reviewing this report. Thanks also to Measured Progress Inc. for providing data for this project, as well as John Donoghue, Neil Dorans, Kyoko Ito, Michael Jodoin, Michael Nering, David Rogosa, Richard Shavelson, Wendy Yen, Rebecca Zwick and seminar participants at CTB/McGraw Hill and ETS for suggestions. Any errors and omissions are the responsibility of the author. Results from the two studies in this report were presented at the 2003 and 2005 Annual Meetings of the American Educational Research Association.
{"title":"Effects of Misbehaving Common Items on Aggregate Scores and an Application of the Mantel-Haenszel Statistic in Test Equating. CSE Report 688.","authors":"M. Michaelides","doi":"10.1037/e644592011-001","DOIUrl":"https://doi.org/10.1037/e644592011-001","url":null,"abstract":"Consistent behavior is a desirable characteristic that common items are expected to have when administered to different groups. Findings from the literature have established that items do not always behave in consistent ways; item indices and IRT item parameter estimates of the same items differ when obtained from different administrations. Content effects, such as discrepancies in instructional emphasis, and context effects, such as changes in the presentation, format, and positioning of the item, may result in differential item difficulty for different groups. When common items are differentially difficult for two groups, using them to generate an equating transformation is questionable. The delta-plot method is a simple, graphical procedure that identifies such items by examining their classical test theory difficulty values. After inspection, such items are likely to drop to a noncommon-item status. Two studies are described in this report. Study 1 investigates the influence of common items that behave inconsistently across two administrations on equated score summaries. Study 2 applies an alternative to the delta-plot method for flagging common items for differential behavior across administrations. The first study examines the effects of retaining versus discarding the common items flagged as outliers by the delta-plot method on equated score summary statistics. For four statewide assessments that were administered in two consecutive years under the common-item nonequivalent groups design, the equating functions that transform the Year2 to the Year-1 scale are estimated using four different IRT equating methods (Stocking & Lord, Haebara, mean/sigma, mean/mean) under two IRT models—the threeand the oneparameter logistic models for the dichotomous items with Samejima’s (1969) graded response model for polytomous items. The changes in the Year-2 equated mean scores, mean gains or declines from Year 1 to Year 2, and proportions above a cut-off point are examined when all the common items are used in the equating process versus when the delta-plot outliers are excluded from the common-item pool. Results under the four equating methods 1 The author would like to thank Edward Haertel for his thoughtful guidance on this project and for reviewing this report. Thanks also to Measured Progress Inc. for providing data for this project, as well as John Donoghue, Neil Dorans, Kyoko Ito, Michael Jodoin, Michael Nering, David Rogosa, Richard Shavelson, Wendy Yen, Rebecca Zwick and seminar participants at CTB/McGraw Hill and ETS for suggestions. Any errors and omissions are the responsibility of the author. Results from the two studies in this report were presented at the 2003 and 2005 Annual Meetings of the American Educational Research Association.","PeriodicalId":19116,"journal":{"name":"National Center for Research on Evaluation, Standards, and Student Testing","volume":"86 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2006-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84946018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The findings and opinions expressed in this report are those of the authors and do not necessarily reflect the positions or policies of the Our purpose in this report is to present and discuss competing accountability approaches, or models, designed to systematically indicate how a school's students are performing academically. Within the framework of the current federally mandated accountability legislation, increased interest in models measuring school performance has caused educational policymakers to consider several key issues. These issues include whether results from different accountability models yield different inferences about a school's performance; what assumptions underlie each of the models; how different models are implemented; and ultimately which model is best suited for a particular context. We address these issues by building a framework for accountability models and then explicitly comparing and contrasting these competing models. In order to accomplish this, we first need to examine two distinct pieces of the larger puzzle. With the first piece, we briefly summarize previous research on school performance. This is done in order to ground all of the accountability models and provide some reference for considering how an accountability model might be constructed. With the second piece, we present building blocks for accountability models. These building blocks include a) important properties of assessments, b) test metrics, c) ways of summarizing student achievement, and d) monitoring achievement growth over time; all of which need to be considered before they are incorporated into an accountability model. Once we have the foundation and building blocks in place we can examine the continuum of accountability models, each of which results in a performance indicator. We consider the choice of model as lying on a continuum because accountability models range from simple calculations on the one end to complex statistical models on the other. At the upper end of the spectrum is a set of accountability models known as value-added models (VAM), which we compare separately. We also compare inferences based on one of these VAMs against inferences based on current federally mandated accountability models. 1 Examining competing accountability models and linking them back to the foundations and building blocks leads to both theoretical and practical implications that are central in considering which model is most appropriate for a given (physical and political) context. One fundamental concern is whether the accountability model can accurately capture the academic progress of underprivileged students (e.g., low socioeconomic status [SES]) and, by extension, …
{"title":"Exploring Models of School Performance: From Theory to Practice. CSE Report 673.","authors":"Kilchan Choi, P. Goldschmidt, Kyo Yamashiro","doi":"10.1037/e644902011-001","DOIUrl":"https://doi.org/10.1037/e644902011-001","url":null,"abstract":"The findings and opinions expressed in this report are those of the authors and do not necessarily reflect the positions or policies of the Our purpose in this report is to present and discuss competing accountability approaches, or models, designed to systematically indicate how a school's students are performing academically. Within the framework of the current federally mandated accountability legislation, increased interest in models measuring school performance has caused educational policymakers to consider several key issues. These issues include whether results from different accountability models yield different inferences about a school's performance; what assumptions underlie each of the models; how different models are implemented; and ultimately which model is best suited for a particular context. We address these issues by building a framework for accountability models and then explicitly comparing and contrasting these competing models. In order to accomplish this, we first need to examine two distinct pieces of the larger puzzle. With the first piece, we briefly summarize previous research on school performance. This is done in order to ground all of the accountability models and provide some reference for considering how an accountability model might be constructed. With the second piece, we present building blocks for accountability models. These building blocks include a) important properties of assessments, b) test metrics, c) ways of summarizing student achievement, and d) monitoring achievement growth over time; all of which need to be considered before they are incorporated into an accountability model. Once we have the foundation and building blocks in place we can examine the continuum of accountability models, each of which results in a performance indicator. We consider the choice of model as lying on a continuum because accountability models range from simple calculations on the one end to complex statistical models on the other. At the upper end of the spectrum is a set of accountability models known as value-added models (VAM), which we compare separately. We also compare inferences based on one of these VAMs against inferences based on current federally mandated accountability models. 1 Examining competing accountability models and linking them back to the foundations and building blocks leads to both theoretical and practical implications that are central in considering which model is most appropriate for a given (physical and political) context. One fundamental concern is whether the accountability model can accurately capture the academic progress of underprivileged students (e.g., low socioeconomic status [SES]) and, by extension, …","PeriodicalId":19116,"journal":{"name":"National Center for Research on Evaluation, Standards, and Student Testing","volume":"25 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2006-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82601701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This study examined participation in preschool and its relationship with the cognitive and social development of language minority students. Although there is a large body of research that demonstrates the cognitive and social benefits of attending preschool (Barnett, 1995; Gorey, 2001; National Research Council, Committee on Early Childhood Pedagogy, 2000; Vandell, 2004), very little of this research has included language minority students, or at least those who do not speak English. Either non-English speaking families are not included in the design of the study, such as with the widely cited National Institute for Child Health and Development (NICHD) Early Child Care Study, or the studies are based on cognitive and social assessments that are only conducted in English (e.g., Magnuson, Meyers, Ruhm, & Waldfogel, 2004). Consequently, little is known about participation in and outcomes of preschool for the growing population of language minority students.
{"title":"Preschool Participation and the Cognitive and Social Development of Language-Minority Students. CSE Technical Report 674. UC LMRI Technical Report.","authors":"R. Rumberger, Loan Tran","doi":"10.1037/e644812011-001","DOIUrl":"https://doi.org/10.1037/e644812011-001","url":null,"abstract":"This study examined participation in preschool and its relationship with the cognitive and social development of language minority students. Although there is a large body of research that demonstrates the cognitive and social benefits of attending preschool (Barnett, 1995; Gorey, 2001; National Research Council, Committee on Early Childhood Pedagogy, 2000; Vandell, 2004), very little of this research has included language minority students, or at least those who do not speak English. Either non-English speaking families are not included in the design of the study, such as with the widely cited National Institute for Child Health and Development (NICHD) Early Child Care Study, or the studies are based on cognitive and social assessments that are only conducted in English (e.g., Magnuson, Meyers, Ruhm, & Waldfogel, 2004). Consequently, little is known about participation in and outcomes of preschool for the growing population of language minority students.","PeriodicalId":19116,"journal":{"name":"National Center for Research on Evaluation, Standards, and Student Testing","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2006-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84211894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}