Educational Assessment最新文献

英文中文

Measuring Students' Ability to Engage in Scientific Inquiry: A New Instrument to Assess Data Analysis, Explanation, and Argumentation. 测量学生从事科学探究的能力:一种评估数据分析、解释和论证的新工具。

IF 1.5 Q1 EDUCATION & EDUCATIONAL RESEARCH

Educational Assessment

Pub Date : 2020-01-01 Epub Date: 2020-05-07 DOI: 10.1080/10627197.2020.1756253

Kavita L Seeratan, Kevin W McElhaney, Jessica Mislevy, Raymond McGhee, Dylan Conger, Mark C Long

We describe the conceptualization, design, development, validation, and testing of a summative instrument that measures high school students' ability to analyze and evaluate data, construct scientific explanations, and formulate scientific arguments in biology and chemistry disciplinary contexts. Data from 1,405 students were analyzed to evaluate the properties of the instrument. Student measurement separation reliability was 0.71 with items showing satisfactory fit to the Partial Credit Model. The use of the Evidence-Centered Design framework during the design and development process provided a strong foundation for the validity argument. Additional evidence for validation were also gathered. The strengths of the instrument lie in its relatively brief time for administration and a unique approach that integrates science practice and disciplinary knowledge, while simultaneously seeking to decouple their measurement. This research models how to design assessments that align to the National Research Council's framework and informs the design of Next Generation Science Standards-aligned assessments.

我们描述了一个总结性工具的概念、设计、开发、验证和测试，该工具可以衡量高中生在生物和化学学科背景下分析和评估数据、构建科学解释和制定科学论点的能力。分析了1405名学生的数据，以评估该仪器的性能。学生测量分离信度为0.71，项目显示满意的拟合部分信用模型。在设计和开发过程中，以证据为中心的设计框架的使用为有效性论证提供了坚实的基础。还收集了验证的其他证据。该仪器的优势在于其相对较短的管理时间和一种独特的方法，它将科学实践和学科知识结合起来，同时寻求将它们的测量分离开来。这项研究模拟了如何设计与国家研究委员会框架一致的评估，并为下一代科学标准评估的设计提供了信息。

{"title":"Measuring Students' Ability to Engage in Scientific Inquiry: A New Instrument to Assess Data Analysis, Explanation, and Argumentation.","authors":"Kavita L Seeratan, Kevin W McElhaney, Jessica Mislevy, Raymond McGhee, Dylan Conger, Mark C Long","doi":"10.1080/10627197.2020.1756253","DOIUrl":"https://doi.org/10.1080/10627197.2020.1756253","url":null,"abstract":"<p><p>We describe the conceptualization, design, development, validation, and testing of a summative instrument that measures high school students' ability to analyze and evaluate data, construct scientific explanations, and formulate scientific arguments in biology and chemistry disciplinary contexts. Data from 1,405 students were analyzed to evaluate the properties of the instrument. Student measurement separation reliability was 0.71 with items showing satisfactory fit to the Partial Credit Model. The use of the Evidence-Centered Design framework during the design and development process provided a strong foundation for the validity argument. Additional evidence for validation were also gathered. The strengths of the instrument lie in its relatively brief time for administration and a unique approach that integrates science practice and disciplinary knowledge, while simultaneously seeking to decouple their measurement. This research models how to design assessments that align to the National Research Council's framework and informs the design of Next Generation Science Standards-aligned assessments.</p>","PeriodicalId":46209,"journal":{"name":"Educational Assessment","volume":"25 2","pages":"112-135"},"PeriodicalIF":1.5,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/10627197.2020.1756253","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38895806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Examining the Use and Construct Fidelity of Technology-Enhanced Items Employed by K-12 Testing Programs 检验K-12测试项目中技术增强项目的使用和构建保真度

IF 1.5 Q1 EDUCATION & EDUCATIONAL RESEARCH

Educational Assessment

Pub Date : 2019-09-27 DOI: 10.1080/10627197.2019.1670055

M. Russell, Sebastian Moncaleano

ABSTRACT Over the past decade, large-scale testing programs have employed technology-enhanced items (TEI) to improve the fidelity with which an item measures a targeted construct. This paper presents findings from a review of released TEIs employed by large-scale testing programs worldwide. Analyses examine the prevalence with which different types of TEIs are employed and the content areas and grade levels in which they are employed. The analyses apply the Technology-Enhanced Item Utility Framework to examine the fidelity with which current TEIs represent targeted constructs. The analyses indicate that the most common type of TEI employed by testing programs is a drag-and-drop response interaction. Analyses indicate that approximately 40% of the TEIs examined provide a high-level of construct fidelity, while an approximately equal proportion provide low construct fidelity. Finally, the data indicate that a large portion of drag-and-drop items are of low fidelity while other TEI types provide moderate or high fidelity.

摘要在过去的十年里，大规模的测试项目采用了技术增强项目（TEI）来提高项目测量目标结构的保真度。本文介绍了对世界范围内大规模测试项目使用的已发布TEI的审查结果。分析考察了不同类型TEI的使用率以及使用它们的内容领域和等级水平。分析应用技术增强型项目实用程序框架来检查当前TEI表示目标结构的保真度。分析表明，测试程序使用的最常见的TEI类型是拖放响应交互。分析表明，大约40%的TEI提供了高水平的结构保真度，而大约相同比例的TEI则提供了低结构保真度。最后，数据表明，大部分拖放项目是低保真度的，而其他TEI类型提供中等或高保真度。

引用次数: 4

Toward a Teacher Professional Learning Continuum in Assessment for Learning 学习评价中的教师专业学习连续体

IF 1.5 Q1 EDUCATION & EDUCATIONAL RESEARCH

Educational Assessment

Pub Date : 2019-09-26 DOI: 10.1080/10627197.2019.1670056

Christopher DeLuca, Allison E. A. Chapman-Chin, D. Klinger

ABSTRACT Over the past 15 years, assessment for learning (AfL) has emerged as a key area of teacher practice with policy mandates around the world supporting teachers’ implementation of the underlying components of this pedagogical approach. While procedural and selective implementation of AfL strategies has been observed within research (i.e., implementing the letter of AfL), promoting a spirit of AfL appears far more challenging. There is a critical need to better understand how teachers develop AfL capacity within their practice to effectively cultivate a spirit of AfL in their classrooms. The purpose of this study was to describe a learning continuum for teachers’ implementation of AfL as based on data from 88 teachers. Specifically, interview and observational data were analyzed to describe five developmental stages demarcating shifts in teachers’ conceptual understandings and enacted AfL practices. The resulting learning continuum provides an empirical foundation for responsive teacher education that facilitates teachers’ continued learning toward more meaningful AfL implementation.

摘要在过去的15年里，学习评估（AfL）已成为教师实践的一个关键领域，世界各地的政策授权都支持教师实施这种教学方法的基本组成部分。虽然在研究中观察到了AfL战略的程序性和选择性实施（即实施AfL信函），但促进AfL精神似乎更具挑战性。迫切需要更好地了解教师如何在实践中培养AfL能力，以在课堂上有效培养AfL精神。本研究的目的是基于88名教师的数据，描述教师实施AfL的学习连续体。具体而言，访谈和观察数据被分析以描述五个发展阶段，划分教师概念理解和实施AfL实践的转变。由此产生的学习连续体为响应式教师教育提供了经验基础，有助于教师继续学习，实现更有意义的AfL。

{"title":"Toward a Teacher Professional Learning Continuum in Assessment for Learning","authors":"Christopher DeLuca, Allison E. A. Chapman-Chin, D. Klinger","doi":"10.1080/10627197.2019.1670056","DOIUrl":"https://doi.org/10.1080/10627197.2019.1670056","url":null,"abstract":"ABSTRACT Over the past 15 years, assessment for learning (AfL) has emerged as a key area of teacher practice with policy mandates around the world supporting teachers’ implementation of the underlying components of this pedagogical approach. While procedural and selective implementation of AfL strategies has been observed within research (i.e., implementing the letter of AfL), promoting a spirit of AfL appears far more challenging. There is a critical need to better understand how teachers develop AfL capacity within their practice to effectively cultivate a spirit of AfL in their classrooms. The purpose of this study was to describe a learning continuum for teachers’ implementation of AfL as based on data from 88 teachers. Specifically, interview and observational data were analyzed to describe five developmental stages demarcating shifts in teachers’ conceptual understandings and enacted AfL practices. The resulting learning continuum provides an empirical foundation for responsive teacher education that facilitates teachers’ continued learning toward more meaningful AfL implementation.","PeriodicalId":46209,"journal":{"name":"Educational Assessment","volume":"24 1","pages":"267 - 285"},"PeriodicalIF":1.5,"publicationDate":"2019-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/10627197.2019.1670056","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49272871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 22

Measuring Reading Strategy Use 衡量阅读策略的使用

IF 1.5 Q1 EDUCATION & EDUCATIONAL RESEARCH

Educational Assessment

Pub Date : 2019-09-11 DOI: 10.35542/osf.io/f6vu9

D. Arya, Anthony Clairmont, Daniel Katz, A. Maul

ABSTRACT This study describes the development and validation of a multidimensional measure of preadolescent and adolescent readers’ abilities to apply reading comprehension strategies necessary for understanding challenging academic texts. The Strategy Use Measure (SUM) was designed with the intention of being pedagogically informative to the increasingly multilingual student population in the U.S. in grades 6 through 8. The SUM aims to measure four areas of knowledge and skill that are widely purported to support the use of reading strategies: (a) morphological awareness, (b) knowledge of cognates, (c) ability to relate micro- and macro- ideas within a text, and (d) the ability to use intra- and inter-sentential context clues for defining unfamiliar words. The test was developed following a principled, iterative process to instrument development, employing Rasch models and qualitative investigations to test hypotheses related to the instrument’s validity. Findings suggest promising evidence for the validity and fairness of this multidimensional measure.

摘要本研究描述了对青春期前和青少年读者应用阅读理解策略的能力的多维测量的发展和验证，这些策略是理解具有挑战性的学术文本所必需的。策略使用量表（SUM）旨在为美国6至8年级越来越多的多语言学生提供教学信息。SUM旨在衡量被广泛认为支持阅读策略使用的四个知识和技能领域：（a）形态意识，（b）同源词知识，（c）在文本中联系微观和宏观想法的能力，以及（d）使用句内和句间上下文线索来定义不熟悉单词的能力。该测试是根据仪器开发的原则性迭代过程开发的，采用Rasch模型和定性调查来测试与仪器有效性相关的假设。研究结果为这一多层面措施的有效性和公平性提供了有希望的证据。

引用次数: 3

Does It Matter if Examinee Motivation Is Measured before or after A Low-Stakes Test? A Moderated Mediation Analysis 是在低风险测试之前还是之后衡量考生的动机重要吗？适度调解分析

IF 1.5 Q1 EDUCATION & EDUCATIONAL RESEARCH

Educational Assessment

Pub Date : 2019-08-26 DOI: 10.1080/10627197.2019.1645591

Aaron J. Myers, S. Finney

ABSTRACT The indirect effect of perceived test importance on test performance via examinee effort is often modeled using importance and effort scores measured after test completion, which does not align with their theoretical temporal ordering. These retrospectively measured scores may be influenced by examinees’ test performance. To investigate the impact of timing of measurement, college students were randomly assigned to one of the three conditions: (a) importance and effort were measured retrospectively, (b) importance and effort were measured retrospectively and importance was measured prospectively, and (c) importance and effort were measured retrospectively and prospectively. The unstandardized indirect effect was invariant across conditions when modeling prospective and retrospective scores. Priming examinees via prospectively measuring importance and effort did not affect the interrelations among performance and retrospective importance and effort (i.e., invariant indirect effect). Priming did lead to higher average test performance. Thus, priming may provide a cheap intervention for increasing test performance.

摘要感知的考试重要性通过考生的努力对考试成绩的间接影响通常是使用重要性和考试完成后测量的努力分数来建模的，这与他们的理论时间顺序不一致。这些回顾性测量的分数可能会受到考生考试成绩的影响。为了调查测量时间的影响，大学生被随机分配到以下三种情况之一：（a）回顾性测量重要性和努力，（b）回顾性评估重要性和努力并前瞻性评估重要性，以及（c）回顾性和前瞻性评估重要度和努力。当对前瞻性和回顾性评分进行建模时，非标准化的间接效应在不同条件下是不变的。通过前瞻性测量重要性和努力来启动受试者，不会影响表现与回顾性重要性和努力之间的相互关系（即不变的间接效应）。涂底漆确实提高了平均测试性能。因此，启动可以为提高测试性能提供廉价的干预措施。

{"title":"Does It Matter if Examinee Motivation Is Measured before or after A Low-Stakes Test? A Moderated Mediation Analysis","authors":"Aaron J. Myers, S. Finney","doi":"10.1080/10627197.2019.1645591","DOIUrl":"https://doi.org/10.1080/10627197.2019.1645591","url":null,"abstract":"ABSTRACT The indirect effect of perceived test importance on test performance via examinee effort is often modeled using importance and effort scores measured after test completion, which does not align with their theoretical temporal ordering. These retrospectively measured scores may be influenced by examinees’ test performance. To investigate the impact of timing of measurement, college students were randomly assigned to one of the three conditions: (a) importance and effort were measured retrospectively, (b) importance and effort were measured retrospectively and importance was measured prospectively, and (c) importance and effort were measured retrospectively and prospectively. The unstandardized indirect effect was invariant across conditions when modeling prospective and retrospective scores. Priming examinees via prospectively measuring importance and effort did not affect the interrelations among performance and retrospective importance and effort (i.e., invariant indirect effect). Priming did lead to higher average test performance. Thus, priming may provide a cheap intervention for increasing test performance.","PeriodicalId":46209,"journal":{"name":"Educational Assessment","volume":"26 1","pages":"1 - 19"},"PeriodicalIF":1.5,"publicationDate":"2019-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/10627197.2019.1645591","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42896999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Predictive Validity of SES Measures for Student Achievement 社会经济能力量表对学生成绩的预测效度

IF 1.5 Q1 EDUCATION & EDUCATIONAL RESEARCH

Educational Assessment

Pub Date : 2019-08-26 DOI: 10.1080/10627197.2019.1645590

Jihyun Lee, Yang Zhang, L. Stankov

ABSTRACT This study aims to identify which socio-economic status (SES) variables have the best predictive validity for academic achievement, based on the international data sets of the Programme for International Student Assessment (PISA) in 2012, 2009, 2006, and 2003. From among 10 SES measures, two composite variables - Index of economic, social and cultural status (ESCS) and Home possessions (HOMEPOS) - showed superior predictive power for student achievement. Their pan-cultural correlations with the PISA 2012 mathematics achievement were r = .40 and r = .36, respectively. Parental occupation status (r = .33) outperformed all other single measures of SES, including parental education (r = .29). Only two SES variables (i.e., family wealth and home possessions) showed non-linear relationships with academic achievement. We conclude with practical implications and recommendations for using SES measures as predictors of student achievement in educational research and point to the importance of a theoretical alignment between SES measures and particular issues to be addressed.

本研究旨在基于国际学生评估项目(PISA)在2012年、2009年、2006年和2003年的国际数据集，确定哪些社会经济地位(SES)变量对学业成绩具有最佳的预测效度。在10项社会经济地位指标中，经济、社会、文化地位指数(ESCS)和家庭财产指数(HOMEPOS)这两个综合变量对学生成绩的预测能力更强。他们与PISA 2012数学成绩的泛文化相关性分别为r = 0.40和r = 0.36。父母职业状况(r = 0.33)优于所有其他单一的社会经济地位测量，包括父母教育(r = 0.29)。只有两个社会经济地位变量(即家庭财富和住房)与学业成绩呈非线性关系。最后，我们提出了在教育研究中使用社会经济地位指标作为学生成绩预测指标的实际意义和建议，并指出了社会经济地位指标与需要解决的特定问题之间的理论一致性的重要性。

引用次数: 17

Do Students Rapidly Guess Repeatedly over Time? A Longitudinal Analysis of Student Test Disengagement, Background, and Attitudes 学生能否随着时间的推移快速地反复猜测?学生考试脱离、背景和态度的纵向分析

IF 1.5 Q1 EDUCATION & EDUCATIONAL RESEARCH

Educational Assessment

Pub Date : 2019-08-26 DOI: 10.1080/10627197.2019.1645592

J. Soland, Megan Kuhfeld

ABSTRACT Considerable research has examined the use of rapid guessing measures to identify disengaged item responses. However, little is known about students who rapidly guess over the course of several tests. In this study, we use achievement test data from six administrations over three years to investigate whether rapid guessing is a stable trait-like behavior or if rapid guessing is determined mostly by situational variables. Additionally, we examine whether rapid guessing over the course of several tests is associated with certain psychological and background measures. We find that rapid guessing tends to be more state-like compared to academic achievement scores, which are fairly stable. Further, we show that repeated rapid guessing is strongly associated with students’ academic self-efficacy and self-management scores. These findings have implications for detecting rapid guessing and intervening to reduce its effect on observed achievement test scores.

相当多的研究已经检验了使用快速猜测措施来识别不参与的项目反应。然而，人们对那些在几次测试中快速猜出答案的学生知之甚少。在这项研究中，我们使用来自6个行政部门超过三年的成就测试数据来调查快速猜测是一种稳定的特质行为，还是快速猜测主要由情境变量决定。此外，我们研究了在几个测试过程中的快速猜测是否与某些心理和背景测量有关。我们发现，与相当稳定的学业成绩分数相比，快速猜测往往更像一种状态。此外，我们发现重复快速猜测与学生的学业自我效能和自我管理分数密切相关。这些发现对检测快速猜测和干预以减少其对观察到的成就测试分数的影响具有启示意义。

引用次数: 9

Intentional Professional Learning Design: Models, Tools, and the Synergies they Produce Supporting Teacher Growth 意向性专业学习设计：支持教师成长的模型、工具及其产生的协同效应

IF 1.5 Q1 EDUCATION & EDUCATIONAL RESEARCH

Educational Assessment

Pub Date : 2019-08-01 DOI: 10.1080/10627197.2020.1766961

V. Mills, C. Harrison

ABSTRACT The need and desire to understand and adopt formative assessment practices remain high on the agenda at all levels of educational systems around the world. To advance teachers’ use of formative assessment, research attention also needs to be paid to (a) understanding the challenges teachers face when asked to utilize formative assessment practices in subject-specific content areas and (b) to the development of appropriate and sufficiently powerful professional learning designs that can enable change for teachers. To begin addressing these needs, this paper offers a close examination of an intentionally designed professional learning (PL) series to help middle and high school Algebra I teachers understand the formative assessment process and then track and advance their classroom practice. The professional learning design, in this case, is based on a collaborative and formative approach to classroom practice and teacher change with high school mathematics teachers. Together, the PL model and tools provide a formative framework that bridges the theory-practice divide enabling teachers to conceptualize and then plan for, reflect on, and revise the ways in which new formative assessment practices are implemented in their classrooms. Through an analysis of the affordances and constraints of the PL design in practice, this paper provides insights into how discipline-specific professional learning can be better developed and supported throughout the teacher growth process.

摘要理解和采用形成性评估实践的需求和愿望仍然是世界各级教育系统议程上的重要内容。为了促进教师对形成性评估的使用，还需要关注研究：（a）了解教师在被要求在特定科目的内容领域使用形成性评估实践时所面临的挑战；（b）开发适当且足够强大的专业学习设计，使教师能够做出改变。为了开始解决这些需求，本文对一个有意设计的专业学习（PL）系列进行了仔细的研究，以帮助中学和高中代数I教师理解形成性评估过程，然后跟踪和推进他们的课堂实践。在这种情况下，专业学习设计是基于与高中数学教师在课堂实践和教师更换方面的合作和形成性方法。PL模型和工具共同提供了一个形成性框架，弥合了理论和实践的鸿沟，使教师能够概念化，然后规划、反思和修改在课堂上实施新的形成性评估实践的方式。通过分析PL设计在实践中的可供性和约束，本文深入了解了如何在教师成长过程中更好地发展和支持特定学科的专业学习。

{"title":"Intentional Professional Learning Design: Models, Tools, and the Synergies they Produce Supporting Teacher Growth","authors":"V. Mills, C. Harrison","doi":"10.1080/10627197.2020.1766961","DOIUrl":"https://doi.org/10.1080/10627197.2020.1766961","url":null,"abstract":"ABSTRACT The need and desire to understand and adopt formative assessment practices remain high on the agenda at all levels of educational systems around the world. To advance teachers’ use of formative assessment, research attention also needs to be paid to (a) understanding the challenges teachers face when asked to utilize formative assessment practices in subject-specific content areas and (b) to the development of appropriate and sufficiently powerful professional learning designs that can enable change for teachers. To begin addressing these needs, this paper offers a close examination of an intentionally designed professional learning (PL) series to help middle and high school Algebra I teachers understand the formative assessment process and then track and advance their classroom practice. The professional learning design, in this case, is based on a collaborative and formative approach to classroom practice and teacher change with high school mathematics teachers. Together, the PL model and tools provide a formative framework that bridges the theory-practice divide enabling teachers to conceptualize and then plan for, reflect on, and revise the ways in which new formative assessment practices are implemented in their classrooms. Through an analysis of the affordances and constraints of the PL design in practice, this paper provides insights into how discipline-specific professional learning can be better developed and supported throughout the teacher growth process.","PeriodicalId":46209,"journal":{"name":"Educational Assessment","volume":"25 1","pages":"331 - 354"},"PeriodicalIF":1.5,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/10627197.2020.1766961","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49341653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Validity Evidence Supporting Use of Anchoring Vignettes to Measure Teaching Practice 支持使用锚定小片段来衡量教学实践的效度证据

IF 1.5 Q1 EDUCATION & EDUCATIONAL RESEARCH

Educational Assessment

Pub Date : 2019-05-27 DOI: 10.1080/10627197.2019.1615374

J. Kaufman, J. Engberg, L. Hamilton, Kun Yuan, H. Hill

ABSTRACT High-quality measures of instructional practice are essential for research and evaluation of innovative instructional policies and programs. However, existing measures have generally proven inadequate because of cost and validity issues. This paper addresses two potential drawbacks of survey self-report measures: variation in teachers’ interpretation of response scales and their interpretation of survey questions. To address these drawbacks, researchers tested out use of “anchoring vignettes“ in teacher surveys to capture information about teaching practice, and they gathered validity evidence in regard to their use as a tool for adjusting teachers’ survey self-reports about their instructional practices for research purposes, or potentially to inform professional development. Data from 65 teachers in grades 4-9 responding to our survey suggested that vignette adjustments were reliable and valid for some instructional practices more than others. For some instructional practices, researchers found significant and high correlations between teachers’ adjusted survey self-rating, through use of anchoring vignettes, and previous observation ratings of teachers’ instruction, including ratings from several widely-used observation rubrics. These results suggest that anchoring vignettes may provide an efficient, cost-effective method for gathering data on teachers’ instruction.

高质量的教学实践措施对于研究和评估创新的教学政策和计划至关重要。然而，由于成本和有效性问题，现有措施通常被证明是不够的。本文讨论了调查自我报告测量的两个潜在缺点：教师对回答量表的解释和对调查问题的解释存在差异。为了解决这些缺点，研究人员测试了“锚定小插曲”的使用“在教师调查中，他们收集了关于教学实践的信息，并收集了有效性证据，证明他们将其作为调整教师调查自我报告的工具，以达到研究目的，或可能为专业发展提供信息。来自65名4-9年级教师的数据表明，小插曲调整是有效的在某些教学实践中比其他教学实践更可靠和有效。对于一些教学实践，研究人员发现，教师通过使用锚定小插曲调整后的调查自我评分与之前对教师教学的观察评分之间存在显著且高度的相关性，包括几种广泛使用的观察准则的评分。这些结果表明，锚定小插曲可以为收集教师教学数据提供一种高效、经济高效的方法。

{"title":"Validity Evidence Supporting Use of Anchoring Vignettes to Measure Teaching Practice","authors":"J. Kaufman, J. Engberg, L. Hamilton, Kun Yuan, H. Hill","doi":"10.1080/10627197.2019.1615374","DOIUrl":"https://doi.org/10.1080/10627197.2019.1615374","url":null,"abstract":"ABSTRACT High-quality measures of instructional practice are essential for research and evaluation of innovative instructional policies and programs. However, existing measures have generally proven inadequate because of cost and validity issues. This paper addresses two potential drawbacks of survey self-report measures: variation in teachers’ interpretation of response scales and their interpretation of survey questions. To address these drawbacks, researchers tested out use of “anchoring vignettes“ in teacher surveys to capture information about teaching practice, and they gathered validity evidence in regard to their use as a tool for adjusting teachers’ survey self-reports about their instructional practices for research purposes, or potentially to inform professional development. Data from 65 teachers in grades 4-9 responding to our survey suggested that vignette adjustments were reliable and valid for some instructional practices more than others. For some instructional practices, researchers found significant and high correlations between teachers’ adjusted survey self-rating, through use of anchoring vignettes, and previous observation ratings of teachers’ instruction, including ratings from several widely-used observation rubrics. These results suggest that anchoring vignettes may provide an efficient, cost-effective method for gathering data on teachers’ instruction.","PeriodicalId":46209,"journal":{"name":"Educational Assessment","volume":"24 1","pages":"155 - 188"},"PeriodicalIF":1.5,"publicationDate":"2019-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/10627197.2019.1615374","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49264339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Patterns of Solution Behavior across Items in Low-Stakes Assessments 低风险评估中项目间的解决方案行为模式

IF 1.5 Q1 EDUCATION & EDUCATIONAL RESEARCH

Educational Assessment

Pub Date : 2019-05-26 DOI: 10.1080/10627197.2019.1615373

D. Pastor, Thai Q. Ong, S. Strickman

ABSTRACT The trustworthiness of low-stakes assessment results largely depends on examinee effort, which can be measured by the amount of time examinees devote to items using solution behavior (SB) indices. Because SB indices are calculated for each item, they can be used to understand how examinee motivation changes across items within a test. Latent class analysis (LCA) was used with the SB indices from three low-stakes assessments to explore patterns of solution behavior across items. Across tests, the favored models consisted of two classes, with Class 1 characterized by high and consistent solution behavior (>90% of examinees) and Class 2 by lower and less consistent solution behavior (<10% of examinees). Additional analyses provided supportive validity evidence for the two-class solution with notable differences between classes in self-reported effort, test scores, gender composition, and testing context. Although results were generally similar across the three assessments, striking differences were found in the nature of the solution behavior pattern for Class 2 and the ability of item characteristics to explain the pattern. The variability in the results suggests motivational changes across items may be unique to aspects of the testing situation (e.g., content of the assessment) for less motivated examinees.

摘要：低风险评估结果的可信度在很大程度上取决于考生的努力程度，这可以通过考生使用解决方案行为（SB）指数在项目上投入的时间来衡量。因为SB指数是为每个项目计算的，所以它们可以用来了解考生的动机在测试中是如何随项目变化的。潜在类别分析（LCA）与来自三个低风险评估的SB指数一起使用，以探索项目之间的解决方案行为模式。在测试中，受欢迎的模型由两类组成，第一类以高度一致的解决方案行为为特征（>90%的受试者），第二类以较低且不太一致的解决方式行为（<10%的受试人）。额外的分析为两个班的解决方案提供了支持性的有效性证据，两个班在自我报告的努力、考试成绩、性别构成和考试背景方面存在显著差异。尽管三项评估的结果大致相似，但在第2类解决方案行为模式的性质和项目特征解释该模式的能力方面发现了显著差异。结果的可变性表明，对于动机较低的考生来说，项目之间的动机变化可能是测试情况的各个方面（例如评估内容）所特有的。

{"title":"Patterns of Solution Behavior across Items in Low-Stakes Assessments","authors":"D. Pastor, Thai Q. Ong, S. Strickman","doi":"10.1080/10627197.2019.1615373","DOIUrl":"https://doi.org/10.1080/10627197.2019.1615373","url":null,"abstract":"ABSTRACT The trustworthiness of low-stakes assessment results largely depends on examinee effort, which can be measured by the amount of time examinees devote to items using solution behavior (SB) indices. Because SB indices are calculated for each item, they can be used to understand how examinee motivation changes across items within a test. Latent class analysis (LCA) was used with the SB indices from three low-stakes assessments to explore patterns of solution behavior across items. Across tests, the favored models consisted of two classes, with Class 1 characterized by high and consistent solution behavior (>90% of examinees) and Class 2 by lower and less consistent solution behavior (<10% of examinees). Additional analyses provided supportive validity evidence for the two-class solution with notable differences between classes in self-reported effort, test scores, gender composition, and testing context. Although results were generally similar across the three assessments, striking differences were found in the nature of the solution behavior pattern for Class 2 and the ability of item characteristics to explain the pattern. The variability in the results suggests motivational changes across items may be unique to aspects of the testing situation (e.g., content of the assessment) for less motivated examinees.","PeriodicalId":46209,"journal":{"name":"Educational Assessment","volume":"24 1","pages":"189 - 212"},"PeriodicalIF":1.5,"publicationDate":"2019-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/10627197.2019.1615373","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44799562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 26

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Educational Assessment

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀