异质群体中的多阶段测试:一些设计和实现方面的考虑。

IF 1 4区 心理学 Q4 PSYCHOLOGY, MATHEMATICAL Applied Psychological Measurement Pub Date : 2022-09-01 DOI:10.1177/01466216221108123
Leslie Rutkowski, Yuan-Ling Liaw, Dubravka Svetina, David Rutkowski
{"title":"异质群体中的多阶段测试:一些设计和实现方面的考虑。","authors":"Leslie Rutkowski,&nbsp;Yuan-Ling Liaw,&nbsp;Dubravka Svetina,&nbsp;David Rutkowski","doi":"10.1177/01466216221108123","DOIUrl":null,"url":null,"abstract":"<p><p>A central challenge in international large-scale assessments is adequately measuring dozens of highly heterogeneous populations, many of which are low performers. To that end, multistage adaptive testing offers one possibility for better assessing across the achievement continuum. This study examines the way that several multistage test design and implementation choices can impact measurement performance in this setting. To attend to gaps in the knowledge base, we extended previous research to include multiple, linked panels, more appropriate estimates of achievement, and multiple populations of varied proficiency. Including achievement distributions from varied populations and associated item parameters, we design and execute a simulation study that mimics an established international assessment. We compare several routing schemes and varied module lengths in terms of item and person parameter recovery. Our findings suggest that, particularly for low performing populations, multistage testing offers precision advantages. Further, findings indicate that equal module lengths-desirable for controlling position effects-and classical routing methods, which lower the technological burden of implementing such a design, produce good results. Finally, probabilistic misrouting offers advantages over merit routing for controlling bias in item and person parameters. Overall, multistage testing shows promise for extending the scope of international assessments. We discuss the importance of our findings for operational work in the international assessment domain.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":1.0000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9382094/pdf/10.1177_01466216221108123.pdf","citationCount":"2","resultStr":"{\"title\":\"Multistage Testing in Heterogeneous Populations: Some Design and Implementation Considerations.\",\"authors\":\"Leslie Rutkowski,&nbsp;Yuan-Ling Liaw,&nbsp;Dubravka Svetina,&nbsp;David Rutkowski\",\"doi\":\"10.1177/01466216221108123\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>A central challenge in international large-scale assessments is adequately measuring dozens of highly heterogeneous populations, many of which are low performers. To that end, multistage adaptive testing offers one possibility for better assessing across the achievement continuum. This study examines the way that several multistage test design and implementation choices can impact measurement performance in this setting. To attend to gaps in the knowledge base, we extended previous research to include multiple, linked panels, more appropriate estimates of achievement, and multiple populations of varied proficiency. Including achievement distributions from varied populations and associated item parameters, we design and execute a simulation study that mimics an established international assessment. We compare several routing schemes and varied module lengths in terms of item and person parameter recovery. Our findings suggest that, particularly for low performing populations, multistage testing offers precision advantages. Further, findings indicate that equal module lengths-desirable for controlling position effects-and classical routing methods, which lower the technological burden of implementing such a design, produce good results. Finally, probabilistic misrouting offers advantages over merit routing for controlling bias in item and person parameters. Overall, multistage testing shows promise for extending the scope of international assessments. We discuss the importance of our findings for operational work in the international assessment domain.</p>\",\"PeriodicalId\":48300,\"journal\":{\"name\":\"Applied Psychological Measurement\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2022-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9382094/pdf/10.1177_01466216221108123.pdf\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Psychological Measurement\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://doi.org/10.1177/01466216221108123\",\"RegionNum\":4,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"PSYCHOLOGY, MATHEMATICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Psychological Measurement","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1177/01466216221108123","RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"PSYCHOLOGY, MATHEMATICAL","Score":null,"Total":0}
引用次数: 2

摘要

国际大规模评估的一个核心挑战是充分衡量几十个高度异质的人群,其中许多人表现不佳。为此,多阶段自适应测试为更好地评估整个成就连续体提供了一种可能性。本研究考察了在这种情况下,几个多阶段测试设计和实现选择可以影响测量性能的方式。为了解决知识库中的差距,我们扩展了以前的研究,包括多个关联的小组,更适当的成就估计,以及不同熟练程度的多个人群。包括不同人群的成就分布和相关项目参数,我们设计并执行了一项模拟国际评估的模拟研究。我们比较了几种路由方案和不同的模块长度在项目和人参数恢复方面。我们的研究结果表明,特别是对于表现不佳的人群,多级测试提供了精确的优势。此外,研究结果表明,相同的模块长度-理想的控制位置效应-和经典的路由方法,这降低了实现这种设计的技术负担,产生了良好的结果。最后,概率错误路径在控制项目和人参数偏差方面比优点路径更有优势。总的来说,多阶段测试显示了扩大国际评估范围的希望。我们讨论了我们的研究结果对国际评估领域业务工作的重要性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Multistage Testing in Heterogeneous Populations: Some Design and Implementation Considerations.

A central challenge in international large-scale assessments is adequately measuring dozens of highly heterogeneous populations, many of which are low performers. To that end, multistage adaptive testing offers one possibility for better assessing across the achievement continuum. This study examines the way that several multistage test design and implementation choices can impact measurement performance in this setting. To attend to gaps in the knowledge base, we extended previous research to include multiple, linked panels, more appropriate estimates of achievement, and multiple populations of varied proficiency. Including achievement distributions from varied populations and associated item parameters, we design and execute a simulation study that mimics an established international assessment. We compare several routing schemes and varied module lengths in terms of item and person parameter recovery. Our findings suggest that, particularly for low performing populations, multistage testing offers precision advantages. Further, findings indicate that equal module lengths-desirable for controlling position effects-and classical routing methods, which lower the technological burden of implementing such a design, produce good results. Finally, probabilistic misrouting offers advantages over merit routing for controlling bias in item and person parameters. Overall, multistage testing shows promise for extending the scope of international assessments. We discuss the importance of our findings for operational work in the international assessment domain.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
2.30
自引率
8.30%
发文量
50
期刊介绍: Applied Psychological Measurement publishes empirical research on the application of techniques of psychological measurement to substantive problems in all areas of psychology and related disciplines.
期刊最新文献
Effect of Differential Item Functioning on Computer Adaptive Testing Under Different Conditions. Evaluating the Construct Validity of Instructional Manipulation Checks as Measures of Careless Responding to Surveys. A Mark-Recapture Approach to Estimating Item Pool Compromise. Estimating Test-Retest Reliability in the Presence of Self-Selection Bias and Learning/Practice Effects. The Improved EMS Algorithm for Latent Variable Selection in M3PL Model.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1