Benjamin Becker, Dries Debeer, Sebastian Weirich, Frank Goldhammer
{"title":"On the Speed Sensitivity Parameter in the Lognormal Model for Response Times and Implications for High-Stakes Measurement Practice.","authors":"Benjamin Becker, Dries Debeer, Sebastian Weirich, Frank Goldhammer","doi":"10.1177/01466216211008530","DOIUrl":null,"url":null,"abstract":"<p><p>In high-stakes testing, often multiple test forms are used and a common time limit is enforced. Test fairness requires that ability estimates must not depend on the administration of a specific test form. Such a requirement may be violated if speededness differs between test forms. The impact of not taking speed sensitivity into account on the comparability of test forms regarding speededness and ability estimation was investigated. The lognormal measurement model for response times by van der Linden was compared with its extension by Klein Entink, van der Linden, and Fox, which includes a speed sensitivity parameter. An empirical data example was used to show that the extended model can fit the data better than the model without speed sensitivity parameters. A simulation was conducted, which showed that test forms with different average speed sensitivity yielded substantial different ability estimates for slow test takers, especially for test takers with high ability. Therefore, the use of the extended lognormal model for response times is recommended for the calibration of item pools in high-stakes testing situations. Limitations to the proposed approach and further research questions are discussed.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"45 6","pages":"407-422"},"PeriodicalIF":1.0000,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8381695/pdf/10.1177_01466216211008530.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Psychological Measurement","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1177/01466216211008530","RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2021/6/9 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"PSYCHOLOGY, MATHEMATICAL","Score":null,"Total":0}
引用次数: 0
Abstract
In high-stakes testing, often multiple test forms are used and a common time limit is enforced. Test fairness requires that ability estimates must not depend on the administration of a specific test form. Such a requirement may be violated if speededness differs between test forms. The impact of not taking speed sensitivity into account on the comparability of test forms regarding speededness and ability estimation was investigated. The lognormal measurement model for response times by van der Linden was compared with its extension by Klein Entink, van der Linden, and Fox, which includes a speed sensitivity parameter. An empirical data example was used to show that the extended model can fit the data better than the model without speed sensitivity parameters. A simulation was conducted, which showed that test forms with different average speed sensitivity yielded substantial different ability estimates for slow test takers, especially for test takers with high ability. Therefore, the use of the extended lognormal model for response times is recommended for the calibration of item pools in high-stakes testing situations. Limitations to the proposed approach and further research questions are discussed.
在高风险测试中,通常会使用多种测试表格,并执行统一的时间限制。测试的公平性要求能力估计不得依赖于特定测试形式的实施。如果不同测试形式的速度敏感性不同,就可能违反这一要求。我们研究了不考虑速度敏感性对测验形式在速度和能力估计方面的可比性的影响。将 van der Linden 的对数正态响应时间测量模型与 Klein Entink、van der Linden 和 Fox 的扩展模型进行了比较,后者包含了速度敏感性参数。一个经验数据实例表明,扩展模型比不包含速度敏感性参数的模型更适合数据。模拟结果表明,平均速度灵敏度不同的测试表格对速度较慢的应试者,尤其是对能力较高的应试者的能力估计值有很大差异。因此,建议在高风险测试中使用反应时间的扩展对数正态模型来校准项目库。本文还讨论了建议方法的局限性和进一步的研究问题。
期刊介绍:
Applied Psychological Measurement publishes empirical research on the application of techniques of psychological measurement to substantive problems in all areas of psychology and related disciplines.