影响分数转换质量的因素是什么?使用部分信用模型计算真分的潜在问题

IF 2.1 3区心理学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Educational and Psychological Measurement Pub Date : 2023-12-01 Epub Date: 2023-01-13 DOI:10.1177/00131644221143051

Carolina Fellinghauer, Rudolf Debelak, Carolin Strobl

{"title":"影响分数转换质量的因素是什么?使用部分信用模型计算真分的潜在问题","authors":"Carolina Fellinghauer, Rudolf Debelak, Carolin Strobl","doi":"10.1177/00131644221143051","DOIUrl":null,"url":null,"abstract":"This simulation study investigated to what extent departures from construct similarity as well as differences in the difficulty and targeting of scales impact the score transformation when scales are equated by means of concurrent calibration using the partial credit model with a common person design. Practical implications of the simulation results are discussed with a focus on scale equating in health-related research settings. The study simulated data for two scales, varying the number of items and the sample sizes. The factor correlation between scales was used to operationalize construct similarity. Targeting of the scales was operationalized through increasing departure from equal difficulty and by varying the dispersion of the item and person parameters in each scale. The results show that low similarity between scales goes along with lower transformation precision. In cases with equal levels of similarity, precision improves in settings where the range of the item parameters is encompassing the person parameters range. With decreasing similarity, score transformation precision benefits more from good targeting. Difficulty shifts up to two logits somewhat increased the estimation bias but without affecting the transformation precision. The observed robustness against difficulty shifts supports the advantage of applying a true-score equating methods over identity equating, which was used as a naive baseline method for comparison. Finally, larger sample size did not improve the transformation precision in this study, longer scales improved only marginally the quality of the equating. The insights from the simulation study are used in a real-data example.","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":2.1000,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10638984/pdf/","citationCount":"0","resultStr":"{\"title\":\"What Affects the Quality of Score Transformations? Potential Issues in True-Score Equating Using the Partial Credit Model.\",\"authors\":\"Carolina Fellinghauer, Rudolf Debelak, Carolin Strobl\",\"doi\":\"10.1177/00131644221143051\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This simulation study investigated to what extent departures from construct similarity as well as differences in the difficulty and targeting of scales impact the score transformation when scales are equated by means of concurrent calibration using the partial credit model with a common person design. Practical implications of the simulation results are discussed with a focus on scale equating in health-related research settings. The study simulated data for two scales, varying the number of items and the sample sizes. The factor correlation between scales was used to operationalize construct similarity. Targeting of the scales was operationalized through increasing departure from equal difficulty and by varying the dispersion of the item and person parameters in each scale. The results show that low similarity between scales goes along with lower transformation precision. In cases with equal levels of similarity, precision improves in settings where the range of the item parameters is encompassing the person parameters range. With decreasing similarity, score transformation precision benefits more from good targeting. Difficulty shifts up to two logits somewhat increased the estimation bias but without affecting the transformation precision. The observed robustness against difficulty shifts supports the advantage of applying a true-score equating methods over identity equating, which was used as a naive baseline method for comparison. Finally, larger sample size did not improve the transformation precision in this study, longer scales improved only marginally the quality of the equating. The insights from the simulation study are used in a real-data example.\",\"PeriodicalId\":11502,\"journal\":{\"name\":\"Educational and Psychological Measurement\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2023-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10638984/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Educational and Psychological Measurement\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://doi.org/10.1177/00131644221143051\",\"RegionNum\":3,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/1/13 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Educational and Psychological Measurement","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1177/00131644221143051","RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/1/13 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

摘要

本模拟研究探讨了在采用部分信用模型和普通人设计的并行校准方法进行等分时，量表的难度和目标的差异以及结构相似度的偏离对分数转换的影响程度。模拟结果的实际意义进行了讨论，重点是规模等同在健康相关的研究设置。该研究模拟了两个尺度的数据，改变了项目的数量和样本量。使用量表间的因子相关性来操作构念相似性。通过增加对同等难度的偏离，以及通过改变每个量表中项目和人参数的分散程度，来实现量表的目标。结果表明，尺度间相似度较低，变换精度较低。在相似程度相等的情况下，在项目参数范围包含人员参数范围的情况下，精度会提高。随着相似度的降低，分数转换精度从良好的定位中获益更多。难度变化到两个对数会增加估计偏差，但不会影响转换精度。观察到的对难度转移的稳健性支持了应用真实分数等同方法优于身份等同方法的优势，身份等同方法被用作比较的朴素基线方法。最后，更大的样本量并没有提高本研究的变换精度，更长的尺度只略微提高了方程的质量。仿真研究的见解被应用于实际数据示例中。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

What Affects the Quality of Score Transformations? Potential Issues in True-Score Equating Using the Partial Credit Model.

This simulation study investigated to what extent departures from construct similarity as well as differences in the difficulty and targeting of scales impact the score transformation when scales are equated by means of concurrent calibration using the partial credit model with a common person design. Practical implications of the simulation results are discussed with a focus on scale equating in health-related research settings. The study simulated data for two scales, varying the number of items and the sample sizes. The factor correlation between scales was used to operationalize construct similarity. Targeting of the scales was operationalized through increasing departure from equal difficulty and by varying the dispersion of the item and person parameters in each scale. The results show that low similarity between scales goes along with lower transformation precision. In cases with equal levels of similarity, precision improves in settings where the range of the item parameters is encompassing the person parameters range. With decreasing similarity, score transformation precision benefits more from good targeting. Difficulty shifts up to two logits somewhat increased the estimation bias but without affecting the transformation precision. The observed robustness against difficulty shifts supports the advantage of applying a true-score equating methods over identity equating, which was used as a naive baseline method for comparison. Finally, larger sample size did not improve the transformation precision in this study, longer scales improved only marginally the quality of the equating. The insights from the simulation study are used in a real-data example.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Educational and Psychological Measurement 医学-数学跨学科应用

CiteScore

5.50

自引率

7.40%

发文量

审稿时长

6-12 weeks

期刊介绍： Educational and Psychological Measurement (EPM) publishes referred scholarly work from all academic disciplines interested in the study of measurement theory, problems, and issues. Theoretical articles address new developments and techniques, and applied articles deal with innovation applications.