Person Misfit and Person Reliability in Rating Scale Measures: The Role of Response Styles

IF 0.6 Q3 SOCIAL SCIENCES, INTERDISCIPLINARY Measurement-Interdisciplinary Research and Perspectives Pub Date : 2023-07-03 DOI:10.1080/15366367.2022.2114243

Tongtong Zou, D. Bolt

{"title":"Person Misfit and Person Reliability in Rating Scale Measures: The Role of Response Styles","authors":"Tongtong Zou, D. Bolt","doi":"10.1080/15366367.2022.2114243","DOIUrl":null,"url":null,"abstract":"ABSTRACT Person misfit and person reliability indices in item response theory (IRT) can play an important role in evaluating the validity of a test or survey instrument at the respondent level. Prior empirical comparisons of these indices have been applied to binary item response data and suggest that the two types of indices return very similar results. In this paper, however, we demonstrate an important applied distinction between these methods when applied to polytomously-scored rating scale items, namely their varying sensitivities to response style tendencies. Using several empirical datasets, we illustrate settings in which these indices are in one case highly correlated and in two other cases weakly correlated. In the datasets showing a weak correlation between indices, the primary distinction appears due to the effects of response style behavior, whereby respondents whose response styles are less common (e.g. a disproportionate selection of the midpoint response) are found to misfit using Drasgow et al’s person misfit index, but often show high levels of reliability from a person reliability perspective; just the opposite frequently occurs for respondents that over-select the rating scale extremes. It is suggested that person misfit reporting should be supplemented with an evaluation of person reliability to best understand the validity of measurement at the respondent level when using IRT models with rating scale measures.","PeriodicalId":46596,"journal":{"name":"Measurement-Interdisciplinary Research and Perspectives","volume":"138 1","pages":"167 - 180"},"PeriodicalIF":0.6000,"publicationDate":"2023-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Measurement-Interdisciplinary Research and Perspectives","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/15366367.2022.2114243","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"SOCIAL SCIENCES, INTERDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

Abstract

ABSTRACT Person misfit and person reliability indices in item response theory (IRT) can play an important role in evaluating the validity of a test or survey instrument at the respondent level. Prior empirical comparisons of these indices have been applied to binary item response data and suggest that the two types of indices return very similar results. In this paper, however, we demonstrate an important applied distinction between these methods when applied to polytomously-scored rating scale items, namely their varying sensitivities to response style tendencies. Using several empirical datasets, we illustrate settings in which these indices are in one case highly correlated and in two other cases weakly correlated. In the datasets showing a weak correlation between indices, the primary distinction appears due to the effects of response style behavior, whereby respondents whose response styles are less common (e.g. a disproportionate selection of the midpoint response) are found to misfit using Drasgow et al’s person misfit index, but often show high levels of reliability from a person reliability perspective; just the opposite frequently occurs for respondents that over-select the rating scale extremes. It is suggested that person misfit reporting should be supplemented with an evaluation of person reliability to best understand the validity of measurement at the respondent level when using IRT models with rating scale measures.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

评量表测量中的人错配与人信度:反应风格的作用

项目反应理论(IRT)中的人错配和人信度指标在被调查者水平上评价测验或调查工具的效度具有重要作用。之前的经验比较这些指数已应用于二进制项目响应数据，并表明这两种类型的指数返回非常相似的结果。然而，在本文中，我们证明了这些方法在应用于多分式评分量表项目时的重要应用区别，即它们对反应风格倾向的不同敏感性。使用几个经验数据集，我们说明了这些指数在一种情况下高度相关，在另外两种情况下弱相关的设置。在显示指数之间弱相关性的数据集中，主要区别是由于反应风格行为的影响，即使用draggow等人的人不匹配指数发现，其反应风格不太常见(例如，中点反应的不成比例选择)的受访者不匹配，但从个人可靠性的角度来看，往往显示出高水平的可靠性;而过度选择极端评分量表的受访者则经常出现相反的情况。建议在使用IRT模型和评定量表测量时，个人不匹配报告应辅以个人信度评估，以最好地理解在被调查者水平上测量的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Measurement-Interdisciplinary Research and Perspectives SOCIAL SCIENCES, INTERDISCIPLINARY-

CiteScore

1.80

自引率

0.00%

发文量