Quantifying Item Invariance for the Selection of the Least Biased Assessment.

Journal of applied measurement Pub Date : 2019-01-01

W Holmes Finch, Brian F French, Maria E Hernandez Finch

{"title":"Quantifying Item Invariance for the Selection of the Least Biased Assessment.","authors":"W Holmes Finch, Brian F French, Maria E Hernandez Finch","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>An important aspect of educational and psychological measurement and evaluation of individuals is the selection of scales with appropriate evidence of reliability and validity for inferences and uses of the scores for the population of interest. One aspect of validity is the degree to which a scale fairly assesses the construct(s) of interest for members of different subgroups within the population. Typically, this issue is addressed statistically through assessment of differential item functioning (DIF) of individual items, or differential bundle functioning (DBF) of sets of items. When selecting an assessment to use for a given application (e.g., measuring intelligence), or which form of an assessment to use in a given instance, researchers need to consider the extent to which the scales work with all members of the population. Little research has examined methods for comparing the amount or magnitude of DIF/DBF present in two assessments when deciding which assessment to use. The current simulation study examines 6 different statistics for this purpose. Results show that a method based on the random effects item response theory model may be optimal for instrument comparisons, particularly when the assessments being compared are not of the same length.</p>","PeriodicalId":73608,"journal":{"name":"Journal of applied measurement","volume":"20 1","pages":"13-26"},"PeriodicalIF":0.0000,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of applied measurement","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

An important aspect of educational and psychological measurement and evaluation of individuals is the selection of scales with appropriate evidence of reliability and validity for inferences and uses of the scores for the population of interest. One aspect of validity is the degree to which a scale fairly assesses the construct(s) of interest for members of different subgroups within the population. Typically, this issue is addressed statistically through assessment of differential item functioning (DIF) of individual items, or differential bundle functioning (DBF) of sets of items. When selecting an assessment to use for a given application (e.g., measuring intelligence), or which form of an assessment to use in a given instance, researchers need to consider the extent to which the scales work with all members of the population. Little research has examined methods for comparing the amount or magnitude of DIF/DBF present in two assessments when deciding which assessment to use. The current simulation study examines 6 different statistics for this purpose. Results show that a method based on the random effects item response theory model may be optimal for instrument comparisons, particularly when the assessments being compared are not of the same length.

微信好友朋友圈 QQ好友复制链接

本刊更多论文

选择最小偏差评价的量化项目不变性。

对个人进行教育和心理测量和评价的一个重要方面是选择具有适当的可靠性和有效性证据的量表，以进行推断，并将分数用于感兴趣的人口。效度的一个方面是量表公平地评估人群中不同亚群成员感兴趣的构念的程度。通常，这个问题是通过评估单个项目的差异项目功能(DIF)或项目集的差异捆绑功能(DBF)来统计地解决的。当选择一个评估用于一个给定的应用(例如，测量智力)，或哪种形式的评估在一个给定的实例中使用时，研究人员需要考虑的程度，量表适用于人口的所有成员。很少有研究考察了在决定使用哪种评估时比较两种评估中存在的DIF/DBF的数量或大小的方法。目前的模拟研究为此目的检查了6种不同的统计数据。结果表明，基于随机效应项目反应理论模型的方法可能是工具比较的最佳方法，特别是当被比较的评估长度不相同时。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Journal of applied measurement

自引率

0.00%

发文量

期刊最新文献

Validation of Egalitarian Education Questionnaire using Rasch Measurement Model. Bootstrap Estimate of Bias for Intraclass Correlation. Rasch's Logistic Model Applied to Growth. Psychometric Properties of the General Movement Optimality Score using Rasch Measurement. Rasch Analysis of the Burn-Specific Pain Anxiety Scale: Evidence for the Abbreviated Version.