{"title":"加权差分项目功能(DIF)分析的稳健性:以Mantel-Haenszel DIF统计为例","authors":"Ru Lu, Hongwen Guo, Neil J. Dorans","doi":"10.1002/ets2.12325","DOIUrl":null,"url":null,"abstract":"<p>Two families of analysis methods can be used for differential item functioning (DIF) analysis. One family is DIF analysis based on observed scores, such as the Mantel–Haenszel (MH) and the standardized proportion-correct metric for DIF procedures; the other is analysis based on latent ability, in which the statistic is a measure of departure from measurement invariance (DMI) for two studied groups. Previous research has shown, that DIF and DMI do not necessarily agree with each other. In practice, many operational testing programs use the MH DIF procedure to flag potential DIF items. Recently, weighted DIF statistics has been proposed, where weighted sum scores are used as the matching variable and the weights are the item discrimination parameters. It has been shown theoretically and analytically that, given the item parameters, weighted DIF statistics can close the gap between DIF and DMI. The current study investigates the robustness of using weighted DIF statistics empirically through simulations when item parameters have to be estimated from data.</p>","PeriodicalId":11972,"journal":{"name":"ETS Research Report Series","volume":"2021 1","pages":"1-23"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1002/ets2.12325","citationCount":"1","resultStr":"{\"title\":\"Robustness of Weighted Differential Item Functioning (DIF) Analysis: The Case of Mantel–Haenszel DIF Statistics\",\"authors\":\"Ru Lu, Hongwen Guo, Neil J. Dorans\",\"doi\":\"10.1002/ets2.12325\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Two families of analysis methods can be used for differential item functioning (DIF) analysis. One family is DIF analysis based on observed scores, such as the Mantel–Haenszel (MH) and the standardized proportion-correct metric for DIF procedures; the other is analysis based on latent ability, in which the statistic is a measure of departure from measurement invariance (DMI) for two studied groups. Previous research has shown, that DIF and DMI do not necessarily agree with each other. In practice, many operational testing programs use the MH DIF procedure to flag potential DIF items. Recently, weighted DIF statistics has been proposed, where weighted sum scores are used as the matching variable and the weights are the item discrimination parameters. It has been shown theoretically and analytically that, given the item parameters, weighted DIF statistics can close the gap between DIF and DMI. The current study investigates the robustness of using weighted DIF statistics empirically through simulations when item parameters have to be estimated from data.</p>\",\"PeriodicalId\":11972,\"journal\":{\"name\":\"ETS Research Report Series\",\"volume\":\"2021 1\",\"pages\":\"1-23\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-08-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1002/ets2.12325\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ETS Research Report Series\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/ets2.12325\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Social Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ETS Research Report Series","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ets2.12325","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Social Sciences","Score":null,"Total":0}
Robustness of Weighted Differential Item Functioning (DIF) Analysis: The Case of Mantel–Haenszel DIF Statistics
Two families of analysis methods can be used for differential item functioning (DIF) analysis. One family is DIF analysis based on observed scores, such as the Mantel–Haenszel (MH) and the standardized proportion-correct metric for DIF procedures; the other is analysis based on latent ability, in which the statistic is a measure of departure from measurement invariance (DMI) for two studied groups. Previous research has shown, that DIF and DMI do not necessarily agree with each other. In practice, many operational testing programs use the MH DIF procedure to flag potential DIF items. Recently, weighted DIF statistics has been proposed, where weighted sum scores are used as the matching variable and the weights are the item discrimination parameters. It has been shown theoretically and analytically that, given the item parameters, weighted DIF statistics can close the gap between DIF and DMI. The current study investigates the robustness of using weighted DIF statistics empirically through simulations when item parameters have to be estimated from data.