首页 > 最新文献

Educational and Psychological Measurement最新文献

英文 中文
Conceptualizing Correlated Residuals as Item-Level Method Effects in Confirmatory Factor Analysis 将相关残差概念化为确证因子分析中的项目级方法效应
IF 2.7 3区 心理学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2023-12-23 DOI: 10.1177/00131644231218401
Karl Schweizer, A. Gold, Dorothea Krampen, Stefan Troche
Conceptualizing two-variable disturbances preventing good model fit in confirmatory factor analysis as item-level method effects instead of correlated residuals avoids violating the principle that residual variation is unique for each item. The possibility of representing such a disturbance by a method factor of a bifactor measurement model was investigated with respect to model identification. It turned out that a suitable way of realizing the method factor is its integration into a fixed-links, parallel-measurement or tau-equivalent measurement submodel that is part of the bifactor model. A simulation study comparing these submodels revealed similar degrees of efficiency in controlling the influence of two-variable disturbances on model fit. Perfect correspondence characterized the fit results of the model assuming correlated residuals and the fixed-links model, and virtually also the tau-equivalent model.
在确认性因素分析中,将阻碍模型良好拟合的双变量干扰概念化为项目级方法效应,而不是相关残差,可以避免违反残差变异对每个项目都是唯一的这一原则。在模型识别方面,研究了用双因素测量模型的方法因素来表示这种干扰的可能性。结果表明,实现方法因子的合适方法是将其整合到作为双因素模型一部分的固定连接、平行测量或头等效测量子模型中。一项比较这些子模型的模拟研究显示,在控制双变量干扰对模型拟合的影响方面,这些子模型具有相似的效率。假定残差相关的模型与固定链接模型的拟合结果完全一致,实际上也与 tau 等效模型完全一致。
{"title":"Conceptualizing Correlated Residuals as Item-Level Method Effects in Confirmatory Factor Analysis","authors":"Karl Schweizer, A. Gold, Dorothea Krampen, Stefan Troche","doi":"10.1177/00131644231218401","DOIUrl":"https://doi.org/10.1177/00131644231218401","url":null,"abstract":"Conceptualizing two-variable disturbances preventing good model fit in confirmatory factor analysis as item-level method effects instead of correlated residuals avoids violating the principle that residual variation is unique for each item. The possibility of representing such a disturbance by a method factor of a bifactor measurement model was investigated with respect to model identification. It turned out that a suitable way of realizing the method factor is its integration into a fixed-links, parallel-measurement or tau-equivalent measurement submodel that is part of the bifactor model. A simulation study comparing these submodels revealed similar degrees of efficiency in controlling the influence of two-variable disturbances on model fit. Perfect correspondence characterized the fit results of the model assuming correlated residuals and the fixed-links model, and virtually also the tau-equivalent model.","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"21 6","pages":""},"PeriodicalIF":2.7,"publicationDate":"2023-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139162221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Separation of Traits and Extreme Response Style in IRTree Models: The Role of Mimicry Effects for the Meaningful Interpretation of Estimates IRTree 模型中特质与极端反应风格的分离:模仿效应对有意义地解释估计值的作用
IF 2.7 3区 心理学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2023-12-22 DOI: 10.1177/00131644231213319
Viola Merhof, Caroline M. Böhm, Thorsten Meiser
Item response tree (IRTree) models are a flexible framework to control self-reported trait measurements for response styles. To this end, IRTree models decompose the responses to rating items into sub-decisions, which are assumed to be made on the basis of either the trait being measured or a response style, whereby the effects of such person parameters can be separated from each other. Here we investigate conditions under which the substantive meanings of estimated extreme response style parameters are potentially invalid and do not correspond to the meanings attributed to them, that is, content-unrelated category preferences. Rather, the response style factor may mimic the trait and capture part of the trait-induced variance in item responding, thus impairing the meaningful separation of the person parameters. Such a mimicry effect is manifested in a biased estimation of the covariance of response style and trait, as well as in an overestimation of the response style variance. Both can lead to severely misleading conclusions drawn from IRTree analyses. A series of simulation studies reveals that mimicry effects depend on the distribution of observed responses and that the estimation biases are stronger the more asymmetrically the responses are distributed across the rating scale. It is further demonstrated that extending the commonly used IRTree model with unidimensional sub-decisions by multidimensional parameterizations counteracts mimicry effects and facilitates the meaningful separation of parameters. An empirical example of the Program for International Student Assessment (PISA) background questionnaire illustrates the threat of mimicry effects in real data. The implications of applying IRTree models for empirical research questions are discussed.
项目反应树(IRTree)模型是一种灵活的框架,用于控制自我报告特质测量的反应风格。为此,IRTree 模型将对评分项目的反应分解为若干子决定,并假定这些子决定是根据所测量的特质或反应风格做出的,这样就可以将这些人的参数的影响彼此分开。在此,我们研究了在哪些条件下,估计的极端反应风格参数的实质含义可能无效,并且与归因于它们的含义(即与内容无关的类别偏好)不一致。相反,反应风格因子可能会模仿特质,并捕捉到项目反应中部分由特质引起的变异,从而损害了人称参数的意义分离。这种模仿效应表现为对反应风格和特质的协方差估计有偏差,以及对反应风格方差估计过高。这两种情况都会严重误导 IRTree 分析得出的结论。一系列模拟研究表明,模仿效应取决于观察到的反应的分布情况,反应在评分量表中的分布越不对称,估计偏差就越大。研究还进一步证明,通过多维参数化扩展常用的 IRTree 模型,使其具有单维子决策,可以抵消模仿效应,并促进参数的有意义分离。以国际学生评估项目(PISA)背景调查问卷为例,说明了真实数据中模仿效应的威胁。本文还讨论了将 IRTree 模型应用于实证研究问题的意义。
{"title":"Separation of Traits and Extreme Response Style in IRTree Models: The Role of Mimicry Effects for the Meaningful Interpretation of Estimates","authors":"Viola Merhof, Caroline M. Böhm, Thorsten Meiser","doi":"10.1177/00131644231213319","DOIUrl":"https://doi.org/10.1177/00131644231213319","url":null,"abstract":"Item response tree (IRTree) models are a flexible framework to control self-reported trait measurements for response styles. To this end, IRTree models decompose the responses to rating items into sub-decisions, which are assumed to be made on the basis of either the trait being measured or a response style, whereby the effects of such person parameters can be separated from each other. Here we investigate conditions under which the substantive meanings of estimated extreme response style parameters are potentially invalid and do not correspond to the meanings attributed to them, that is, content-unrelated category preferences. Rather, the response style factor may mimic the trait and capture part of the trait-induced variance in item responding, thus impairing the meaningful separation of the person parameters. Such a mimicry effect is manifested in a biased estimation of the covariance of response style and trait, as well as in an overestimation of the response style variance. Both can lead to severely misleading conclusions drawn from IRTree analyses. A series of simulation studies reveals that mimicry effects depend on the distribution of observed responses and that the estimation biases are stronger the more asymmetrically the responses are distributed across the rating scale. It is further demonstrated that extending the commonly used IRTree model with unidimensional sub-decisions by multidimensional parameterizations counteracts mimicry effects and facilitates the meaningful separation of parameters. An empirical example of the Program for International Student Assessment (PISA) background questionnaire illustrates the threat of mimicry effects in real data. The implications of applying IRTree models for empirical research questions are discussed.","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"179 1","pages":""},"PeriodicalIF":2.7,"publicationDate":"2023-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139165688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Effects of the Quantity and Magnitude of Cross-Loading and Model Specification on MIRT Item Parameter Recovery 交叉加载的数量和幅度以及模型规格对 MIRT 项目参数恢复的影响
IF 2.7 3区 心理学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2023-12-21 DOI: 10.1177/00131644231210509
Mostafa Hosseinzadeh, Ki Lynn Matlock Cole
In real-world situations, multidimensional data may appear on large-scale tests or psychological surveys. The purpose of this study was to investigate the effects of the quantity and magnitude of cross-loadings and model specification on item parameter recovery in multidimensional Item Response Theory (MIRT) models, especially when the model was misspecified as a simple structure, ignoring the quantity and magnitude of cross-loading. A simulation study that replicated this scenario was designed to manipulate the variables that could potentially influence the precision of item parameter estimation in the MIRT models. Item parameters were estimated using marginal maximum likelihood, utilizing the expectation-maximization algorithms. A compensatory two-parameter logistic-MIRT model with two dimensions and dichotomous item–responses was used to simulate and calibrate the data for each combination of conditions across 500 replications. The results of this study indicated that ignoring the quantity and magnitude of cross-loading and model specification resulted in inaccurate and biased item discrimination parameter estimates. As the quantity and magnitude of cross-loading increased, the root mean square of error and bias estimates of item discrimination worsened.
在现实世界中,大规模测验或心理调查中可能会出现多维数据。本研究旨在探讨交叉负荷的数量和大小以及模型规格对多维项目反应理论(MIRT)模型中项目参数恢复的影响,尤其是当模型被错误地规格为简单结构,忽略了交叉负荷的数量和大小时。我们设计了一项模拟研究来复制这种情况,以操纵可能影响多维项目反应理论模型中项目参数估计精度的变量。项目参数采用边际最大似然法,利用期望最大化算法进行估计。我们使用了一个具有两个维度和二分项目反应的补偿性双参数逻辑-MIRT 模型来模拟和校准 500 次重复中每种条件组合的数据。研究结果表明,忽略交叉负荷的数量和大小以及模型的规格会导致项目区分度参数估计的不准确和偏差。随着交叉负荷数量和幅度的增加,项目辨别力的误差均方根和偏差估计值也在增加。
{"title":"Effects of the Quantity and Magnitude of Cross-Loading and Model Specification on MIRT Item Parameter Recovery","authors":"Mostafa Hosseinzadeh, Ki Lynn Matlock Cole","doi":"10.1177/00131644231210509","DOIUrl":"https://doi.org/10.1177/00131644231210509","url":null,"abstract":"In real-world situations, multidimensional data may appear on large-scale tests or psychological surveys. The purpose of this study was to investigate the effects of the quantity and magnitude of cross-loadings and model specification on item parameter recovery in multidimensional Item Response Theory (MIRT) models, especially when the model was misspecified as a simple structure, ignoring the quantity and magnitude of cross-loading. A simulation study that replicated this scenario was designed to manipulate the variables that could potentially influence the precision of item parameter estimation in the MIRT models. Item parameters were estimated using marginal maximum likelihood, utilizing the expectation-maximization algorithms. A compensatory two-parameter logistic-MIRT model with two dimensions and dichotomous item–responses was used to simulate and calibrate the data for each combination of conditions across 500 replications. The results of this study indicated that ignoring the quantity and magnitude of cross-loading and model specification resulted in inaccurate and biased item discrimination parameter estimates. As the quantity and magnitude of cross-loading increased, the root mean square of error and bias estimates of item discrimination worsened.","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"63 2","pages":""},"PeriodicalIF":2.7,"publicationDate":"2023-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138950656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Explanatory Multidimensional Random Item Effects Rating Scale Model. 一个解释性的多维随机项目效果评定量表模型
IF 2.7 3区 心理学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2023-12-01 Epub Date: 2022-12-13 DOI: 10.1177/00131644221140906
Sijia Huang, Jinwen Jevan Luo, Li Cai

Random item effects item response theory (IRT) models, which treat both person and item effects as random, have received much attention for more than a decade. The random item effects approach has several advantages in many practical settings. The present study introduced an explanatory multidimensional random item effects rating scale model. The proposed model was formulated under a novel parameterization of the nominal response model (NRM), and allows for flexible inclusion of person-related and item-related covariates (e.g., person characteristics and item features) to study their impacts on the person and item latent variables. A new variant of the Metropolis-Hastings Robbins-Monro (MH-RM) algorithm designed for latent variable models with crossed random effects was applied to obtain parameter estimates for the proposed model. A preliminary simulation study was conducted to evaluate the performance of the MH-RM algorithm for estimating the proposed model. Results indicated that the model parameters were well recovered. An empirical data set was analyzed to further illustrate the usage of the proposed model.

随机项目效应-项目反应理论(IRT)模型将人和项目效应都视为随机的,十多年来一直备受关注。随机项目效果方法在许多实际环境中具有几个优点。本研究引入了一个解释性多维随机项目效应评分量表模型。所提出的模型是在名义反应模型(NRM)的新参数化下制定的,并允许灵活地包含与人和项目相关的协变量(例如,人特征和项目特征),以研究它们对人和项目潜在变量的影响。应用为具有交叉随机效应的潜变量模型设计的Metropolis Hastings-Robbins-Monro(MH-RM)算法的新变体来获得所提出模型的参数估计。进行了初步的仿真研究,以评估MH-RM算法用于估计所提出的模型的性能。结果表明,模型参数恢复良好。分析了一个经验数据集,以进一步说明所提出的模型的使用。
{"title":"An Explanatory Multidimensional Random Item Effects Rating Scale Model.","authors":"Sijia Huang, Jinwen Jevan Luo, Li Cai","doi":"10.1177/00131644221140906","DOIUrl":"10.1177/00131644221140906","url":null,"abstract":"<p><p>Random item effects item response theory (IRT) models, which treat both person and item effects as random, have received much attention for more than a decade. The random item effects approach has several advantages in many practical settings. The present study introduced an explanatory multidimensional random item effects rating scale model. The proposed model was formulated under a novel parameterization of the nominal response model (NRM), and allows for flexible inclusion of person-related and item-related covariates (e.g., person characteristics and item features) to study their impacts on the person and item latent variables. A new variant of the Metropolis-Hastings Robbins-Monro (MH-RM) algorithm designed for latent variable models with crossed random effects was applied to obtain parameter estimates for the proposed model. A preliminary simulation study was conducted to evaluate the performance of the MH-RM algorithm for estimating the proposed model. Results indicated that the model parameters were well recovered. An empirical data set was analyzed to further illustrate the usage of the proposed model.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"1 1","pages":"1229-1248"},"PeriodicalIF":2.7,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10638980/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41340323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the Utility of Indirect Methods for Detecting Faking 论间接检测造假的效用
3区 心理学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2023-11-13 DOI: 10.1177/00131644231209520
Philippe Goldammer, Peter Lucas Stöckli, Yannik Andrea Escher, Hubert Annen, Klaus Jonas
Indirect indices for faking detection in questionnaires make use of a respondent’s deviant or unlikely response pattern over the course of the questionnaire to identify them as a faker. Compared with established direct faking indices (i.e., lying and social desirability scales), indirect indices have at least two advantages: First, they cannot be detected by the test taker. Second, their usage does not require changes to the questionnaire. In the last decades, several such indirect indices have been proposed. However, at present, the researcher’s choice between different indirect faking detection indices is guided by relatively little information, especially if conceptually different indices are to be used together. Thus, we examined and compared how well indices of a representative selection of 12 conceptionally different indirect indices perform and how well they perform individually and jointly compared with an established direct faking measure or validity scale. We found that, first, the score on the agreement factor of the Likert-type item response process tree model, the proportion of desirable scale endpoint responses, and the covariance index were the best-performing indirect indices. Second, using indirect indices in combination resulted in comparable and in some cases even better detection rates than when using direct faking measures. Third, some effective indirect indices were only minimally correlated with substantive scales and could therefore be used to partial faking variance from response sets without losing substance. We, therefore, encourage researchers to use indirect indices instead of direct faking measures when they aim to detect faking in their data.
问卷造假检测的间接指标利用被调查者在问卷过程中的偏差或不太可能的反应模式来识别他们是否为伪造者。与现有的直接欺骗指数(即说谎和社会期望量表)相比,间接指数至少有两个优势:首先,它们不会被测试者察觉。其次,它们的使用不需要改变问卷。在过去的几十年里,已经提出了几个这样的间接指数。然而,目前研究人员在不同的间接伪造检测指标之间的选择所获得的信息相对较少,特别是在概念上不同的指标要同时使用的情况下。因此,我们检查和比较了12个概念上不同的间接指标的代表性选择的指数的表现,以及它们单独和联合与已建立的直接虚假测量或效度量表相比的表现。研究发现,第一,李克特项目反应过程树模型的一致性因子得分、理想量表端点反应比例和协方差指数是表现最好的间接指标。其次,与使用直接检测方法相比,结合使用间接指标的检出率相当,在某些情况下甚至更好。第三,一些有效的间接指标与实质性量表只有最低程度的相关性,因此可以用来部分伪造响应集的方差而不失去实质。因此,我们鼓励研究人员在检测数据造假时使用间接指标,而不是直接的造假措施。
{"title":"On the Utility of Indirect Methods for Detecting Faking","authors":"Philippe Goldammer, Peter Lucas Stöckli, Yannik Andrea Escher, Hubert Annen, Klaus Jonas","doi":"10.1177/00131644231209520","DOIUrl":"https://doi.org/10.1177/00131644231209520","url":null,"abstract":"Indirect indices for faking detection in questionnaires make use of a respondent’s deviant or unlikely response pattern over the course of the questionnaire to identify them as a faker. Compared with established direct faking indices (i.e., lying and social desirability scales), indirect indices have at least two advantages: First, they cannot be detected by the test taker. Second, their usage does not require changes to the questionnaire. In the last decades, several such indirect indices have been proposed. However, at present, the researcher’s choice between different indirect faking detection indices is guided by relatively little information, especially if conceptually different indices are to be used together. Thus, we examined and compared how well indices of a representative selection of 12 conceptionally different indirect indices perform and how well they perform individually and jointly compared with an established direct faking measure or validity scale. We found that, first, the score on the agreement factor of the Likert-type item response process tree model, the proportion of desirable scale endpoint responses, and the covariance index were the best-performing indirect indices. Second, using indirect indices in combination resulted in comparable and in some cases even better detection rates than when using direct faking measures. Third, some effective indirect indices were only minimally correlated with substantive scales and could therefore be used to partial faking variance from response sets without losing substance. We, therefore, encourage researchers to use indirect indices instead of direct faking measures when they aim to detect faking in their data.","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"122 50","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136352015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Investigating Heterogeneity in Response Strategies: A Mixture Multidimensional IRTree Approach 研究响应策略的异质性:一种混合多维IRTree方法
3区 心理学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2023-11-09 DOI: 10.1177/00131644231206765
Ö. Emre C. Alagöz, Thorsten Meiser
To improve the validity of self-report measures, researchers should control for response style (RS) effects, which can be achieved with IRTree models. A traditional IRTree model considers a response as a combination of distinct decision-making processes, where the substantive trait affects the decision on response direction, while decisions about choosing the middle category or extreme categories are largely determined by midpoint RS (MRS) and extreme RS (ERS). One limitation of traditional IRTree models is the assumption that all respondents utilize the same set of RS in their response strategies, whereas it can be assumed that the nature and the strength of RS effects can differ between individuals. To address this limitation, we propose a mixture multidimensional IRTree (MM-IRTree) model that detects heterogeneity in response strategies. The MM-IRTree model comprises four latent classes of respondents, each associated with a different set of RS traits in addition to the substantive trait. More specifically, the class-specific response strategies involve (1) only ERS in the “ERS only” class, (2) only MRS in the “MRS only” class, (3) both ERS and MRS in the “2RS” class, and (4) neither ERS nor MRS in the “0RS” class. In a simulation study, we showed that the MM-IRTree model performed well in recovering model parameters and class memberships, whereas the traditional IRTree approach showed poor performance if the population includes a mixture of response strategies. In an application to empirical data, the MM-IRTree model revealed distinct classes with noticeable class sizes, suggesting that respondents indeed utilize different response strategies.
为了提高自我报告量表的效度,研究人员应该控制反应风格(RS)效应,这可以通过IRTree模型来实现。传统的IRTree模型将响应视为不同决策过程的组合,其中实质性特征影响响应方向的决策,而选择中间类别或极端类别的决策主要由中点RS (MRS)和极端RS (ERS)决定。传统IRTree模型的一个局限性是假设所有受访者在其响应策略中使用相同的RS集,而可以假设RS效应的性质和强度在个体之间是不同的。为了解决这一限制,我们提出了一个混合多维IRTree (MM-IRTree)模型来检测响应策略的异质性。MM-IRTree模型包括四类潜在的被调查者,每一类都与一组不同的RS特征相关联。更具体地说,针对特定类别的响应策略包括(1)“仅限ERS”类别中的ERS,(2)“仅限MRS”类别中的MRS,(3)“2RS”类别中的ERS和MRS,以及(4)“0RS”类别中的ERS和MRS都不是。在模拟研究中,我们发现MM-IRTree模型在恢复模型参数和类隶属度方面表现良好,而传统的IRTree方法在总体包含混合响应策略时表现不佳。在对实证数据的应用中,MM-IRTree模型揭示了不同的类别和显著的班级规模,表明受访者确实使用不同的响应策略。
{"title":"Investigating Heterogeneity in Response Strategies: A Mixture Multidimensional IRTree Approach","authors":"Ö. Emre C. Alagöz, Thorsten Meiser","doi":"10.1177/00131644231206765","DOIUrl":"https://doi.org/10.1177/00131644231206765","url":null,"abstract":"To improve the validity of self-report measures, researchers should control for response style (RS) effects, which can be achieved with IRTree models. A traditional IRTree model considers a response as a combination of distinct decision-making processes, where the substantive trait affects the decision on response direction, while decisions about choosing the middle category or extreme categories are largely determined by midpoint RS (MRS) and extreme RS (ERS). One limitation of traditional IRTree models is the assumption that all respondents utilize the same set of RS in their response strategies, whereas it can be assumed that the nature and the strength of RS effects can differ between individuals. To address this limitation, we propose a mixture multidimensional IRTree (MM-IRTree) model that detects heterogeneity in response strategies. The MM-IRTree model comprises four latent classes of respondents, each associated with a different set of RS traits in addition to the substantive trait. More specifically, the class-specific response strategies involve (1) only ERS in the “ERS only” class, (2) only MRS in the “MRS only” class, (3) both ERS and MRS in the “2RS” class, and (4) neither ERS nor MRS in the “0RS” class. In a simulation study, we showed that the MM-IRTree model performed well in recovering model parameters and class memberships, whereas the traditional IRTree approach showed poor performance if the population includes a mixture of response strategies. In an application to empirical data, the MM-IRTree model revealed distinct classes with noticeable class sizes, suggesting that respondents indeed utilize different response strategies.","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" 9","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135242059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparing RMSEA-Based Indices for Assessing Measurement Invariance in Confirmatory Factor Models 基于rmsea的验证性因子模型测量不变性评价指标比较
3区 心理学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2023-11-01 DOI: 10.1177/00131644231202949
Nataly Beribisky, Gregory R. Hancock
Fit indices are descriptive measures that can help evaluate how well a confirmatory factor analysis (CFA) model fits a researcher’s data. In multigroup models, before between-group comparisons are made, fit indices may be used to evaluate measurement invariance by assessing the degree to which multiple groups’ data are consistent with increasingly constrained nested models. One such fit index is an adaptation of the root mean square error of approximation (RMSEA) called RMSEA D . This index embeds the chi-square and degree-of-freedom differences into a modified RMSEA formula. The present study comprehensively compared RMSEA D to ΔRMSEA, the difference between two RMSEA values associated with a comparison of nested models. The comparison consisted of both derivations as well as a population analysis using one-factor CFA models with features common to those found in practical research. The findings demonstrated that for the same model, RMSEA D will always have increased sensitivity relative to ΔRMSEA with an increasing number of indicator variables. The study also indicated that RMSEA D had increased ability to detect noninvariance relative to ΔRMSEA in one-factor models. For these reasons, when evaluating measurement invariance, RMSEA D is recommended instead of ΔRMSEA.
拟合指数是描述性的措施,可以帮助评估如何很好地验证因子分析(CFA)模型适合研究人员的数据。在多组模型中,在进行组间比较之前,可以使用拟合指数通过评估多组数据与日益受限的嵌套模型的一致程度来评估测量不变性。其中一种拟合指标是对近似均方根误差(RMSEA)的适应,称为RMSEA D。该指标将卡方和自由度差异嵌入到修改后的RMSEA公式中。本研究全面比较了RMSEA D与ΔRMSEA,两个RMSEA值之间的差异与嵌套模型的比较有关。比较包括推导和使用单因素CFA模型的总体分析,其特征与实际研究中发现的特征相同。研究结果表明,对于同一模型,随着指标变量数量的增加,RMSEA D相对于ΔRMSEA的灵敏度总是增加。该研究还表明,在单因素模型中,RMSEA D相对于ΔRMSEA具有更高的检测非不变性的能力。由于这些原因,在评估测量不变性时,建议使用RMSEA D而不是ΔRMSEA。
{"title":"Comparing RMSEA-Based Indices for Assessing Measurement Invariance in Confirmatory Factor Models","authors":"Nataly Beribisky, Gregory R. Hancock","doi":"10.1177/00131644231202949","DOIUrl":"https://doi.org/10.1177/00131644231202949","url":null,"abstract":"Fit indices are descriptive measures that can help evaluate how well a confirmatory factor analysis (CFA) model fits a researcher’s data. In multigroup models, before between-group comparisons are made, fit indices may be used to evaluate measurement invariance by assessing the degree to which multiple groups’ data are consistent with increasingly constrained nested models. One such fit index is an adaptation of the root mean square error of approximation (RMSEA) called RMSEA D . This index embeds the chi-square and degree-of-freedom differences into a modified RMSEA formula. The present study comprehensively compared RMSEA D to ΔRMSEA, the difference between two RMSEA values associated with a comparison of nested models. The comparison consisted of both derivations as well as a population analysis using one-factor CFA models with features common to those found in practical research. The findings demonstrated that for the same model, RMSEA D will always have increased sensitivity relative to ΔRMSEA with an increasing number of indicator variables. The study also indicated that RMSEA D had increased ability to detect noninvariance relative to ΔRMSEA in one-factor models. For these reasons, when evaluating measurement invariance, RMSEA D is recommended instead of ΔRMSEA.","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"236 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135326183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Item Parameter Recovery: Sensitivity to Prior Distribution 项目参数恢复:对先验分布的敏感性
3区 心理学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2023-10-30 DOI: 10.1177/00131644231203688
Christine E. DeMars, Paulius Satkus
Marginal maximum likelihood, a common estimation method for item response theory models, is not inherently a Bayesian procedure. However, due to estimation difficulties, Bayesian priors are often applied to the likelihood when estimating 3PL models, especially with small samples. Little focus has been placed on choosing the priors for marginal maximum estimation. In this study, using sample sizes of 1,000 or smaller, not using priors often led to extreme, implausible parameter estimates. Applying prior distributions to the c-parameters alleviated the estimation problems with samples of 500 or more; for the samples of 100, priors on both the a-parameters and c-parameters were needed. Estimates were biased when the mode of the prior did not match the true parameter value, but the degree of the bias did not depend on the strength of the prior unless it was extremely informative. The root mean squared error (RMSE) of the a-parameters and b-parameters did not depend greatly on either the mode or the strength of the prior unless it was extremely informative. The RMSE of the c-parameters, like the bias, depended on the mode of the prior for c.
边际最大似然是项目反应理论模型中常用的一种估计方法,它本身并不是贝叶斯过程。然而,由于估计困难,在估计3PL模型时,特别是在小样本情况下,贝叶斯先验经常应用于似然。很少关注于选择边际最大值估计的先验。在这项研究中,使用1000或更小的样本量,不使用先验往往导致极端的,难以置信的参数估计。对c参数应用先验分布可以缓解500个或更多样本的估计问题;对于100个样本,a参数和c参数都需要先验。当先验的模式与真实参数值不匹配时,估计是有偏差的,但偏差的程度不取决于先验的强度,除非它具有极大的信息量。a参数和b参数的均方根误差(RMSE)不太依赖于先验的模式或强度,除非它具有极大的信息量。c参数的RMSE,就像偏差一样,取决于c的先验模式。
{"title":"Item Parameter Recovery: Sensitivity to Prior Distribution","authors":"Christine E. DeMars, Paulius Satkus","doi":"10.1177/00131644231203688","DOIUrl":"https://doi.org/10.1177/00131644231203688","url":null,"abstract":"Marginal maximum likelihood, a common estimation method for item response theory models, is not inherently a Bayesian procedure. However, due to estimation difficulties, Bayesian priors are often applied to the likelihood when estimating 3PL models, especially with small samples. Little focus has been placed on choosing the priors for marginal maximum estimation. In this study, using sample sizes of 1,000 or smaller, not using priors often led to extreme, implausible parameter estimates. Applying prior distributions to the c-parameters alleviated the estimation problems with samples of 500 or more; for the samples of 100, priors on both the a-parameters and c-parameters were needed. Estimates were biased when the mode of the prior did not match the true parameter value, but the degree of the bias did not depend on the strength of the prior unless it was extremely informative. The root mean squared error (RMSE) of the a-parameters and b-parameters did not depend greatly on either the mode or the strength of the prior unless it was extremely informative. The RMSE of the c-parameters, like the bias, depended on the mode of the prior for c.","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136067363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Linear Factor Analytic Thurstonian Forced-Choice Models: Current Status and Issues 线性因子分析瑟斯顿强迫选择模型:现状与问题
3区 心理学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2023-10-30 DOI: 10.1177/00131644231205011
Markus T. Jansen, Ralf Schulze
Thurstonian forced-choice modeling is considered to be a powerful new tool to estimate item and person parameters while simultaneously testing the model fit. This assessment approach is associated with the aim of reducing faking and other response tendencies that plague traditional self-report trait assessments. As a result of major recent methodological developments, the estimation of normative trait scores has become possible in addition to the computation of only ipsative scores. This opened up the important possibility of comparisons between individuals with forced-choice assessment procedures. With item response theory (IRT) methods, a multidimensional forced-choice (MFC) format has also been proposed to estimate individual scores. Customarily, items to assess different traits are presented in blocks, often triplets, in applications of the MFC, which is an efficient form of item presentation but also a simplification of the original models. The present study provides a comprehensive review of the present status of Thurstonian forced-choice models and their variants. Critical features of the current models, especially the block models, are identified and discussed. It is concluded that MFC modeling with item blocks is highly problematic and yields biased results. In particular, the often-recommended presentation of blocks with items that are keyed in different directions of a trait proves to be counterproductive considering the goal to reduce response tendencies. The consequences and implications of the highlighted issues are further discussed.
瑟斯顿强迫选择建模被认为是一种强大的新工具,可以估计项目和人的参数,同时测试模型的拟合。这种评估方法的目的是减少欺骗和其他困扰传统自我报告特质评估的反应倾向。由于最近主要的方法发展,除了计算负性分数之外,对规范性特征分数的估计已经成为可能。这就提供了一种重要的可能性,可以对具有强制选择评估程序的个人进行比较。利用项目反应理论(IRT)方法,提出了一种多维强迫选择(MFC)格式来估计个体得分。通常,在MFC应用中,评估不同特征的项目以块(通常是三元组)的形式呈现,这是一种有效的项目呈现形式,也是对原始模型的简化。本研究对瑟斯顿强迫选择模型及其变体的现状进行了全面的回顾。对当前模型,特别是块模型的关键特征进行了识别和讨论。结论是,使用项目块的MFC建模是非常有问题的,并且产生有偏差的结果。特别是,考虑到减少反应倾向的目标,经常推荐的用不同方向键的项目展示块被证明是适得其反的。进一步讨论了突出问题的后果和影响。
{"title":"Linear Factor Analytic Thurstonian Forced-Choice Models: Current Status and Issues","authors":"Markus T. Jansen, Ralf Schulze","doi":"10.1177/00131644231205011","DOIUrl":"https://doi.org/10.1177/00131644231205011","url":null,"abstract":"Thurstonian forced-choice modeling is considered to be a powerful new tool to estimate item and person parameters while simultaneously testing the model fit. This assessment approach is associated with the aim of reducing faking and other response tendencies that plague traditional self-report trait assessments. As a result of major recent methodological developments, the estimation of normative trait scores has become possible in addition to the computation of only ipsative scores. This opened up the important possibility of comparisons between individuals with forced-choice assessment procedures. With item response theory (IRT) methods, a multidimensional forced-choice (MFC) format has also been proposed to estimate individual scores. Customarily, items to assess different traits are presented in blocks, often triplets, in applications of the MFC, which is an efficient form of item presentation but also a simplification of the original models. The present study provides a comprehensive review of the present status of Thurstonian forced-choice models and their variants. Critical features of the current models, especially the block models, are identified and discussed. It is concluded that MFC modeling with item blocks is highly problematic and yields biased results. In particular, the often-recommended presentation of blocks with items that are keyed in different directions of a trait proves to be counterproductive considering the goal to reduce response tendencies. The consequences and implications of the highlighted issues are further discussed.","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136069087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Thinking About Sum Scores Yet Again, Maybe the Last Time, We Don’t Know, Oh No . . .: A Comment on 再一次思考求和分数,也许是最后一次,我们不知道,哦不…:评论
3区 心理学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2023-10-13 DOI: 10.1177/00131644231205310
Keith F. Widaman, William Revelle
The relative advantages and disadvantages of sum scores and estimated factor scores are issues of concern for substantive research in psychology. Recently, while championing estimated factor scores over sum scores, McNeish offered a trenchant rejoinder to an article by Widaman and Revelle, which had critiqued an earlier paper by McNeish and Wolf. In the recent contribution, McNeish misrepresented a number of claims by Widaman and Revelle, rendering moot his criticisms of Widaman and Revelle. Notably, McNeish chose to avoid confronting a key strength of sum scores stressed by Widaman and Revelle—the greater comparability of results across studies if sum scores are used. Instead, McNeish pivoted to present a host of simulation studies to identify relative strengths of estimated factor scores. Here, we review our prior claims and, in the process, deflect purported criticisms by McNeish. We discuss briefly issues related to simulated data and empirical data that provide evidence of strengths of each type of score. In doing so, we identified a second strength of sum scores: superior cross-validation of results across independent samples of empirical data, at least for samples of moderate size. We close with consideration of four general issues concerning sum scores and estimated factor scores that highlight the contrasts between positions offered by McNeish and by us, issues of importance when pursuing applied research in our field.
总和分数和估计因子分数的相对优劣是心理学实质性研究关注的问题。最近,麦克尼什在支持估计因子得分高于总和得分的同时,对Widaman和Revelle的一篇文章提出了尖锐的反驳,这篇文章批评了麦克尼什和沃尔夫早先的一篇论文。在最近的文章中,McNeish歪曲了Widaman和Revelle的一些观点,使他对Widaman和Revelle的批评变得毫无意义。值得注意的是,McNeish选择避免面对Widaman和revelve强调的总和分数的关键优势-如果使用总和分数,则研究结果的可比性更大。相反,麦克尼什转而提出了一系列模拟研究,以确定估计因素得分的相对优势。在这里,我们回顾我们之前的主张,并在此过程中,转移McNeish的批评。我们简要讨论了与模拟数据和经验数据相关的问题,这些数据提供了每种类型得分优势的证据。在这样做的过程中,我们确定了总和分数的第二个优势:至少对于中等规模的样本,在经验数据的独立样本中,结果的交叉验证效果更好。最后,我们考虑了关于总分数和估计因子分数的四个一般问题,这些问题突出了麦克尼什和我们提供的职位之间的对比,这是在我们的领域进行应用研究时的重要问题。
{"title":"Thinking About Sum Scores Yet Again, Maybe the Last Time, We Don’t Know, Oh No . . .: A Comment on","authors":"Keith F. Widaman, William Revelle","doi":"10.1177/00131644231205310","DOIUrl":"https://doi.org/10.1177/00131644231205310","url":null,"abstract":"The relative advantages and disadvantages of sum scores and estimated factor scores are issues of concern for substantive research in psychology. Recently, while championing estimated factor scores over sum scores, McNeish offered a trenchant rejoinder to an article by Widaman and Revelle, which had critiqued an earlier paper by McNeish and Wolf. In the recent contribution, McNeish misrepresented a number of claims by Widaman and Revelle, rendering moot his criticisms of Widaman and Revelle. Notably, McNeish chose to avoid confronting a key strength of sum scores stressed by Widaman and Revelle—the greater comparability of results across studies if sum scores are used. Instead, McNeish pivoted to present a host of simulation studies to identify relative strengths of estimated factor scores. Here, we review our prior claims and, in the process, deflect purported criticisms by McNeish. We discuss briefly issues related to simulated data and empirical data that provide evidence of strengths of each type of score. In doing so, we identified a second strength of sum scores: superior cross-validation of results across independent samples of empirical data, at least for samples of moderate size. We close with consideration of four general issues concerning sum scores and estimated factor scores that highlight the contrasts between positions offered by McNeish and by us, issues of importance when pursuing applied research in our field.","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135859120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Educational and Psychological Measurement
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1