首页 > 最新文献

Educational and Psychological Measurement最新文献

英文 中文
The Impact of Measurement Model Misspecification on Coefficient Omega Estimates of Composite Reliability. 测量模型不规范对复合材料可靠性系数Omega估计的影响
IF 2.7 3区 心理学 Q1 Social Sciences Pub Date : 2024-02-01 Epub Date: 2023-02-18 DOI: 10.1177/00131644231155804
Stephanie M Bell, R Philip Chalmers, David B Flora

Coefficient omega indices are model-based composite reliability estimates that have become increasingly popular. A coefficient omega index estimates how reliably an observed composite score measures a target construct as represented by a factor in a factor-analysis model; as such, the accuracy of omega estimates is likely to depend on correct model specification. The current paper presents a simulation study to investigate the performance of omega-unidimensional (based on the parameters of a one-factor model) and omega-hierarchical (based on a bifactor model) under correct and incorrect model misspecification for high and low reliability composites and different scale lengths. Our results show that coefficient omega estimates are unbiased when calculated from the parameter estimates of a properly specified model. However, omega-unidimensional produced positively biased estimates when the population model was characterized by unmodeled error correlations or multidimensionality, whereas omega-hierarchical was only slightly biased when the population model was either a one-factor model with correlated errors or a higher-order model. These biases were higher when population reliability was lower and increased with scale length. Researchers should carefully evaluate the feasibility of a one-factor model before estimating and reporting omega-unidimensional.

系数ω指数是基于模型的综合可靠性估计,越来越受欢迎。系数ω指数估计观察到的综合得分如何可靠地测量由因子分析模型中的因子表示的目标结构;因此,omega估计的准确性可能取决于正确的模型规范。本文提出了一项模拟研究,以研究高可靠性和低可靠性复合材料以及不同标度长度的ω一维(基于单因素模型的参数)和ω层次(基于双因素模型)在正确和不正确的模型错误指定下的性能。我们的结果表明,当根据适当指定的模型的参数估计进行计算时,系数ω估计是无偏的。然而,当总体模型以未建模的误差相关性或多维性为特征时,ω单维产生了正偏差估计,而当总体模型是具有相关误差的单因素模型或高阶模型时,ω层次仅略有偏差。当总体可靠性较低时,这些偏差较高,并且随着量表长度的增加而增加。研究人员在估计和报告ω一维之前,应该仔细评估单因素模型的可行性。
{"title":"The Impact of Measurement Model Misspecification on Coefficient Omega Estimates of Composite Reliability.","authors":"Stephanie M Bell, R Philip Chalmers, David B Flora","doi":"10.1177/00131644231155804","DOIUrl":"10.1177/00131644231155804","url":null,"abstract":"<p><p>Coefficient omega indices are model-based composite reliability estimates that have become increasingly popular. A coefficient omega index estimates how reliably an observed composite score measures a target construct as represented by a factor in a factor-analysis model; as such, the accuracy of omega estimates is likely to depend on correct model specification. The current paper presents a simulation study to investigate the performance of omega-unidimensional (based on the parameters of a one-factor model) and omega-hierarchical (based on a bifactor model) under correct and incorrect model misspecification for high and low reliability composites and different scale lengths. Our results show that coefficient omega estimates are unbiased when calculated from the parameter estimates of a properly specified model. However, omega-unidimensional produced positively biased estimates when the population model was characterized by unmodeled error correlations or multidimensionality, whereas omega-hierarchical was only slightly biased when the population model was either a one-factor model with correlated errors or a higher-order model. These biases were higher when population reliability was lower and increased with scale length. Researchers should carefully evaluate the feasibility of a one-factor model before estimating and reporting omega-unidimensional.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10795570/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42609812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correcting for Extreme Response Style: Model Choice Matters. 纠正极端反应风格:模型选择问题
IF 2.7 3区 心理学 Q1 Social Sciences Pub Date : 2024-02-01 Epub Date: 2023-02-17 DOI: 10.1177/00131644231155838
Martijn Schoenmakers, Jesper Tijmstra, Jeroen Vermunt, Maria Bolsinova

Extreme response style (ERS), the tendency of participants to select extreme item categories regardless of the item content, has frequently been found to decrease the validity of Likert-type questionnaire results. For this reason, various item response theory (IRT) models have been proposed to model ERS and correct for it. Comparisons of these models are however rare in the literature, especially in the context of cross-cultural comparisons, where ERS is even more relevant due to cultural differences between groups. To remedy this issue, the current article examines two frequently used IRT models that can be estimated using standard software: a multidimensional nominal response model (MNRM) and a IRTree model. Studying conceptual differences between these models reveals that they differ substantially in their conceptualization of ERS. These differences result in different category probabilities between the models. To evaluate the impact of these differences in a multigroup context, a simulation study is conducted. Our results show that when the groups differ in their average ERS, the IRTree model and MNRM can drastically differ in their conclusions about the size and presence of differences in the substantive trait between these groups. An empirical example is given and implications for the future use of both models and the conceptualization of ERS are discussed.

极端反应风格(Extreme response style, ERS),即参与者不考虑项目内容而选择极端项目类别的倾向,经常被发现会降低李克特型问卷结果的效度。因此,人们提出了各种项目反应理论(IRT)模型来对ERS进行建模和修正。然而,这些模型的比较在文献中很少,特别是在跨文化比较的背景下,由于群体之间的文化差异,ERS更加相关。为了解决这个问题,本文研究了两种常用的IRT模型,它们可以使用标准软件进行估计:多维标称响应模型(MNRM)和IRTree模型。研究这些模型之间的概念差异表明,它们对ERS的概念化存在很大差异。这些差异导致模型之间的类别概率不同。为了评估这些差异在多群体环境中的影响,进行了模拟研究。我们的研究结果表明,当两组的平均ERS不同时,IRTree模型和MNRM可以在这些组之间实质性性状差异的大小和存在性方面得出截然不同的结论。给出了一个经验例子,并讨论了未来使用这两个模型和ERS概念化的含义。
{"title":"Correcting for Extreme Response Style: Model Choice Matters.","authors":"Martijn Schoenmakers, Jesper Tijmstra, Jeroen Vermunt, Maria Bolsinova","doi":"10.1177/00131644231155838","DOIUrl":"10.1177/00131644231155838","url":null,"abstract":"<p><p>Extreme response style (ERS), the tendency of participants to select extreme item categories regardless of the item content, has frequently been found to decrease the validity of Likert-type questionnaire results. For this reason, various item response theory (IRT) models have been proposed to model ERS and correct for it. Comparisons of these models are however rare in the literature, especially in the context of cross-cultural comparisons, where ERS is even more relevant due to cultural differences between groups. To remedy this issue, the current article examines two frequently used IRT models that can be estimated using standard software: a multidimensional nominal response model (MNRM) and a IRTree model. Studying conceptual differences between these models reveals that they differ substantially in their conceptualization of ERS. These differences result in different category probabilities between the models. To evaluate the impact of these differences in a multigroup context, a simulation study is conducted. Our results show that when the groups differ in their average ERS, the IRTree model and MNRM can drastically differ in their conclusions about the size and presence of differences in the substantive trait between these groups. An empirical example is given and implications for the future use of both models and the conceptualization of ERS are discussed.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10795569/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41386423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Model Specification Searches in Structural Equation Modeling Using Bee Swarm Optimization. 基于蜂群优化的结构方程建模模型规范搜索
IF 2.7 3区 心理学 Q1 Social Sciences Pub Date : 2024-02-01 Epub Date: 2023-03-29 DOI: 10.1177/00131644231160552
Ulrich Schroeders, Florian Scharf, Gabriel Olaru

Metaheuristics are optimization algorithms that efficiently solve a variety of complex combinatorial problems. In psychological research, metaheuristics have been applied in short-scale construction and model specification search. In the present study, we propose a bee swarm optimization (BSO) algorithm to explore the structure underlying a psychological measurement instrument. The algorithm assigns items to an unknown number of nested factors in a confirmatory bifactor model, while simultaneously selecting items for the final scale. To achieve this, the algorithm follows the biological template of bees' foraging behavior: Scout bees explore new food sources, whereas onlooker bees search in the vicinity of previously explored, promising food sources. Analogously, scout bees in BSO introduce major changes to a model specification (e.g., adding or removing a specific factor), whereas onlooker bees only make minor changes (e.g., adding an item to a factor or swapping items between specific factors). Through this division of labor in an artificial bee colony, the algorithm aims to strike a balance between two opposing strategies diversification (or exploration) versus intensification (or exploitation). We demonstrate the usefulness of the algorithm to find the underlying structure in two empirical data sets (Holzinger-Swineford and short dark triad questionnaire, SDQ3). Furthermore, we illustrate the influence of relevant hyperparameters such as the number of bees in the hive, the percentage of scouts to onlookers, and the number of top solutions to be followed. Finally, useful applications of the new algorithm are discussed, as well as limitations and possible future research opportunities.

元启发式算法是一种有效解决各种复杂组合问题的优化算法。在心理学研究中,元启发式被应用于短尺度构建和模型规范搜索。在本研究中,我们提出了一种蜂群优化(BSO)算法来探索心理测量仪器的底层结构。该算法在验证性双因子模型中为未知数量的嵌套因子分配项目,同时为最终量表选择项目。为了实现这一目标,该算法遵循蜜蜂觅食行为的生物模板:侦察蜜蜂探索新的食物来源,而旁观蜜蜂在以前探索过的有希望的食物来源附近搜索。类似地,BSO中的侦察兵蜜蜂会对模型规范进行重大更改(例如,添加或删除特定因素),而旁观者蜜蜂只会进行微小更改(例如,向因素添加项目或在特定因素之间交换项目)。通过人工蜂群中的这种劳动分工,该算法旨在在多样化(或探索)与集约化(或开发)两种相反的策略之间取得平衡。我们证明了该算法在两个经验数据集(Holzinger-Swineford和short dark triad questionnaire, SDQ3)中找到底层结构的有效性。此外,我们说明了相关超参数的影响,如蜂箱中的蜜蜂数量,侦察兵对旁观者的百分比,以及要遵循的顶级解决方案的数量。最后,讨论了新算法的有用应用,以及局限性和可能的未来研究机会。
{"title":"Model Specification Searches in Structural Equation Modeling Using Bee Swarm Optimization.","authors":"Ulrich Schroeders, Florian Scharf, Gabriel Olaru","doi":"10.1177/00131644231160552","DOIUrl":"10.1177/00131644231160552","url":null,"abstract":"<p><p>Metaheuristics are optimization algorithms that efficiently solve a variety of complex combinatorial problems. In psychological research, metaheuristics have been applied in short-scale construction and model specification search. In the present study, we propose a bee swarm optimization (BSO) algorithm to explore the structure underlying a psychological measurement instrument. The algorithm assigns items to an unknown number of nested factors in a confirmatory bifactor model, while simultaneously selecting items for the final scale. To achieve this, the algorithm follows the biological template of bees' foraging behavior: Scout bees explore new food sources, whereas onlooker bees search in the vicinity of previously explored, promising food sources. Analogously, scout bees in BSO introduce major changes to a model specification (e.g., adding or removing a specific factor), whereas onlooker bees only make minor changes (e.g., adding an item to a factor or swapping items between specific factors). Through this division of labor in an artificial bee colony, the algorithm aims to strike a balance between two opposing strategies diversification (or exploration) versus intensification (or exploitation). We demonstrate the usefulness of the algorithm to find the underlying structure in two empirical data sets (Holzinger-Swineford and short dark triad questionnaire, SDQ3). Furthermore, we illustrate the influence of relevant hyperparameters such as the number of bees in the hive, the percentage of scouts to onlookers, and the number of top solutions to be followed. Finally, useful applications of the new algorithm are discussed, as well as limitations and possible future research opportunities.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10795566/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45155550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rotation Local Solutions in Multidimensional Item Response Theory Models 多维项目反应理论模型中的旋转局部解决方案
IF 2.7 3区 心理学 Q1 Social Sciences Pub Date : 2024-01-23 DOI: 10.1177/00131644231223722
Hoang V. Nguyen, Niels G. Waller
We conducted an extensive Monte Carlo study of factor-rotation local solutions (LS) in multidimensional, two-parameter logistic (M2PL) item response models. In this study, we simulated more than 19,200 data sets that were drawn from 96 model conditions and performed more than 7.6 million rotations to examine the influence of (a) slope parameter sizes, (b) number of indicators per factor (trait), (c) probabilities of cross-loadings, (d) factor correlation sizes, (e) model approximation error, and (f) sample sizes on the local solution rates of the oblimin and (oblique) geomin rotation algorithms. To accommodate these design variables, we extended the standard M2PL model to include correlated major factors and uncorrelated minor factors (to represent model error). Our results showed that both rotation methods converged to LS under some conditions with geomin producing the highest local solution rates across many models. Our results also showed that, for identical item response patterns, rotation LS can produce different latent trait estimates with different levels of measurement precision (as indexed by the conditional standard error of measurement). Follow-up analyses revealed that when rotation algorithms converged to multiple solutions, quantitative indices of structural fit, such as numerical measures of simple structure, will often misidentify the rotation that is closest in mean-squared error to the factor pattern (or item-slope pattern) of the data-generating model.
我们对多维双参数逻辑(M2PL)项目反应模型中的因子旋转局部解(LS)进行了广泛的蒙特卡罗研究。在这项研究中,我们模拟了来自 96 个模型条件的 19200 多个数据集,并进行了 760 多万次旋转,以检验(a)斜率参数大小、(b)每个因子(特质)的指标数、(c)交叉负荷概率、(d)因子相关性大小、(e)模型近似误差以及(f)样本大小对 oblimin 和(斜)geomin 旋转算法的局部解率的影响。为了适应这些设计变量,我们扩展了标准 M2PL 模型,使其包括相关的主要因子和不相关的次要因子(代表模型误差)。我们的结果表明,两种旋转方法在某些条件下都收敛于 LS,而 geomin 在许多模型中都产生了最高的局部求解率。我们的结果还显示,对于相同的项目响应模式,旋转 LS 可以产生具有不同测量精度(以条件测量标准误差为指标)的不同潜在特质估计值。后续分析表明,当旋转算法收敛到多个解时,结构拟合的定量指标,如简单结构的数值测量,往往会错误地识别出在均方误差上最接近数据生成模型的因子模式(或项目-斜率模式)的旋转。
{"title":"Rotation Local Solutions in Multidimensional Item Response Theory Models","authors":"Hoang V. Nguyen, Niels G. Waller","doi":"10.1177/00131644231223722","DOIUrl":"https://doi.org/10.1177/00131644231223722","url":null,"abstract":"We conducted an extensive Monte Carlo study of factor-rotation local solutions (LS) in multidimensional, two-parameter logistic (M2PL) item response models. In this study, we simulated more than 19,200 data sets that were drawn from 96 model conditions and performed more than 7.6 million rotations to examine the influence of (a) slope parameter sizes, (b) number of indicators per factor (trait), (c) probabilities of cross-loadings, (d) factor correlation sizes, (e) model approximation error, and (f) sample sizes on the local solution rates of the oblimin and (oblique) geomin rotation algorithms. To accommodate these design variables, we extended the standard M2PL model to include correlated major factors and uncorrelated minor factors (to represent model error). Our results showed that both rotation methods converged to LS under some conditions with geomin producing the highest local solution rates across many models. Our results also showed that, for identical item response patterns, rotation LS can produce different latent trait estimates with different levels of measurement precision (as indexed by the conditional standard error of measurement). Follow-up analyses revealed that when rotation algorithms converged to multiple solutions, quantitative indices of structural fit, such as numerical measures of simple structure, will often misidentify the rotation that is closest in mean-squared error to the factor pattern (or item-slope pattern) of the data-generating model.","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139604366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detecting Careless Responding in Multidimensional Forced-Choice Questionnaires 检测多维强迫选择问卷中的粗心应答
IF 2.7 3区 心理学 Q1 Social Sciences Pub Date : 2024-01-12 DOI: 10.1177/00131644231222420
Rebekka Kupffer, Susanne Frick, Eunike Wetzel
The multidimensional forced-choice (MFC) format is an alternative to rating scales in which participants rank items according to how well the items describe them. Currently, little is known about how to detect careless responding in MFC data. The aim of this study was to adapt a number of indices used for rating scales to the MFC format and additionally develop several new indices that are unique to the MFC format. We applied these indices to a data set from an online survey ( N = 1,169) that included a series of personality questionnaires in the MFC format. The correlations among the careless responding indices were somewhat lower than those published for rating scales. Results from a latent profile analysis suggested that the majority of the sample (about 76–84%) did not respond carelessly, although the ones who did were characterized by different levels of careless responding. In a simulation study, we simulated different careless responding patterns and varied the overall proportion of carelessness in the samples. With one exception, the indices worked as intended conceptually. Taken together, the results suggest that careless responding also plays an important role in the MFC format. Recommendations on how it can be addressed are discussed.
多维强迫选择(MFC)形式是评分量表的一种替代方法,参与者根据项目对其描述的程度对项目进行排序。目前,人们对如何检测 MFC 数据中的粗心应答知之甚少。本研究的目的是将一些用于评分量表的指数调整为 MFC 格式,并开发出几个 MFC 格式特有的新指数。我们将这些指数应用于一个在线调查(N = 1,169)的数据集,其中包括一系列 MFC 格式的人格问卷。粗心应答指数之间的相关性略低于评级量表的相关性。潜在特征分析的结果表明,大多数样本(约 76-84%)并没有粗心应答,尽管粗心应答者的粗心程度各不相同。在模拟研究中,我们模拟了不同的粗心应答模式,并改变了样本中粗心应答的总体比例。除了一个例外,这些指数都达到了预期的概念效果。总之,研究结果表明,粗心应答在 MFC 格式中也起着重要作用。本文讨论了如何解决这一问题的建议。
{"title":"Detecting Careless Responding in Multidimensional Forced-Choice Questionnaires","authors":"Rebekka Kupffer, Susanne Frick, Eunike Wetzel","doi":"10.1177/00131644231222420","DOIUrl":"https://doi.org/10.1177/00131644231222420","url":null,"abstract":"The multidimensional forced-choice (MFC) format is an alternative to rating scales in which participants rank items according to how well the items describe them. Currently, little is known about how to detect careless responding in MFC data. The aim of this study was to adapt a number of indices used for rating scales to the MFC format and additionally develop several new indices that are unique to the MFC format. We applied these indices to a data set from an online survey ( N = 1,169) that included a series of personality questionnaires in the MFC format. The correlations among the careless responding indices were somewhat lower than those published for rating scales. Results from a latent profile analysis suggested that the majority of the sample (about 76–84%) did not respond carelessly, although the ones who did were characterized by different levels of careless responding. In a simulation study, we simulated different careless responding patterns and varied the overall proportion of carelessness in the samples. With one exception, the indices worked as intended conceptually. Taken together, the results suggest that careless responding also plays an important role in the MFC format. Recommendations on how it can be addressed are discussed.","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139625047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Two-Method Measurement Planned Missing Data With Purposefully Selected Samples 使用特选样本的双方法测量计划缺失数据
IF 2.7 3区 心理学 Q1 Social Sciences Pub Date : 2024-01-05 DOI: 10.1177/00131644231222603
M. Xu, Jessica A. R. Logan
Research designs that include planned missing data are gaining popularity in applied education research. These methods have traditionally relied on introducing missingness into data collections using the missing completely at random (MCAR) mechanism. This study assesses whether planned missingness can also be implemented when data are instead designed to be purposefully missing based on student performance. A research design with purposefully selected missingness would allow researchers to focus all assessment efforts on a target sample, while still maintaining the statistical power of the full sample. This study introduces the method and demonstrates the performance of the purposeful missingness method within the two-method measurement planned missingness design using a Monte Carlo simulation study. Results demonstrate that the purposeful missingness method can recover parameter estimates in models with as much accuracy as the MCAR method, across multiple conditions.
在应用教育研究中,包含计划缺失数据的研究设计越来越受欢迎。这些方法传统上依赖于使用完全随机缺失(MCAR)机制在数据收集中引入缺失。本研究评估的是,当数据被设计为基于学生成绩的有目的缺失时,是否也可以实施有计划的缺失。有目的性地选择缺失的研究设计可以让研究人员将所有评估工作集中在目标样本上,同时仍能保持全样本的统计能力。本研究介绍了这一方法,并通过蒙特卡罗模拟研究证明了有目的遗漏法在双方法测量计划遗漏设计中的性能。结果表明,有目的的遗漏法可以在多种条件下恢复模型中的参数估计值,其准确性不亚于 MCAR 方法。
{"title":"Two-Method Measurement Planned Missing Data With Purposefully Selected Samples","authors":"M. Xu, Jessica A. R. Logan","doi":"10.1177/00131644231222603","DOIUrl":"https://doi.org/10.1177/00131644231222603","url":null,"abstract":"Research designs that include planned missing data are gaining popularity in applied education research. These methods have traditionally relied on introducing missingness into data collections using the missing completely at random (MCAR) mechanism. This study assesses whether planned missingness can also be implemented when data are instead designed to be purposefully missing based on student performance. A research design with purposefully selected missingness would allow researchers to focus all assessment efforts on a target sample, while still maintaining the statistical power of the full sample. This study introduces the method and demonstrates the performance of the purposeful missingness method within the two-method measurement planned missingness design using a Monte Carlo simulation study. Results demonstrate that the purposeful missingness method can recover parameter estimates in models with as much accuracy as the MCAR method, across multiple conditions.","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139382541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Conceptualizing Correlated Residuals as Item-Level Method Effects in Confirmatory Factor Analysis 将相关残差概念化为确证因子分析中的项目级方法效应
IF 2.7 3区 心理学 Q1 Social Sciences Pub Date : 2023-12-23 DOI: 10.1177/00131644231218401
Karl Schweizer, A. Gold, Dorothea Krampen, Stefan Troche
Conceptualizing two-variable disturbances preventing good model fit in confirmatory factor analysis as item-level method effects instead of correlated residuals avoids violating the principle that residual variation is unique for each item. The possibility of representing such a disturbance by a method factor of a bifactor measurement model was investigated with respect to model identification. It turned out that a suitable way of realizing the method factor is its integration into a fixed-links, parallel-measurement or tau-equivalent measurement submodel that is part of the bifactor model. A simulation study comparing these submodels revealed similar degrees of efficiency in controlling the influence of two-variable disturbances on model fit. Perfect correspondence characterized the fit results of the model assuming correlated residuals and the fixed-links model, and virtually also the tau-equivalent model.
在确认性因素分析中,将阻碍模型良好拟合的双变量干扰概念化为项目级方法效应,而不是相关残差,可以避免违反残差变异对每个项目都是唯一的这一原则。在模型识别方面,研究了用双因素测量模型的方法因素来表示这种干扰的可能性。结果表明,实现方法因子的合适方法是将其整合到作为双因素模型一部分的固定连接、平行测量或头等效测量子模型中。一项比较这些子模型的模拟研究显示,在控制双变量干扰对模型拟合的影响方面,这些子模型具有相似的效率。假定残差相关的模型与固定链接模型的拟合结果完全一致,实际上也与 tau 等效模型完全一致。
{"title":"Conceptualizing Correlated Residuals as Item-Level Method Effects in Confirmatory Factor Analysis","authors":"Karl Schweizer, A. Gold, Dorothea Krampen, Stefan Troche","doi":"10.1177/00131644231218401","DOIUrl":"https://doi.org/10.1177/00131644231218401","url":null,"abstract":"Conceptualizing two-variable disturbances preventing good model fit in confirmatory factor analysis as item-level method effects instead of correlated residuals avoids violating the principle that residual variation is unique for each item. The possibility of representing such a disturbance by a method factor of a bifactor measurement model was investigated with respect to model identification. It turned out that a suitable way of realizing the method factor is its integration into a fixed-links, parallel-measurement or tau-equivalent measurement submodel that is part of the bifactor model. A simulation study comparing these submodels revealed similar degrees of efficiency in controlling the influence of two-variable disturbances on model fit. Perfect correspondence characterized the fit results of the model assuming correlated residuals and the fixed-links model, and virtually also the tau-equivalent model.","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2023-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139162221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Separation of Traits and Extreme Response Style in IRTree Models: The Role of Mimicry Effects for the Meaningful Interpretation of Estimates IRTree 模型中特质与极端反应风格的分离:模仿效应对有意义地解释估计值的作用
IF 2.7 3区 心理学 Q1 Social Sciences Pub Date : 2023-12-22 DOI: 10.1177/00131644231213319
Viola Merhof, Caroline M. Böhm, Thorsten Meiser
Item response tree (IRTree) models are a flexible framework to control self-reported trait measurements for response styles. To this end, IRTree models decompose the responses to rating items into sub-decisions, which are assumed to be made on the basis of either the trait being measured or a response style, whereby the effects of such person parameters can be separated from each other. Here we investigate conditions under which the substantive meanings of estimated extreme response style parameters are potentially invalid and do not correspond to the meanings attributed to them, that is, content-unrelated category preferences. Rather, the response style factor may mimic the trait and capture part of the trait-induced variance in item responding, thus impairing the meaningful separation of the person parameters. Such a mimicry effect is manifested in a biased estimation of the covariance of response style and trait, as well as in an overestimation of the response style variance. Both can lead to severely misleading conclusions drawn from IRTree analyses. A series of simulation studies reveals that mimicry effects depend on the distribution of observed responses and that the estimation biases are stronger the more asymmetrically the responses are distributed across the rating scale. It is further demonstrated that extending the commonly used IRTree model with unidimensional sub-decisions by multidimensional parameterizations counteracts mimicry effects and facilitates the meaningful separation of parameters. An empirical example of the Program for International Student Assessment (PISA) background questionnaire illustrates the threat of mimicry effects in real data. The implications of applying IRTree models for empirical research questions are discussed.
项目反应树(IRTree)模型是一种灵活的框架,用于控制自我报告特质测量的反应风格。为此,IRTree 模型将对评分项目的反应分解为若干子决定,并假定这些子决定是根据所测量的特质或反应风格做出的,这样就可以将这些人的参数的影响彼此分开。在此,我们研究了在哪些条件下,估计的极端反应风格参数的实质含义可能无效,并且与归因于它们的含义(即与内容无关的类别偏好)不一致。相反,反应风格因子可能会模仿特质,并捕捉到项目反应中部分由特质引起的变异,从而损害了人称参数的意义分离。这种模仿效应表现为对反应风格和特质的协方差估计有偏差,以及对反应风格方差估计过高。这两种情况都会严重误导 IRTree 分析得出的结论。一系列模拟研究表明,模仿效应取决于观察到的反应的分布情况,反应在评分量表中的分布越不对称,估计偏差就越大。研究还进一步证明,通过多维参数化扩展常用的 IRTree 模型,使其具有单维子决策,可以抵消模仿效应,并促进参数的有意义分离。以国际学生评估项目(PISA)背景调查问卷为例,说明了真实数据中模仿效应的威胁。本文还讨论了将 IRTree 模型应用于实证研究问题的意义。
{"title":"Separation of Traits and Extreme Response Style in IRTree Models: The Role of Mimicry Effects for the Meaningful Interpretation of Estimates","authors":"Viola Merhof, Caroline M. Böhm, Thorsten Meiser","doi":"10.1177/00131644231213319","DOIUrl":"https://doi.org/10.1177/00131644231213319","url":null,"abstract":"Item response tree (IRTree) models are a flexible framework to control self-reported trait measurements for response styles. To this end, IRTree models decompose the responses to rating items into sub-decisions, which are assumed to be made on the basis of either the trait being measured or a response style, whereby the effects of such person parameters can be separated from each other. Here we investigate conditions under which the substantive meanings of estimated extreme response style parameters are potentially invalid and do not correspond to the meanings attributed to them, that is, content-unrelated category preferences. Rather, the response style factor may mimic the trait and capture part of the trait-induced variance in item responding, thus impairing the meaningful separation of the person parameters. Such a mimicry effect is manifested in a biased estimation of the covariance of response style and trait, as well as in an overestimation of the response style variance. Both can lead to severely misleading conclusions drawn from IRTree analyses. A series of simulation studies reveals that mimicry effects depend on the distribution of observed responses and that the estimation biases are stronger the more asymmetrically the responses are distributed across the rating scale. It is further demonstrated that extending the commonly used IRTree model with unidimensional sub-decisions by multidimensional parameterizations counteracts mimicry effects and facilitates the meaningful separation of parameters. An empirical example of the Program for International Student Assessment (PISA) background questionnaire illustrates the threat of mimicry effects in real data. The implications of applying IRTree models for empirical research questions are discussed.","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2023-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139165688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Effects of the Quantity and Magnitude of Cross-Loading and Model Specification on MIRT Item Parameter Recovery 交叉加载的数量和幅度以及模型规格对 MIRT 项目参数恢复的影响
IF 2.7 3区 心理学 Q1 Social Sciences Pub Date : 2023-12-21 DOI: 10.1177/00131644231210509
Mostafa Hosseinzadeh, Ki Lynn Matlock Cole
In real-world situations, multidimensional data may appear on large-scale tests or psychological surveys. The purpose of this study was to investigate the effects of the quantity and magnitude of cross-loadings and model specification on item parameter recovery in multidimensional Item Response Theory (MIRT) models, especially when the model was misspecified as a simple structure, ignoring the quantity and magnitude of cross-loading. A simulation study that replicated this scenario was designed to manipulate the variables that could potentially influence the precision of item parameter estimation in the MIRT models. Item parameters were estimated using marginal maximum likelihood, utilizing the expectation-maximization algorithms. A compensatory two-parameter logistic-MIRT model with two dimensions and dichotomous item–responses was used to simulate and calibrate the data for each combination of conditions across 500 replications. The results of this study indicated that ignoring the quantity and magnitude of cross-loading and model specification resulted in inaccurate and biased item discrimination parameter estimates. As the quantity and magnitude of cross-loading increased, the root mean square of error and bias estimates of item discrimination worsened.
在现实世界中,大规模测验或心理调查中可能会出现多维数据。本研究旨在探讨交叉负荷的数量和大小以及模型规格对多维项目反应理论(MIRT)模型中项目参数恢复的影响,尤其是当模型被错误地规格为简单结构,忽略了交叉负荷的数量和大小时。我们设计了一项模拟研究来复制这种情况,以操纵可能影响多维项目反应理论模型中项目参数估计精度的变量。项目参数采用边际最大似然法,利用期望最大化算法进行估计。我们使用了一个具有两个维度和二分项目反应的补偿性双参数逻辑-MIRT 模型来模拟和校准 500 次重复中每种条件组合的数据。研究结果表明,忽略交叉负荷的数量和大小以及模型的规格会导致项目区分度参数估计的不准确和偏差。随着交叉负荷数量和幅度的增加,项目辨别力的误差均方根和偏差估计值也在增加。
{"title":"Effects of the Quantity and Magnitude of Cross-Loading and Model Specification on MIRT Item Parameter Recovery","authors":"Mostafa Hosseinzadeh, Ki Lynn Matlock Cole","doi":"10.1177/00131644231210509","DOIUrl":"https://doi.org/10.1177/00131644231210509","url":null,"abstract":"In real-world situations, multidimensional data may appear on large-scale tests or psychological surveys. The purpose of this study was to investigate the effects of the quantity and magnitude of cross-loadings and model specification on item parameter recovery in multidimensional Item Response Theory (MIRT) models, especially when the model was misspecified as a simple structure, ignoring the quantity and magnitude of cross-loading. A simulation study that replicated this scenario was designed to manipulate the variables that could potentially influence the precision of item parameter estimation in the MIRT models. Item parameters were estimated using marginal maximum likelihood, utilizing the expectation-maximization algorithms. A compensatory two-parameter logistic-MIRT model with two dimensions and dichotomous item–responses was used to simulate and calibrate the data for each combination of conditions across 500 replications. The results of this study indicated that ignoring the quantity and magnitude of cross-loading and model specification resulted in inaccurate and biased item discrimination parameter estimates. As the quantity and magnitude of cross-loading increased, the root mean square of error and bias estimates of item discrimination worsened.","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2023-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138950656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
What Affects the Quality of Score Transformations? Potential Issues in True-Score Equating Using the Partial Credit Model. 影响分数转换质量的因素是什么?使用部分信用模型计算真分的潜在问题
IF 2.7 3区 心理学 Q1 Social Sciences Pub Date : 2023-12-01 Epub Date: 2023-01-13 DOI: 10.1177/00131644221143051
Carolina Fellinghauer, Rudolf Debelak, Carolin Strobl

This simulation study investigated to what extent departures from construct similarity as well as differences in the difficulty and targeting of scales impact the score transformation when scales are equated by means of concurrent calibration using the partial credit model with a common person design. Practical implications of the simulation results are discussed with a focus on scale equating in health-related research settings. The study simulated data for two scales, varying the number of items and the sample sizes. The factor correlation between scales was used to operationalize construct similarity. Targeting of the scales was operationalized through increasing departure from equal difficulty and by varying the dispersion of the item and person parameters in each scale. The results show that low similarity between scales goes along with lower transformation precision. In cases with equal levels of similarity, precision improves in settings where the range of the item parameters is encompassing the person parameters range. With decreasing similarity, score transformation precision benefits more from good targeting. Difficulty shifts up to two logits somewhat increased the estimation bias but without affecting the transformation precision. The observed robustness against difficulty shifts supports the advantage of applying a true-score equating methods over identity equating, which was used as a naive baseline method for comparison. Finally, larger sample size did not improve the transformation precision in this study, longer scales improved only marginally the quality of the equating. The insights from the simulation study are used in a real-data example.

本模拟研究探讨了在采用部分信用模型和普通人设计的并行校准方法进行等分时,量表的难度和目标的差异以及结构相似度的偏离对分数转换的影响程度。模拟结果的实际意义进行了讨论,重点是规模等同在健康相关的研究设置。该研究模拟了两个尺度的数据,改变了项目的数量和样本量。使用量表间的因子相关性来操作构念相似性。通过增加对同等难度的偏离,以及通过改变每个量表中项目和人参数的分散程度,来实现量表的目标。结果表明,尺度间相似度较低,变换精度较低。在相似程度相等的情况下,在项目参数范围包含人员参数范围的情况下,精度会提高。随着相似度的降低,分数转换精度从良好的定位中获益更多。难度变化到两个对数会增加估计偏差,但不会影响转换精度。观察到的对难度转移的稳健性支持了应用真实分数等同方法优于身份等同方法的优势,身份等同方法被用作比较的朴素基线方法。最后,更大的样本量并没有提高本研究的变换精度,更长的尺度只略微提高了方程的质量。仿真研究的见解被应用于实际数据示例中。
{"title":"What Affects the Quality of Score Transformations? Potential Issues in True-Score Equating Using the Partial Credit Model.","authors":"Carolina Fellinghauer, Rudolf Debelak, Carolin Strobl","doi":"10.1177/00131644221143051","DOIUrl":"10.1177/00131644221143051","url":null,"abstract":"<p><p>This simulation study investigated to what extent departures from construct similarity as well as differences in the difficulty and targeting of scales impact the score transformation when scales are equated by means of concurrent calibration using the partial credit model with a common person design. Practical implications of the simulation results are discussed with a focus on scale equating in health-related research settings. The study simulated data for two scales, varying the number of items and the sample sizes. The factor correlation between scales was used to operationalize construct similarity. Targeting of the scales was operationalized through increasing departure from equal difficulty and by varying the dispersion of the item and person parameters in each scale. The results show that low similarity between scales goes along with lower transformation precision. In cases with equal levels of similarity, precision improves in settings where the range of the item parameters is encompassing the person parameters range. With decreasing similarity, score transformation precision benefits more from good targeting. Difficulty shifts up to two logits somewhat increased the estimation bias but without affecting the transformation precision. The observed robustness against difficulty shifts supports the advantage of applying a true-score equating methods over identity equating, which was used as a naive baseline method for comparison. Finally, larger sample size did not improve the transformation precision in this study, longer scales improved only marginally the quality of the equating. The insights from the simulation study are used in a real-data example.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10638984/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43041969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Educational and Psychological Measurement
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1