首页 > 最新文献

Psychometrika最新文献

英文 中文
Recognize the Value of the Sum Score, Psychometrics’ Greatest Accomplishment 认识总分的价值,心理测量学的最大成就
IF 3 2区 心理学 Q2 Mathematics Pub Date : 2024-04-17 DOI: 10.1007/s11336-024-09964-7
Klaas Sijtsma, Jules L. Ellis, Denny Borsboom

The sum score on a psychological test is, and should continue to be, a tool central in psychometric practice. This position runs counter to several psychometricians’ belief that the sum score represents a pre-scientific conception that must be abandoned from psychometrics in favor of latent variables. First, we reiterate that the sum score stochastically orders the latent variable in a wide variety of much-used item response models. In fact, item response theory provides a mathematically based justification for the ordinal use of the sum score. Second, because discussions about the sum score often involve its reliability and estimation methods as well, we show that, based on very general assumptions, classical test theory provides a family of lower bounds several of which are close to the true reliability under reasonable conditions. Finally, we argue that eventually sum scores derive their value from the degree to which they enable predicting practically relevant events and behaviors. None of our discussion is meant to discredit modern measurement models; they have their own merits unattainable for classical test theory, but the latter model provides impressive contributions to psychometrics based on very few assumptions that seem to have become obscured in the past few decades. Their generality and practical usefulness add to the accomplishments of more recent approaches.

心理测验的总分是,而且应该继续是,心理测量实践中的核心工具。有几位心理测量学家认为,总分是一种前科学概念,必须从心理测量学中摒弃,转而使用潜变量。首先,我们要重申,在各种常用的项目反应模型中,总分是随机排列潜变量的。事实上,项目反应理论为总分的顺序使用提供了数学上的依据。其次,由于有关总分的讨论往往还涉及其信度和估计方法,我们证明,基于非常一般的假设,经典测验理论提供了一系列下限,其中有几个在合理条件下接近真实信度。最后,我们认为,最终总分的价值来自于它们能够预测实际相关事件和行为的程度。我们的讨论无意诋毁现代测量模型;它们有自己的优点,是经典测验理论无法企及的,但后者基于极少的假设为心理测量学做出了令人印象深刻的贡献,而这些假设在过去几十年中似乎变得模糊不清了。这些模型的通用性和实用性为最新方法的成就锦上添花。
{"title":"Recognize the Value of the Sum Score, Psychometrics’ Greatest Accomplishment","authors":"Klaas Sijtsma, Jules L. Ellis, Denny Borsboom","doi":"10.1007/s11336-024-09964-7","DOIUrl":"https://doi.org/10.1007/s11336-024-09964-7","url":null,"abstract":"<p>The sum score on a psychological test is, and should continue to be, a tool central in psychometric practice. This position runs counter to several psychometricians’ belief that the sum score represents a pre-scientific conception that must be abandoned from psychometrics in favor of latent variables. First, we reiterate that the sum score stochastically orders the latent variable in a wide variety of much-used item response models. In fact, item response theory provides a mathematically based justification for the ordinal use of the sum score. Second, because discussions about the sum score often involve its reliability and estimation methods as well, we show that, based on very general assumptions, classical test theory provides a family of lower bounds several of which are close to the true reliability under reasonable conditions. Finally, we argue that eventually sum scores derive their value from the degree to which they enable predicting practically relevant events and behaviors. None of our discussion is meant to discredit modern measurement models; they have their own merits unattainable for classical test theory, but the latter model provides impressive contributions to psychometrics based on very few assumptions that seem to have become obscured in the past few decades. Their generality and practical usefulness add to the accomplishments of more recent approaches.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140612805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Parallel Optimal Calibration of Mixed-Format Items for Achievement Tests 成就测试混合格式项目的平行优化校准
IF 3 2区 心理学 Q2 Mathematics Pub Date : 2024-04-15 DOI: 10.1007/s11336-024-09968-3
Frank Miller, Ellinor Fackle-Fornius

When large achievement tests are conducted regularly, items need to be calibrated before being used as operational items in a test. Methods have been developed to optimally assign pretest items to examinees based on their abilities. Most of these methods, however, are intended for situations where examinees arrive sequentially to be assigned to calibration items. In several calibration tests, examinees take the test simultaneously or in parallel. In this article, we develop an optimal calibration design tailored for such parallel test setups. Our objective is both to investigate the efficiency gain of the method as well as to demonstrate that this method can be implemented in real calibration scenarios. For the latter, we have employed this method to calibrate items for the Swedish national tests in Mathematics. In this case study, like in many real test situations, items are of mixed format and the optimal design method needs to handle that. The method we propose works for mixed-format tests and accounts for varying expected response times. Our investigations show that the proposed method considerably enhances calibration efficiency.

在定期进行大型成绩测验时,需要对测验项目进行校准,然后才能将其用作测验中的操作项目。目前已经开发出了一些方法,可以根据考生的能力来为他们分配最佳的测前项目。然而,这些方法大多适用于考生依次到达考场,被分配到校准项目的情况。在一些校准测试中,考生会同时或平行参加测试。在本文中,我们为这种平行测试设置开发了一种最佳校准设计。我们的目的既是为了研究该方法的效率增益,也是为了证明该方法可在实际校准场景中实施。对于后者,我们采用了这种方法来校准瑞典国家数学测试的项目。在这个案例研究中,就像在许多真实的测试环境中一样,题目是混合格式的,优化设计方法需要处理这种情况。我们提出的方法适用于混合形式的测试,并考虑到了不同的预期反应时间。我们的研究表明,所提出的方法大大提高了校准效率。
{"title":"Parallel Optimal Calibration of Mixed-Format Items for Achievement Tests","authors":"Frank Miller, Ellinor Fackle-Fornius","doi":"10.1007/s11336-024-09968-3","DOIUrl":"https://doi.org/10.1007/s11336-024-09968-3","url":null,"abstract":"<p>When large achievement tests are conducted regularly, items need to be calibrated before being used as operational items in a test. Methods have been developed to optimally assign pretest items to examinees based on their abilities. Most of these methods, however, are intended for situations where examinees arrive sequentially to be assigned to calibration items. In several calibration tests, examinees take the test simultaneously or in parallel. In this article, we develop an optimal calibration design tailored for such parallel test setups. Our objective is both to investigate the efficiency gain of the method as well as to demonstrate that this method can be implemented in real calibration scenarios. For the latter, we have employed this method to calibrate items for the Swedish national tests in Mathematics. In this case study, like in many real test situations, items are of mixed format and the optimal design method needs to handle that. The method we propose works for mixed-format tests and accounts for varying expected response times. Our investigations show that the proposed method considerably enhances calibration efficiency.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2024-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140569132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning Bayesian Networks: A Copula Approach for Mixed-Type Data 学习贝叶斯网络:混合类型数据的 Copula 方法
IF 3 2区 心理学 Q2 Mathematics Pub Date : 2024-04-12 DOI: 10.1007/s11336-024-09969-2
Federico Castelletti

Estimating dependence relationships between variables is a crucial issue in many applied domains and in particular psychology. When several variables are entertained, these can be organized into a network which encodes their set of conditional dependence relations. Typically however, the underlying network structure is completely unknown or can be partially drawn only; accordingly it should be learned from the available data, a process known as structure learning. In addition, data arising from social and psychological studies are often of different types, as they can include categorical, discrete and continuous measurements. In this paper, we develop a novel Bayesian methodology for structure learning of directed networks which applies to mixed data, i.e., possibly containing continuous, discrete, ordinal and binary variables simultaneously. Whenever available, our method can easily incorporate known dependence structures among variables represented by paths or edge directions that can be postulated in advance based on the specific problem under consideration. We evaluate the proposed method through extensive simulation studies, with appreciable performances in comparison with current state-of-the-art alternative methods. Finally, we apply our methodology to well-being data from a social survey promoted by the United Nations, and mental health data collected from a cohort of medical students. R code implementing the proposed methodology is available at https://github.com/FedeCastelletti/bayes_networks_mixed_data.

估计变量之间的依赖关系是许多应用领域,尤其是心理学领域的一个关键问题。当多个变量同时存在时,可以将这些变量组织成一个网络,其中编码了它们之间的一系列条件依赖关系。然而,通常情况下,底层网络结构是完全未知的,或者只能部分得出;因此,应从现有数据中学习网络结构,这一过程被称为结构学习。此外,社会和心理研究中产生的数据通常有不同类型,因为它们可能包括分类、离散和连续测量。在本文中,我们为有向网络的结构学习开发了一种新颖的贝叶斯方法,该方法适用于混合数据,即可能同时包含连续、离散、顺序和二进制变量的数据。只要有可用的数据,我们的方法就能轻松纳入已知的变量间依赖结构,这些结构由路径或边缘方向表示,可以根据所考虑的具体问题事先假设。我们通过大量的模拟研究对所提出的方法进行了评估,与目前最先进的替代方法相比,我们的方法具有显著的性能。最后,我们将我们的方法应用于联合国推广的一项社会调查中的幸福感数据,以及从一批医学生中收集的心理健康数据。实现该方法的 R 代码可在 https://github.com/FedeCastelletti/bayes_networks_mixed_data 上获取。
{"title":"Learning Bayesian Networks: A Copula Approach for Mixed-Type Data","authors":"Federico Castelletti","doi":"10.1007/s11336-024-09969-2","DOIUrl":"https://doi.org/10.1007/s11336-024-09969-2","url":null,"abstract":"<p>Estimating dependence relationships between variables is a crucial issue in many applied domains and in particular psychology. When several variables are entertained, these can be organized into a network which encodes their set of conditional dependence relations. Typically however, the underlying network structure is completely unknown or can be partially drawn only; accordingly it should be learned from the available data, a process known as <i>structure learning</i>. In addition, data arising from social and psychological studies are often of different types, as they can include categorical, discrete and continuous measurements. In this paper, we develop a novel Bayesian methodology for structure learning of directed networks which applies to mixed data, i.e., possibly containing continuous, discrete, ordinal and binary variables simultaneously. Whenever available, our method can easily incorporate known dependence structures among variables represented by paths or edge directions that can be postulated in advance based on the specific problem under consideration. We evaluate the proposed method through extensive simulation studies, with appreciable performances in comparison with current state-of-the-art alternative methods. Finally, we apply our methodology to well-being data from a social survey promoted by the United Nations, and mental health data collected from a cohort of medical students. R code implementing the proposed methodology is available at https://github.com/FedeCastelletti/bayes_networks_mixed_data.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2024-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140569609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Polytomous Effectiveness Indicators in Complex Problem-Solving Tasks and Their Applications in Developing Measurement Model 复杂问题解决任务中的多项式有效性指标及其在开发测量模型中的应用
IF 3 2区 心理学 Q2 Mathematics Pub Date : 2024-04-09 DOI: 10.1007/s11336-024-09963-8
Pujue Wang, Hongyun Liu

Recent years have witnessed the emergence of measurement models for analyzing action sequences in computer-based problem-solving interactive tasks. The cutting-edge psychometrics process models require pre-specification of the effectiveness of state transitions often simplifying them into dichotomous indicators. However, the dichotomous effectiveness becomes impractical when dealing with complex tasks that involve multiple optimal paths and numerous state transitions. Building on the concept of problem-solving, we introduce polytomous indicators to assess the effectiveness of problem states (d_{s}) and state-to-state transitions ({mathrm {Delta }d}_{mathrm {srightarrow s'}}). The three-step evaluation method for these two types of indicators is proposed and illustrated across two real problem-solving tasks. We further present a novel psychometrics process model, the sequential response model with polytomous effectiveness indicators (SRM-PEI), which is tailored to encompass a broader range of problem-solving tasks. Monte Carlo simulations indicated that SRM-PEI performed well in the estimation of latent ability and transition tendency parameters across different conditions. Empirical studies conducted on two real tasks supported the better fit of SRM-PEI over previous models such as SRM and SRMM, providing rational and interpretable estimates of latent abilities and transition tendencies through effectiveness indicators. The paper concludes by outlining potential avenues for the further application and enhancement of polytomous effectiveness indicators and SRM-PEI.

近年来,在基于计算机的问题解决互动任务中,出现了用于分析行动序列的测量模型。最先进的心理测量过程模型要求预先指定状态转换的有效性,通常将其简化为二分法指标。然而,在处理涉及多个最佳路径和无数状态转换的复杂任务时,二分法的有效性就变得不切实际了。基于问题解决的概念,我们引入了多态指标来评估问题状态(d_{s})和状态到状态转换({mathrm {Delta }d}_{mathrm {srightarrow s'}})的有效性。我们提出了这两类指标的三步评估方法,并在两个真实的问题解决任务中进行了说明。我们还进一步提出了一种新的心理测量过程模型,即具有多态有效性指标的序列反应模型(SRM-PEI),该模型是为涵盖更广泛的问题解决任务而量身定制的。蒙特卡罗模拟表明,SRM-PEI 在估计不同条件下的潜在能力和过渡倾向参数方面表现良好。在两个真实任务上进行的实证研究证明,SRM-PEI 比 SRM 和 SRMM 等以前的模型拟合得更好,通过有效性指标提供了合理的、可解释的潜在能力和过渡倾向估计值。本文最后概述了进一步应用和改进多项式效能指标和 SRM-PEI 的潜在途径。
{"title":"Polytomous Effectiveness Indicators in Complex Problem-Solving Tasks and Their Applications in Developing Measurement Model","authors":"Pujue Wang, Hongyun Liu","doi":"10.1007/s11336-024-09963-8","DOIUrl":"https://doi.org/10.1007/s11336-024-09963-8","url":null,"abstract":"<p>Recent years have witnessed the emergence of measurement models for analyzing action sequences in computer-based problem-solving interactive tasks. The cutting-edge psychometrics process models require pre-specification of the effectiveness of state transitions often simplifying them into dichotomous indicators. However, the dichotomous effectiveness becomes impractical when dealing with complex tasks that involve multiple optimal paths and numerous state transitions. Building on the concept of problem-solving, we introduce polytomous indicators to assess the effectiveness of problem states <span>(d_{s})</span> and state-to-state transitions <span>({mathrm {Delta }d}_{mathrm {srightarrow s'}})</span>. The three-step evaluation method for these two types of indicators is proposed and illustrated across two real problem-solving tasks. We further present a novel psychometrics process model, the sequential response model with polytomous effectiveness indicators (SRM-PEI), which is tailored to encompass a broader range of problem-solving tasks. Monte Carlo simulations indicated that SRM-PEI performed well in the estimation of latent ability and transition tendency parameters across different conditions. Empirical studies conducted on two real tasks supported the better fit of SRM-PEI over previous models such as SRM and SRMM, providing rational and interpretable estimates of latent abilities and transition tendencies through effectiveness indicators. The paper concludes by outlining potential avenues for the further application and enhancement of polytomous effectiveness indicators and SRM-PEI.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2024-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140569211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Examining Differential Item Functioning from a Multidimensional IRT Perspective 从多维 IRT 角度研究差异项目功能
IF 3 2区 心理学 Q2 Mathematics Pub Date : 2024-04-05 DOI: 10.1007/s11336-024-09965-6
Terry A. Ackerman, Ye Ma

Differential item functioning (DIF) is a standard analysis for every testing company. Research has demonstrated that DIF can result when test items measure different ability composites, and the groups being examined for DIF exhibit distinct underlying ability distributions on those composite abilities. In this article, we examine DIF from a two-dimensional multidimensional item response theory (MIRT) perspective. We begin by delving into the compensatory MIRT model, illustrating and how items and the composites they measure can be graphically represented. Additionally, we discuss how estimated item parameters can vary based on the underlying latent ability distributions of the examinees. Analytical research highlighting the consequences of ignoring dimensionally and applying unidimensional IRT models, where the two-dimensional latent space is mapped onto a unidimensional, is reviewed. Next, we investigate three different approaches to understanding DIF from a MIRT standpoint: 1. Analytically Uniform and Nonuniform DIF: When two groups of interest have different two-dimensional ability distributions, a unidimensional model is estimated. 2. Accounting for complete latent ability space: We emphasize the importance of considering the entire latent ability space when using DIF conditional approaches, which leads to the mitigation of DIF effects. 3. Scenario-Based DIF: Even when underlying two-dimensional distributions are identical for two groups, differing problem-solving approaches can still lead to DIF. Modern software programs facilitate routine DIF procedures for comparing response data from two identified groups of interest. The real challenge is to identify why DIF could occur with flagged items. Thus, as a closing challenge, we present four items (Appendix A) from a standardized test and invite readers to identify which group was favored by a DIF analysis.

差异项目功能(DIF)是每个测试公司的标准分析方法。研究表明,当测验项目测量的是不同的综合能力,而被测群体在这些综合能力上表现出不同的基本能力分布时,就会产生 DIF。本文将从二维多维项目反应理论(MIRT)的角度对 DIF 进行研究。首先,我们将深入探讨补偿性 MIRT 模型,说明项目及其测量的复合能力如何以图形表示。此外,我们还讨论了估计的项目参数如何根据考生的潜在能力分布而变化。分析研究强调了忽略维度和应用单维度 IRT 模型(将二维潜空间映射到单维度上)的后果。接下来,我们研究了从 MIRT 角度理解 DIF 的三种不同方法:1.分析均匀和非均匀 DIF:当两个相关群体具有不同的二维能力分布时,我们会估计一个单维模型。2.考虑完整的潜在能力空间:我们强调在使用 DIF 条件方法时考虑整个潜在能力空间的重要性,这样可以减轻 DIF 的影响。3.基于情景的 DIF:即使两组的基本二维分布相同,不同的解题方法仍可能导致 DIF。现代软件程序为常规 DIF 程序提供了便利,可用于比较两个已确定的相关群体的响应数据。真正的挑战在于找出标记项目可能出现 DIF 的原因。因此,作为最后的挑战,我们提出了一个标准化测试中的四个项目(附录 A),并邀请读者通过 DIF 分析来确定哪个组别更受青睐。
{"title":"Examining Differential Item Functioning from a Multidimensional IRT Perspective","authors":"Terry A. Ackerman, Ye Ma","doi":"10.1007/s11336-024-09965-6","DOIUrl":"https://doi.org/10.1007/s11336-024-09965-6","url":null,"abstract":"<p>Differential item functioning (DIF) is a standard analysis for every testing company. Research has demonstrated that DIF can result when test items measure different ability composites, and the groups being examined for DIF exhibit distinct underlying ability distributions on those composite abilities. In this article, we examine DIF from a two-dimensional multidimensional item response theory (MIRT) perspective. We begin by delving into the compensatory MIRT model, illustrating and how items and the composites they measure can be graphically represented. Additionally, we discuss how estimated item parameters can vary based on the underlying latent ability distributions of the examinees. Analytical research highlighting the consequences of ignoring dimensionally and applying unidimensional IRT models, where the two-dimensional latent space is mapped onto a unidimensional, is reviewed. Next, we investigate three different approaches to understanding DIF from a MIRT standpoint: 1. Analytically Uniform and Nonuniform DIF: When two groups of interest have different two-dimensional ability distributions, a unidimensional model is estimated. 2. Accounting for complete latent ability space: We emphasize the importance of considering the entire latent ability space when using DIF conditional approaches, which leads to the mitigation of DIF effects. 3. Scenario-Based DIF: Even when underlying two-dimensional distributions are identical for two groups, differing problem-solving approaches can still lead to DIF. Modern software programs facilitate routine DIF procedures for comparing response data from two identified groups of interest. The real challenge is to identify why DIF could occur with flagged items. Thus, as a closing challenge, we present four items (Appendix A) from a standardized test and invite readers to identify which group was favored by a DIF analysis.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2024-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140569127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reducing Attenuation Bias in Regression Analyses Involving Rating Scale Data via Psychometric Modeling 通过心理测量建模减少评级量表数据回归分析中的衰减偏差
IF 3 2区 心理学 Q2 Mathematics Pub Date : 2024-04-04 DOI: 10.1007/s11336-024-09967-4
Cees A. W. Glas, Terrence D. Jorgensen, Debby ten Hove

Many studies in fields such as psychology and educational sciences obtain information about attributes of subjects through observational studies, in which raters score subjects using multiple-item rating scales. Error variance due to measurement effects, such as items and raters, attenuate the regression coefficients and lower the power of (hierarchical) linear models. A modeling procedure is discussed to reduce the attenuation. The procedure consists of (1) an item response theory (IRT) model to map the discrete item responses to a continuous latent scale and (2) a generalizability theory (GT) model to separate the variance in the latent measurement into variance components of interest and nuisance variance components. It will be shown how measurements obtained from this mixture of IRT and GT models can be embedded in (hierarchical) linear models, both as predictor or criterion variables, such that error variance due to nuisance effects are partialled out. Using examples from the field of educational measurement, it is shown how general-purpose software can be used to implement the modeling procedure.

心理学和教育学等领域的许多研究都是通过观察性研究获得受试者属性信息的,在观察性研究中,评分者使用多项目评分量表对受试者进行评分。由测量效应(如项目和评分者)引起的误差方差会削弱回归系数,降低(层次)线性模型的能力。本文讨论了一种减少衰减的建模程序。该程序包括:(1) 项目反应理论(IRT)模型,将离散的项目反应映射到连续的潜在量表;(2) 普适性理论(GT)模型,将潜在测量中的方差分为相关方差成分和干扰方差成分。研究将展示如何将从 IRT 和 GT 模型混合中获得的测量结果嵌入(分层)线性模型中,作为预测变量或标准变量,从而消除由于干扰效应造成的误差方差。通过教育测量领域的实例,说明如何使用通用软件来实施建模程序。
{"title":"Reducing Attenuation Bias in Regression Analyses Involving Rating Scale Data via Psychometric Modeling","authors":"Cees A. W. Glas, Terrence D. Jorgensen, Debby ten Hove","doi":"10.1007/s11336-024-09967-4","DOIUrl":"https://doi.org/10.1007/s11336-024-09967-4","url":null,"abstract":"<p>Many studies in fields such as psychology and educational sciences obtain information about attributes of subjects through observational studies, in which raters score subjects using multiple-item rating scales. Error variance due to measurement effects, such as items and raters, attenuate the regression coefficients and lower the power of (hierarchical) linear models. A modeling procedure is discussed to reduce the attenuation. The procedure consists of (1) an item response theory (IRT) model to map the discrete item responses to a continuous latent scale and (2) a generalizability theory (GT) model to separate the variance in the latent measurement into variance components of interest and nuisance variance components. It will be shown how measurements obtained from this mixture of IRT and GT models can be embedded in (hierarchical) linear models, both as predictor or criterion variables, such that error variance due to nuisance effects are partialled out. Using examples from the field of educational measurement, it is shown how general-purpose software can be used to implement the modeling procedure.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140569369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Book Review. 书评
IF 3 2区 心理学 Q2 Mathematics Pub Date : 2024-04-03 DOI: 10.1007/s11336-024-09958-5
F. Bartolucci, F. Pennoni
{"title":"Book Review.","authors":"F. Bartolucci, F. Pennoni","doi":"10.1007/s11336-024-09958-5","DOIUrl":"https://doi.org/10.1007/s11336-024-09958-5","url":null,"abstract":"","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2024-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140750641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Remarks from the New Editor-in-Chief. 新主编致辞。
IF 3 2区 心理学 Q2 Mathematics Pub Date : 2024-04-03 DOI: 10.1007/s11336-024-09970-9
S. Sinharay
{"title":"Remarks from the New Editor-in-Chief.","authors":"S. Sinharay","doi":"10.1007/s11336-024-09970-9","DOIUrl":"https://doi.org/10.1007/s11336-024-09970-9","url":null,"abstract":"","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2024-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140748158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sociocognitive and Argumentation Perspectives on Psychometric Modeling in Educational Assessment 从社会认知和论证角度看教育评估中的心理测量建模
IF 3 2区 心理学 Q2 Mathematics Pub Date : 2024-04-03 DOI: 10.1007/s11336-024-09966-5
Robert J. Mislevy

Rapid advances in psychology and technology open opportunities and present challenges beyond familiar forms of educational assessment and measurement. Viewing assessment through the perspectives of complex adaptive sociocognitive systems and argumentation helps us extend the concepts and methods of educational measurement to new forms of assessment, such as those involving interaction in simulation environments and automated evaluation of performances. I summarize key ideas for doing so and point to the roles of measurement models and their relation to sociocognitive systems and assessment arguments. A game-based learning assessment SimCityEDU: Pollution Challenge! is used to illustrate ideas.

心理学和技术的飞速发展为我们带来了机遇和挑战,超越了我们熟悉的教育评估和测量形式。从复杂的适应性社会认知系统和论证的角度来看待评估,有助于我们将教育测量的概念和方法扩展到新的评估形式,如涉及模拟环境中的互动和对表现的自动评估。我总结了这样做的主要思路,并指出了测量模型的作用及其与社会认知系统和评估论证的关系。我将使用基于游戏的学习评估《模拟城市教育大学:污染挑战!》来说明这些观点。
{"title":"Sociocognitive and Argumentation Perspectives on Psychometric Modeling in Educational Assessment","authors":"Robert J. Mislevy","doi":"10.1007/s11336-024-09966-5","DOIUrl":"https://doi.org/10.1007/s11336-024-09966-5","url":null,"abstract":"<p>Rapid advances in psychology and technology open opportunities and present challenges beyond familiar forms of educational assessment and measurement. Viewing assessment through the perspectives of complex adaptive sociocognitive systems and argumentation helps us extend the concepts and methods of educational measurement to new forms of assessment, such as those involving interaction in simulation environments and automated evaluation of performances. I summarize key ideas for doing so and point to the roles of measurement models and their relation to sociocognitive systems and assessment arguments. A game-based learning assessment <i>SimCityEDU: Pollution Challenge!</i> is used to illustrate ideas.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2024-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140569124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Regularized Variational Estimation for Exploratory Item Factor Analysis. 用于探索性项目因素分析的正则化变量估计。
IF 2.9 2区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-03-01 Epub Date: 2022-07-13 DOI: 10.1007/s11336-022-09874-6
April E Cho, Jiaying Xiao, Chun Wang, Gongjun Xu

Item factor analysis (IFA), also known as Multidimensional Item Response Theory (MIRT), is a general framework for specifying the functional relationship between respondents' multiple latent traits and their responses to assessment items. The key element in MIRT is the relationship between the items and the latent traits, so-called item factor loading structure. The correct specification of this loading structure is crucial for accurate calibration of item parameters and recovery of individual latent traits. This paper proposes a regularized Gaussian Variational Expectation Maximization (GVEM) algorithm to efficiently infer item factor loading structure directly from data. The main idea is to impose an adaptive L 1 -type penalty to the variational lower bound of the likelihood to shrink certain loadings to 0. This new algorithm takes advantage of the computational efficiency of GVEM algorithm and is suitable for high-dimensional MIRT applications. Simulation studies show that the proposed method accurately recovers the loading structure and is computationally efficient. The new method is also illustrated using the National Education Longitudinal Study of 1988 (NELS:88) mathematics and science assessment data.

项目因素分析(IFA),又称多维项目反应理论(MIRT),是一种用于明确受访者的多个潜在特质与其对测评项目的反应之间的功能关系的通用框架。多维项目反应理论的关键因素是项目与潜在特质之间的关系,即所谓的项目因子负荷结构。正确说明这种负荷结构对于准确校准项目参数和恢复个体潜在特质至关重要。本文提出了一种正则化高斯变分期望最大化(GVEM)算法,可直接从数据中有效推断项目因子载荷结构。这种新算法利用了 GVEM 算法的计算效率优势,适用于高维 MIRT 应用。仿真研究表明,所提出的方法能准确地恢复载荷结构,而且计算效率高。新方法还利用 1988 年全国教育纵向研究(NELS:88)的数学和科学评估数据进行了说明。
{"title":"Regularized Variational Estimation for Exploratory Item Factor Analysis.","authors":"April E Cho, Jiaying Xiao, Chun Wang, Gongjun Xu","doi":"10.1007/s11336-022-09874-6","DOIUrl":"10.1007/s11336-022-09874-6","url":null,"abstract":"<p><p>Item factor analysis (IFA), also known as Multidimensional Item Response Theory (MIRT), is a general framework for specifying the functional relationship between respondents' multiple latent traits and their responses to assessment items. The key element in MIRT is the relationship between the items and the latent traits, so-called item factor loading structure. The correct specification of this loading structure is crucial for accurate calibration of item parameters and recovery of individual latent traits. This paper proposes a regularized Gaussian Variational Expectation Maximization (GVEM) algorithm to efficiently infer item factor loading structure directly from data. The main idea is to impose an adaptive <math><msub><mi>L</mi> <mn>1</mn></msub> </math> -type penalty to the variational lower bound of the likelihood to shrink certain loadings to 0. This new algorithm takes advantage of the computational efficiency of GVEM algorithm and is suitable for high-dimensional MIRT applications. Simulation studies show that the proposed method accurately recovers the loading structure and is computationally efficient. The new method is also illustrated using the National Education Longitudinal Study of 1988 (NELS:88) mathematics and science assessment data.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":null,"pages":null},"PeriodicalIF":2.9,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40614257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Psychometrika
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1