首页 > 最新文献

Psychometrika最新文献

英文 中文
A Tutorial on Estimating the Precision of Individual Test Scores for Anyone Constructing and Using Psychological Tests. 为任何构建和使用心理测试的人估计个人测试分数精度的教程。
IF 3.1 2区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2026-01-09 DOI: 10.1017/psy.2026.10081
Julius M Pfadt, Dylan Molenaar, Petra Hurks, Klaas Sijtsma
{"title":"A Tutorial on Estimating the Precision of Individual Test Scores for Anyone Constructing and Using Psychological Tests.","authors":"Julius M Pfadt, Dylan Molenaar, Petra Hurks, Klaas Sijtsma","doi":"10.1017/psy.2026.10081","DOIUrl":"https://doi.org/10.1017/psy.2026.10081","url":null,"abstract":"","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1-35"},"PeriodicalIF":3.1,"publicationDate":"2026-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145936369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Regularized Joint Maximum Likelihood Estimation of Latent Space Item Response Models. 潜在空间项目反应模型的正则化联合最大似然估计。
IF 3.1 2区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2026-01-09 DOI: 10.1017/psy.2025.10068
Dylan Molenaar, Minjeong Jeon

In latent space item response models (LSIRMs), subjects and items are embedded in a low-dimensional Euclidean latent space. As such, interactions among persons and/or items can be revealed that are unmodeled in conventional item response theory models. Current estimation approach for LSIRMs is a fully Bayesian procedure with Markov Chain Monte Carlo, which is, while practical, computationally challenging, hampering applied researchers to use the models in a wide range of settings. Therefore, we propose an LSIRM based on two variants of regularized joint maximum likelihood (JML) estimation: penalized JML and constrained JML. Owing to the absence of integrals in the likelihood, the JML methods allow for various models to be fit in limited amount of time. This computational speed facilitates a practical extension of LSIRMs to ordinal data, and the possibility to select the dimensionality of the latent space using cross-validation. In this study, we derive the two JML approaches and address different issues that arise when using maximum likelihood to estimate the LSIRM. We present a simulation study demonstrating acceptable parameter recovery and adequate performance of the cross-validation procedure. In addition, we estimate different binary and ordinal LSIRMs on real datasets pertaining to deductive reasoning and personality. All methods are implemented in R package 'LSMjml' which is available from CRAN.

在潜在空间项目反应模型中,被试和项目被嵌入到一个低维欧几里得潜在空间中。因此,人们和/或项目之间的相互作用可以揭示在传统的项目反应理论模型中未建模的。目前lsims的估计方法是一个完全的贝叶斯过程和马尔可夫链蒙特卡罗,这虽然实用,但在计算上具有挑战性,阻碍了应用研究人员在广泛的设置中使用模型。因此,我们提出了一种基于正则化联合最大似然(JML)估计的两种变体的LSIRM:惩罚JML和约束JML。由于似然中没有积分,JML方法允许在有限的时间内拟合各种模型。这种计算速度有利于将lsims扩展到有序数据,并且可以使用交叉验证来选择潜在空间的维度。在本研究中,我们推导了两种JML方法,并解决了使用最大似然估计LSIRM时出现的不同问题。我们提出了一个模拟研究,证明了交叉验证程序的可接受参数恢复和足够的性能。此外,我们在演绎推理和人格相关的真实数据集上估计了不同的二进制和有序lsims。所有方法都在R包“LSMjml”中实现,该包可从CRAN获得。
{"title":"Regularized Joint Maximum Likelihood Estimation of Latent Space Item Response Models.","authors":"Dylan Molenaar, Minjeong Jeon","doi":"10.1017/psy.2025.10068","DOIUrl":"https://doi.org/10.1017/psy.2025.10068","url":null,"abstract":"<p><p>In latent space item response models (LSIRMs), subjects and items are embedded in a low-dimensional Euclidean latent space. As such, interactions among persons and/or items can be revealed that are unmodeled in conventional item response theory models. Current estimation approach for LSIRMs is a fully Bayesian procedure with Markov Chain Monte Carlo, which is, while practical, computationally challenging, hampering applied researchers to use the models in a wide range of settings. Therefore, we propose an LSIRM based on two variants of regularized joint maximum likelihood (JML) estimation: penalized JML and constrained JML. Owing to the absence of integrals in the likelihood, the JML methods allow for various models to be fit in limited amount of time. This computational speed facilitates a practical extension of LSIRMs to ordinal data, and the possibility to select the dimensionality of the latent space using cross-validation. In this study, we derive the two JML approaches and address different issues that arise when using maximum likelihood to estimate the LSIRM. We present a simulation study demonstrating acceptable parameter recovery and adequate performance of the cross-validation procedure. In addition, we estimate different binary and ordinal LSIRMs on real datasets pertaining to deductive reasoning and personality. All methods are implemented in R package 'LSMjml' which is available from CRAN.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1-25"},"PeriodicalIF":3.1,"publicationDate":"2026-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145936442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimating Latent Distribution of Item Response Theory Using Kernel Density Method. 用核密度法估计项目反应理论的潜在分布。
IF 3.1 2区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2026-01-08 DOI: 10.1017/psy.2026.10080
Seewoo Li, Guemin Lee

The article proposes a new approach to estimating the latent distribution of item response theory (IRT) using kernel density estimation (KDE), particularly the solve-the-equation (STE) algorithm developed by Sheather and Jones (1991). As with existing methods, the KDE method aims to estimate the latent distribution of IRT to reduce biases in parameter estimates when the normality assumption on the latent variable is violated. Simulation studies and an empirical example confirm the robustness of algorithmic convergence of the KDE approach, and show that the KDE approach yields parameter estimates that are more accurate than or comparable to existing methods. Unlike other approaches that require multiple model fits for smoothing parameter selection, KDE requires only a single model-fitting step, substantially reducing computation time. These findings highlight KDE as a practical and efficient method for estimating latent distributions in IRT.

本文提出了一种利用核密度估计(KDE)来估计项目反应理论(IRT)潜在分布的新方法,特别是由Sheather和Jones(1991)开发的方程求解(STE)算法。与现有方法一样,KDE方法旨在估计IRT的潜在分布,以减少当潜在变量的正态性假设被违反时参数估计的偏差。仿真研究和经验示例证实了KDE方法算法收敛的鲁棒性,并表明KDE方法产生的参数估计比现有方法更准确或可与之媲美。与其他需要多个模型拟合来平滑参数选择的方法不同,KDE只需要一个模型拟合步骤,从而大大减少了计算时间。这些发现突出了KDE作为估计IRT潜在分布的实用和有效的方法。
{"title":"Estimating Latent Distribution of Item Response Theory Using Kernel Density Method.","authors":"Seewoo Li, Guemin Lee","doi":"10.1017/psy.2026.10080","DOIUrl":"10.1017/psy.2026.10080","url":null,"abstract":"<p><p>The article proposes a new approach to estimating the latent distribution of item response theory (IRT) using kernel density estimation (KDE), particularly the solve-the-equation (STE) algorithm developed by Sheather and Jones (1991). As with existing methods, the KDE method aims to estimate the latent distribution of IRT to reduce biases in parameter estimates when the normality assumption on the latent variable is violated. Simulation studies and an empirical example confirm the robustness of algorithmic convergence of the KDE approach, and show that the KDE approach yields parameter estimates that are more accurate than or comparable to existing methods. Unlike other approaches that require multiple model fits for smoothing parameter selection, KDE requires only a single model-fitting step, substantially reducing computation time. These findings highlight KDE as a practical and efficient method for estimating latent distributions in IRT.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1-21"},"PeriodicalIF":3.1,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145919241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Constructive Q-Matrix Identifiability via Novel Tensor Unfolding. 基于新张量展开的构造q矩阵可辨识性。
IF 3.1 2区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2026-01-06 DOI: 10.1017/psy.2025.10078
Yuqi Gu

This work establishes a new identifiability theory for a cornerstone of various cognitive diagnostic models (CDMs) popular in psychometrics: the Q-matrix. The key idea is a novel tensor-unfolding proof strategy. Representing the joint distribution of J categorical responses as a J-way tensor, we strategically unfold the tensor into matrices in multiple ways and use their rank properties to identify the unknown Q-matrix. This approach departs fundamentally from all prior identifiability analyses in CDMs. Our proof is constructive, elucidating a population-level procedure to exactly recover the Q-matrix within a parameter space where each latent attribute is measured by at least two "pure" items that solely measure this attribute. The theory has several desirable features: it can constructively identify both the Q-matrix and the number of latent attributes; it applies to broad classes of linear and nonlinear CDMs with main or all saturated effects of attributes; and it accommodates polytomous responses, extending beyond classical binary response settings. The new identifiability result unifies and strengthens identifiability guarantees across diverse CDMs. It provides rigorous theoretical foundations and indicates a future pathway toward using tensor unfolding for practical Q-matrix estimation.

这项工作为心理测量学中流行的各种认知诊断模型(CDMs)的基石建立了一个新的可识别性理论:q矩阵。关键思想是一种新的张量展开证明策略。将J个分类响应的联合分布表示为J路张量,我们策略性地将张量以多种方式展开成矩阵,并利用它们的秩性质来识别未知的q矩阵。这种方法从根本上背离了cdm中所有先前的可识别性分析。我们的证明是建设性的,阐明了在参数空间内精确恢复q矩阵的总体水平过程,其中每个潜在属性由至少两个单独测量该属性的“纯”项测量。该理论有几个令人满意的特点:它可以建设性地识别q矩阵和潜在属性的数量;它适用于具有主要或全部属性饱和效应的广义线性和非线性cdm;它适应多元反应,超越了经典的二元反应设置。新的可识别性结果统一并加强了不同cdm之间的可识别性保证。它提供了严格的理论基础,并指出了使用张量展开进行实际q矩阵估计的未来途径。
{"title":"Constructive <i>Q</i>-Matrix Identifiability via Novel Tensor Unfolding.","authors":"Yuqi Gu","doi":"10.1017/psy.2025.10078","DOIUrl":"https://doi.org/10.1017/psy.2025.10078","url":null,"abstract":"<p><p>This work establishes a new identifiability theory for a cornerstone of various cognitive diagnostic models (CDMs) popular in psychometrics: the Q-matrix. The key idea is a novel tensor-unfolding proof strategy. Representing the joint distribution of <i>J</i> categorical responses as a <i>J</i>-way tensor, we strategically unfold the tensor into matrices in multiple ways and use their rank properties to identify the unknown Q-matrix. This approach departs fundamentally from all prior identifiability analyses in CDMs. Our proof is constructive, elucidating a population-level procedure to exactly recover the Q-matrix within a parameter space where each latent attribute is measured by at least two \"pure\" items that solely measure this attribute. The theory has several desirable features: it can constructively identify both the Q-matrix and the number of latent attributes; it applies to broad classes of linear and nonlinear CDMs with main or all saturated effects of attributes; and it accommodates polytomous responses, extending beyond classical binary response settings. The new identifiability result unifies and strengthens identifiability guarantees across diverse CDMs. It provides rigorous theoretical foundations and indicates a future pathway toward using tensor unfolding for practical Q-matrix estimation.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1-20"},"PeriodicalIF":3.1,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145907280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A RECURSIVE STOCHASTIC ALGORITHM FOR REAL-TIME ONLINE PARAMETER ESTIMATION IN ITEM RESPONSE THEORY: ENHANCING COMPUTATIONAL EFFICIENCY FOR DYNAMIC EDUCATIONAL ASSESSMENT. 项目反应理论中实时在线参数估计的递归随机算法:提高动态教育评估的计算效率。
IF 3.1 2区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-12-23 DOI: 10.1017/psy.2025.10064
Sainan Xu, Jing Lu, Jiwei Zhang
{"title":"A RECURSIVE STOCHASTIC ALGORITHM FOR REAL-TIME ONLINE PARAMETER ESTIMATION IN ITEM RESPONSE THEORY: ENHANCING COMPUTATIONAL EFFICIENCY FOR DYNAMIC EDUCATIONAL ASSESSMENT.","authors":"Sainan Xu, Jing Lu, Jiwei Zhang","doi":"10.1017/psy.2025.10064","DOIUrl":"https://doi.org/10.1017/psy.2025.10064","url":null,"abstract":"","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1-45"},"PeriodicalIF":3.1,"publicationDate":"2025-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145812134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Plausible and Proper Multiple-Choice Items for Diagnostic Classification. 诊断分类中似是而非的选择题。
IF 3.1 2区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-12-19 DOI: 10.1017/psy.2025.10074
Chia-Yi Chiu, Hans Friedrich Koehn, Yu Wang
{"title":"Plausible and Proper Multiple-Choice Items for Diagnostic Classification.","authors":"Chia-Yi Chiu, Hans Friedrich Koehn, Yu Wang","doi":"10.1017/psy.2025.10074","DOIUrl":"https://doi.org/10.1017/psy.2025.10074","url":null,"abstract":"","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1-43"},"PeriodicalIF":3.1,"publicationDate":"2025-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145783651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Psychometric Model Framework for Multiple Response Items. 多反应项目心理测量模型框架。
IF 3.1 2区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-12-19 DOI: 10.1017/psy.2025.10073
Wenjie Zhou, Lei Guo

Multiple response (MR) items-such as multiple true-false, multiple-select, and select-N items-are increasingly used in assessments to identify partial knowledge and differentiate latent abilities more accurately. Allowing multiple selections, MR items provide richer information and reduce guessing effects compared to single-answer multiple-choice items. However, traditional scoring methods (e.g., Dichotomous, Ripkey, Partial scoring) compress response combination (RC) data, losing valuable information and ignoring issues like local dependence and incompatibility across item types. To address these challenges, we introduce a novel psychometric model framework: the Multiple Response Model with Inter-option Local Dependencies (MRM-LD), and its simplified version, the Multiple Response Model (MRM). These models preserve RC data across MR item types, offering a more comprehensive understanding for MR assessment. Parameters for MRM-LD and MRM were estimated using Markov chain Monte Carlo algorithms in Stan and R. Empirical data from an eighth-grade physics test showed that MRM-LD and MRM outperform Graded Response Model and Nominal Response Model combined with three scoring methods, by retaining more test information, improving reliability and validity, and providing more detailed analysis of item characteristics. Simulation studies confirmed the proposed models perform robustly under various conditions, including small samples and few items, demonstrating their applicability across diverse testing scenarios.

多重反应(MR)项目——如多个真假、多重选择和选择n项目——越来越多地用于评估中,以更准确地识别部分知识和区分潜在能力。与单答案的多项选择题相比,MR项目允许多次选择,提供更丰富的信息,减少猜测效果。然而,传统的评分方法(如二分法、Ripkey、部分评分)压缩了响应组合(RC)数据,丢失了有价值的信息,忽略了项目类型之间的局部依赖性和不兼容性等问题。为了解决这些挑战,我们引入了一种新的心理测量模型框架:具有选项间局部依赖关系的多重反应模型(MRM- ld)及其简化版本,多重反应模型(MRM)。这些模型保存了跨MR项目类型的RC数据,为MR评估提供了更全面的理解。采用马尔可夫链蒙特卡罗算法对MRM- ld和MRM的参数进行了估计。一项八年级物理测试的实证数据表明,MRM- ld和MRM在保留了更多的测试信息、提高了信度和效度、提供了更详细的项目特征分析等方面优于分级反应模型和标称反应模型结合三种评分方法。仿真研究证实了所提出的模型在各种条件下的鲁棒性,包括小样本和少量项目,证明了它们在不同测试场景中的适用性。
{"title":"Psychometric Model Framework for Multiple Response Items.","authors":"Wenjie Zhou, Lei Guo","doi":"10.1017/psy.2025.10073","DOIUrl":"10.1017/psy.2025.10073","url":null,"abstract":"<p><p>Multiple response (MR) items-such as multiple true-false, multiple-select, and select-N items-are increasingly used in assessments to identify partial knowledge and differentiate latent abilities more accurately. Allowing multiple selections, MR items provide richer information and reduce guessing effects compared to single-answer multiple-choice items. However, traditional scoring methods (e.g., Dichotomous, Ripkey, Partial scoring) compress response combination (RC) data, losing valuable information and ignoring issues like local dependence and incompatibility across item types. To address these challenges, we introduce a novel psychometric model framework: the Multiple Response Model with Inter-option Local Dependencies (MRM-LD), and its simplified version, the Multiple Response Model (MRM). These models preserve RC data across MR item types, offering a more comprehensive understanding for MR assessment. Parameters for MRM-LD and MRM were estimated using Markov chain Monte Carlo algorithms in Stan and R. Empirical data from an eighth-grade physics test showed that MRM-LD and MRM outperform Graded Response Model and Nominal Response Model combined with three scoring methods, by retaining more test information, improving reliability and validity, and providing more detailed analysis of item characteristics. Simulation studies confirmed the proposed models perform robustly under various conditions, including small samples and few items, demonstrating their applicability across diverse testing scenarios.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1-33"},"PeriodicalIF":3.1,"publicationDate":"2025-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145783707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reducing Differential Item Functioning via Process Data. 通过过程数据减少差异项目功能。
IF 3.1 2区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-12-10 DOI: 10.1017/psy.2025.10072
Ling Chen, Susu Zhang, Jingchen Liu

Test fairness is a major concern in psychometric and educational research. A typical approach for ensuring test fairness is through differential item functioning (DIF) analysis. DIF arises when a test item functions differently across subgroups that are typically defined by the respondents' demographic characteristics. Most of the existing research focuses on the statistical detection of DIF, yet less attention has been given to reducing or eliminating DIF. Simultaneously, the use of computer-based assessments has become increasingly popular. The data obtained from respondents interacting with an item are recorded in computer log files and are referred to as process data. In this article, we propose a novel method within the framework of generalized linear models that leverages process data to reduce and understand DIF. Specifically, we construct a nuisance trait surrogate with the features extracted from process data. With the constructed nuisance trait, we introduce a new scoring rule that incorporates respondents' behaviors captured through process data on top of the target latent trait. We demonstrate the efficiency of our approach through extensive simulation experiments and an application to 13 Problem Solving in Technology-Rich Environments items from the 2012 Programme for the International Assessment of Adult Competencies assessment.

考试公平是心理测量学和教育研究的一个主要问题。确保测试公平性的一个典型方法是通过差异项目功能(DIF)分析。当测试项目在通常由应答者的人口统计特征定义的子组之间的功能不同时,就会出现DIF。现有的研究大多集中在DIF的统计检测上,而对减少或消除DIF的关注较少。与此同时,以计算机为基础的评估也越来越受欢迎。从与项目交互的应答者获得的数据被记录在计算机日志文件中,并被称为过程数据。在本文中,我们提出了一种在广义线性模型框架内利用过程数据来减少和理解DIF的新方法。具体来说,我们用从过程数据中提取的特征构建了一个讨厌的特征代理。通过构建的讨厌特质,我们引入了一种新的评分规则,该规则将通过过程数据捕获的被调查者的行为纳入目标潜在特质之上。我们通过广泛的模拟实验和应用于2012年国际成人能力评估评估计划中的13个技术丰富环境中的问题解决方案来证明我们方法的有效性。
{"title":"Reducing Differential Item Functioning via Process Data.","authors":"Ling Chen, Susu Zhang, Jingchen Liu","doi":"10.1017/psy.2025.10072","DOIUrl":"10.1017/psy.2025.10072","url":null,"abstract":"<p><p>Test fairness is a major concern in psychometric and educational research. A typical approach for ensuring test fairness is through differential item functioning (DIF) analysis. DIF arises when a test item functions differently across subgroups that are typically defined by the respondents' demographic characteristics. Most of the existing research focuses on the statistical detection of DIF, yet less attention has been given to reducing or eliminating DIF. Simultaneously, the use of computer-based assessments has become increasingly popular. The data obtained from respondents interacting with an item are recorded in computer log files and are referred to as process data. In this article, we propose a novel method within the framework of generalized linear models that leverages process data to reduce and understand DIF. Specifically, we construct a nuisance trait surrogate with the features extracted from process data. With the constructed nuisance trait, we introduce a new scoring rule that incorporates respondents' behaviors captured through process data on top of the target latent trait. We demonstrate the efficiency of our approach through extensive simulation experiments and an application to 13 Problem Solving in Technology-Rich Environments items from the 2012 Programme for the International Assessment of Adult Competencies assessment.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1-36"},"PeriodicalIF":3.1,"publicationDate":"2025-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145716679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian Selection Policies for Human-in-the-Loop Anomaly Detectors with Applications in Test Security. 人在环异常检测器的贝叶斯选择策略及其在测试安全中的应用。
IF 3.1 2区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-12-10 DOI: 10.1017/psy.2025.10056
Michael Fauss, Xiang Liu, Chen Li, Ikkyu Choi, H Vincent Poor

This article investigates the problem of automatically flagging test takers who exhibit atypical responses or behaviors for further review by human experts. The objective is to develop a selection policy that maximizes the expected number of test takers correctly identified as warranting additional scrutiny while maintaining a manageable volume of reviews per test administration. The selection procedure should learn from the outcomes of the expert reviews. Since typically only a fraction of test takers are reviewed, this leads to a semi-supervised learning problem. The latter is formalized in a Bayesian setting, and the corresponding optimal selection policy is derived. Since calculating the policy and the underlying posterior distributions is computationally infeasible, a variational approximation and three heuristic selection policies are proposed. These policies are informed by properties of the optimal policy and correspond to different exploration/exploitation trade-offs. The performance of the approximate policies is assessed via numerical experiments using both synthetic and real-world data and is compared with procedures based on off-the-shelf algorithms as well as theoretical performance bounds.

本文研究了自动标记表现出非典型反应或行为的考生的问题,以供人类专家进一步审查。我们的目标是制定一项选拔政策,以最大限度地增加被正确识别为需要额外审查的考生的预期数量,同时保持每次考试管理的可管理的审查量。甄选程序应借鉴专家评审的结果。由于通常只有一小部分考生被审查,这导致了半监督学习问题。将后者在贝叶斯环境中形式化,并推导出相应的最优选择策略。由于计算策略和潜在的后验分布在计算上是不可行的,因此提出了一种变分逼近和三种启发式选择策略。这些策略由最优策略的属性通知,并对应于不同的勘探/开发权衡。近似策略的性能通过使用合成和实际数据的数值实验进行评估,并与基于现成算法和理论性能界限的过程进行比较。
{"title":"Bayesian Selection Policies for Human-in-the-Loop Anomaly Detectors with Applications in Test Security.","authors":"Michael Fauss, Xiang Liu, Chen Li, Ikkyu Choi, H Vincent Poor","doi":"10.1017/psy.2025.10056","DOIUrl":"10.1017/psy.2025.10056","url":null,"abstract":"<p><p>This article investigates the problem of automatically flagging test takers who exhibit atypical responses or behaviors for further review by human experts. The objective is to develop a selection policy that maximizes the expected number of test takers correctly identified as warranting additional scrutiny while maintaining a manageable volume of reviews per test administration. The selection procedure should learn from the outcomes of the expert reviews. Since typically only a fraction of test takers are reviewed, this leads to a semi-supervised learning problem. The latter is formalized in a Bayesian setting, and the corresponding optimal selection policy is derived. Since calculating the policy and the underlying posterior distributions is computationally infeasible, a variational approximation and three heuristic selection policies are proposed. These policies are informed by properties of the optimal policy and correspond to different exploration/exploitation trade-offs. The performance of the approximate policies is assessed via numerical experiments using both synthetic and real-world data and is compared with procedures based on off-the-shelf algorithms as well as theoretical performance bounds.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1-33"},"PeriodicalIF":3.1,"publicationDate":"2025-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145716767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SELF-Tree: An Interpretable Model for Multivariate Causal Direction Heterogeneity Analysis. SELF-Tree:多变量因果方向异质性分析的可解释模型。
IF 3.1 2区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-12-10 DOI: 10.1017/psy.2025.10067
Zhifei Li, Hongbo Wen
{"title":"SELF-Tree: An Interpretable Model for Multivariate Causal Direction Heterogeneity Analysis.","authors":"Zhifei Li, Hongbo Wen","doi":"10.1017/psy.2025.10067","DOIUrl":"https://doi.org/10.1017/psy.2025.10067","url":null,"abstract":"","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1-52"},"PeriodicalIF":3.1,"publicationDate":"2025-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145716708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Psychometrika
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1