首页 > 最新文献

Psychometrika最新文献

英文 中文
Reducing Differential Item Functioning via Process Data. 通过过程数据减少差异项目功能。
IF 3.1 2区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-12-10 DOI: 10.1017/psy.2025.10072
Ling Chen, Susu Zhang, Jingchen Liu

Test fairness is a major concern in psychometric and educational research. A typical approach for ensuring test fairness is through differential item functioning (DIF) analysis. DIF arises when a test item functions differently across subgroups that are typically defined by the respondents' demographic characteristics. Most of the existing research focuses on the statistical detection of DIF, yet less attention has been given to reducing or eliminating DIF. Simultaneously, the use of computer-based assessments has become increasingly popular. The data obtained from respondents interacting with an item are recorded in computer log files and are referred to as process data. In this article, we propose a novel method within the framework of generalized linear models that leverages process data to reduce and understand DIF. Specifically, we construct a nuisance trait surrogate with the features extracted from process data. With the constructed nuisance trait, we introduce a new scoring rule that incorporates respondents' behaviors captured through process data on top of the target latent trait. We demonstrate the efficiency of our approach through extensive simulation experiments and an application to 13 Problem Solving in Technology-Rich Environments items from the 2012 Programme for the International Assessment of Adult Competencies assessment.

考试公平是心理测量学和教育研究的一个主要问题。确保测试公平性的一个典型方法是通过差异项目功能(DIF)分析。当测试项目在通常由应答者的人口统计特征定义的子组之间的功能不同时,就会出现DIF。现有的研究大多集中在DIF的统计检测上,而对减少或消除DIF的关注较少。与此同时,以计算机为基础的评估也越来越受欢迎。从与项目交互的应答者获得的数据被记录在计算机日志文件中,并被称为过程数据。在本文中,我们提出了一种在广义线性模型框架内利用过程数据来减少和理解DIF的新方法。具体来说,我们用从过程数据中提取的特征构建了一个讨厌的特征代理。通过构建的讨厌特质,我们引入了一种新的评分规则,该规则将通过过程数据捕获的被调查者的行为纳入目标潜在特质之上。我们通过广泛的模拟实验和应用于2012年国际成人能力评估评估计划中的13个技术丰富环境中的问题解决方案来证明我们方法的有效性。
{"title":"Reducing Differential Item Functioning via Process Data.","authors":"Ling Chen, Susu Zhang, Jingchen Liu","doi":"10.1017/psy.2025.10072","DOIUrl":"10.1017/psy.2025.10072","url":null,"abstract":"<p><p>Test fairness is a major concern in psychometric and educational research. A typical approach for ensuring test fairness is through differential item functioning (DIF) analysis. DIF arises when a test item functions differently across subgroups that are typically defined by the respondents' demographic characteristics. Most of the existing research focuses on the statistical detection of DIF, yet less attention has been given to reducing or eliminating DIF. Simultaneously, the use of computer-based assessments has become increasingly popular. The data obtained from respondents interacting with an item are recorded in computer log files and are referred to as process data. In this article, we propose a novel method within the framework of generalized linear models that leverages process data to reduce and understand DIF. Specifically, we construct a nuisance trait surrogate with the features extracted from process data. With the constructed nuisance trait, we introduce a new scoring rule that incorporates respondents' behaviors captured through process data on top of the target latent trait. We demonstrate the efficiency of our approach through extensive simulation experiments and an application to 13 Problem Solving in Technology-Rich Environments items from the 2012 Programme for the International Assessment of Adult Competencies assessment.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1-36"},"PeriodicalIF":3.1,"publicationDate":"2025-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145716679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian Selection Policies for Human-in-the-Loop Anomaly Detectors with Applications in Test Security. 人在环异常检测器的贝叶斯选择策略及其在测试安全中的应用。
IF 3.1 2区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-12-10 DOI: 10.1017/psy.2025.10056
Michael Fauss, Xiang Liu, Chen Li, Ikkyu Choi, H Vincent Poor

This article investigates the problem of automatically flagging test takers who exhibit atypical responses or behaviors for further review by human experts. The objective is to develop a selection policy that maximizes the expected number of test takers correctly identified as warranting additional scrutiny while maintaining a manageable volume of reviews per test administration. The selection procedure should learn from the outcomes of the expert reviews. Since typically only a fraction of test takers are reviewed, this leads to a semi-supervised learning problem. The latter is formalized in a Bayesian setting, and the corresponding optimal selection policy is derived. Since calculating the policy and the underlying posterior distributions is computationally infeasible, a variational approximation and three heuristic selection policies are proposed. These policies are informed by properties of the optimal policy and correspond to different exploration/exploitation trade-offs. The performance of the approximate policies is assessed via numerical experiments using both synthetic and real-world data and is compared with procedures based on off-the-shelf algorithms as well as theoretical performance bounds.

本文研究了自动标记表现出非典型反应或行为的考生的问题,以供人类专家进一步审查。我们的目标是制定一项选拔政策,以最大限度地增加被正确识别为需要额外审查的考生的预期数量,同时保持每次考试管理的可管理的审查量。甄选程序应借鉴专家评审的结果。由于通常只有一小部分考生被审查,这导致了半监督学习问题。将后者在贝叶斯环境中形式化,并推导出相应的最优选择策略。由于计算策略和潜在的后验分布在计算上是不可行的,因此提出了一种变分逼近和三种启发式选择策略。这些策略由最优策略的属性通知,并对应于不同的勘探/开发权衡。近似策略的性能通过使用合成和实际数据的数值实验进行评估,并与基于现成算法和理论性能界限的过程进行比较。
{"title":"Bayesian Selection Policies for Human-in-the-Loop Anomaly Detectors with Applications in Test Security.","authors":"Michael Fauss, Xiang Liu, Chen Li, Ikkyu Choi, H Vincent Poor","doi":"10.1017/psy.2025.10056","DOIUrl":"10.1017/psy.2025.10056","url":null,"abstract":"<p><p>This article investigates the problem of automatically flagging test takers who exhibit atypical responses or behaviors for further review by human experts. The objective is to develop a selection policy that maximizes the expected number of test takers correctly identified as warranting additional scrutiny while maintaining a manageable volume of reviews per test administration. The selection procedure should learn from the outcomes of the expert reviews. Since typically only a fraction of test takers are reviewed, this leads to a semi-supervised learning problem. The latter is formalized in a Bayesian setting, and the corresponding optimal selection policy is derived. Since calculating the policy and the underlying posterior distributions is computationally infeasible, a variational approximation and three heuristic selection policies are proposed. These policies are informed by properties of the optimal policy and correspond to different exploration/exploitation trade-offs. The performance of the approximate policies is assessed via numerical experiments using both synthetic and real-world data and is compared with procedures based on off-the-shelf algorithms as well as theoretical performance bounds.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1-33"},"PeriodicalIF":3.1,"publicationDate":"2025-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145716767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SELF-Tree: An Interpretable Model for Multivariate Causal Direction Heterogeneity Analysis. SELF-Tree:多变量因果方向异质性分析的可解释模型。
IF 3.1 2区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-12-10 DOI: 10.1017/psy.2025.10067
Zhifei Li, Hongbo Wen

Identifying causal directions among variables via data-driven approaches is a research hotspot. Researchers now focus on detecting causal direction heterogeneity among multiple variables (variables more than two) when covariates cause such heterogeneity. This study combines the structural equation likelihood function (SELF) method with a recursive partitioning method to achieve an interpretable model of multivariate causal direction heterogeneity in multivariable settings. Through simulation, we compared the performance of the SELF-Tree model in terms of the identification about heterogeneous causal direction under different conditions. Using a public drug consumption dataset, we demonstrated its real data application. The SELF-Tree model offers researchers a new way to understand variable causal direction heterogeneity.

利用数据驱动方法识别变量之间的因果关系是一个研究热点。当协变量导致这种异质性时,研究人员现在关注的是在多个变量(两个以上的变量)之间检测因果方向异质性。本研究将结构方程似然函数(SELF)方法与递归划分方法相结合,实现了多变量设置下多元因果方向异质性的可解释模型。通过仿真,比较了SELF-Tree模型在不同条件下对异质性因果方向的识别性能。我们使用一个公开的药物消费数据集,演示了它的实际数据应用。SELF-Tree模型为研究人员提供了一种理解变量因果方向异质性的新途径。
{"title":"SELF-Tree: An Interpretable Model for Multivariate Causal Direction Heterogeneity Analysis.","authors":"Zhifei Li, Hongbo Wen","doi":"10.1017/psy.2025.10067","DOIUrl":"10.1017/psy.2025.10067","url":null,"abstract":"<p><p>Identifying causal directions among variables via data-driven approaches is a research hotspot. Researchers now focus on detecting causal direction heterogeneity among multiple variables (variables more than two) when covariates cause such heterogeneity. This study combines the structural equation likelihood function (SELF) method with a recursive partitioning method to achieve an interpretable model of multivariate causal direction heterogeneity in multivariable settings. Through simulation, we compared the performance of the SELF-Tree model in terms of the identification about heterogeneous causal direction under different conditions. Using a public drug consumption dataset, we demonstrated its real data application. The SELF-Tree model offers researchers a new way to understand variable causal direction heterogeneity.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1-31"},"PeriodicalIF":3.1,"publicationDate":"2025-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145716708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The bit scale: A metric score scale for unidimensional item response theory models. 位量表:一种用于一维项目反应理论模型的度量计分量表。
IF 3.1 2区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-12-10 DOI: 10.1017/psy.2025.10071
Joakim Wallmark, Marie Wiberg
{"title":"The bit scale: A metric score scale for unidimensional item response theory models.","authors":"Joakim Wallmark, Marie Wiberg","doi":"10.1017/psy.2025.10071","DOIUrl":"https://doi.org/10.1017/psy.2025.10071","url":null,"abstract":"","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1-32"},"PeriodicalIF":3.1,"publicationDate":"2025-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145716765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian Joint Modeling of Response Times with Dynamic Latent Ability in Educational Testing. 教育测试中反应时间与动态潜在能力的贝叶斯联合建模。
IF 3.1 2区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-12-02 DOI: 10.1017/psy.2025.10019
Xiaojing Wang, Abhisek Saha, Dipak K Dey

In educational testing, inferences of ability have been mainly based on item responses, while the time taken to complete an item is often ignored. To better infer the ability, a new class of state space models, which conjointly model response time with time series of dichotomous responses, is developed. Simulations for the proposed models demonstrate that the biases of ability estimation are reduced as well as the precisions of ability estimation are improved. An empirical study is conducted using EdSphere datasets, where the two competing relationships (i.e., monotone and inverted U-shape) for the distance between ability and difficulty are investigated in modeling response times. The results of model comparison support that the inverted U-shape relationship better captures the behaviors and psychology of examinees in exams for EdSphere datasets.

在教育测试中,能力的推断主要基于项目的反应,而完成一个项目所花费的时间往往被忽视。为了更好地推断这种能力,提出了一种新的状态空间模型,该模型将响应时间与二分类响应时间序列联合建模。仿真结果表明,所提模型减小了能力估计的偏差,提高了能力估计的精度。使用EdSphere数据集进行了实证研究,研究了建模响应时间中能力和难度之间的两种竞争关系(即单调和倒u形)。模型比较结果支持倒u型关系更好地反映了EdSphere数据集考生的考试行为和心理。
{"title":"Bayesian Joint Modeling of Response Times with Dynamic Latent Ability in Educational Testing.","authors":"Xiaojing Wang, Abhisek Saha, Dipak K Dey","doi":"10.1017/psy.2025.10019","DOIUrl":"https://doi.org/10.1017/psy.2025.10019","url":null,"abstract":"<p><p>In educational testing, inferences of ability have been mainly based on item responses, while the time taken to complete an item is often ignored. To better infer the ability, a new class of state space models, which conjointly model response time with time series of dichotomous responses, is developed. Simulations for the proposed models demonstrate that the biases of ability estimation are reduced as well as the precisions of ability estimation are improved. An empirical study is conducted using EdSphere datasets, where the two competing relationships (i.e., monotone and inverted U-shape) for the distance between ability and difficulty are investigated in modeling response times. The results of model comparison support that the inverted U-shape relationship better captures the behaviors and psychology of examinees in exams for EdSphere datasets.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1-23"},"PeriodicalIF":3.1,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145656409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust Estimation of Polychoric Correlation. 多频相关的稳健估计。
IF 3.1 2区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-12-01 DOI: 10.1017/psy.2025.10066
Max Welz, Patrick Mair, Andreas Alfons

Polychoric correlation is often an important building block in the analysis of rating data, particularly for structural equation models. However, the commonly employed maximum likelihood (ML) estimator is highly susceptible to misspecification of the polychoric correlation model, for instance, through violations of latent normality assumptions. We propose a novel estimator that is designed to be robust against partial misspecification of the polychoric model, that is, when the model is misspecified for an unknown fraction of observations, such as careless respondents. To this end, the estimator minimizes a robust loss function based on the divergence between observed frequencies and theoretical frequencies implied by the polychoric model. In contrast to existing literature, our estimator makes no assumption on the type or degree of model misspecification. It furthermore generalizes ML estimation, is consistent as well as asymptotically normally distributed, and comes at no additional computational cost. We demonstrate the robustness and practical usefulness of our estimator in simulation studies and an empirical application on a Big Five administration. In the latter, the polychoric correlation estimates of our estimator and ML differ substantially, which, after further inspection, is likely due to the presence of careless respondents that the estimator helps identify.

在评级数据分析中,尤其是结构方程模型中,多重相关往往是一个重要的组成部分。然而,通常使用的最大似然(ML)估计器极易受到多重相关模型的错误规范的影响,例如,通过违反潜在正态性假设。我们提出了一种新的估计器,该估计器被设计为对多元模型的部分错误规范具有鲁棒性,也就是说,当模型被错误指定为未知部分的观测值时,例如粗心的应答者。为此,估计器根据观测频率和多共频模型隐含的理论频率之间的差异最小化鲁棒损失函数。与现有文献相反,我们的估计器没有对模型规格错误的类型或程度做任何假设。它进一步推广了ML估计,是一致的,也是渐近正态分布的,并且不需要额外的计算成本。我们在模拟研究和五大管理的经验应用中证明了我们的估计器的鲁棒性和实用性。在后一种情况下,我们的估计器和ML的多重相关性估计有很大的不同,在进一步检查之后,这可能是由于估计器帮助识别的粗心的应答者的存在。
{"title":"Robust Estimation of Polychoric Correlation.","authors":"Max Welz, Patrick Mair, Andreas Alfons","doi":"10.1017/psy.2025.10066","DOIUrl":"10.1017/psy.2025.10066","url":null,"abstract":"<p><p>Polychoric correlation is often an important building block in the analysis of rating data, particularly for structural equation models. However, the commonly employed maximum likelihood (ML) estimator is highly susceptible to misspecification of the polychoric correlation model, for instance, through violations of latent normality assumptions. We propose a novel estimator that is designed to be robust against partial misspecification of the polychoric model, that is, when the model is misspecified for an unknown fraction of observations, such as careless respondents. To this end, the estimator minimizes a robust loss function based on the divergence between observed frequencies and theoretical frequencies implied by the polychoric model. In contrast to existing literature, our estimator makes no assumption on the type or degree of model misspecification. It furthermore generalizes ML estimation, is consistent as well as asymptotically normally distributed, and comes at no additional computational cost. We demonstrate the robustness and practical usefulness of our estimator in simulation studies and an empirical application on a Big Five administration. In the latter, the polychoric correlation estimates of our estimator and ML differ substantially, which, after further inspection, is likely due to the presence of careless respondents that the estimator helps identify.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1-32"},"PeriodicalIF":3.1,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145650157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Generalized Definition of Multidimensional Item Response Theory Parameters. 多维项目反应理论参数的广义定义。
IF 3.1 2区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-11-19 DOI: 10.1017/psy.2025.10063
Daniel Morillo-Cuadrado, Mario Luzardo-Verde

In this paper, we generalize the multidimensional discrimination and difficulty parameters in the multidimensional two-parameter logistic model to account for nonidentity latent covariances and negatively keyed items. We apply Reckase's maximum discrimination point method to define them in an arbitrary algebraic basis. Then, we define that basis to be a geometrical representation of the measured construct. This results in three different versions of the parameters: the original one, based on the item parameters solely; one that incorporates the covariance structure of the latent space; and one that uses the correlation structure instead. Importantly, we find that the items should be properly represented in a test space, distinct from the latent space. We also provide a procedure for the geometrical representation of the items in the test space and apply our results to examples from the literature to get a more accurate representation of the measurement properties of the items. We recommend using the covariance structure version for describing the properties of the parameters and the correlation structure version for graphical representation. Finally, we discuss the implications of this generalization for other multidimensional item response theory models and the parallels of our results in common factor model theory.

本文推广了多维双参数逻辑模型中的多维判别和难度参数,以解释非同一性潜在协方差和负关键字项目。我们应用Reckase的最大区别点方法在任意代数基上定义它们。然后,我们将该基定义为测量构造的几何表示。这将产生三个不同版本的参数:原始版本仅基于项目参数;一种包含潜在空间协方差结构的;另一种是使用相关结构。重要的是,我们发现项目应该在测试空间中适当地表示,与潜在空间不同。我们还提供了测试空间中项目的几何表示程序,并将我们的结果应用于文献中的例子,以获得更准确的项目测量属性表示。我们建议使用协方差结构版本来描述参数的属性,使用相关结构版本来进行图形表示。最后,我们讨论了这一推广对其他多维项目反应理论模型的影响,以及我们的结果在共同因素模型理论中的相似之处。
{"title":"A Generalized Definition of Multidimensional Item Response Theory Parameters.","authors":"Daniel Morillo-Cuadrado, Mario Luzardo-Verde","doi":"10.1017/psy.2025.10063","DOIUrl":"10.1017/psy.2025.10063","url":null,"abstract":"<p><p>In this paper, we generalize the multidimensional discrimination and difficulty parameters in the multidimensional two-parameter logistic model to account for nonidentity latent covariances and negatively keyed items. We apply Reckase's maximum discrimination point method to define them in an arbitrary algebraic basis. Then, we define that basis to be a geometrical representation of the measured construct. This results in three different versions of the parameters: the original one, based on the item parameters solely; one that incorporates the covariance structure of the latent space; and one that uses the correlation structure instead. Importantly, we find that the items should be properly represented in a test space, distinct from the latent space. We also provide a procedure for the geometrical representation of the items in the test space and apply our results to examples from the literature to get a more accurate representation of the measurement properties of the items. We recommend using the covariance structure version for describing the properties of the parameters and the correlation structure version for graphical representation. Finally, we discuss the implications of this generalization for other multidimensional item response theory models and the parallels of our results in common factor model theory.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1-23"},"PeriodicalIF":3.1,"publicationDate":"2025-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145551540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Two Markov Solution Process Models for the Assessment of Planning in Problem Solving. 问题解决中计划评估的两个马尔可夫解过程模型。
IF 3.1 2区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-11-13 DOI: 10.1017/psy.2025.10042
Andrea Brancaccio, Debora de Chiusole, Ottavia M Epifania, Pasquale Anselmi, Matilde Spinoso, Noemi Mazzoni, Alice Bacherini, Matteo Orsoni, Sara Giovagnoli, Irene Pierluigi, Mariagrazia Benassi, Giulia Balboni, Luca Stefanutti

Tower tasks are popular tools used to measure planning skills. The sequences of moves undertaken by the respondents in solving tower tasks might provide important and useful information to shed light on their planning skills. The article focuses on the distinction between a situation where planning occurs before action (pre-planning) from one where planning and action are interlaced all along the execution of the task (interim-planning). While the model for pre-planning was already developed by Stefanutti et al. (2021), an alternative model for the interim-planning is proposed. The two models are compared with one another in an empirical study. In accordance with the literature on the development of planning skills, the pre-planning model better fits data collected on individuals aged 14 on, while the interim-planning model displays a better fit with data collected on individuals aged 4-8. This result is further corroborated by the analysis of the time performance.

塔式任务是衡量计划能力的常用工具。被调查者在解决塔任务时采取的行动顺序可能提供重要和有用的信息,以阐明他们的计划技能。这篇文章关注的是计划发生在行动之前的情况(预计划)和计划和行动在任务执行过程中相互交织的情况(中期计划)之间的区别。虽然Stefanutti等人(2021)已经开发了预先规划模型,但提出了一种替代的中期规划模型。在实证研究中,对两种模型进行了比较。根据有关规划技能发展的文献,预规划模型更适合14岁以上的个体数据,而中期规划模型更适合4-8岁的个体数据。对时间性能的分析进一步证实了这一结果。
{"title":"Two Markov Solution Process Models for the Assessment of Planning in Problem Solving.","authors":"Andrea Brancaccio, Debora de Chiusole, Ottavia M Epifania, Pasquale Anselmi, Matilde Spinoso, Noemi Mazzoni, Alice Bacherini, Matteo Orsoni, Sara Giovagnoli, Irene Pierluigi, Mariagrazia Benassi, Giulia Balboni, Luca Stefanutti","doi":"10.1017/psy.2025.10042","DOIUrl":"https://doi.org/10.1017/psy.2025.10042","url":null,"abstract":"<p><p>Tower tasks are popular tools used to measure planning skills. The sequences of moves undertaken by the respondents in solving tower tasks might provide important and useful information to shed light on their planning skills. The article focuses on the distinction between a situation where planning occurs before action (pre-planning) from one where planning and action are interlaced all along the execution of the task (interim-planning). While the model for pre-planning was already developed by Stefanutti et al. (2021), an alternative model for the interim-planning is proposed. The two models are compared with one another in an empirical study. In accordance with the literature on the development of planning skills, the pre-planning model better fits data collected on individuals aged 14 on, while the interim-planning model displays a better fit with data collected on individuals aged 4-8. This result is further corroborated by the analysis of the time performance.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1-31"},"PeriodicalIF":3.1,"publicationDate":"2025-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145507979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multidimensional Generalized Partial Preference Model for Forced-Choice Items. 强迫选择项目的多维广义部分偏好模型。
IF 3.1 2区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-11-13 DOI: 10.1017/psy.2025.10054
Daniel C Furr, Jianbin Fu

A ranking pattern approach is proposed to build item response theory (IRT) models for forced-choice (FC) items. This new approach is an addition to the two existing approaches, sequential selection and Thurstone's law of pairwise comparison. A new dominance IRT model, the multidimensional generalized partial preference model (MGPPM), is proposed for FC items with any number (greater than 1) of statements. The maximum marginal likelihood estimation using an expectation-maximization algorithm (MML-EM) and Markov chain Monte Carlo (MCMC) estimation are developed. A simulation study is conducted to show satisfactory parameter recovery on triplet and tetrad data. The relationships between the newly proposed approach/model and the existing approaches/models are described, and the MGPPM, Thurstonian IRT (TIRT) model, and Triplet-2PLM are compared when applied to simulated and real triplet data. The new approach offers more flexible IRT modeling than the other two approaches under different assumptions, and the MGPPM is more statistically elegant than the TIRT and Triple-2PLM.

提出了一种排序模式方法来构建强迫选择项目的项目反应理论模型。这种新方法是对现有的两种方法——顺序选择法和瑟斯通的两两比较法——的补充。本文提出了一个新的优势IRT模型——多维广义部分偏好模型(MGPPM),该模型适用于具有任意数量(大于1)语句的FC条目。提出了期望最大化算法(MML-EM)和马尔可夫链蒙特卡罗(MCMC)估计的最大边际似然估计。仿真研究表明,对三联体和四联体数据的参数恢复是满意的。描述了新提出的方法/模型与现有方法/模型之间的关系,并比较了MGPPM、Thurstonian IRT (TIRT)模型和triplet - 2plm在模拟和实际三重数据中的应用。在不同的假设下,新方法提供了比其他两种方法更灵活的IRT建模,并且MGPPM在统计上比TIRT和Triple-2PLM更优雅。
{"title":"Multidimensional Generalized Partial Preference Model for Forced-Choice Items.","authors":"Daniel C Furr, Jianbin Fu","doi":"10.1017/psy.2025.10054","DOIUrl":"10.1017/psy.2025.10054","url":null,"abstract":"<p><p>A ranking pattern approach is proposed to build item response theory (IRT) models for forced-choice (FC) items. This new approach is an addition to the two existing approaches, sequential selection and Thurstone's law of pairwise comparison. A new dominance IRT model, the multidimensional generalized partial preference model (MGPPM), is proposed for FC items with any number (greater than 1) of statements. The maximum marginal likelihood estimation using an expectation-maximization algorithm (MML-EM) and Markov chain Monte Carlo (MCMC) estimation are developed. A simulation study is conducted to show satisfactory parameter recovery on triplet and tetrad data. The relationships between the newly proposed approach/model and the existing approaches/models are described, and the MGPPM, Thurstonian IRT (TIRT) model, and Triplet-2PLM are compared when applied to simulated and real triplet data. The new approach offers more flexible IRT modeling than the other two approaches under different assumptions, and the MGPPM is more statistically elegant than the TIRT and Triple-2PLM.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1-30"},"PeriodicalIF":3.1,"publicationDate":"2025-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12805200/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145508067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generative Adversarial Networks for High-Dimensional Item Factor Analysis: A Deep Adversarial Learning Algorithm. 高维项目因子分析的生成对抗网络:一种深度对抗学习算法。
IF 3.1 2区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-11-11 DOI: 10.1017/psy.2025.10059
Nanyu Luo, Feng Ji

Advances in deep learning and representation learning have transformed item factor analysis (IFA) in the item response theory (IRT) literature by enabling more efficient and accurate parameter estimation. Variational autoencoders (VAEs) are widely used to model high-dimensional latent variables in this context, but the limited expressiveness of their inference networks can still hinder performance. We introduce adversarial variational Bayes (AVB) and an importance-weighted extension (IWAVB) as more flexible inference algorithms for IFA. By combining VAEs with generative adversarial networks (GANs), AVB uses an auxiliary discriminator network to frame estimation as a two-player game and removes the restrictive standard normal assumption on the latent variables. Theoretically, AVB and IWAVB can achieve likelihoods that match or exceed those of VAEs and importance-weighted autoencoders (IWAEs). In exploratory analyses of empirical data, IWAVB attained higher likelihoods than IWAE, indicating greater expressiveness. In confirmatory simulations, IWAVB achieved comparable mean-square error in parameter recovery while consistently yielding higher likelihoods, and it clearly outperformed IWAE when the latent distribution was multimodal. These findings suggest that IWAVB can scale IFA to complex, large-scale, and potentially multimodal settings, supporting closer integration of psychometrics with modern multimodal data analysis.

深度学习和表征学习的进步通过实现更有效和准确的参数估计,改变了项目反应理论(IRT)文献中的项目因素分析(IFA)。在这种情况下,变分自编码器(VAEs)被广泛用于高维潜在变量的建模,但其推理网络的有限表达能力仍然会阻碍性能。我们引入了对抗变分贝叶斯(AVB)和重要加权扩展(IWAVB)作为更灵活的IFA推理算法。AVB通过将ves与生成对抗网络(GANs)相结合,使用辅助判别器网络将估计框架为两人博弈,并消除了对潜在变量的限制性标准正态假设。理论上,AVB和IWAVB可以实现匹配或超过VAEs和重要性加权自编码器(IWAEs)的可能性。在实证数据的探索性分析中,IWAVB比IWAE获得更高的可能性,表明更强的表达能力。在验证性模拟中,IWAVB在参数恢复中获得了相当的均方误差,同时始终产生更高的似然,并且当潜在分布是多模态时,它明显优于IWAE。这些发现表明,IWAVB可以将IFA扩展到复杂、大规模和潜在的多模态环境,支持心理测量学与现代多模态数据分析的更紧密整合。
{"title":"Generative Adversarial Networks for High-Dimensional Item Factor Analysis: A Deep Adversarial Learning Algorithm.","authors":"Nanyu Luo, Feng Ji","doi":"10.1017/psy.2025.10059","DOIUrl":"10.1017/psy.2025.10059","url":null,"abstract":"<p><p>Advances in deep learning and representation learning have transformed item factor analysis (IFA) in the item response theory (IRT) literature by enabling more efficient and accurate parameter estimation. Variational autoencoders (VAEs) are widely used to model high-dimensional latent variables in this context, but the limited expressiveness of their inference networks can still hinder performance. We introduce adversarial variational Bayes (AVB) and an importance-weighted extension (IWAVB) as more flexible inference algorithms for IFA. By combining VAEs with generative adversarial networks (GANs), AVB uses an auxiliary discriminator network to frame estimation as a two-player game and removes the restrictive standard normal assumption on the latent variables. Theoretically, AVB and IWAVB can achieve likelihoods that match or exceed those of VAEs and importance-weighted autoencoders (IWAEs). In exploratory analyses of empirical data, IWAVB attained higher likelihoods than IWAE, indicating greater expressiveness. In confirmatory simulations, IWAVB achieved comparable mean-square error in parameter recovery while consistently yielding higher likelihoods, and it clearly outperformed IWAE when the latent distribution was multimodal. These findings suggest that IWAVB can scale IFA to complex, large-scale, and potentially multimodal settings, supporting closer integration of psychometrics with modern multimodal data analysis.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1-24"},"PeriodicalIF":3.1,"publicationDate":"2025-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12805202/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145490878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Psychometrika
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1