首页 > 最新文献

British Journal of Mathematical & Statistical Psychology最新文献

英文 中文
Inferences of associated latent variables by the observable test scores 由可观察测验分数推断相关潜在变量。
IF 1.8 3区 心理学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-06-18 DOI: 10.1111/bmsp.70002
Rudy Ligtvoet

Test scores, like the sum score, can be useful for making inferences about the latent variables. The conditions under which such test scores allow for inferences of the latent variables based on a “weaker” stochastic ordering are generalized to any monotone latent variable model for which the latent variables are associated. The generality of these conditions places the sum score, or indeed any test score, well beyond a mere intuitive measure or a relic from classical test theory.

测试分数,就像总和分数一样,可以用来推断潜在变量。这种测试分数允许基于“较弱”随机排序推断潜在变量的条件被推广到潜在变量相关联的任何单调潜在变量模型。这些条件的普遍性使得总和分数,或者任何考试分数,远远超出了单纯的直觉测量或经典考试理论的遗物。
{"title":"Inferences of associated latent variables by the observable test scores","authors":"Rudy Ligtvoet","doi":"10.1111/bmsp.70002","DOIUrl":"10.1111/bmsp.70002","url":null,"abstract":"<p>Test scores, like the sum score, can be useful for making inferences about the latent variables. The conditions under which such test scores allow for inferences of the latent variables based on a “weaker” stochastic ordering are generalized to any monotone latent variable model for which the latent variables are associated. The generality of these conditions places the sum score, or indeed any test score, well beyond a mere intuitive measure or a relic from classical test theory.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":"79 1","pages":"139-145"},"PeriodicalIF":1.8,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://bpspsychub.onlinelibrary.wiley.com/doi/epdf/10.1111/bmsp.70002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144327806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Testing the validity of instrumental variables in just-identified linear non-Gaussian models 检验工具变量在刚识别的线性非高斯模型中的有效性。
IF 1.8 3区 心理学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-06-16 DOI: 10.1111/bmsp.70000
Wolfgang Wiedermann, Dexin Shi

Instrumental variable (IV) estimation constitutes a powerful quasi-experimental tool to estimate causal effects in observational data. The IV approach, however, rests on two crucial assumptions—the instrument relevance assumption and the exclusion restriction assumption. The latter requirement (stating that the IV is not allowed to be related to the outcome via any path other than the one going through the predictor), cannot be empirically tested in just-identified models (i.e. models with as many IVs as predictors). The present study introduces properties of non-Gaussian IV models which enable one to test whether hidden confounding between an IV and the outcome is present. Detecting exclusion restriction violations due to a direct path between the IV and the outcome, however, is restricted to the over-identified case. Based on these insights, a two-step approach is presented to test IV validity against hidden confounding in just-identified models. The performance of the approach was evaluated using Monte-Carlo simulation experiments. An empirical example from psychological research is given to illustrate the approach in practice. Recommendations for best-practice applications and future research directions are discussed. Although the current study presents important insights for developing diagnostic procedures for IV models, sound universal IV validation in the just-identified case remains a challenging task.

工具变量(IV)估计是估计观测数据因果效应的一种强大的准实验工具。然而,IV方法依赖于两个关键假设——工具相关性假设和排除限制假设。后一项要求(即除了通过预测器的路径外,不允许IV通过任何其他路径与结果相关)无法在刚刚确定的模型(即具有与预测器一样多的IV的模型)中进行经验检验。本研究介绍了非高斯IV模型的特性,使人们能够测试IV和结果之间是否存在隐藏的混淆。然而,由于静脉注射和结果之间的直接路径,检测排除限制违规行为仅限于过度识别的病例。基于这些见解,提出了一种两步方法来测试IV有效性,以对抗刚刚确定的模型中的隐藏混淆。通过蒙特卡罗仿真实验对该方法的性能进行了评价。以心理学研究为例,说明了该方法在实践中的应用。讨论了最佳实践应用建议和未来的研究方向。尽管目前的研究为开发静脉注射模型的诊断程序提供了重要的见解,但在刚刚确定的病例中进行全面的静脉注射验证仍然是一项具有挑战性的任务。
{"title":"Testing the validity of instrumental variables in just-identified linear non-Gaussian models","authors":"Wolfgang Wiedermann,&nbsp;Dexin Shi","doi":"10.1111/bmsp.70000","DOIUrl":"10.1111/bmsp.70000","url":null,"abstract":"<p>Instrumental variable (IV) estimation constitutes a powerful quasi-experimental tool to estimate causal effects in observational data. The IV approach, however, rests on two crucial assumptions—the instrument relevance assumption and the exclusion restriction assumption. The latter requirement (stating that the IV is not allowed to be related to the outcome via any path other than the one going through the predictor), cannot be empirically tested in just-identified models (i.e. models with as many IVs as predictors). The present study introduces properties of non-Gaussian IV models which enable one to test whether hidden confounding between an IV and the outcome is present. Detecting exclusion restriction violations due to a direct path between the IV and the outcome, however, is restricted to the over-identified case. Based on these insights, a two-step approach is presented to test IV validity against hidden confounding in just-identified models. The performance of the approach was evaluated using Monte-Carlo simulation experiments. An empirical example from psychological research is given to illustrate the approach in practice. Recommendations for best-practice applications and future research directions are discussed. Although the current study presents important insights for developing diagnostic procedures for IV models, sound universal IV validation in the just-identified case remains a challenging task.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":"79 1","pages":"111-138"},"PeriodicalIF":1.8,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144310869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
New developments in experience sampling methodology 经验抽样方法的新发展。
IF 1.8 3区 心理学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-06-10 DOI: 10.1111/bmsp.12398
Francis Tuerlinckx, Peter Kuppens, Sigert Ariens, Leonie Cloos, Egon Dejonckheere, Ginette Lafit, Koen Niemeijer, Jordan Revol, Evelien Schat, Marieke Schreuder, Niels Vanhasbroeck, Eva Ceulemans

Experience Sampling Methodology (ESM) has been widely used over the past decades to study feelings, behaviour and thoughts as they occur in daily life. Typically, participants complete several assessments per day via a smartphone for multiple days. The growing adoption of ESM has spurred a number of methodological advancements. In this paper, we provide an overview of recent developments in ESM design, statistical analysis and implementation. In terms of design, we discuss considerations around what to measure—including the reliability and validity of self-report measures as well as mobile sensing—as well as when to measure, where we focus on the pros and cons of burst designs and advances in sample size planning methodology. Regarding statistical analysis, we highlight non-linear models, survival analysis for understanding time-to-event data and real-time monitoring of ESM time series. At the implementation level, we address open science practices and advances in data preprocessing. Although most of the topics discussed in this paper are generic, many of the examples are focused on the study of affect in daily life.

在过去的几十年里,经验抽样方法(ESM)被广泛用于研究日常生活中的感受、行为和思想。通常,参与者每天通过智能手机完成几项评估,持续数天。ESM的日益普及推动了一些方法上的进步。在本文中,我们概述了ESM设计,统计分析和实施的最新发展。在设计方面,我们讨论了有关测量内容的考虑因素——包括自我报告测量的可靠性和有效性以及移动感知——以及何时测量,其中我们重点讨论了突发设计的优缺点和样本量规划方法的进展。在统计分析方面,我们强调非线性模型,生存分析以理解时间到事件数据和实时监测ESM时间序列。在实施层面,我们讨论了开放科学实践和数据预处理方面的进展。虽然本文讨论的大多数主题都是一般性的,但许多例子都集中在日常生活中的情感研究上。
{"title":"New developments in experience sampling methodology","authors":"Francis Tuerlinckx,&nbsp;Peter Kuppens,&nbsp;Sigert Ariens,&nbsp;Leonie Cloos,&nbsp;Egon Dejonckheere,&nbsp;Ginette Lafit,&nbsp;Koen Niemeijer,&nbsp;Jordan Revol,&nbsp;Evelien Schat,&nbsp;Marieke Schreuder,&nbsp;Niels Vanhasbroeck,&nbsp;Eva Ceulemans","doi":"10.1111/bmsp.12398","DOIUrl":"10.1111/bmsp.12398","url":null,"abstract":"<p>Experience Sampling Methodology (ESM) has been widely used over the past decades to study feelings, behaviour and thoughts as they occur in daily life. Typically, participants complete several assessments per day via a smartphone for multiple days. The growing adoption of ESM has spurred a number of methodological advancements. In this paper, we provide an overview of recent developments in ESM design, statistical analysis and implementation. In terms of design, we discuss considerations around what to measure—including the reliability and validity of self-report measures as well as mobile sensing—as well as when to measure, where we focus on the pros and cons of burst designs and advances in sample size planning methodology. Regarding statistical analysis, we highlight non-linear models, survival analysis for understanding time-to-event data and real-time monitoring of ESM time series. At the implementation level, we address open science practices and advances in data preprocessing. Although most of the topics discussed in this paper are generic, many of the examples are focused on the study of affect in daily life.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":"79 1","pages":"46-65"},"PeriodicalIF":1.8,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://bpspsychub.onlinelibrary.wiley.com/doi/epdf/10.1111/bmsp.12398","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144259430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Keeping Elo alive: Evaluating and improving measurement properties of learning systems based on Elo ratings 保持Elo的活力:评估和改进基于Elo评级的学习系统的测量特性。
IF 1.8 3区 心理学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-06-06 DOI: 10.1111/bmsp.12395
Maria Bolsinova, Bence Gergely, Matthieu J. S. Brinkhuis

The Elo Rating System which originates from competitive chess has been widely utilised in large-scale online educational applications where it is used for on-the-fly estimation of ability, item calibration, and adaptivity. In this paper, we aim to critically analyse the shortcomings of the Elo rating system in an educational context, shedding light on its measurement properties and when these may fall short in accurately capturing student abilities and item difficulties. In a simulation study, we look at the asymptotic properties of the Elo rating system. Our results show that the Elo ratings are generally not unbiased and their variances are context-dependent. Furthermore, in scenarios where items are selected adaptively based on the current ratings and the item difficulties are updated alongside the student abilities, the variance of the ratings across items and students artificially increases over time and as a result the ratings do not converge. We propose a solution to this problem which entails using two parallel chains of ratings which remove the dependence of item selection on the current errors in the ratings.

源自国际象棋的Elo评分系统已被广泛应用于大规模的在线教育应用中,用于能力的实时评估、项目校准和适应性。在本文中,我们的目标是批判性地分析Elo评分系统在教育背景下的缺点,揭示其测量特性,以及这些特性在准确捕捉学生能力和项目困难方面可能存在的不足。在模拟研究中,我们研究了Elo评级系统的渐近性质。我们的研究结果表明,Elo评级通常不是无偏的,它们的差异是上下文相关的。此外,在根据当前评分自适应地选择项目,并且项目难度与学生能力一起更新的情况下,项目和学生之间的评分差异会随着时间的推移而人为地增加,因此评分不会收敛。我们提出了一个解决这个问题的方法,它需要使用两个平行的评级链,从而消除了项目选择对评级中当前错误的依赖。
{"title":"Keeping Elo alive: Evaluating and improving measurement properties of learning systems based on Elo ratings","authors":"Maria Bolsinova,&nbsp;Bence Gergely,&nbsp;Matthieu J. S. Brinkhuis","doi":"10.1111/bmsp.12395","DOIUrl":"10.1111/bmsp.12395","url":null,"abstract":"<p>The Elo Rating System which originates from competitive chess has been widely utilised in large-scale online educational applications where it is used for on-the-fly estimation of ability, item calibration, and adaptivity. In this paper, we aim to critically analyse the shortcomings of the Elo rating system in an educational context, shedding light on its measurement properties and when these may fall short in accurately capturing student abilities and item difficulties. In a simulation study, we look at the asymptotic properties of the Elo rating system. Our results show that the Elo ratings are generally not unbiased and their variances are context-dependent. Furthermore, in scenarios where items are selected adaptively based on the current ratings and the item difficulties are updated alongside the student abilities, the variance of the ratings across items and students artificially increases over time and as a result the ratings do not converge. We propose a solution to this problem which entails using two parallel chains of ratings which remove the dependence of item selection on the current errors in the ratings.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":"79 1","pages":"95-110"},"PeriodicalIF":1.8,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://bpspsychub.onlinelibrary.wiley.com/doi/epdf/10.1111/bmsp.12395","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144235988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modelling non-linear psychological processes: Reviewing and evaluating non-parametric approaches and their applicability to intensive longitudinal data. 非线性心理过程建模:回顾和评估非参数方法及其对密集纵向数据的适用性。
IF 1.5 3区 心理学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-05-30 DOI: 10.1111/bmsp.12397
Jan I Failenschmid, Leonie V D E Vogelsmeier, Joris Mulder, Joran Jongerling

Psychological concepts are increasingly understood as complex dynamic systems that change over time. To study these complex systems, researchers are increasingly gathering intensive longitudinal data (ILD), revealing non-linear phenomena such as asymptotic growth, mean-level switching, and regulatory oscillations. However, psychological researchers currently lack advanced statistical methods that are flexible enough to capture these non-linear processes accurately, which hinders theory development. While methods such as local polynomial regression, Gaussian processes and generalized additive models (GAMs) exist outside of psychology, they are rarely applied within the field because they have not yet been reviewed accessibly and evaluated within the context of ILD. To address this important gap, this article introduces these three methods for an applied psychological audience. We further conducted a simulation study, which demonstrates that all three methods infer non-linear processes that have been found in ILD more accurately than polynomial regression. Particularly, GAMs closely captured the underlying processes, performing almost as well as the data-generating parametric models. Finally, we illustrate how GAMs can be applied to explore idiographic processes and identify potential phenomena in ILD. This comprehensive analysis empowers psychological researchers to model non-linear processes accurately and select a method that aligns with their data and research goals.

心理学概念越来越被理解为复杂的动态系统,随着时间的推移而变化。为了研究这些复杂的系统,研究人员越来越多地收集密集的纵向数据(ILD),揭示非线性现象,如渐近增长、平均水平切换和调节振荡。然而,心理学研究人员目前缺乏足够灵活的先进统计方法来准确捕捉这些非线性过程,这阻碍了理论的发展。虽然局部多项式回归,高斯过程和广义加性模型(GAMs)等方法存在于心理学之外,但它们很少在该领域内应用,因为它们尚未在ILD背景下进行可访问的审查和评估。为了解决这一重要的差距,本文为应用心理学读者介绍了这三种方法。我们进一步进行了模拟研究,这表明所有三种方法都比多项式回归更准确地推断出在ILD中发现的非线性过程。特别地,GAMs紧密地捕获了底层过程,执行起来几乎和数据生成参数模型一样好。最后,我们说明了GAMs如何应用于探索ILD的具体过程和识别潜在现象。这种全面的分析使心理学研究人员能够准确地模拟非线性过程,并选择与他们的数据和研究目标相一致的方法。
{"title":"Modelling non-linear psychological processes: Reviewing and evaluating non-parametric approaches and their applicability to intensive longitudinal data.","authors":"Jan I Failenschmid, Leonie V D E Vogelsmeier, Joris Mulder, Joran Jongerling","doi":"10.1111/bmsp.12397","DOIUrl":"https://doi.org/10.1111/bmsp.12397","url":null,"abstract":"<p><p>Psychological concepts are increasingly understood as complex dynamic systems that change over time. To study these complex systems, researchers are increasingly gathering intensive longitudinal data (ILD), revealing non-linear phenomena such as asymptotic growth, mean-level switching, and regulatory oscillations. However, psychological researchers currently lack advanced statistical methods that are flexible enough to capture these non-linear processes accurately, which hinders theory development. While methods such as local polynomial regression, Gaussian processes and generalized additive models (GAMs) exist outside of psychology, they are rarely applied within the field because they have not yet been reviewed accessibly and evaluated within the context of ILD. To address this important gap, this article introduces these three methods for an applied psychological audience. We further conducted a simulation study, which demonstrates that all three methods infer non-linear processes that have been found in ILD more accurately than polynomial regression. Particularly, GAMs closely captured the underlying processes, performing almost as well as the data-generating parametric models. Finally, we illustrate how GAMs can be applied to explore idiographic processes and identify potential phenomena in ILD. This comprehensive analysis empowers psychological researchers to model non-linear processes accurately and select a method that aligns with their data and research goals.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":" ","pages":""},"PeriodicalIF":1.5,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144180327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A path signature perspective of process data feature extraction 过程数据特征提取的路径签名视角。
IF 1.8 3区 心理学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-05-26 DOI: 10.1111/bmsp.12390
Xueying Tang, Jingchen Liu, Zhiliang Ying

Computer-based interactive items have become prevalent in recent educational assessments. In such items, the entire human-computer interactive process is recorded in a log file and is known as the response process. These data are noisy, diverse, and in a nonstandard format. Several feature extraction methods have been developed to overcome the difficulties in process data analysis. However, these methods often focus on the action sequence and ignore the time sequence in response processes. In this paper, we introduce a new feature extraction method that incorporates the information in both the action sequence and the response time sequence. The method is based on the concept of path signature from stochastic analysis. We apply the proposed method to both simulated data and real response process data from PIAAC. A prediction framework is used to show that taking time information into account provides a more comprehensive understanding of respondents' behaviors.

以电脑为基础的互动项目在最近的教育评估中变得普遍。在这些项目中,整个人机交互过程被记录在一个日志文件中,被称为响应过程。这些数据是嘈杂的、多样化的,并且采用非标准格式。为了克服过程数据分析中的困难,已经开发了几种特征提取方法。然而,这些方法往往侧重于动作顺序,而忽略了响应过程中的时间顺序。本文提出了一种融合动作序列和响应时间序列信息的特征提取方法。该方法基于随机分析中路径特征的概念。我们将该方法应用于PIAAC的模拟数据和实际响应过程数据。一个预测框架被用来表明,考虑时间信息提供了一个更全面的了解受访者的行为。
{"title":"A path signature perspective of process data feature extraction","authors":"Xueying Tang,&nbsp;Jingchen Liu,&nbsp;Zhiliang Ying","doi":"10.1111/bmsp.12390","DOIUrl":"10.1111/bmsp.12390","url":null,"abstract":"<p>Computer-based interactive items have become prevalent in recent educational assessments. In such items, the entire human-computer interactive process is recorded in a log file and is known as the response process. These data are noisy, diverse, and in a nonstandard format. Several feature extraction methods have been developed to overcome the difficulties in process data analysis. However, these methods often focus on the action sequence and ignore the time sequence in response processes. In this paper, we introduce a new feature extraction method that incorporates the information in both the action sequence and the response time sequence. The method is based on the concept of path signature from stochastic analysis. We apply the proposed method to both simulated data and real response process data from PIAAC. A prediction framework is used to show that taking time information into account provides a more comprehensive understanding of respondents' behaviors.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":"78 3","pages":"939-964"},"PeriodicalIF":1.8,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144144399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A supervised learning approach to estimating IRT models in small samples 一种估计小样本IRT模型的监督学习方法。
IF 1.8 3区 心理学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-05-15 DOI: 10.1111/bmsp.12396
Dmitry I. Belov, Oliver Lüdtke, Esther Ulitzsch

Existing estimators of parameters of item response theory (IRT) models exploit the likelihood function. In small samples, however, the IRT likelihood oftentimes contains little informative value, potentially resulting in biased and/or unstable parameter estimates and large standard errors. To facilitate small-sample IRT estimation, we introduce a novel approach that does not rely on the likelihood. Our estimation approach derives features from response data and then maps the features to item parameters using a neural network (NN). We describe and evaluate our approach for the three-parameter logistic model; however, it is applicable to any model with an item characteristic curve. Three types of NNs are developed, supporting the obtainment of both point estimates and confidence intervals for IRT model parameters. The results of a simulation study demonstrate that these NNs perform better than Bayesian estimation using Markov chain Monte Carlo methods in terms of the quality of the point estimates and confidence intervals while also being much faster. These properties facilitate (1) pretesting items in a real-time testing environment, (2) pretesting more items and (3) pretesting items only in a secured environment to eradicate possible compromise of new items in online testing.

现有的项目反应理论(IRT)模型参数估计方法利用了似然函数。然而,在小样本中,IRT似然通常包含很少的信息值,可能导致有偏差和/或不稳定的参数估计和大的标准误差。为了便于小样本IRT估计,我们引入了一种不依赖于似然的新方法。我们的估计方法从响应数据中提取特征,然后使用神经网络(NN)将特征映射到项目参数。我们描述并评估了我们的三参数逻辑模型的方法;但是,它适用于任何具有项目特征曲线的模型。开发了三种类型的神经网络,支持获得IRT模型参数的点估计和置信区间。仿真研究的结果表明,这些神经网络在点估计的质量和置信区间方面优于使用马尔可夫链蒙特卡罗方法的贝叶斯估计,同时速度也快得多。这些属性有助于(1)在实时测试环境中预测试项目,(2)预测试更多项目,(3)仅在安全环境中预测试项目,以消除在线测试中新项目的可能危害。
{"title":"A supervised learning approach to estimating IRT models in small samples","authors":"Dmitry I. Belov,&nbsp;Oliver Lüdtke,&nbsp;Esther Ulitzsch","doi":"10.1111/bmsp.12396","DOIUrl":"10.1111/bmsp.12396","url":null,"abstract":"<p>Existing estimators of parameters of item response theory (IRT) models exploit the likelihood function. In small samples, however, the IRT likelihood oftentimes contains little informative value, potentially resulting in biased and/or unstable parameter estimates and large standard errors. To facilitate small-sample IRT estimation, we introduce a novel approach that does not rely on the likelihood. Our estimation approach derives features from response data and then maps the features to item parameters using a neural network (NN). We describe and evaluate our approach for the three-parameter logistic model; however, it is applicable to any model with an item characteristic curve. Three types of NNs are developed, supporting the obtainment of both point estimates and confidence intervals for IRT model parameters. The results of a simulation study demonstrate that these NNs perform better than Bayesian estimation using Markov chain Monte Carlo methods in terms of the quality of the point estimates and confidence intervals while also being much faster. These properties facilitate (1) pretesting items in a real-time testing environment, (2) pretesting more items and (3) pretesting items only in a secured environment to eradicate possible compromise of new items in online testing.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":"79 1","pages":"66-94"},"PeriodicalIF":1.8,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144082247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel nonvisual procedure for screening for nonstationarity in time series as obtained from intensive longitudinal designs. 一个新的非视觉程序筛选非平稳性的时间序列,从密集的纵向设计获得。
IF 1.5 3区 心理学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-04-25 DOI: 10.1111/bmsp.12394
Steffen Zitzmann, Christoph Lindner, Julian F Lohmann, Martin Hecht

Researchers working with intensive longitudinal designs often encounter the challenge of determining whether to relax the assumption of stationarity in their models. Given that these designs typically involve data from a large number of subjects ( N 1 $$ Ngg 1 $$ ), visual screening all time series can quickly become tedious. Even when conducted by experts, such screenings can lack accuracy. In this article, we propose a nonvisual procedure that enables fast and accurate screening. This procedure has potential to become a widely adopted approach for detecting nonstationarity and guiding model building in psychology and related fields, where intensive longitudinal designs are used and time series data are collected.

研究密集的纵向设计的研究人员经常遇到的挑战,决定是否放宽假设的平稳性在他们的模型。考虑到这些设计通常涉及来自大量受试者的数据(N > 1 $$ Ngg 1 $$),所有时间序列的视觉筛选很快就会变得乏味。即使由专家进行,这种筛查也可能缺乏准确性。在本文中,我们提出了一种非视觉程序,使快速和准确的筛选。这一过程有可能成为一种被广泛采用的方法,用于检测非平稳性,并指导心理学和相关领域的模型构建,在这些领域中,密集的纵向设计被使用,时间序列数据被收集。
{"title":"A novel nonvisual procedure for screening for nonstationarity in time series as obtained from intensive longitudinal designs.","authors":"Steffen Zitzmann, Christoph Lindner, Julian F Lohmann, Martin Hecht","doi":"10.1111/bmsp.12394","DOIUrl":"https://doi.org/10.1111/bmsp.12394","url":null,"abstract":"<p><p>Researchers working with intensive longitudinal designs often encounter the challenge of determining whether to relax the assumption of stationarity in their models. Given that these designs typically involve data from a large number of subjects ( <math> <semantics><mrow><mi>N</mi> <mo>≫</mo> <mn>1</mn></mrow> <annotation>$$ Ngg 1 $$</annotation></semantics> </math> ), visual screening all time series can quickly become tedious. Even when conducted by experts, such screenings can lack accuracy. In this article, we propose a nonvisual procedure that enables fast and accurate screening. This procedure has potential to become a widely adopted approach for detecting nonstationarity and guiding model building in psychology and related fields, where intensive longitudinal designs are used and time series data are collected.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":" ","pages":""},"PeriodicalIF":1.5,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144054625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A general diagnostic modelling framework for forced-choice assessments 强迫选择评估的一般诊断模型框架。
IF 1.8 3区 心理学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-04-23 DOI: 10.1111/bmsp.12393
Pablo Nájera, Rodrigo S. Kreitchmann, Scarlett Escudero, Francisco J. Abad, Jimmy de la Torre, Miguel A. Sorrel

Diagnostic classification modelling (DCM) is a family of restricted latent class models often used in educational settings to assess students' strengths and weaknesses. Recently, there has been growing interest in applying DCM to noncognitive traits in fields such as clinical and organizational psychology, as well as personality profiling. To address common response biases in these assessments, such as social desirability, Huang (2023, Educational and Psychological Measurement, 83, 146) adopted the forced-choice (FC) item format within the DCM framework, developing the FC-DCM. This model assumes that examinees with no clear preference for any statements in an FC block will choose completely at random. Additionally, the unique parametrization of the FC-DCM poses challenges for integration with established DCM frameworks in the literature. In the present study, we enhance the capabilities of DCM by introducing a general diagnostic framework for FC assessments. We present an adaptation of the G-DINA model to accommodate FC responses. Simulation results show that the G-DINA model provides accurate classifications, item parameter estimates and attribute correlations, outperforming the FC-DCM in realistic scenarios where item discrimination varies. A real FC assessment example further illustrates the better model fit of the G-DINA. Practical recommendations for using the FC format in diagnostic assessments of noncognitive traits are provided.

诊断分类模型(DCM)是一类受限的潜在类别模型,通常用于教育环境中评估学生的优势和劣势。最近,人们对将DCM应用于临床和组织心理学以及人格分析等领域的非认知特征越来越感兴趣。为了解决这些评估中常见的反应偏差,例如社会期望,Huang (2023, Educational and Psychological Measurement, 83,146)在DCM框架中采用了强制选择(FC)项目格式,开发了FC-DCM。该模型假设考生对FC块中的任何语句没有明确的偏好,将完全随机选择。此外,FC-DCM的独特参数化对与文献中已建立的DCM框架的集成提出了挑战。在本研究中,我们通过引入FC评估的一般诊断框架来增强DCM的能力。我们提出了一种适应G-DINA模型以适应FC响应。仿真结果表明,G-DINA模型提供了准确的分类、项目参数估计和属性相关性,在项目区分变化的现实场景中优于FC-DCM模型。一个实际的FC评估实例进一步说明了G-DINA模型拟合效果较好。提供了在非认知特征的诊断评估中使用FC格式的实用建议。
{"title":"A general diagnostic modelling framework for forced-choice assessments","authors":"Pablo Nájera,&nbsp;Rodrigo S. Kreitchmann,&nbsp;Scarlett Escudero,&nbsp;Francisco J. Abad,&nbsp;Jimmy de la Torre,&nbsp;Miguel A. Sorrel","doi":"10.1111/bmsp.12393","DOIUrl":"10.1111/bmsp.12393","url":null,"abstract":"<p>Diagnostic classification modelling (DCM) is a family of restricted latent class models often used in educational settings to assess students' strengths and weaknesses. Recently, there has been growing interest in applying DCM to noncognitive traits in fields such as clinical and organizational psychology, as well as personality profiling. To address common response biases in these assessments, such as social desirability, Huang (2023, <i>Educational and Psychological Measurement</i>, <i>83</i>, 146) adopted the forced-choice (FC) item format within the DCM framework, developing the FC-DCM. This model assumes that examinees with no clear preference for any statements in an FC block will choose completely at random. Additionally, the unique parametrization of the FC-DCM poses challenges for integration with established DCM frameworks in the literature. In the present study, we enhance the capabilities of DCM by introducing a general diagnostic framework for FC assessments. We present an adaptation of the G-DINA model to accommodate FC responses. Simulation results show that the G-DINA model provides accurate classifications, item parameter estimates and attribute correlations, outperforming the FC-DCM in realistic scenarios where item discrimination varies. A real FC assessment example further illustrates the better model fit of the G-DINA. Practical recommendations for using the FC format in diagnostic assessments of noncognitive traits are provided.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":"78 3","pages":"1025-1047"},"PeriodicalIF":1.8,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://bpspsychub.onlinelibrary.wiley.com/doi/epdf/10.1111/bmsp.12393","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144035912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Score-based tests for parameter instability in ordinal factor models 有序因子模型中参数不稳定性的基于分数的检验。
IF 1.8 3区 心理学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-04-23 DOI: 10.1111/bmsp.12392
Franz Classe, Rudolf Debelak, Christoph Kern

We present a novel approach for computing model scores for ordinal factor models, that is, graded response models (GRMs) fitted with a limited information (LI) estimator. The method makes it possible to compute score-based tests for parameter instability for ordinal factor models. This way, rapid execution of numerous parameter instability tests for multidimensional item response theory (MIRT) models is facilitated. We present a comparative analysis of the performance of the proposed score-based tests for ordinal factor models in comparison to tests for GRMs fitted with a full information (FI) estimator. The new method has a good Type I error rate, high power and is computationally faster than FI estimation. We further illustrate that the proposed method works well with complex models in real data applications. The method is implemented in the lavaan package in R.

我们提出了一种计算有序因子模型分数的新方法,即用有限信息(LI)估计器拟合的分级响应模型(GRMs)。该方法使计算基于分数的参数不稳定性测试的顺序因素模型成为可能。通过这种方法,可以快速执行多维项目反应理论(MIRT)模型的众多参数不稳定性测试。我们提出了一个比较分析的性能提出的分数为基础的测试为有序因子模型,与测试的grm拟合与一个完整的信息(FI)估计。该方法具有较好的I型误差率、较高的功率和较FI估计计算速度快的特点。在实际数据应用中,我们进一步证明了该方法可以很好地处理复杂的模型。该方法在R中的lavaan包中实现。
{"title":"Score-based tests for parameter instability in ordinal factor models","authors":"Franz Classe,&nbsp;Rudolf Debelak,&nbsp;Christoph Kern","doi":"10.1111/bmsp.12392","DOIUrl":"10.1111/bmsp.12392","url":null,"abstract":"<p>We present a novel approach for computing model scores for ordinal factor models, that is, graded response models (GRMs) fitted with a limited information (LI) estimator. The method makes it possible to compute score-based tests for parameter instability for ordinal factor models. This way, rapid execution of numerous parameter instability tests for multidimensional item response theory (MIRT) models is facilitated. We present a comparative analysis of the performance of the proposed score-based tests for ordinal factor models in comparison to tests for GRMs fitted with a full information (FI) estimator. The new method has a good Type I error rate, high power and is computationally faster than FI estimation. We further illustrate that the proposed method works well with complex models in real data applications. The method is implemented in the <i>lavaan</i> package in R.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":"78 3","pages":"996-1024"},"PeriodicalIF":1.8,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://bpspsychub.onlinelibrary.wiley.com/doi/epdf/10.1111/bmsp.12392","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144052855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
British Journal of Mathematical & Statistical Psychology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1