首页 > 最新文献

British Journal of Mathematical & Statistical Psychology最新文献

英文 中文
Variational Bayesian inference for sparse item response theory models. 稀疏项目反应理论模型的变分贝叶斯推理。
IF 1.8 3区 心理学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2026-02-04 DOI: 10.1111/bmsp.70032
Yemao Xia, Yu Xue, Depeng Jiang

Item response theory (IRT) model is a widely appreciated statistical method in exploring the relationship between individual latent traits and item responses. In this paper, a sparse IRT model is established to address the sparsity of factor loadings. A global and local shrinkage prior is imposed to penalize the factor loadings: the global parameter controls the amount of shrinkage at the column levels, while the local parameter adjusts the penalty of factor loadings within each column. We develop a variational Bayesian procedure to conduct posterior inference. By exploiting a stochastic representation for logistic function, we frame sparse IRT model as a mixture model mixing with Pólya-Gamma distribution. Such a strategy admits a conjugate posterior for the latent quantity, thus leading to a straightforward posterior computation. We assess the performance of the proposed method via a simulation study. A real example related to personality assessment is analysed to illustrate the usefulness of methodology.

项目反应理论(IRT)模型是一种广泛应用于研究个体潜在特质与项目反应之间关系的统计方法。本文建立了一个稀疏IRT模型来解决因子负载的稀疏性问题。施加全局和局部收缩先验来惩罚因子加载:全局参数控制列级别上的收缩量,而局部参数调整每个列内因子加载的惩罚。我们开发了一个变分贝叶斯过程来进行后验推理。通过利用logistic函数的随机表示,我们将稀疏IRT模型构建为与Pólya-Gamma分布混合的混合模型。这种策略允许潜在量的共轭后验,从而导致一个简单的后验计算。我们通过模拟研究来评估所提出方法的性能。分析了一个与人格评估有关的真实例子,以说明方法的有用性。
{"title":"Variational Bayesian inference for sparse item response theory models.","authors":"Yemao Xia, Yu Xue, Depeng Jiang","doi":"10.1111/bmsp.70032","DOIUrl":"https://doi.org/10.1111/bmsp.70032","url":null,"abstract":"<p><p>Item response theory (IRT) model is a widely appreciated statistical method in exploring the relationship between individual latent traits and item responses. In this paper, a sparse IRT model is established to address the sparsity of factor loadings. A global and local shrinkage prior is imposed to penalize the factor loadings: the global parameter controls the amount of shrinkage at the column levels, while the local parameter adjusts the penalty of factor loadings within each column. We develop a variational Bayesian procedure to conduct posterior inference. By exploiting a stochastic representation for logistic function, we frame sparse IRT model as a mixture model mixing with Pólya-Gamma distribution. Such a strategy admits a conjugate posterior for the latent quantity, thus leading to a straightforward posterior computation. We assess the performance of the proposed method via a simulation study. A real example related to personality assessment is analysed to illustrate the usefulness of methodology.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2026-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146121229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Latent Poisson count models for action count data from technology-enhanced assessments. 来自技术增强评估的行动计数数据的潜在泊松计数模型。
IF 1.8 3区 心理学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2026-02-03 DOI: 10.1111/bmsp.70036
Gregory Arbet, Hyeon-Ah Kang

Recent advances in computerized assessments have enabled the use of innovative item formats (e.g., drag-and-drop, scenario-based), necessitating a flexible model that can capture systematic influence of item types on action counts. In this study, we present a refinement scheme that can explicitly model common features of items and allows inference on the item-type effects. We apply multifaceted parameterization to characterize the common and unique features of items and implement the formulation in two existing models, the Rasch and Conway-Maxwell-Poisson count models. The inference procedures for the proposed models are presented using Stan and validated for estimation accuracy. Numerical experimentation with simulated data suggest that the proposed inferential scheme adequately recovers the underlying model parameters. Empirical application demonstrated that the proposed refinement holds practical relevance when data exhibit distinct item-type effects. Based on the findings from the empirical investigation, we discuss practical considerations in applying the Poisson models for analysing count data.

计算机化评估方面的最新进展使人们能够使用创新的项目格式(例如,拖放式、基于场景的),因此需要一种灵活的模型,能够捕捉项目类型对行动数量的系统影响。在这项研究中,我们提出了一种细化方案,可以显式地模拟项目的共同特征,并允许对项目类型效应进行推断。我们应用多方面参数化来表征项目的共同和独特特征,并在两个现有模型中实现该公式,即Rasch和Conway-Maxwell-Poisson计数模型。使用Stan给出了模型的推理过程,并对模型的估计精度进行了验证。模拟数据的数值实验表明,所提出的推理方案能够较好地恢复模型参数。实证应用表明,当数据表现出明显的项目类型效应时,所提出的改进具有实际意义。根据实证调查的结果,我们讨论了应用泊松模型分析计数数据的实际考虑。
{"title":"Latent Poisson count models for action count data from technology-enhanced assessments.","authors":"Gregory Arbet, Hyeon-Ah Kang","doi":"10.1111/bmsp.70036","DOIUrl":"https://doi.org/10.1111/bmsp.70036","url":null,"abstract":"<p><p>Recent advances in computerized assessments have enabled the use of innovative item formats (e.g., drag-and-drop, scenario-based), necessitating a flexible model that can capture systematic influence of item types on action counts. In this study, we present a refinement scheme that can explicitly model common features of items and allows inference on the item-type effects. We apply multifaceted parameterization to characterize the common and unique features of items and implement the formulation in two existing models, the Rasch and Conway-Maxwell-Poisson count models. The inference procedures for the proposed models are presented using Stan and validated for estimation accuracy. Numerical experimentation with simulated data suggest that the proposed inferential scheme adequately recovers the underlying model parameters. Empirical application demonstrated that the proposed refinement holds practical relevance when data exhibit distinct item-type effects. Based on the findings from the empirical investigation, we discuss practical considerations in applying the Poisson models for analysing count data.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146114993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Revisiting reliability with human and machine learning raters under scoring design and rater configuration in the many-facet Rasch model. 在多面Rasch模型的评分设计和评分配置下,重新审视人类和机器学习评分器的可靠性。
IF 1.8 3区 心理学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2026-01-31 DOI: 10.1111/bmsp.70034
Xingyao Xiao, Richard J Patz, Mark R Wilson

Constructed-response (CR) items are widely used to assess higher order skills but require human scoring, which introduces variability and is costly at scale. Machine learning (ML)-based scoring offers a scalable alternative, yet its psychometric consequences in rater-mediated models remain underexplored. This study examines how scoring design, rater bias, ML inconsistency and model specification affect the reliability of ability estimation in polytomous CR assessments. Using Monte Carlo simulation, we manipulated human and ML rater bias, ML inconsistency and scoring density (complete, overlapping, isolated). Five estimation models were compared, including the Partial Credit Model (PCM) with fixed thresholds and the Many-Facet Partial Credit Model (MFPCM) with and without free calibration. Results showed that systematic bias, not random inconsistency, was the main source of error. Hybrid human-ML scoring improved estimation when raters were unbiased or exhibited opposing biases, but error compounded when biases aligned. Across designs, PCM with fixed thresholds consistently outperformed more complex alternatives, while anchoring CR items to selected-response metrics stabilized MFPCM estimation. The real data application replicated these patterns. Findings show that scoring design and bias structure, rather than model complexity, drive the benefits of hybrid scoring and that anchoring offers a practical strategy for stabilizing estimation.

构建反应(CR)项目被广泛用于评估高阶技能,但需要人工评分,这引入了可变性,并且成本很高。基于机器学习(ML)的评分提供了一种可扩展的替代方案,但其在评分中介模型中的心理测量结果仍未得到充分探索。本研究探讨了评分设计、评分者偏差、机器学习不一致性和模型规范如何影响多重CR评估中能力估计的可靠性。使用蒙特卡罗模拟,我们操纵了人类和机器学习的评分偏差、机器学习的不一致性和评分密度(完整、重叠、孤立)。比较了五种估计模型,包括固定阈值的部分信用模型(PCM)和有和没有自由校准的多面部分信用模型(MFPCM)。结果表明,系统偏差,而不是随机不一致,是误差的主要来源。当评分者没有偏见或表现出相反的偏见时,混合人-机器学习评分改善了估计,但当偏见一致时,错误加剧了。在整个设计中,具有固定阈值的PCM始终优于更复杂的替代方案,同时将CR项目锚定在选择响应指标稳定的MFPCM估计上。实际的数据应用程序复制了这些模式。研究结果表明,评分设计和偏差结构,而不是模型复杂性,驱动混合评分的好处,锚定提供了稳定估计的实用策略。
{"title":"Revisiting reliability with human and machine learning raters under scoring design and rater configuration in the many-facet Rasch model.","authors":"Xingyao Xiao, Richard J Patz, Mark R Wilson","doi":"10.1111/bmsp.70034","DOIUrl":"https://doi.org/10.1111/bmsp.70034","url":null,"abstract":"<p><p>Constructed-response (CR) items are widely used to assess higher order skills but require human scoring, which introduces variability and is costly at scale. Machine learning (ML)-based scoring offers a scalable alternative, yet its psychometric consequences in rater-mediated models remain underexplored. This study examines how scoring design, rater bias, ML inconsistency and model specification affect the reliability of ability estimation in polytomous CR assessments. Using Monte Carlo simulation, we manipulated human and ML rater bias, ML inconsistency and scoring density (complete, overlapping, isolated). Five estimation models were compared, including the Partial Credit Model (PCM) with fixed thresholds and the Many-Facet Partial Credit Model (MFPCM) with and without free calibration. Results showed that systematic bias, not random inconsistency, was the main source of error. Hybrid human-ML scoring improved estimation when raters were unbiased or exhibited opposing biases, but error compounded when biases aligned. Across designs, PCM with fixed thresholds consistently outperformed more complex alternatives, while anchoring CR items to selected-response metrics stabilized MFPCM estimation. The real data application replicated these patterns. Findings show that scoring design and bias structure, rather than model complexity, drive the benefits of hybrid scoring and that anchoring offers a practical strategy for stabilizing estimation.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2026-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146094849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian inference for dynamic Q matrices and attribute trajectories in hidden Markov diagnostic classification models. 隐马尔可夫诊断分类模型中动态Q矩阵和属性轨迹的贝叶斯推理。
IF 1.8 3区 心理学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2026-01-20 DOI: 10.1111/bmsp.70028
Chen-Wei Liu

Hidden Markov diagnostic classification models capture how students' cognitive attributes evolve over time. This paper introduces a Bayesian Markov chain Monte Carlo algorithm for diagnostic classification models that jointly estimates time-varying Q matrices, latent attributes, item parameters, attribute class proportions and transition matrices across multiple occasions. Using the R package hmdcm developed for this study, Monte Carlo simulations demonstrate accurate parameter recovery, and an empirical probability-concept assessment confirmed the algorithm's ability to trace attribute trajectories, supporting its value for longitudinal diagnostic classification in both research and instructional practice.

隐马尔可夫诊断分类模型捕捉学生的认知属性如何随时间演变。介绍了一种用于诊断分类模型的贝叶斯马尔可夫链蒙特卡罗算法,该算法可以跨多个场合对时变Q矩阵、潜在属性、项目参数、属性类比例和转移矩阵进行联合估计。使用为本研究开发的R软件包hmdcm,蒙特卡罗模拟显示了准确的参数恢复,经验概率概念评估证实了该算法跟踪属性轨迹的能力,支持其在研究和教学实践中的纵向诊断分类价值。
{"title":"Bayesian inference for dynamic Q matrices and attribute trajectories in hidden Markov diagnostic classification models.","authors":"Chen-Wei Liu","doi":"10.1111/bmsp.70028","DOIUrl":"https://doi.org/10.1111/bmsp.70028","url":null,"abstract":"<p><p>Hidden Markov diagnostic classification models capture how students' cognitive attributes evolve over time. This paper introduces a Bayesian Markov chain Monte Carlo algorithm for diagnostic classification models that jointly estimates time-varying Q matrices, latent attributes, item parameters, attribute class proportions and transition matrices across multiple occasions. Using the R package hmdcm developed for this study, Monte Carlo simulations demonstrate accurate parameter recovery, and an empirical probability-concept assessment confirmed the algorithm's ability to trace attribute trajectories, supporting its value for longitudinal diagnostic classification in both research and instructional practice.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146013432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing generalizability theory with mixed-effects models for heteroscedasticity in psychological measurement: A theoretical introduction with an application from EEG data. 用混合效应模型增强心理测量中异方差的泛化理论:理论介绍及脑电数据的应用。
IF 1.8 3区 心理学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2026-01-18 DOI: 10.1111/bmsp.70026
Philippe Rast, Peter E Clayson

Generalizability theory (G-theory) defines a statistical framework for assessing measurement reliability by decomposing observed variance into meaningful components attributable to persons, facets, and error. Classic G-theory assumes homoscedastic residual variances across measurement conditions, an assumption that is often violated in psychological and behavioural data. The main focus of this work is to extend G-theory using a mixed-effects location-scale model (MELSM) that allows residual error variance to vary systematically across conditions and persons. By modeling heteroscedasticity, we can extend the computation of condition-specific generalizability ( G t $$ {G}_t $$ ) and dependability ( D t $$ {D}_t $$ ) coefficients to reflect local reliability under varying degrees of measurement precision. As an illustration, we apply the model to empirical data from an EEG experiment and show that failing to account for variance heterogeneity can mask meaningful differences in measurement quality. A simulation-based decision study further demonstrates how targeted increases in measurement density can improve reliability for low-precision conditions or participants. The proposed framework retains the interpretative character of classical G-theory while enhancing its flexibility. We argue that it supports finer-grained insights on conditions that influence reliability and better-informed design decisions in psychological measurements. We discuss implications for individualized reliability assessment, adaptive measurement strategies, and future extensions to multi-facet designs.

概括性理论(G-theory)定义了一个统计框架,通过将观察到的方差分解为可归因于人、方面和误差的有意义的成分,来评估测量的可靠性。经典的g理论假设在测量条件下的残差均方差,这一假设在心理和行为数据中经常被违反。这项工作的主要重点是使用混合效应位置尺度模型(MELSM)扩展g理论,该模型允许残差方差在条件和人员之间系统地变化。通过对异方差进行建模,我们可以扩展特定条件下的广义性(G t $$ {G}_t $$)和可靠性(D t $$ {D}_t $$)系数的计算,以反映不同测量精度下的局部可靠度。作为说明,我们将该模型应用于EEG实验的经验数据,并表明未能考虑方差异质性可以掩盖测量质量的有意义差异。一项基于模拟的决策研究进一步证明了有针对性地增加测量密度可以提高低精度条件或参与者的可靠性。该框架保留了经典g理论的解释性特征,同时增强了其灵活性。我们认为,它支持对影响可靠性和更明智的心理测量设计决策的条件的更细粒度的见解。我们讨论了个性化可靠性评估、自适应测量策略和未来扩展到多面设计的含义。
{"title":"Enhancing generalizability theory with mixed-effects models for heteroscedasticity in psychological measurement: A theoretical introduction with an application from EEG data.","authors":"Philippe Rast, Peter E Clayson","doi":"10.1111/bmsp.70026","DOIUrl":"https://doi.org/10.1111/bmsp.70026","url":null,"abstract":"<p><p>Generalizability theory (G-theory) defines a statistical framework for assessing measurement reliability by decomposing observed variance into meaningful components attributable to persons, facets, and error. Classic G-theory assumes homoscedastic residual variances across measurement conditions, an assumption that is often violated in psychological and behavioural data. The main focus of this work is to extend G-theory using a mixed-effects location-scale model (MELSM) that allows residual error variance to vary systematically across conditions and persons. By modeling heteroscedasticity, we can extend the computation of condition-specific generalizability ( <math> <semantics> <mrow> <msub><mrow><mi>G</mi></mrow> <mrow><mi>t</mi></mrow> </msub> </mrow> <annotation>$$ {G}_t $$</annotation></semantics> </math> ) and dependability ( <math> <semantics> <mrow> <msub><mrow><mi>D</mi></mrow> <mrow><mi>t</mi></mrow> </msub> </mrow> <annotation>$$ {D}_t $$</annotation></semantics> </math> ) coefficients to reflect local reliability under varying degrees of measurement precision. As an illustration, we apply the model to empirical data from an EEG experiment and show that failing to account for variance heterogeneity can mask meaningful differences in measurement quality. A simulation-based decision study further demonstrates how targeted increases in measurement density can improve reliability for low-precision conditions or participants. The proposed framework retains the interpretative character of classical G-theory while enhancing its flexibility. We argue that it supports finer-grained insights on conditions that influence reliability and better-informed design decisions in psychological measurements. We discuss implications for individualized reliability assessment, adaptive measurement strategies, and future extensions to multi-facet designs.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2026-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145999604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparing training window selection methods for prediction in non-stationary time series. 比较非平稳时间序列预测的训练窗口选择方法。
IF 1.8 3区 心理学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2026-01-13 DOI: 10.1111/bmsp.70018
Fridtjof Petersen, Jonas M B Haslbeck, Jorge N Tendeiro, Anna M Langener, Martien J H Kas, Dimitris Rizopoulos, Laura F Bringmann

The widespread adoption of smartphones creates the possibility to passively monitor everyday behaviour via sensors. Sensor data have been linked to moment-to-moment psychological symptoms and mood of individuals and thus could alleviate the burden associated with repeated measurement of symptoms. Additionally, psychological care could be improved by predicting moments of high psychopathology and providing immediate interventions. Current research assumes that the relationship between sensor data and psychological symptoms is constant over time - or changes with a fixed rate: Models are trained on all past data or on a fixed window, without comparing different window sizes with each other. This is problematic as choosing the wrong training window can negatively impact prediction accuracy, especially if the underlying rate of change is varying. As a potential solution we compare different methodologies for choosing the correct window size ranging from frequent practice based on heuristics to super learning approaches. In a simulation study, we vary the rate of change in the underlying relationship form over time. We show that even computing a simple average across different windows can help reduce the prediction error rather than selecting a single best window for both simulated and real world data.

智能手机的广泛使用为通过传感器被动监控日常行为创造了可能。传感器数据与个体的即时心理症状和情绪有关,因此可以减轻重复测量症状带来的负担。此外,心理护理可以通过预测高精神病理时刻和提供即时干预来改善。目前的研究假设传感器数据和心理症状之间的关系随着时间的推移是恒定的,或者以固定的速率变化:模型是根据所有过去的数据或固定的窗口进行训练的,而不比较不同窗口的大小。这是有问题的,因为选择错误的训练窗口会对预测准确性产生负面影响,特别是在潜在的变化率变化的情况下。作为一种潜在的解决方案,我们比较了选择正确窗口大小的不同方法,从基于启发式的频繁练习到超级学习方法。在模拟研究中,我们随时间改变潜在关系形式的变化率。我们表明,即使计算跨不同窗口的简单平均值也可以帮助减少预测误差,而不是为模拟和现实世界的数据选择一个最佳窗口。
{"title":"Comparing training window selection methods for prediction in non-stationary time series.","authors":"Fridtjof Petersen, Jonas M B Haslbeck, Jorge N Tendeiro, Anna M Langener, Martien J H Kas, Dimitris Rizopoulos, Laura F Bringmann","doi":"10.1111/bmsp.70018","DOIUrl":"10.1111/bmsp.70018","url":null,"abstract":"<p><p>The widespread adoption of smartphones creates the possibility to passively monitor everyday behaviour via sensors. Sensor data have been linked to moment-to-moment psychological symptoms and mood of individuals and thus could alleviate the burden associated with repeated measurement of symptoms. Additionally, psychological care could be improved by predicting moments of high psychopathology and providing immediate interventions. Current research assumes that the relationship between sensor data and psychological symptoms is constant over time - or changes with a fixed rate: Models are trained on all past data or on a fixed window, without comparing different window sizes with each other. This is problematic as choosing the wrong training window can negatively impact prediction accuracy, especially if the underlying rate of change is varying. As a potential solution we compare different methodologies for choosing the correct window size ranging from frequent practice based on heuristics to super learning approaches. In a simulation study, we vary the rate of change in the underlying relationship form over time. We show that even computing a simple average across different windows can help reduce the prediction error rather than selecting a single best window for both simulated and real world data.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2026-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145967898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Simultaneous detection of gradual and abrupt structural changes in Bayesian longitudinal modelling using entropy and model fit measures. 利用熵和模型拟合方法同时检测贝叶斯纵向模型中逐渐和突然的结构变化。
IF 1.8 3区 心理学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2026-01-07 DOI: 10.1111/bmsp.70029
Yanling Li, Xiaoyue Xiong, Zita Oravecz, Sy-Miin Chow

Although individuals may exhibit both gradual and abrupt changes in their dynamic properties as shaped by both slowly accumulating influences and acute events, existing statistical frameworks offer limited capacity for the simultaneous detection and representation of these distinct change patterns. We propose a Bayesian regime-switching (RS) modelling framework and an entropy measure adapted from the frequentist framework to facilitate simultaneous representation and testing of postulates of gradual and abrupt changes. Results from Monte Carlo simulation studies indicated that using a combination of entropy and information criterion measures such as the Bayesian information criterion was consistently most effective at facilitating the selection of the best-fitting model across varying magnitudes of abrupt changes. We found that slight lower entropy thresholds may be helpful in facilitating the selection of longitudinal models with RS properties as this class of models tended to yield lower entropy values than conventional thresholds for reliable classification in cross-sectional mixture models-even under satisfactory parameter recovery and classification results. We fitted the proposed models and other candidate models to the data collected from an intervention study on the psychological well-being (PWB) of college-attending early adults. Results suggested abrupt, regime-related transitions in the intra-individual variability levels of PWB dynamics among some participants following the intervention period. Practical usage of the entropy measure in conjunction with other model selection measures, and guidelines to enhance simultaneous detection of true abrupt and gradual changes are discussed.

虽然个体的动态特性可能表现出逐渐或突然的变化,这些变化是由缓慢累积的影响和突发事件形成的,但现有的统计框架在同时检测和表示这些不同变化模式方面的能力有限。我们提出了一个贝叶斯状态切换(RS)建模框架和一个熵度量,以适应频率主义框架,以方便同时表示和测试渐进和突变的假设。蒙特卡罗模拟研究的结果表明,使用熵和信息标准措施(如贝叶斯信息标准)的组合在促进在不同幅度的突变中选择最佳拟合模型方面始终是最有效的。我们发现,即使在令人满意的参数恢复和分类结果下,较低的熵阈值可能有助于选择具有RS属性的纵向模型,因为这类模型往往比横截面混合模型中可靠分类的常规阈值产生更低的熵值。我们将所提出的模型和其他候选模型拟合到从大学入学的早期成人心理健康(PWB)干预研究中收集的数据中。结果表明,在干预期后,一些参与者的PWB动态的个体内变异性水平发生了突然的、与制度相关的转变。讨论了熵测度与其他模型选择测度的实际应用,以及增强对真实突变和渐变的同时检测的指导方针。
{"title":"Simultaneous detection of gradual and abrupt structural changes in Bayesian longitudinal modelling using entropy and model fit measures.","authors":"Yanling Li, Xiaoyue Xiong, Zita Oravecz, Sy-Miin Chow","doi":"10.1111/bmsp.70029","DOIUrl":"10.1111/bmsp.70029","url":null,"abstract":"<p><p>Although individuals may exhibit both gradual and abrupt changes in their dynamic properties as shaped by both slowly accumulating influences and acute events, existing statistical frameworks offer limited capacity for the simultaneous detection and representation of these distinct change patterns. We propose a Bayesian regime-switching (RS) modelling framework and an entropy measure adapted from the frequentist framework to facilitate simultaneous representation and testing of postulates of gradual and abrupt changes. Results from Monte Carlo simulation studies indicated that using a combination of entropy and information criterion measures such as the Bayesian information criterion was consistently most effective at facilitating the selection of the best-fitting model across varying magnitudes of abrupt changes. We found that slight lower entropy thresholds may be helpful in facilitating the selection of longitudinal models with RS properties as this class of models tended to yield lower entropy values than conventional thresholds for reliable classification in cross-sectional mixture models-even under satisfactory parameter recovery and classification results. We fitted the proposed models and other candidate models to the data collected from an intervention study on the psychological well-being (PWB) of college-attending early adults. Results suggested abrupt, regime-related transitions in the intra-individual variability levels of PWB dynamics among some participants following the intervention period. Practical usage of the entropy measure in conjunction with other model selection measures, and guidelines to enhance simultaneous detection of true abrupt and gradual changes are discussed.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12875568/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145919114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Level-specific reliability coefficients from the perspective of latent state-trait theory. 基于潜在状态-特质理论的水平特异性信度系数。
IF 1.8 3区 心理学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-12-27 DOI: 10.1111/bmsp.70027
Lennart Nacke, Axel Mayer

The growing popularity of the ecological momentary assessment method in psychological research requires adequate statistical models for intensive longitudinal data (ILD), with multilevel latent state-trait (ML-LST) models based on the latent state-trait theory revised (LST-R theory) as one possible alternative. Besides the traditional LST-R coefficients reliability, consistency and occasion-specificity, ML-LST models are also suitable for estimating reliability at Level 1 ("within-subject reliability") and Level 2 ("between-subject reliability"). However, these level-specific coefficients have not yet been defined in LST-R theory and, therefore, their interpretation has been unclear from the perspective of LST-R theory. In the current study, we discuss the interpretation and identification of these coefficients based on the (multilevel) versions of the Multistate-Singletrait (MSST), the Multistate-Indicator-specific trait (MSIT)  and the Multistate-Singletrait model with M-1 correlated method factors (MSST-M-1). We show that, in the MSST-M-1 model, the between-subject coefficient is a measure of the indicator-unspecificity of an item (i.e. the portion of between-level variance that a specific item shares with a common trait) or the unidimensionality of a scale. Moreover, we highlight differences between occasion-specificity and within-subject reliability. The performance of the ML-MSST-M-1 model and the corresponding theoretical findings are illustrated using data from an experience sampling study on the within-person fluctuations of narcissistic admiration (Heyde et al., 2023).

随着生态瞬时评价方法在心理学研究领域的日益普及,对密集纵向数据(ILD)的统计模型提出了要求,基于潜态-特质修正理论(LST-R理论)的多层次潜态-特质(ML-LST)模型是一种可能的选择。除了传统的LST-R系数信度、一致性和场合特异性外,ML-LST模型也适用于估计一级(“主体内信度”)和二级(“主体间信度”)的信度。然而,这些水平特异性系数在LST-R理论中尚未定义,因此从LST-R理论的角度对其解释尚不明确。在本研究中,我们基于多状态-单性状(MSST)、多状态-指标特异性性状(MSIT)和具有M-1相关方法因子的多状态-单性状模型(MSST-M-1)的(多水平)版本讨论了这些系数的解释和识别。我们表明,在mst - m -1模型中,被试间系数是衡量一个项目的指标非特异性(即一个特定项目与一个共同特征共享的水平间方差的部分)或量表的单维性。此外,我们强调了场合特异性和主体内信度之间的差异。ml - mst - m -1模型的性能和相应的理论发现使用了一项关于自恋崇拜的个人波动的经验抽样研究的数据(Heyde et al., 2023)。
{"title":"Level-specific reliability coefficients from the perspective of latent state-trait theory.","authors":"Lennart Nacke, Axel Mayer","doi":"10.1111/bmsp.70027","DOIUrl":"https://doi.org/10.1111/bmsp.70027","url":null,"abstract":"<p><p>The growing popularity of the ecological momentary assessment method in psychological research requires adequate statistical models for intensive longitudinal data (ILD), with multilevel latent state-trait (ML-LST) models based on the latent state-trait theory revised (LST-R theory) as one possible alternative. Besides the traditional LST-R coefficients reliability, consistency and occasion-specificity, ML-LST models are also suitable for estimating reliability at Level 1 (\"within-subject reliability\") and Level 2 (\"between-subject reliability\"). However, these level-specific coefficients have not yet been defined in LST-R theory and, therefore, their interpretation has been unclear from the perspective of LST-R theory. In the current study, we discuss the interpretation and identification of these coefficients based on the (multilevel) versions of the Multistate-Singletrait (MSST), the Multistate-Indicator-specific trait (MSIT)  and the Multistate-Singletrait model with M-1 correlated method factors (MSST-M-1). We show that, in the MSST-M-1 model, the between-subject coefficient is a measure of the indicator-unspecificity of an item (i.e. the portion of between-level variance that a specific item shares with a common trait) or the unidimensionality of a scale. Moreover, we highlight differences between occasion-specificity and within-subject reliability. The performance of the ML-MSST-M-1 model and the corresponding theoretical findings are illustrated using data from an experience sampling study on the within-person fluctuations of narcissistic admiration (Heyde et al., 2023).</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145844542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Power priors for latent variable mediation models under small sample sizes. 小样本量下潜在变量中介模型的幂先验。
IF 1.8 3区 心理学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-12-24 DOI: 10.1111/bmsp.70025
Lihan Chen, Milica Miočević, Carl F Falk

Latent variable models typically require large sample sizes for acceptable efficiency and reliable convergence. Appropriate informative priors are often required for gainfully employing Bayesian analysis with small samples. Power priors are informative priors built on historical data, weighted to account for non-exchangeability with the current sample. Many extant power prior approaches are designed for manifest variable models, and are not easily adapted for latent variable models, for example, they may require integration over all model parameters. We examined two recent power prior approaches straightforward to adapt to these models, Mahalanobis weight (MW) priors based on Golchi (Use of historical individual patient data in analysis of clinical trials, 2020), and univariate priors, based on Finch (The Psychiatrist, 6, 2024, 45)'s application of Haddad et al. (Journal of Biopharmaceutical Statistics, 27, 2017, 1089) and Balcome et al. (bayesdp: Implementation of the Bayesian discount prior approach for clinical trials, 2022). We applied these approaches along with diffuse and weakly informative priors to a latent variable mediation model, under various sample sizes and non-exchangeability conditions. We compared their performances in terms of convergence, bias, efficiency, and credible interval coverage when estimating an indirect effect. Diffuse priors and the univariate approach lead to poor convergence. The weakly informative and MW approach both improved convergence and yielded reasonable estimates, but MW performed poorly under some non-exchangeable conditions. We discussed the issues with these approaches and future research directions.

潜在变量模型通常需要较大的样本量才能获得可接受的效率和可靠的收敛性。适当的信息先验通常需要在小样本中有效地使用贝叶斯分析。功率先验是建立在历史数据基础上的信息先验,加权以考虑与当前样本的不可交换性。许多现有的幂先验方法是为明显变量模型设计的,并且不容易适用于潜在变量模型,例如,它们可能需要对所有模型参数进行集成。我们研究了最近两种直接适应这些模型的功率先验方法,基于Golchi的Mahalanobis权重(MW)先验(在临床试验分析中使用个体患者的历史数据,2020),以及基于Finch (The psychiatry, 6,2024, 45)应用Haddad等人(Journal of biopharmacicalstatistics, 27,2017, 1089)和Balcome等人(bayesdp:临床试验贝叶斯折扣先验方法的实现,2022)的单变量先验。在不同样本量和不可交换性条件下,我们将这些方法与弥漫性和弱信息先验一起应用于潜在变量中介模型。在估计间接影响时,我们比较了它们在收敛性、偏差、效率和可信区间覆盖方面的表现。扩散先验和单变量方法导致收敛性差。弱信息方法和最小估计方法都提高了收敛性并产生了合理的估计,但最小估计方法在一些非交换条件下表现不佳。讨论了这些方法存在的问题和未来的研究方向。
{"title":"Power priors for latent variable mediation models under small sample sizes.","authors":"Lihan Chen, Milica Miočević, Carl F Falk","doi":"10.1111/bmsp.70025","DOIUrl":"https://doi.org/10.1111/bmsp.70025","url":null,"abstract":"<p><p>Latent variable models typically require large sample sizes for acceptable efficiency and reliable convergence. Appropriate informative priors are often required for gainfully employing Bayesian analysis with small samples. Power priors are informative priors built on historical data, weighted to account for non-exchangeability with the current sample. Many extant power prior approaches are designed for manifest variable models, and are not easily adapted for latent variable models, for example, they may require integration over all model parameters. We examined two recent power prior approaches straightforward to adapt to these models, Mahalanobis weight (MW) priors based on Golchi (Use of historical individual patient data in analysis of clinical trials, 2020), and univariate priors, based on Finch (The Psychiatrist, 6, 2024, 45)'s application of Haddad et al. (Journal of Biopharmaceutical Statistics, 27, 2017, 1089) and Balcome et al. (bayesdp: Implementation of the Bayesian discount prior approach for clinical trials, 2022). We applied these approaches along with diffuse and weakly informative priors to a latent variable mediation model, under various sample sizes and non-exchangeability conditions. We compared their performances in terms of convergence, bias, efficiency, and credible interval coverage when estimating an indirect effect. Diffuse priors and the univariate approach lead to poor convergence. The weakly informative and MW approach both improved convergence and yielded reasonable estimates, but MW performed poorly under some non-exchangeable conditions. We discussed the issues with these approaches and future research directions.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145822099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Idiographic interrater reliability measures for intensive longitudinal multirater data. 密集纵向多变量数据的具体变量间可靠性测量。
IF 1.8 3区 心理学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-12-20 DOI: 10.1111/bmsp.70022
Tobias Koch, Miriam F Jaehne, Michaela Riediger, Antje Rauers, Jana Holtmann

Interrater reliability plays a crucial role in various areas of psychology. In this article, we propose a multilevel latent time series model for intensive longitudinal data with structurally different raters (e.g., self-reports and partner reports). The new MR-MLTS model enables researchers to estimate idiographic (person-specific) rater consistency coefficients for contemporaneous or dynamic rater agreement. Additionally, the model allows rater consistency coefficients to be linked to external explanatory or outcome variables. It can be implemented in Mplus as well as in the newly developed R package mlts. We illustrate the model using data from an intensive longitudinal multirater study involving 100 heterosexual couples (200 individuals) assessed across 86 time points. Our findings show that relationship duration and partner cognitive resources positively predict rater consistency for the innovations. Results from a simulation study indicate that the number of time points is critical for accurately estimating idiographic rater consistency coefficients, whereas the number of participants is important for accurately recovering the random effect variances. We discuss advantages, limitations, and future extensions of the MR-MLTS model.

被测者的信度在心理学的各个领域都起着至关重要的作用。在本文中,我们提出了一个多层次的潜在时间序列模型,用于具有结构不同评分者(例如,自我报告和伴侣报告)的密集纵向数据。新的MR-MLTS模型使研究人员能够估计具体的(个人特定的)评价一致性系数为同期或动态评价一致。此外,该模型允许将较高的一致性系数与外部解释变量或结果变量联系起来。它既可以在Mplus中实现,也可以在新开发的R包中实现。我们使用一项涉及100对异性恋夫妇(200个人)的密集纵向多因素研究的数据来说明该模型,该研究跨越86个时间点进行评估。我们的研究结果表明,关系持续时间和伴侣认知资源正向预测创新的一致性。模拟研究结果表明,时间点的数量对于准确估计具体的比率一致性系数至关重要,而参与者的数量对于准确恢复随机效应方差至关重要。我们讨论了MR-MLTS模型的优点、局限性和未来扩展。
{"title":"Idiographic interrater reliability measures for intensive longitudinal multirater data.","authors":"Tobias Koch, Miriam F Jaehne, Michaela Riediger, Antje Rauers, Jana Holtmann","doi":"10.1111/bmsp.70022","DOIUrl":"https://doi.org/10.1111/bmsp.70022","url":null,"abstract":"<p><p>Interrater reliability plays a crucial role in various areas of psychology. In this article, we propose a multilevel latent time series model for intensive longitudinal data with structurally different raters (e.g., self-reports and partner reports). The new MR-MLTS model enables researchers to estimate idiographic (person-specific) rater consistency coefficients for contemporaneous or dynamic rater agreement. Additionally, the model allows rater consistency coefficients to be linked to external explanatory or outcome variables. It can be implemented in Mplus as well as in the newly developed R package mlts. We illustrate the model using data from an intensive longitudinal multirater study involving 100 heterosexual couples (200 individuals) assessed across 86 time points. Our findings show that relationship duration and partner cognitive resources positively predict rater consistency for the innovations. Results from a simulation study indicate that the number of time points is critical for accurately estimating idiographic rater consistency coefficients, whereas the number of participants is important for accurately recovering the random effect variances. We discuss advantages, limitations, and future extensions of the MR-MLTS model.</p>","PeriodicalId":55322,"journal":{"name":"British Journal of Mathematical & Statistical Psychology","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145795447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
British Journal of Mathematical & Statistical Psychology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1