首页 > 最新文献

Multivariate Behavioral Research最新文献

英文 中文
Regression Discontinuity Analysis with Latent Variables. 潜在变量的回归不连续分析。
IF 3.5 3区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-11-21 DOI: 10.1080/00273171.2025.2565591
Monica Morell, Muwon Kwon, Youngjin Han, Youjin Sung, Yang Liu, Ji Seung Yang

A regression discontinuity (RD) design is often employed to provide causal evidence when the randomization of the treatment assignment is infeasible. When variables of interest are latent constructs measured by observed indicators, the conventional RD analysis using observed variable scores does not allow researchers to examine heterogeneity in the estimated local average treatment effect (ATE) and to generalize the ATE to participants away from the cutoff. We propose a novel methodological augmentation to the conventional RD analysis, which assumes the availability of multiple indicator variables (i.e., raw item responses) that measure the latent construct underlying the running variable. By specifying an explicit measurement model based on those indicator variables, our latent RD framework allows 1) defining the local ATE conditional on the latent construct, 2) disentangling the heterogeneity of the local ATE, and 3) generalizing the local ATE to running variable scores away from the cutoff. In a proof-of-concept simulation we illustrate the proposed augmentation recovers parameters of interest well under practical test length and sample size conditions.

当治疗分配的随机化不可行时,通常采用回归不连续(RD)设计来提供因果证据。当感兴趣的变量是由观察到的指标测量的潜在结构时,使用观察到的变量得分的传统RD分析不允许研究人员检查估计的局部平均治疗效果(ATE)的异质性,并将ATE推广到远离截止点的参与者。我们提出了一种新的方法来增强传统的RD分析,它假设了多个指标变量(即原始项目反应)的可用性,这些变量可以测量运行变量背后的潜在结构。通过指定基于这些指标变量的显式测量模型,我们的潜在RD框架允许1)根据潜在结构定义局部ATE, 2)解开局部ATE的异质性,以及3)将局部ATE推广到远离截止点的变量分数。在一个概念验证模拟中,我们说明了所提出的增强在实际测试长度和样本量条件下很好地恢复了感兴趣的参数。
{"title":"Regression Discontinuity Analysis with Latent Variables.","authors":"Monica Morell, Muwon Kwon, Youngjin Han, Youjin Sung, Yang Liu, Ji Seung Yang","doi":"10.1080/00273171.2025.2565591","DOIUrl":"https://doi.org/10.1080/00273171.2025.2565591","url":null,"abstract":"<p><p>A regression discontinuity (RD) design is often employed to provide causal evidence when the randomization of the treatment assignment is infeasible. When variables of interest are latent constructs measured by observed indicators, the conventional RD analysis using observed variable scores does not allow researchers to examine heterogeneity in the estimated local average treatment effect (ATE) and to generalize the ATE to participants away from the cutoff. We propose a novel methodological augmentation to the conventional RD analysis, which assumes the availability of multiple indicator variables (i.e., raw item responses) that measure the latent construct underlying the running variable. By specifying an explicit measurement model based on those indicator variables, our latent RD framework allows 1) defining the local ATE conditional on the latent construct, 2) disentangling the heterogeneity of the local ATE, and 3) generalizing the local ATE to running variable scores away from the cutoff. In a proof-of-concept simulation we illustrate the proposed augmentation recovers parameters of interest well under practical test length and sample size conditions.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1-14"},"PeriodicalIF":3.5,"publicationDate":"2025-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145574752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detecting Transition Points in the Slope-Intercept Relation in Linear Latent Growth Models. 线性潜在增长模型中斜截关系过渡点的检测。
IF 3.5 3区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-11-20 DOI: 10.1080/00273171.2025.2583158
Dayeon Lee, Gregory R Hancock

In a linear latent growth model parameterized by intercept (α) and slope (β) factors, those factors' relation is often of interest. The model typically captures this through their covariance parameter, which inherently assumes linearity in their relation. However, this assumption may not always hold. For instance, α and β might be unrelated below a certain threshold along the α-axis but show a meaningful relation above it. That is, even though individual growth trajectories may follow a linear pattern over time, the relation between α and β can be nonlinear, potentially featuring distinct segments separated by a transition point. To address such relations, we propose a semiparametric approach that combines Bayesian P-splines for flexible nonlinear modeling of the α-β relation along with a segmented regression-based transition point detection method. This two-stage analytic approach provides for a more nuanced understanding of the α-β relation, including estimation of a potential transition point where the α-β relation structure fundamentally changes. Simulation results and an empirical data illustration support this approach's effectiveness with single transition point scenarios, offering deeper insights into aspects of the growth process.

在用截距(α)和斜率(β)因素参数化的线性潜在增长模型中,这些因素的关系往往是令人感兴趣的。模型通常通过它们的协方差参数来捕捉这一点,协方差参数固有地假设它们的关系是线性的。然而,这种假设可能并不总是成立。例如,α和β沿着α-轴可能在某一阈值以下不相关,但在此阈值以上表现出有意义的关系。也就是说,即使个体生长轨迹随时间的推移可能遵循线性模式,α和β之间的关系也可能是非线性的,可能具有由过渡点分开的不同片段。为了解决这种关系,我们提出了一种半参数方法,该方法结合了贝叶斯p样条对α-β关系的柔性非线性建模以及基于分段回归的过渡点检测方法。这种两阶段分析方法提供了对α-β关系的更细致的理解,包括对α-β关系结构发生根本变化的潜在过渡点的估计。模拟结果和经验数据说明支持该方法在单一过渡点场景下的有效性,为增长过程的各个方面提供了更深入的见解。
{"title":"Detecting Transition Points in the Slope-Intercept Relation in Linear Latent Growth Models.","authors":"Dayeon Lee, Gregory R Hancock","doi":"10.1080/00273171.2025.2583158","DOIUrl":"https://doi.org/10.1080/00273171.2025.2583158","url":null,"abstract":"<p><p>In a linear latent growth model parameterized by intercept (α) and slope (β) factors, those factors' relation is often of interest. The model typically captures this through their covariance parameter, which inherently assumes linearity in their relation. However, this assumption may not always hold. For instance, α and β might be unrelated below a certain threshold along the α-axis but show a meaningful relation above it. That is, even though individual growth trajectories may follow a linear pattern over time, the relation between α and β can be nonlinear, potentially featuring distinct segments separated by a transition point. To address such relations, we propose a semiparametric approach that combines Bayesian P-splines for flexible nonlinear modeling of the α-β relation along with a segmented regression-based transition point detection method. This two-stage analytic approach provides for a more nuanced understanding of the α-β relation, including estimation of a potential transition point where the α-β relation structure fundamentally changes. Simulation results and an empirical data illustration support this approach's effectiveness with single transition point scenarios, offering deeper insights into aspects of the growth process.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1-20"},"PeriodicalIF":3.5,"publicationDate":"2025-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145566273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Targeted Maximum Likelihood Estimation for Causal Inference With Observational Data-The Example of Private Tutoring. 基于观测数据的因果推理的目标最大似然估计——以家教为例。
IF 3.5 3区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-11-20 DOI: 10.1080/00273171.2025.2561942
Christoph Jindra, Karoline A Sachse

State-of-the-art causal inference methods for observational data promise to relax assumptions threatening valid causal inference. Targeted maximum likelihood estimation (TMLE), for example, is a template for constructing doubly robust, semiparametric, efficient substitution estimators, providing consistent estimates if the outcome or treatment model is correctly specified. Compared to standard approaches, it reduces the risk of misspecification bias by allowing (nonparametric) machine-learning techniques, including super learning, to estimate the relevant components of the data distribution. We briefly introduce TMLE and demonstrate its use by estimating the effects of private tutoring in mathematics during Year 7 on mathematics proficiency and grades using observational data from starting cohort 3 of the National Education Panel Study (N= 4,167). We contrast TMLE estimates to those from ordinary least squares, the parametric G-formula, and the augmented inverse-probability weighted estimator. Our findings reveal close agreement between methods for end-of-year grades. However, variations emerge when examining mathematics proficiency as the outcome, highlighting that substantive conclusions may depend on the analytical approach. The results underscore the significance of employing advanced causal inference methods, such as TMLE, when navigating the complexities of observational data and highlight the nuanced impact of methodological choices on the interpretation of study outcomes.

最先进的观测数据因果推理方法有望放松威胁有效因果推理的假设。例如,目标最大似然估计(TMLE)是构建双鲁棒、半参数、高效替代估计器的模板,如果正确指定了结果或治疗模型,则提供一致的估计。与标准方法相比,它通过允许(非参数)机器学习技术(包括超级学习)来估计数据分布的相关组成部分,从而降低了错误规范偏差的风险。我们简要介绍了TMLE,并通过使用国家教育小组研究(N= 4,167)的起始队列3的观察数据,估计七年级数学私人辅导对数学熟练程度和成绩的影响来证明其使用。我们将TMLE估计与普通最小二乘、参数g公式和增广逆概率加权估计进行了比较。我们的研究结果揭示了年终成绩的方法之间的密切一致。然而,当检查数学熟练程度作为结果时,出现了变化,强调实质性结论可能取决于分析方法。研究结果强调了在处理观测数据的复杂性时,采用先进的因果推理方法(如TMLE)的重要性,并强调了方法选择对研究结果解释的微妙影响。
{"title":"Targeted Maximum Likelihood Estimation for Causal Inference With Observational Data-The Example of Private Tutoring.","authors":"Christoph Jindra, Karoline A Sachse","doi":"10.1080/00273171.2025.2561942","DOIUrl":"https://doi.org/10.1080/00273171.2025.2561942","url":null,"abstract":"<p><p>State-of-the-art causal inference methods for observational data promise to relax assumptions threatening valid causal inference. Targeted maximum likelihood estimation (TMLE), for example, is a template for constructing doubly robust, semiparametric, efficient substitution estimators, providing consistent estimates if the outcome or treatment model is correctly specified. Compared to standard approaches, it reduces the risk of misspecification bias by allowing (nonparametric) machine-learning techniques, including super learning, to estimate the relevant components of the data distribution. We briefly introduce TMLE and demonstrate its use by estimating the effects of private tutoring in mathematics during Year 7 on mathematics proficiency and grades using observational data from starting cohort 3 of the National Education Panel Study (<math><mrow><mi>N</mi><mo>=</mo></mrow></math> 4,167). We contrast TMLE estimates to those from ordinary least squares, the parametric G-formula, and the augmented inverse-probability weighted estimator. Our findings reveal close agreement between methods for end-of-year grades. However, variations emerge when examining mathematics proficiency as the outcome, highlighting that substantive conclusions may depend on the analytical approach. The results underscore the significance of employing advanced causal inference methods, such as TMLE, when navigating the complexities of observational data and highlight the nuanced impact of methodological choices on the interpretation of study outcomes.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1-20"},"PeriodicalIF":3.5,"publicationDate":"2025-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145566256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sample Size Determination for Optimal and Sub-Optimal Designs in Simplified Parametric Test Norming. 简化参数试验归一化中最优和次优设计的样本量确定。
IF 3.5 3区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-11-19 DOI: 10.1080/00273171.2025.2580712
Francesco Innocenti, Alberto Cassese

Norms play a critical role in high-stakes individual assessments (e.g., diagnosing intellectual disabilities), where precision and stability are essential. To reduce fluctuations in norms due to sampling, normative studies must be based on sufficiently large and well-designed samples. This paper provides formulas, applicable to any sample composition, for determining the required sample size for normative studies under the simplified parametric norming framework. In addition to a sufficiently large sample size, precision can be further improved by sampling according to an optimal design, that is, a sample composition that minimizes sampling error in the norms. Optimal designs are, here, derived for 45 (multivariate) multiple linear regression models, assuming normality and homoscedasticity. These models vary in the degree of interaction among three norm-predictors: a continuous variable (e.g., age), a categorical variable (e.g., sex), and a variable (e.g., education) that may be treated as either continuous or categorical. To support practical implementation, three interactive Shiny apps are introduced, enabling users to determine the sample size for their normative studies. Their use is demonstrated through the hypothetical planning of a normative study for the Trail Making Test, accompanied by a review of the most common models for this neuropsychological test in current practice.

规范在高风险的个人评估(例如,诊断智力残疾)中发挥着关键作用,其中准确性和稳定性至关重要。为了减少抽样造成的规范波动,规范研究必须以足够大和设计良好的样本为基础。本文提供了适用于任何样本组成的公式,用于确定简化参数规范化框架下规范研究所需的样本量。除了足够大的样本量外,还可以通过根据最优设计(即使规范中的抽样误差最小化的样本组成)进行抽样来进一步提高精度。最佳设计,在这里,推导了45(多元)多元线性回归模型,假设正态性和均方差。这些模型在三个规范预测因子之间的相互作用程度各不相同:连续变量(如年龄),分类变量(如性别)和变量(如教育),可以被视为连续或分类。为了支持实际实施,引入了三个交互式Shiny应用程序,使用户能够确定其规范研究的样本量。他们的使用是通过一个假设的计划的规范性研究的线索制造测试,伴随着最常见的模型的神经心理学测试在当前的实践审查证明。
{"title":"Sample Size Determination for Optimal and Sub-Optimal Designs in Simplified Parametric Test Norming.","authors":"Francesco Innocenti, Alberto Cassese","doi":"10.1080/00273171.2025.2580712","DOIUrl":"https://doi.org/10.1080/00273171.2025.2580712","url":null,"abstract":"<p><p>Norms play a critical role in high-stakes individual assessments (e.g., diagnosing intellectual disabilities), where precision and stability are essential. To reduce fluctuations in norms due to sampling, normative studies must be based on sufficiently large and well-designed samples. This paper provides formulas, applicable to any sample composition, for determining the required sample size for normative studies under the simplified parametric norming framework. In addition to a sufficiently large sample size, precision can be further improved by sampling according to an optimal design, that is, a sample composition that minimizes sampling error in the norms. Optimal designs are, here, derived for 45 (multivariate) multiple linear regression models, assuming normality and homoscedasticity. These models vary in the degree of interaction among three norm-predictors: a continuous variable (e.g., age), a categorical variable (e.g., sex), and a variable (e.g., education) that may be treated as either continuous or categorical. To support practical implementation, three interactive Shiny apps are introduced, enabling users to determine the sample size for their normative studies. Their use is demonstrated through the hypothetical planning of a normative study for the Trail Making Test, accompanied by a review of the most common models for this neuropsychological test in current practice.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1-25"},"PeriodicalIF":3.5,"publicationDate":"2025-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145551930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multilevel Metamodels: Enhancing Inference, Interpretability, and Generalizability in Monte Carlo Simulation Studies. 多层元模型:在蒙特卡罗模拟研究中增强推理、可解释性和泛化性。
IF 3.5 3区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-11-19 DOI: 10.1080/00273171.2025.2586631
Joshua B Gilbert, Luke W Miratrix

Metamodels, or the regression analysis of Monte Carlo simulation results, provide a powerful tool to summarize simulation findings. However, an underutilized approach is the multilevel metamodel (MLMM) that accounts for the dependent data structure that arises from fitting multiple models to the same simulated data set. In this study, we articulate the theoretical rationale for the MLMM and illustrate how it can improve the interpretability of simulation results, better account for complex simulation designs, and provide new insights into the generalizability of simulation findings.

元模型,或蒙特卡罗模拟结果的回归分析,提供了一个强大的工具来总结模拟结果。然而,一种未充分利用的方法是多层元模型(MLMM),它解释了由于将多个模型拟合到相同的模拟数据集而产生的依赖数据结构。在本研究中,我们阐明了MLMM的理论基础,并说明了它如何提高仿真结果的可解释性,更好地解释复杂的仿真设计,并为仿真结果的普遍性提供了新的见解。
{"title":"Multilevel Metamodels: Enhancing Inference, Interpretability, and Generalizability in Monte Carlo Simulation Studies.","authors":"Joshua B Gilbert, Luke W Miratrix","doi":"10.1080/00273171.2025.2586631","DOIUrl":"https://doi.org/10.1080/00273171.2025.2586631","url":null,"abstract":"<p><p>Metamodels, or the regression analysis of Monte Carlo simulation results, provide a powerful tool to summarize simulation findings. However, an underutilized approach is the multilevel metamodel (MLMM) that accounts for the dependent data structure that arises from fitting multiple models to the same simulated data set. In this study, we articulate the theoretical rationale for the MLMM and illustrate how it can improve the interpretability of simulation results, better account for complex simulation designs, and provide new insights into the generalizability of simulation findings.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1-24"},"PeriodicalIF":3.5,"publicationDate":"2025-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145558457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian Multilevel Compositional Data Analysis with the R Package multilevelcoda. 贝叶斯多层成分数据分析与R包多层coda。
IF 3.5 3区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-11-17 DOI: 10.1080/00273171.2025.2565598
Flora Le, Dorothea Dumuid, Tyman E Stanford, Joshua F Wiley

Multilevel compositional data, such as data sampled over time that are non-negative and sum to a constant value, are common in various fields. However, there is currently no software specifically built to model compositional data in a multilevel framework. The R package multilevelcoda implements a collection of tools for modeling compositional data in a Bayesian multivariate, multilevel pipeline. The user-friendly setup only requires the data, model formula, and minimal specification of the analysis. This article outlines the statistical theory underlying the Bayesian compositional multilevel modeling approach and details the implementation of the functions available in multilevelcoda, using an example dataset of compositional daily sleep-wake behaviors. This innovative method can be used to robustly answer scientific questions from the increasingly available multilevel compositional data from intensive, longitudinal studies.

多层组合数据,如随时间采样的非负和和为常数值的数据,在各个领域都很常见。然而,目前还没有专门为多层框架中的组合数据建模的软件。R包multilevelcoda实现了一组工具,用于在贝叶斯多变量多级管道中对组合数据进行建模。用户友好的设置只需要数据、模型公式和最小的分析规范。本文概述了贝叶斯组合多层次建模方法的统计理论,并详细介绍了多层coda中可用功能的实现,使用了一个组合日常睡眠-觉醒行为的示例数据集。这种创新的方法可用于从密集的纵向研究中日益可用的多层次成分数据中可靠地回答科学问题。
{"title":"Bayesian Multilevel Compositional Data Analysis with the R Package <i>multilevelcoda</i>.","authors":"Flora Le, Dorothea Dumuid, Tyman E Stanford, Joshua F Wiley","doi":"10.1080/00273171.2025.2565598","DOIUrl":"https://doi.org/10.1080/00273171.2025.2565598","url":null,"abstract":"<p><p>Multilevel compositional data, such as data sampled over time that are non-negative and sum to a constant value, are common in various fields. However, there is currently no software specifically built to model compositional data in a multilevel framework. The <b>R</b> package <b><i>multilevelcoda</i></b> implements a collection of tools for modeling compositional data in a Bayesian multivariate, multilevel pipeline. The user-friendly setup only requires the data, model formula, and minimal specification of the analysis. This article outlines the statistical theory underlying the Bayesian compositional multilevel modeling approach and details the implementation of the functions available in <b><i>multilevelcoda</i></b>, using an example dataset of compositional daily sleep-wake behaviors. This innovative method can be used to robustly answer scientific questions from the increasingly available multilevel compositional data from intensive, longitudinal studies.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1-19"},"PeriodicalIF":3.5,"publicationDate":"2025-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145543877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correlated Residuals in Lagged-Effects Models: What They (Do Not) Represent in the Case of a Continuous-Time Process. 滞后效应模型中的相关残差:在连续时间过程中它们(不)代表什么。
IF 3.5 3区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-11-10 DOI: 10.1080/00273171.2025.2557274
R M Kuiper, E L Hamaker

The appeal of lagged-effects models, like the first-order vector autoregressive (VAR(1)) model, is the interpretation of the lagged coefficients in terms of predictive-and possibly causal-relationships between variables over time. While the focus in VAR(1) applications has traditionally been on the strength and sign of the lagged relationships, there has been a growing interest in the residual relationships (i.e., the correlations between the innovations) as well. In this article, we will investigate what residual correlations can and cannot signal, for both the discrete-time (DT) and continuous-time (CT) VAR(1) model, when inspecting a CT process. We will show that one should not take on a DT perspective when investigating a CT process: Correlated (i.e., non-zero) DT residuals can flag omitted common causes and effects at shorter intervals (which is well-known), but-when having a CT process-also effects at longer intervals. Furthermore, when inspecting a CT process, uncorrelated (i.e., zero) DT residuals do not imply that the variables have no effect on each other at other intervals, nor does it preclude the risk of having omitted common causes. Additionally, we will show that residual correlations in a CT model signal omitted causes for one or more of the observed variables. This may bias the estimation of lagged relationships, implying that the found predictive lagged relationships do not equal the underlying causal lagged relationships. Unfortunately, the CT residual correlations do not reflect the magnitude of the distortion.

滞后效应模型的吸引力,如一阶向量自回归(VAR(1))模型,是根据变量之间随时间的预测关系(也可能是因果关系)来解释滞后系数。虽然VAR(1)应用的重点传统上是滞后关系的强度和标志,但对剩余关系(即创新之间的相关性)也越来越感兴趣。在本文中,我们将研究在检查CT过程时,对于离散时间(DT)和连续时间(CT) VAR(1)模型,残余相关性可以和不能发出信号。我们将表明,在研究CT过程时,不应该采用DT视角:相关(即非零)DT残差可以在较短的间隔(这是众所周知的)标记忽略的常见原因和结果,但是-当具有CT过程时-也会在较长的间隔上产生影响。此外,在检查CT过程时,不相关(即零)DT残差并不意味着变量在其他间隔内对彼此没有影响,也不能排除遗漏共同原因的风险。此外,我们将展示CT模型信号中的残差相关性忽略了一个或多个观测变量的原因。这可能会对滞后关系的估计产生偏差,这意味着发现的预测滞后关系不等于潜在的因果滞后关系。不幸的是,CT残差相关性并不能反映失真的程度。
{"title":"Correlated Residuals in Lagged-Effects Models: What They (Do Not) Represent in the Case of a Continuous-Time Process.","authors":"R M Kuiper, E L Hamaker","doi":"10.1080/00273171.2025.2557274","DOIUrl":"https://doi.org/10.1080/00273171.2025.2557274","url":null,"abstract":"<p><p>The appeal of lagged-effects models, like the first-order vector autoregressive (VAR(1)) model, is the interpretation of the lagged coefficients in terms of predictive-and possibly causal-relationships between variables over time. While the focus in VAR(1) applications has traditionally been on the strength and sign of the lagged relationships, there has been a growing interest in the residual relationships (i.e., the correlations between the innovations) as well. In this article, we will investigate what residual correlations can and cannot signal, for both the discrete-time (DT) and continuous-time (CT) VAR(1) model, when inspecting a CT process. We will show that one should not take on a DT perspective when investigating a CT process: Correlated (i.e., non-zero) DT residuals can flag omitted common causes and effects at shorter intervals (which is well-known), but-when having a CT process-also effects at longer intervals. Furthermore, when inspecting a CT process, uncorrelated (i.e., zero) DT residuals do not imply that the variables have no effect on each other at other intervals, nor does it preclude the risk of having omitted common causes. Additionally, we will show that residual correlations in a CT model signal omitted causes for one or more of the observed variables. This may bias the estimation of lagged relationships, implying that the found predictive lagged relationships do not equal the underlying causal lagged relationships. Unfortunately, the CT residual correlations do not reflect the magnitude of the distortion.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1-27"},"PeriodicalIF":3.5,"publicationDate":"2025-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145483732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Effects of Data Preprocessing Choices on Behavioral RCT Outcomes: A Multiverse Analysis. 数据预处理选择对行为随机对照试验结果的影响:多元宇宙分析。
IF 3.5 3区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-11-03 DOI: 10.1080/00273171.2025.2575399
Giuseppe A Veltri

Seemingly routine data-preprocessing choices can exert outsized influence on the conclusions drawn from randomized controlled trials (RCTs), particularly in behavioral science where data are noisy, skewed and replete with outliers. We demonstrate this influence with two fully specified multiverse analyses on simulated RCT data. Each analysis spans 180 analytical pathways, produced by crossing 36 preprocessing pipelines that vary outlier handling, missing-data imputation and scale transformation, with five common model specifications. In Simulation A, which uses linear regression families, preprocessing decisions explain 76.9% of the total variance in estimated treatment effects, whereas model choice explains only 7.5%. In Simulation B, which replaces the linear models with advanced algorithms (generalized additive models, random forests, gradient boosting), the dominance of preprocessing is even clearer: 99.8% of the variance is attributable to data handling and just 0.1% to model specification. The ranges of mean effects show the same pattern (4.34 vs. 1.43 in Simulation A; 15.30 vs. 0.56 in Simulation B). Particular pipelines-most notably those that standardize or log-transform variables-shrink effect estimates by more than 90% relative to the raw-data baseline, while pipelines that leave the original scale intact can inflate effects by an order of magnitude. Because preprocessing choices can overshadow even large shifts in statistical methodology, we call for meticulous reporting of these steps and for routine sensitivity or multiverse analyses that make their impact transparent. Such practices are essential for improving the robustness and replicability of behavioral-science RCTs.

看似常规的数据预处理选择可能会对随机对照试验(rct)得出的结论产生巨大影响,特别是在数据嘈杂、扭曲和充满异常值的行为科学中。我们通过模拟RCT数据的两个完全指定的多元宇宙分析来证明这种影响。每个分析跨越180个分析路径,通过36个预处理管道产生,这些管道包括异常值处理、缺失数据输入和尺度转换,具有五种常见的模型规格。在使用线性回归族的模拟A中,预处理决策解释了估计治疗效果中总方差的76.9%,而模型选择只解释了7.5%。在用高级算法(广义加性模型、随机森林、梯度增强)取代线性模型的模拟B中,预处理的主导地位更加明显:99.8%的方差归因于数据处理,只有0.1%归因于模型规范。平均效应的范围显示出相同的模式(模拟A为4.34 vs. 1.43;模拟B为15.30 vs. 0.56)。特定的管道——最明显的是那些标准化或对数变换变量的管道——相对于原始数据基线收缩了90%以上的效果估计,而保持原始规模完整的管道可以将效果膨胀一个数量级。由于预处理的选择甚至会掩盖统计方法上的重大变化,我们呼吁对这些步骤进行细致的报告,并进行常规的敏感性或多元宇宙分析,使其影响透明。这些实践对于提高行为科学随机对照试验的稳健性和可重复性至关重要。
{"title":"The Effects of Data Preprocessing Choices on Behavioral RCT Outcomes: A Multiverse Analysis.","authors":"Giuseppe A Veltri","doi":"10.1080/00273171.2025.2575399","DOIUrl":"https://doi.org/10.1080/00273171.2025.2575399","url":null,"abstract":"<p><p>Seemingly routine data-preprocessing choices can exert outsized influence on the conclusions drawn from randomized controlled trials (RCTs), particularly in behavioral science where data are noisy, skewed and replete with outliers. We demonstrate this influence with two fully specified multiverse analyses on simulated RCT data. Each analysis spans 180 analytical pathways, produced by crossing 36 preprocessing pipelines that vary outlier handling, missing-data imputation and scale transformation, with five common model specifications. In Simulation A, which uses linear regression families, preprocessing decisions explain 76.9% of the total variance in estimated treatment effects, whereas model choice explains only 7.5%. In Simulation B, which replaces the linear models with advanced algorithms (generalized additive models, random forests, gradient boosting), the dominance of preprocessing is even clearer: 99.8% of the variance is attributable to data handling and just 0.1% to model specification. The ranges of mean effects show the same pattern (4.34 vs. 1.43 in Simulation A; 15.30 vs. 0.56 in Simulation B). Particular pipelines-most notably those that standardize or log-transform variables-shrink effect estimates by more than 90% relative to the raw-data baseline, while pipelines that leave the original scale intact can inflate effects by an order of magnitude. Because preprocessing choices can overshadow even large shifts in statistical methodology, we call for meticulous reporting of these steps and for routine sensitivity or multiverse analyses that make their impact transparent. Such practices are essential for improving the robustness and replicability of behavioral-science RCTs.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1-16"},"PeriodicalIF":3.5,"publicationDate":"2025-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145439947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detecting Model Misfit in Structural Equation Modeling with Machine Learning-A Proof of Concept. 用机器学习检测结构方程建模中的模型不拟合——概念验证。
IF 3.5 3区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-11-02 DOI: 10.1080/00273171.2025.2552304
Melanie Viola Partsch, David Goretzko

Despite the popularity of structural equation modeling in psychological research, accurately evaluating the fit of these models to data is still challenging. Using fixed fit index cutoffs is error-prone due to the fit indices' dependence on various features of the model and data ("nuisance parameters"). Nonetheless, applied researchers mostly rely on fixed fit index cutoffs, neglecting the risk of falsely accepting (or rejecting) their model. With the goal of developing a broadly applicable method that is almost independent of nuisance parameters, we introduce a machine learning (ML)-based approach to evaluate the fit of multi-factorial measurement models. We trained an ML model based on 173 model and data features that we extracted from 1,323,866 simulated data sets and models fitted by means of confirmatory factor analysis. We evaluated the performance of the ML model based on 1,659,386 independent test observations. The ML model performed very well in detecting model (mis-)fit in most conditions, hereby outperforming commonly used fixed fit index cutoffs across the board. Only minor misspecifications, such as a single neglected residual correlation, proved to be challenging to detect. This proof-of-concept study shows that ML is very promising in the context of model fit evaluation.

尽管结构方程模型在心理学研究中很受欢迎,但准确评估这些模型与数据的拟合性仍然具有挑战性。由于拟合指数依赖于模型和数据的各种特征(“讨厌的参数”),使用固定的拟合指数截止点容易出错。然而,应用研究人员大多依赖于固定的拟合指数截止值,忽视了错误接受(或拒绝)他们的模型的风险。为了开发一种几乎独立于干扰参数的广泛适用的方法,我们引入了一种基于机器学习(ML)的方法来评估多因子测量模型的拟合。我们基于从1,323,866个模拟数据集和通过验证性因子分析拟合的模型中提取的173个模型和数据特征训练了一个ML模型。我们基于1,659,386个独立测试观察值评估了ML模型的性能。在大多数情况下,ML模型在检测模型(误)拟合方面表现非常好,从而全面优于Hu和Bentler的固定拟合指标截止值。只有较小的错误说明,如单个被忽略的残差相关性,被证明是具有挑战性的检测。这一概念验证研究表明,机器学习在模型拟合评估方面非常有前途。
{"title":"Detecting Model Misfit in Structural Equation Modeling with Machine Learning-A Proof of Concept.","authors":"Melanie Viola Partsch, David Goretzko","doi":"10.1080/00273171.2025.2552304","DOIUrl":"10.1080/00273171.2025.2552304","url":null,"abstract":"<p><p>Despite the popularity of structural equation modeling in psychological research, accurately evaluating the fit of these models to data is still challenging. Using fixed fit index cutoffs is error-prone due to the fit indices' dependence on various features of the model and data (\"nuisance parameters\"). Nonetheless, applied researchers mostly rely on fixed fit index cutoffs, neglecting the risk of falsely accepting (or rejecting) their model. With the goal of developing a broadly applicable method that is almost independent of nuisance parameters, we introduce a machine learning (ML)-based approach to evaluate the fit of multi-factorial measurement models. We trained an ML model based on 173 model and data features that we extracted from 1,323,866 simulated data sets and models fitted by means of confirmatory factor analysis. We evaluated the performance of the ML model based on 1,659,386 independent test observations. The ML model performed very well in detecting model (mis-)fit in most conditions, hereby outperforming commonly used fixed fit index cutoffs across the board. Only minor misspecifications, such as a single neglected residual correlation, proved to be challenging to detect. This proof-of-concept study shows that ML is very promising in the context of model fit evaluation.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1-24"},"PeriodicalIF":3.5,"publicationDate":"2025-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145433132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unique Contributions of Dynamic Affect Indicators - Beyond Static Variability. 动态影响指标的独特贡献-超越静态变异性。
IF 3.5 3区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-11-01 Epub Date: 2025-09-02 DOI: 10.1080/00273171.2025.2545367
Kenneth Koslowski, Jana Holtmann

Indicators of affect dynamics (IADs) capture temporal dependencies and instability in affective trajectories over time. However, the relevance of IADs for the prediction of time-invariant outcomes (e.g., depressive symptoms) was recently challenged due to results suggesting low predictive utility beyond intraindividual means and variances. We argue that these results may in part be explained by mathematical redundancies between IADs and static variability as well as the chosen modeling strategy. In three extensive simulation studies we investigate the accuracy and power for detecting non-null relations between IADs and an outcome variable in different relevant settings, illustrating the effect of the length of a time series, the presence of missing values or measurement error, as well as of erroneously fixing innovation variances to be equal across persons. We show that, if uncertainty in individual IAD estimates is not accounted for, relations between IADs (i.e., autoregressive effects) and a time-invariant outcome are underestimated even in large samples and propose the use of a latent multilevel one-step approach. In an empirical application we illustrate that the different modeling approaches can lead to different substantive conclusions regarding the role of negative affect inertia in the prediction of depressive symptoms.

情感动态指标(IADs)捕捉情感轨迹随时间的时间依赖性和不稳定性。然而,IADs与预测时不变结果(如抑郁症状)的相关性最近受到了挑战,因为结果表明,除了个体内部均值和方差之外,预测效用很低。我们认为,这些结果可能部分地由IADs和静态变异性之间的数学冗余以及所选择的建模策略来解释。在三个广泛的模拟研究中,我们调查了在不同相关设置中检测IADs与结果变量之间非零关系的准确性和能力,说明了时间序列长度、缺失值或测量误差的存在以及错误地将创新方差固定为相等的影响。我们表明,如果不考虑个体IAD估计的不确定性,即使在大样本中,IAD(即自回归效应)与时不变结果之间的关系也会被低估,并建议使用潜在的多层次一步方法。在一个实证应用中,我们说明了不同的建模方法可以导致关于负面影响惯性在抑郁症状预测中的作用不同的实质性结论。
{"title":"Unique Contributions of Dynamic Affect Indicators - Beyond Static Variability.","authors":"Kenneth Koslowski, Jana Holtmann","doi":"10.1080/00273171.2025.2545367","DOIUrl":"10.1080/00273171.2025.2545367","url":null,"abstract":"<p><p>Indicators of affect dynamics (IADs) capture temporal dependencies and instability in affective trajectories over time. However, the relevance of IADs for the prediction of time-invariant outcomes (e.g., depressive symptoms) was recently challenged due to results suggesting low predictive utility beyond intraindividual means and variances. We argue that these results may in part be explained by mathematical redundancies between IADs and static variability as well as the chosen modeling strategy. In three extensive simulation studies we investigate the accuracy and power for detecting non-null relations between IADs and an outcome variable in different relevant settings, illustrating the effect of the length of a time series, the presence of missing values or measurement error, as well as of erroneously fixing innovation variances to be equal across persons. We show that, if uncertainty in individual IAD estimates is not accounted for, relations between IADs (i.e., autoregressive effects) and a time-invariant outcome are underestimated even in large samples and propose the use of a latent multilevel one-step approach. In an empirical application we illustrate that the different modeling approaches can lead to different substantive conclusions regarding the role of negative affect inertia in the prediction of depressive symptoms.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1199-1220"},"PeriodicalIF":3.5,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144978248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Multivariate Behavioral Research
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1