首页 > 最新文献

Multivariate Behavioral Research最新文献

英文 中文
Targeted Maximum Likelihood Estimation for Causal Inference With Observational Data-The Example of Private Tutoring. 基于观测数据的因果推理的目标最大似然估计——以家教为例。
IF 3.5 3区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-11-20 DOI: 10.1080/00273171.2025.2561942
Christoph Jindra, Karoline A Sachse

State-of-the-art causal inference methods for observational data promise to relax assumptions threatening valid causal inference. Targeted maximum likelihood estimation (TMLE), for example, is a template for constructing doubly robust, semiparametric, efficient substitution estimators, providing consistent estimates if the outcome or treatment model is correctly specified. Compared to standard approaches, it reduces the risk of misspecification bias by allowing (nonparametric) machine-learning techniques, including super learning, to estimate the relevant components of the data distribution. We briefly introduce TMLE and demonstrate its use by estimating the effects of private tutoring in mathematics during Year 7 on mathematics proficiency and grades using observational data from starting cohort 3 of the National Education Panel Study (N= 4,167). We contrast TMLE estimates to those from ordinary least squares, the parametric G-formula, and the augmented inverse-probability weighted estimator. Our findings reveal close agreement between methods for end-of-year grades. However, variations emerge when examining mathematics proficiency as the outcome, highlighting that substantive conclusions may depend on the analytical approach. The results underscore the significance of employing advanced causal inference methods, such as TMLE, when navigating the complexities of observational data and highlight the nuanced impact of methodological choices on the interpretation of study outcomes.

最先进的观测数据因果推理方法有望放松威胁有效因果推理的假设。例如,目标最大似然估计(TMLE)是构建双鲁棒、半参数、高效替代估计器的模板,如果正确指定了结果或治疗模型,则提供一致的估计。与标准方法相比,它通过允许(非参数)机器学习技术(包括超级学习)来估计数据分布的相关组成部分,从而降低了错误规范偏差的风险。我们简要介绍了TMLE,并通过使用国家教育小组研究(N= 4,167)的起始队列3的观察数据,估计七年级数学私人辅导对数学熟练程度和成绩的影响来证明其使用。我们将TMLE估计与普通最小二乘、参数g公式和增广逆概率加权估计进行了比较。我们的研究结果揭示了年终成绩的方法之间的密切一致。然而,当检查数学熟练程度作为结果时,出现了变化,强调实质性结论可能取决于分析方法。研究结果强调了在处理观测数据的复杂性时,采用先进的因果推理方法(如TMLE)的重要性,并强调了方法选择对研究结果解释的微妙影响。
{"title":"Targeted Maximum Likelihood Estimation for Causal Inference With Observational Data-The Example of Private Tutoring.","authors":"Christoph Jindra, Karoline A Sachse","doi":"10.1080/00273171.2025.2561942","DOIUrl":"https://doi.org/10.1080/00273171.2025.2561942","url":null,"abstract":"<p><p>State-of-the-art causal inference methods for observational data promise to relax assumptions threatening valid causal inference. Targeted maximum likelihood estimation (TMLE), for example, is a template for constructing doubly robust, semiparametric, efficient substitution estimators, providing consistent estimates if the outcome or treatment model is correctly specified. Compared to standard approaches, it reduces the risk of misspecification bias by allowing (nonparametric) machine-learning techniques, including super learning, to estimate the relevant components of the data distribution. We briefly introduce TMLE and demonstrate its use by estimating the effects of private tutoring in mathematics during Year 7 on mathematics proficiency and grades using observational data from starting cohort 3 of the National Education Panel Study (<math><mrow><mi>N</mi><mo>=</mo></mrow></math> 4,167). We contrast TMLE estimates to those from ordinary least squares, the parametric G-formula, and the augmented inverse-probability weighted estimator. Our findings reveal close agreement between methods for end-of-year grades. However, variations emerge when examining mathematics proficiency as the outcome, highlighting that substantive conclusions may depend on the analytical approach. The results underscore the significance of employing advanced causal inference methods, such as TMLE, when navigating the complexities of observational data and highlight the nuanced impact of methodological choices on the interpretation of study outcomes.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1-20"},"PeriodicalIF":3.5,"publicationDate":"2025-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145566256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sample Size Determination for Optimal and Sub-Optimal Designs in Simplified Parametric Test Norming. 简化参数试验归一化中最优和次优设计的样本量确定。
IF 3.5 3区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-11-19 DOI: 10.1080/00273171.2025.2580712
Francesco Innocenti, Alberto Cassese

Norms play a critical role in high-stakes individual assessments (e.g., diagnosing intellectual disabilities), where precision and stability are essential. To reduce fluctuations in norms due to sampling, normative studies must be based on sufficiently large and well-designed samples. This paper provides formulas, applicable to any sample composition, for determining the required sample size for normative studies under the simplified parametric norming framework. In addition to a sufficiently large sample size, precision can be further improved by sampling according to an optimal design, that is, a sample composition that minimizes sampling error in the norms. Optimal designs are, here, derived for 45 (multivariate) multiple linear regression models, assuming normality and homoscedasticity. These models vary in the degree of interaction among three norm-predictors: a continuous variable (e.g., age), a categorical variable (e.g., sex), and a variable (e.g., education) that may be treated as either continuous or categorical. To support practical implementation, three interactive Shiny apps are introduced, enabling users to determine the sample size for their normative studies. Their use is demonstrated through the hypothetical planning of a normative study for the Trail Making Test, accompanied by a review of the most common models for this neuropsychological test in current practice.

规范在高风险的个人评估(例如,诊断智力残疾)中发挥着关键作用,其中准确性和稳定性至关重要。为了减少抽样造成的规范波动,规范研究必须以足够大和设计良好的样本为基础。本文提供了适用于任何样本组成的公式,用于确定简化参数规范化框架下规范研究所需的样本量。除了足够大的样本量外,还可以通过根据最优设计(即使规范中的抽样误差最小化的样本组成)进行抽样来进一步提高精度。最佳设计,在这里,推导了45(多元)多元线性回归模型,假设正态性和均方差。这些模型在三个规范预测因子之间的相互作用程度各不相同:连续变量(如年龄),分类变量(如性别)和变量(如教育),可以被视为连续或分类。为了支持实际实施,引入了三个交互式Shiny应用程序,使用户能够确定其规范研究的样本量。他们的使用是通过一个假设的计划的规范性研究的线索制造测试,伴随着最常见的模型的神经心理学测试在当前的实践审查证明。
{"title":"Sample Size Determination for Optimal and Sub-Optimal Designs in Simplified Parametric Test Norming.","authors":"Francesco Innocenti, Alberto Cassese","doi":"10.1080/00273171.2025.2580712","DOIUrl":"https://doi.org/10.1080/00273171.2025.2580712","url":null,"abstract":"<p><p>Norms play a critical role in high-stakes individual assessments (e.g., diagnosing intellectual disabilities), where precision and stability are essential. To reduce fluctuations in norms due to sampling, normative studies must be based on sufficiently large and well-designed samples. This paper provides formulas, applicable to any sample composition, for determining the required sample size for normative studies under the simplified parametric norming framework. In addition to a sufficiently large sample size, precision can be further improved by sampling according to an optimal design, that is, a sample composition that minimizes sampling error in the norms. Optimal designs are, here, derived for 45 (multivariate) multiple linear regression models, assuming normality and homoscedasticity. These models vary in the degree of interaction among three norm-predictors: a continuous variable (e.g., age), a categorical variable (e.g., sex), and a variable (e.g., education) that may be treated as either continuous or categorical. To support practical implementation, three interactive Shiny apps are introduced, enabling users to determine the sample size for their normative studies. Their use is demonstrated through the hypothetical planning of a normative study for the Trail Making Test, accompanied by a review of the most common models for this neuropsychological test in current practice.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1-25"},"PeriodicalIF":3.5,"publicationDate":"2025-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145551930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multilevel Metamodels: Enhancing Inference, Interpretability, and Generalizability in Monte Carlo Simulation Studies. 多层元模型:在蒙特卡罗模拟研究中增强推理、可解释性和泛化性。
IF 3.5 3区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-11-19 DOI: 10.1080/00273171.2025.2586631
Joshua B Gilbert, Luke W Miratrix

Metamodels, or the regression analysis of Monte Carlo simulation results, provide a powerful tool to summarize simulation findings. However, an underutilized approach is the multilevel metamodel (MLMM) that accounts for the dependent data structure that arises from fitting multiple models to the same simulated data set. In this study, we articulate the theoretical rationale for the MLMM and illustrate how it can improve the interpretability of simulation results, better account for complex simulation designs, and provide new insights into the generalizability of simulation findings.

元模型,或蒙特卡罗模拟结果的回归分析,提供了一个强大的工具来总结模拟结果。然而,一种未充分利用的方法是多层元模型(MLMM),它解释了由于将多个模型拟合到相同的模拟数据集而产生的依赖数据结构。在本研究中,我们阐明了MLMM的理论基础,并说明了它如何提高仿真结果的可解释性,更好地解释复杂的仿真设计,并为仿真结果的普遍性提供了新的见解。
{"title":"Multilevel Metamodels: Enhancing Inference, Interpretability, and Generalizability in Monte Carlo Simulation Studies.","authors":"Joshua B Gilbert, Luke W Miratrix","doi":"10.1080/00273171.2025.2586631","DOIUrl":"https://doi.org/10.1080/00273171.2025.2586631","url":null,"abstract":"<p><p>Metamodels, or the regression analysis of Monte Carlo simulation results, provide a powerful tool to summarize simulation findings. However, an underutilized approach is the multilevel metamodel (MLMM) that accounts for the dependent data structure that arises from fitting multiple models to the same simulated data set. In this study, we articulate the theoretical rationale for the MLMM and illustrate how it can improve the interpretability of simulation results, better account for complex simulation designs, and provide new insights into the generalizability of simulation findings.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1-24"},"PeriodicalIF":3.5,"publicationDate":"2025-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145558457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian Multilevel Compositional Data Analysis with the R Package multilevelcoda. 贝叶斯多层成分数据分析与R包多层coda。
IF 3.5 3区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-11-17 DOI: 10.1080/00273171.2025.2565598
Flora Le, Dorothea Dumuid, Tyman E Stanford, Joshua F Wiley

Multilevel compositional data, such as data sampled over time that are non-negative and sum to a constant value, are common in various fields. However, there is currently no software specifically built to model compositional data in a multilevel framework. The R package multilevelcoda implements a collection of tools for modeling compositional data in a Bayesian multivariate, multilevel pipeline. The user-friendly setup only requires the data, model formula, and minimal specification of the analysis. This article outlines the statistical theory underlying the Bayesian compositional multilevel modeling approach and details the implementation of the functions available in multilevelcoda, using an example dataset of compositional daily sleep-wake behaviors. This innovative method can be used to robustly answer scientific questions from the increasingly available multilevel compositional data from intensive, longitudinal studies.

多层组合数据,如随时间采样的非负和和为常数值的数据,在各个领域都很常见。然而,目前还没有专门为多层框架中的组合数据建模的软件。R包multilevelcoda实现了一组工具,用于在贝叶斯多变量多级管道中对组合数据进行建模。用户友好的设置只需要数据、模型公式和最小的分析规范。本文概述了贝叶斯组合多层次建模方法的统计理论,并详细介绍了多层coda中可用功能的实现,使用了一个组合日常睡眠-觉醒行为的示例数据集。这种创新的方法可用于从密集的纵向研究中日益可用的多层次成分数据中可靠地回答科学问题。
{"title":"Bayesian Multilevel Compositional Data Analysis with the R Package <i>multilevelcoda</i>.","authors":"Flora Le, Dorothea Dumuid, Tyman E Stanford, Joshua F Wiley","doi":"10.1080/00273171.2025.2565598","DOIUrl":"https://doi.org/10.1080/00273171.2025.2565598","url":null,"abstract":"<p><p>Multilevel compositional data, such as data sampled over time that are non-negative and sum to a constant value, are common in various fields. However, there is currently no software specifically built to model compositional data in a multilevel framework. The <b>R</b> package <b><i>multilevelcoda</i></b> implements a collection of tools for modeling compositional data in a Bayesian multivariate, multilevel pipeline. The user-friendly setup only requires the data, model formula, and minimal specification of the analysis. This article outlines the statistical theory underlying the Bayesian compositional multilevel modeling approach and details the implementation of the functions available in <b><i>multilevelcoda</i></b>, using an example dataset of compositional daily sleep-wake behaviors. This innovative method can be used to robustly answer scientific questions from the increasingly available multilevel compositional data from intensive, longitudinal studies.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1-19"},"PeriodicalIF":3.5,"publicationDate":"2025-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145543877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correlated Residuals in Lagged-Effects Models: What They (Do Not) Represent in the Case of a Continuous-Time Process. 滞后效应模型中的相关残差:在连续时间过程中它们(不)代表什么。
IF 3.5 3区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-11-10 DOI: 10.1080/00273171.2025.2557274
R M Kuiper, E L Hamaker

The appeal of lagged-effects models, like the first-order vector autoregressive (VAR(1)) model, is the interpretation of the lagged coefficients in terms of predictive-and possibly causal-relationships between variables over time. While the focus in VAR(1) applications has traditionally been on the strength and sign of the lagged relationships, there has been a growing interest in the residual relationships (i.e., the correlations between the innovations) as well. In this article, we will investigate what residual correlations can and cannot signal, for both the discrete-time (DT) and continuous-time (CT) VAR(1) model, when inspecting a CT process. We will show that one should not take on a DT perspective when investigating a CT process: Correlated (i.e., non-zero) DT residuals can flag omitted common causes and effects at shorter intervals (which is well-known), but-when having a CT process-also effects at longer intervals. Furthermore, when inspecting a CT process, uncorrelated (i.e., zero) DT residuals do not imply that the variables have no effect on each other at other intervals, nor does it preclude the risk of having omitted common causes. Additionally, we will show that residual correlations in a CT model signal omitted causes for one or more of the observed variables. This may bias the estimation of lagged relationships, implying that the found predictive lagged relationships do not equal the underlying causal lagged relationships. Unfortunately, the CT residual correlations do not reflect the magnitude of the distortion.

滞后效应模型的吸引力,如一阶向量自回归(VAR(1))模型,是根据变量之间随时间的预测关系(也可能是因果关系)来解释滞后系数。虽然VAR(1)应用的重点传统上是滞后关系的强度和标志,但对剩余关系(即创新之间的相关性)也越来越感兴趣。在本文中,我们将研究在检查CT过程时,对于离散时间(DT)和连续时间(CT) VAR(1)模型,残余相关性可以和不能发出信号。我们将表明,在研究CT过程时,不应该采用DT视角:相关(即非零)DT残差可以在较短的间隔(这是众所周知的)标记忽略的常见原因和结果,但是-当具有CT过程时-也会在较长的间隔上产生影响。此外,在检查CT过程时,不相关(即零)DT残差并不意味着变量在其他间隔内对彼此没有影响,也不能排除遗漏共同原因的风险。此外,我们将展示CT模型信号中的残差相关性忽略了一个或多个观测变量的原因。这可能会对滞后关系的估计产生偏差,这意味着发现的预测滞后关系不等于潜在的因果滞后关系。不幸的是,CT残差相关性并不能反映失真的程度。
{"title":"Correlated Residuals in Lagged-Effects Models: What They (Do Not) Represent in the Case of a Continuous-Time Process.","authors":"R M Kuiper, E L Hamaker","doi":"10.1080/00273171.2025.2557274","DOIUrl":"https://doi.org/10.1080/00273171.2025.2557274","url":null,"abstract":"<p><p>The appeal of lagged-effects models, like the first-order vector autoregressive (VAR(1)) model, is the interpretation of the lagged coefficients in terms of predictive-and possibly causal-relationships between variables over time. While the focus in VAR(1) applications has traditionally been on the strength and sign of the lagged relationships, there has been a growing interest in the residual relationships (i.e., the correlations between the innovations) as well. In this article, we will investigate what residual correlations can and cannot signal, for both the discrete-time (DT) and continuous-time (CT) VAR(1) model, when inspecting a CT process. We will show that one should not take on a DT perspective when investigating a CT process: Correlated (i.e., non-zero) DT residuals can flag omitted common causes and effects at shorter intervals (which is well-known), but-when having a CT process-also effects at longer intervals. Furthermore, when inspecting a CT process, uncorrelated (i.e., zero) DT residuals do not imply that the variables have no effect on each other at other intervals, nor does it preclude the risk of having omitted common causes. Additionally, we will show that residual correlations in a CT model signal omitted causes for one or more of the observed variables. This may bias the estimation of lagged relationships, implying that the found predictive lagged relationships do not equal the underlying causal lagged relationships. Unfortunately, the CT residual correlations do not reflect the magnitude of the distortion.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1-27"},"PeriodicalIF":3.5,"publicationDate":"2025-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145483732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Effects of Data Preprocessing Choices on Behavioral RCT Outcomes: A Multiverse Analysis. 数据预处理选择对行为随机对照试验结果的影响:多元宇宙分析。
IF 3.5 3区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-11-03 DOI: 10.1080/00273171.2025.2575399
Giuseppe A Veltri

Seemingly routine data-preprocessing choices can exert outsized influence on the conclusions drawn from randomized controlled trials (RCTs), particularly in behavioral science where data are noisy, skewed and replete with outliers. We demonstrate this influence with two fully specified multiverse analyses on simulated RCT data. Each analysis spans 180 analytical pathways, produced by crossing 36 preprocessing pipelines that vary outlier handling, missing-data imputation and scale transformation, with five common model specifications. In Simulation A, which uses linear regression families, preprocessing decisions explain 76.9% of the total variance in estimated treatment effects, whereas model choice explains only 7.5%. In Simulation B, which replaces the linear models with advanced algorithms (generalized additive models, random forests, gradient boosting), the dominance of preprocessing is even clearer: 99.8% of the variance is attributable to data handling and just 0.1% to model specification. The ranges of mean effects show the same pattern (4.34 vs. 1.43 in Simulation A; 15.30 vs. 0.56 in Simulation B). Particular pipelines-most notably those that standardize or log-transform variables-shrink effect estimates by more than 90% relative to the raw-data baseline, while pipelines that leave the original scale intact can inflate effects by an order of magnitude. Because preprocessing choices can overshadow even large shifts in statistical methodology, we call for meticulous reporting of these steps and for routine sensitivity or multiverse analyses that make their impact transparent. Such practices are essential for improving the robustness and replicability of behavioral-science RCTs.

看似常规的数据预处理选择可能会对随机对照试验(rct)得出的结论产生巨大影响,特别是在数据嘈杂、扭曲和充满异常值的行为科学中。我们通过模拟RCT数据的两个完全指定的多元宇宙分析来证明这种影响。每个分析跨越180个分析路径,通过36个预处理管道产生,这些管道包括异常值处理、缺失数据输入和尺度转换,具有五种常见的模型规格。在使用线性回归族的模拟A中,预处理决策解释了估计治疗效果中总方差的76.9%,而模型选择只解释了7.5%。在用高级算法(广义加性模型、随机森林、梯度增强)取代线性模型的模拟B中,预处理的主导地位更加明显:99.8%的方差归因于数据处理,只有0.1%归因于模型规范。平均效应的范围显示出相同的模式(模拟A为4.34 vs. 1.43;模拟B为15.30 vs. 0.56)。特定的管道——最明显的是那些标准化或对数变换变量的管道——相对于原始数据基线收缩了90%以上的效果估计,而保持原始规模完整的管道可以将效果膨胀一个数量级。由于预处理的选择甚至会掩盖统计方法上的重大变化,我们呼吁对这些步骤进行细致的报告,并进行常规的敏感性或多元宇宙分析,使其影响透明。这些实践对于提高行为科学随机对照试验的稳健性和可重复性至关重要。
{"title":"The Effects of Data Preprocessing Choices on Behavioral RCT Outcomes: A Multiverse Analysis.","authors":"Giuseppe A Veltri","doi":"10.1080/00273171.2025.2575399","DOIUrl":"https://doi.org/10.1080/00273171.2025.2575399","url":null,"abstract":"<p><p>Seemingly routine data-preprocessing choices can exert outsized influence on the conclusions drawn from randomized controlled trials (RCTs), particularly in behavioral science where data are noisy, skewed and replete with outliers. We demonstrate this influence with two fully specified multiverse analyses on simulated RCT data. Each analysis spans 180 analytical pathways, produced by crossing 36 preprocessing pipelines that vary outlier handling, missing-data imputation and scale transformation, with five common model specifications. In Simulation A, which uses linear regression families, preprocessing decisions explain 76.9% of the total variance in estimated treatment effects, whereas model choice explains only 7.5%. In Simulation B, which replaces the linear models with advanced algorithms (generalized additive models, random forests, gradient boosting), the dominance of preprocessing is even clearer: 99.8% of the variance is attributable to data handling and just 0.1% to model specification. The ranges of mean effects show the same pattern (4.34 vs. 1.43 in Simulation A; 15.30 vs. 0.56 in Simulation B). Particular pipelines-most notably those that standardize or log-transform variables-shrink effect estimates by more than 90% relative to the raw-data baseline, while pipelines that leave the original scale intact can inflate effects by an order of magnitude. Because preprocessing choices can overshadow even large shifts in statistical methodology, we call for meticulous reporting of these steps and for routine sensitivity or multiverse analyses that make their impact transparent. Such practices are essential for improving the robustness and replicability of behavioral-science RCTs.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1-16"},"PeriodicalIF":3.5,"publicationDate":"2025-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145439947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detecting Model Misfit in Structural Equation Modeling with Machine Learning-A Proof of Concept. 用机器学习检测结构方程建模中的模型不拟合——概念验证。
IF 3.5 3区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-11-02 DOI: 10.1080/00273171.2025.2552304
Melanie Viola Partsch, David Goretzko

Despite the popularity of structural equation modeling in psychological research, accurately evaluating the fit of these models to data is still challenging. Using fixed fit index cutoffs is error-prone due to the fit indices' dependence on various features of the model and data ("nuisance parameters"). Nonetheless, applied researchers mostly rely on fixed fit index cutoffs, neglecting the risk of falsely accepting (or rejecting) their model. With the goal of developing a broadly applicable method that is almost independent of nuisance parameters, we introduce a machine learning (ML)-based approach to evaluate the fit of multi-factorial measurement models. We trained an ML model based on 173 model and data features that we extracted from 1,323,866 simulated data sets and models fitted by means of confirmatory factor analysis. We evaluated the performance of the ML model based on 1,659,386 independent test observations. The ML model performed very well in detecting model (mis-)fit in most conditions, hereby outperforming commonly used fixed fit index cutoffs across the board. Only minor misspecifications, such as a single neglected residual correlation, proved to be challenging to detect. This proof-of-concept study shows that ML is very promising in the context of model fit evaluation.

尽管结构方程模型在心理学研究中很受欢迎,但准确评估这些模型与数据的拟合性仍然具有挑战性。由于拟合指数依赖于模型和数据的各种特征(“讨厌的参数”),使用固定的拟合指数截止点容易出错。然而,应用研究人员大多依赖于固定的拟合指数截止值,忽视了错误接受(或拒绝)他们的模型的风险。为了开发一种几乎独立于干扰参数的广泛适用的方法,我们引入了一种基于机器学习(ML)的方法来评估多因子测量模型的拟合。我们基于从1,323,866个模拟数据集和通过验证性因子分析拟合的模型中提取的173个模型和数据特征训练了一个ML模型。我们基于1,659,386个独立测试观察值评估了ML模型的性能。在大多数情况下,ML模型在检测模型(误)拟合方面表现非常好,从而全面优于Hu和Bentler的固定拟合指标截止值。只有较小的错误说明,如单个被忽略的残差相关性,被证明是具有挑战性的检测。这一概念验证研究表明,机器学习在模型拟合评估方面非常有前途。
{"title":"Detecting Model Misfit in Structural Equation Modeling with Machine Learning-A Proof of Concept.","authors":"Melanie Viola Partsch, David Goretzko","doi":"10.1080/00273171.2025.2552304","DOIUrl":"10.1080/00273171.2025.2552304","url":null,"abstract":"<p><p>Despite the popularity of structural equation modeling in psychological research, accurately evaluating the fit of these models to data is still challenging. Using fixed fit index cutoffs is error-prone due to the fit indices' dependence on various features of the model and data (\"nuisance parameters\"). Nonetheless, applied researchers mostly rely on fixed fit index cutoffs, neglecting the risk of falsely accepting (or rejecting) their model. With the goal of developing a broadly applicable method that is almost independent of nuisance parameters, we introduce a machine learning (ML)-based approach to evaluate the fit of multi-factorial measurement models. We trained an ML model based on 173 model and data features that we extracted from 1,323,866 simulated data sets and models fitted by means of confirmatory factor analysis. We evaluated the performance of the ML model based on 1,659,386 independent test observations. The ML model performed very well in detecting model (mis-)fit in most conditions, hereby outperforming commonly used fixed fit index cutoffs across the board. Only minor misspecifications, such as a single neglected residual correlation, proved to be challenging to detect. This proof-of-concept study shows that ML is very promising in the context of model fit evaluation.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1-24"},"PeriodicalIF":3.5,"publicationDate":"2025-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145433132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unique Contributions of Dynamic Affect Indicators - Beyond Static Variability. 动态影响指标的独特贡献-超越静态变异性。
IF 3.5 3区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-11-01 Epub Date: 2025-09-02 DOI: 10.1080/00273171.2025.2545367
Kenneth Koslowski, Jana Holtmann

Indicators of affect dynamics (IADs) capture temporal dependencies and instability in affective trajectories over time. However, the relevance of IADs for the prediction of time-invariant outcomes (e.g., depressive symptoms) was recently challenged due to results suggesting low predictive utility beyond intraindividual means and variances. We argue that these results may in part be explained by mathematical redundancies between IADs and static variability as well as the chosen modeling strategy. In three extensive simulation studies we investigate the accuracy and power for detecting non-null relations between IADs and an outcome variable in different relevant settings, illustrating the effect of the length of a time series, the presence of missing values or measurement error, as well as of erroneously fixing innovation variances to be equal across persons. We show that, if uncertainty in individual IAD estimates is not accounted for, relations between IADs (i.e., autoregressive effects) and a time-invariant outcome are underestimated even in large samples and propose the use of a latent multilevel one-step approach. In an empirical application we illustrate that the different modeling approaches can lead to different substantive conclusions regarding the role of negative affect inertia in the prediction of depressive symptoms.

情感动态指标(IADs)捕捉情感轨迹随时间的时间依赖性和不稳定性。然而,IADs与预测时不变结果(如抑郁症状)的相关性最近受到了挑战,因为结果表明,除了个体内部均值和方差之外,预测效用很低。我们认为,这些结果可能部分地由IADs和静态变异性之间的数学冗余以及所选择的建模策略来解释。在三个广泛的模拟研究中,我们调查了在不同相关设置中检测IADs与结果变量之间非零关系的准确性和能力,说明了时间序列长度、缺失值或测量误差的存在以及错误地将创新方差固定为相等的影响。我们表明,如果不考虑个体IAD估计的不确定性,即使在大样本中,IAD(即自回归效应)与时不变结果之间的关系也会被低估,并建议使用潜在的多层次一步方法。在一个实证应用中,我们说明了不同的建模方法可以导致关于负面影响惯性在抑郁症状预测中的作用不同的实质性结论。
{"title":"Unique Contributions of Dynamic Affect Indicators - Beyond Static Variability.","authors":"Kenneth Koslowski, Jana Holtmann","doi":"10.1080/00273171.2025.2545367","DOIUrl":"10.1080/00273171.2025.2545367","url":null,"abstract":"<p><p>Indicators of affect dynamics (IADs) capture temporal dependencies and instability in affective trajectories over time. However, the relevance of IADs for the prediction of time-invariant outcomes (e.g., depressive symptoms) was recently challenged due to results suggesting low predictive utility beyond intraindividual means and variances. We argue that these results may in part be explained by mathematical redundancies between IADs and static variability as well as the chosen modeling strategy. In three extensive simulation studies we investigate the accuracy and power for detecting non-null relations between IADs and an outcome variable in different relevant settings, illustrating the effect of the length of a time series, the presence of missing values or measurement error, as well as of erroneously fixing innovation variances to be equal across persons. We show that, if uncertainty in individual IAD estimates is not accounted for, relations between IADs (i.e., autoregressive effects) and a time-invariant outcome are underestimated even in large samples and propose the use of a latent multilevel one-step approach. In an empirical application we illustrate that the different modeling approaches can lead to different substantive conclusions regarding the role of negative affect inertia in the prediction of depressive symptoms.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1199-1220"},"PeriodicalIF":3.5,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144978248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Systematic Evaluation of Wording Effects Modeling Under the Exploratory Structural Equation Modeling Framework. 探索性结构方程建模框架下的措辞效果建模系统评价。
IF 3.5 3区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-11-01 Epub Date: 2025-09-08 DOI: 10.1080/00273171.2025.2545362
Luis Eduardo Garrido, Alexander P Christensen, Hudson Golino, Agustín Martínez-Molina, Víctor B Arias, Kiero Guerra-Peña, María Dolores Nieto-Cañaveras, Flávio Azevedo, Francisco J Abad

Wording effects, the systematic method variance arising from the inconsistent responding to positively and negatively worded items of the same construct, are pervasive in the behavioral and health sciences. Although several factor modeling strategies have been proposed to mitigate their adverse effects, there is limited systematic research assessing their performance with exploratory structural equation models (ESEM). The present study evaluated the impact of different types of response bias related to wording effects (random and straight-line carelessness, acquiescence, item difficulty, and mixed) on ESEM models incorporating two popular method modeling strategies, the correlated traits-correlated methods minus one (CTC[M-1]) model and random intercept item factor analysis (RIIFA), as well as the "do nothing" approach. Five variables were manipulated using Monte Carlo methods: the type and magnitude of response bias, factor loadings, factor correlations, and sample size. Overall, the results showed that ignoring wording effects leads to poor model fit and serious distortions of the ESEM estimates. The RIIFA approach generally performed best at countering these adverse impacts and recovering unbiased factor structures, whereas the CTC(M-1) models struggled when biases affected both positively and negatively worded items. Our findings also indicated that method factors can sometimes reflect or absorb substantive variance, which may blur their associations with external variables and complicate their interpretation when embedded in broader structural models. A straightforward guide is offered to applied researchers who wish to use ESEM with mixed-worded scales.

措辞效应,即由于对同一构念的积极和消极措辞项目的不一致反应而引起的系统方法差异,在行为科学和健康科学中普遍存在。虽然已经提出了几种因子建模策略来减轻其不利影响,但利用探索性结构方程模型(ESEM)评估其性能的系统研究有限。本研究评估了不同类型的与措辞效应相关的反应偏差(随机和直线大意、默认、项目难度和混合)对ESEM模型的影响,该模型采用了两种常用的建模策略,即相关性状-相关方法减一(CTC[M-1])模型和随机截点项目因子分析(RIIFA),以及“不做”方法。使用蒙特卡罗方法对五个变量进行处理:反应偏差的类型和大小、因子负荷、因子相关性和样本量。总体而言,研究结果表明,忽略措辞效应会导致模型拟合不良,导致ESEM估计严重失真。RIIFA方法通常在对抗这些不利影响和恢复无偏因素结构方面表现最好,而CTC(M-1)模型在偏见影响积极和消极措辞项目时表现不佳。我们的研究结果还表明,方法因素有时可以反映或吸收实质性的方差,这可能会模糊它们与外部变量的关联,并使它们在嵌入更广泛的结构模型时的解释复杂化。一个简单的指南是提供给应用研究人员谁希望使用ESEM与混合用词的规模。
{"title":"A Systematic Evaluation of Wording Effects Modeling Under the Exploratory Structural Equation Modeling Framework.","authors":"Luis Eduardo Garrido, Alexander P Christensen, Hudson Golino, Agustín Martínez-Molina, Víctor B Arias, Kiero Guerra-Peña, María Dolores Nieto-Cañaveras, Flávio Azevedo, Francisco J Abad","doi":"10.1080/00273171.2025.2545362","DOIUrl":"10.1080/00273171.2025.2545362","url":null,"abstract":"<p><p>Wording effects, the systematic method variance arising from the inconsistent responding to positively and negatively worded items of the same construct, are pervasive in the behavioral and health sciences. Although several factor modeling strategies have been proposed to mitigate their adverse effects, there is limited systematic research assessing their performance with exploratory structural equation models (ESEM). The present study evaluated the impact of different types of response bias related to wording effects (random and straight-line carelessness, acquiescence, item difficulty, and mixed) on ESEM models incorporating two popular method modeling strategies, the correlated traits-correlated methods minus one (CTC[M-1]) model and random intercept item factor analysis (RIIFA), as well as the \"do nothing\" approach. Five variables were manipulated using Monte Carlo methods: the type and magnitude of response bias, factor loadings, factor correlations, and sample size. Overall, the results showed that ignoring wording effects leads to poor model fit and serious distortions of the ESEM estimates. The RIIFA approach generally performed best at countering these adverse impacts and recovering unbiased factor structures, whereas the CTC(M-1) models struggled when biases affected both positively and negatively worded items. Our findings also indicated that method factors can sometimes reflect or absorb substantive variance, which may blur their associations with external variables and complicate their interpretation when embedded in broader structural models. A straightforward guide is offered to applied researchers who wish to use ESEM with mixed-worded scales.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1169-1198"},"PeriodicalIF":3.5,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145016636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Equilibrium Causal Models: Connecting Dynamical Systems Modeling and Cross-Sectional Data Analysis. 平衡因果模型:连接动力系统建模和横断面数据分析。
IF 3.5 3区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-11-01 Epub Date: 2025-09-04 DOI: 10.1080/00273171.2025.2522733
Oisín Ryan, Fabian Dablander

Many psychological phenomena can be understood as arising from systems of causally connected components that evolve over time within an individual. In current empirical practice, researchers frequently study these systems by fitting statistical models to data collected at a single moment in time, that is, cross-sectional data. This raises a central question: Can cross-sectional data analysis ever yield causal insights into systems that evolve over time-and if so, under what conditions? In this paper, we address this question by introducing Equilibrium Causal Models (ECMs) to the psychological literature. ECMs are causal abstractions of an underlying dynamical system that allow for inferences about the long-term effects of interventions, permit cyclic causal relations, and can in principle be estimated from cross-sectional data, as long as information about the resting state of the system is captured by those measurements. We explain the conditions under which ECM estimation is possible, show that they allow researchers to learn about within-person processes from cross-sectional data, and discuss how tools from both the psychological measurement modeling and the causal discovery literature can inform the ways in which researchers collect and analyze their data.

许多心理现象可以被理解为产生于个体内部随时间进化的因果关联组件系统。在目前的实证实践中,研究人员经常通过将统计模型拟合到单个时刻收集的数据(即横截面数据)来研究这些系统。这就提出了一个核心问题:横断面数据分析是否能够对随时间演变的系统产生因果关系?如果可以,在什么条件下?在本文中,我们通过将均衡因果模型(ecm)引入心理学文献来解决这个问题。ecm是潜在动力系统的因果抽象,允许对干预的长期影响进行推断,允许循环因果关系,并且原则上可以从横截面数据中进行估计,只要这些测量捕获了有关系统静息状态的信息。我们解释了ECM估计可能发生的条件,表明它们允许研究人员从横截面数据中了解个人内部过程,并讨论了心理测量建模和因果发现文献中的工具如何为研究人员收集和分析数据的方式提供信息。
{"title":"Equilibrium Causal Models: Connecting Dynamical Systems Modeling and Cross-Sectional Data Analysis.","authors":"Oisín Ryan, Fabian Dablander","doi":"10.1080/00273171.2025.2522733","DOIUrl":"10.1080/00273171.2025.2522733","url":null,"abstract":"<p><p>Many psychological phenomena can be understood as arising from systems of causally connected components that evolve over time within an individual. In current empirical practice, researchers frequently study these systems by fitting statistical models to data collected at a single moment in time, that is, cross-sectional data. This raises a central question: Can cross-sectional data analysis ever yield causal insights into systems that evolve over time-and if so, under what conditions? In this paper, we address this question by introducing Equilibrium Causal Models (ECMs) to the psychological literature. ECMs are causal abstractions of an underlying dynamical system that allow for inferences about the long-term effects of interventions, permit cyclic causal relations, and can in principle be estimated from cross-sectional data, as long as information about the resting state of the system is captured by those measurements. We explain the conditions under which ECM estimation is possible, show that they allow researchers to learn about within-person processes from cross-sectional data, and discuss how tools from both the psychological measurement modeling and the causal discovery literature can inform the ways in which researchers collect and analyze their data.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1116-1150"},"PeriodicalIF":3.5,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144994473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Multivariate Behavioral Research
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1