Pub Date : 2025-11-20DOI: 10.1080/00273171.2025.2561942
Christoph Jindra, Karoline A Sachse
State-of-the-art causal inference methods for observational data promise to relax assumptions threatening valid causal inference. Targeted maximum likelihood estimation (TMLE), for example, is a template for constructing doubly robust, semiparametric, efficient substitution estimators, providing consistent estimates if the outcome or treatment model is correctly specified. Compared to standard approaches, it reduces the risk of misspecification bias by allowing (nonparametric) machine-learning techniques, including super learning, to estimate the relevant components of the data distribution. We briefly introduce TMLE and demonstrate its use by estimating the effects of private tutoring in mathematics during Year 7 on mathematics proficiency and grades using observational data from starting cohort 3 of the National Education Panel Study ( 4,167). We contrast TMLE estimates to those from ordinary least squares, the parametric G-formula, and the augmented inverse-probability weighted estimator. Our findings reveal close agreement between methods for end-of-year grades. However, variations emerge when examining mathematics proficiency as the outcome, highlighting that substantive conclusions may depend on the analytical approach. The results underscore the significance of employing advanced causal inference methods, such as TMLE, when navigating the complexities of observational data and highlight the nuanced impact of methodological choices on the interpretation of study outcomes.
{"title":"Targeted Maximum Likelihood Estimation for Causal Inference With Observational Data-The Example of Private Tutoring.","authors":"Christoph Jindra, Karoline A Sachse","doi":"10.1080/00273171.2025.2561942","DOIUrl":"https://doi.org/10.1080/00273171.2025.2561942","url":null,"abstract":"<p><p>State-of-the-art causal inference methods for observational data promise to relax assumptions threatening valid causal inference. Targeted maximum likelihood estimation (TMLE), for example, is a template for constructing doubly robust, semiparametric, efficient substitution estimators, providing consistent estimates if the outcome or treatment model is correctly specified. Compared to standard approaches, it reduces the risk of misspecification bias by allowing (nonparametric) machine-learning techniques, including super learning, to estimate the relevant components of the data distribution. We briefly introduce TMLE and demonstrate its use by estimating the effects of private tutoring in mathematics during Year 7 on mathematics proficiency and grades using observational data from starting cohort 3 of the National Education Panel Study (<math><mrow><mi>N</mi><mo>=</mo></mrow></math> 4,167). We contrast TMLE estimates to those from ordinary least squares, the parametric G-formula, and the augmented inverse-probability weighted estimator. Our findings reveal close agreement between methods for end-of-year grades. However, variations emerge when examining mathematics proficiency as the outcome, highlighting that substantive conclusions may depend on the analytical approach. The results underscore the significance of employing advanced causal inference methods, such as TMLE, when navigating the complexities of observational data and highlight the nuanced impact of methodological choices on the interpretation of study outcomes.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1-20"},"PeriodicalIF":3.5,"publicationDate":"2025-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145566256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-19DOI: 10.1080/00273171.2025.2580712
Francesco Innocenti, Alberto Cassese
Norms play a critical role in high-stakes individual assessments (e.g., diagnosing intellectual disabilities), where precision and stability are essential. To reduce fluctuations in norms due to sampling, normative studies must be based on sufficiently large and well-designed samples. This paper provides formulas, applicable to any sample composition, for determining the required sample size for normative studies under the simplified parametric norming framework. In addition to a sufficiently large sample size, precision can be further improved by sampling according to an optimal design, that is, a sample composition that minimizes sampling error in the norms. Optimal designs are, here, derived for 45 (multivariate) multiple linear regression models, assuming normality and homoscedasticity. These models vary in the degree of interaction among three norm-predictors: a continuous variable (e.g., age), a categorical variable (e.g., sex), and a variable (e.g., education) that may be treated as either continuous or categorical. To support practical implementation, three interactive Shiny apps are introduced, enabling users to determine the sample size for their normative studies. Their use is demonstrated through the hypothetical planning of a normative study for the Trail Making Test, accompanied by a review of the most common models for this neuropsychological test in current practice.
{"title":"Sample Size Determination for Optimal and Sub-Optimal Designs in Simplified Parametric Test Norming.","authors":"Francesco Innocenti, Alberto Cassese","doi":"10.1080/00273171.2025.2580712","DOIUrl":"https://doi.org/10.1080/00273171.2025.2580712","url":null,"abstract":"<p><p>Norms play a critical role in high-stakes individual assessments (e.g., diagnosing intellectual disabilities), where precision and stability are essential. To reduce fluctuations in norms due to sampling, normative studies must be based on sufficiently large and well-designed samples. This paper provides formulas, applicable to any sample composition, for determining the required sample size for normative studies under the simplified parametric norming framework. In addition to a sufficiently large sample size, precision can be further improved by sampling according to an optimal design, that is, a sample composition that minimizes sampling error in the norms. Optimal designs are, here, derived for 45 (multivariate) multiple linear regression models, assuming normality and homoscedasticity. These models vary in the degree of interaction among three norm-predictors: a continuous variable (e.g., age), a categorical variable (e.g., sex), and a variable (e.g., education) that may be treated as either continuous or categorical. To support practical implementation, three interactive Shiny apps are introduced, enabling users to determine the sample size for their normative studies. Their use is demonstrated through the hypothetical planning of a normative study for the Trail Making Test, accompanied by a review of the most common models for this neuropsychological test in current practice.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1-25"},"PeriodicalIF":3.5,"publicationDate":"2025-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145551930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-19DOI: 10.1080/00273171.2025.2586631
Joshua B Gilbert, Luke W Miratrix
Metamodels, or the regression analysis of Monte Carlo simulation results, provide a powerful tool to summarize simulation findings. However, an underutilized approach is the multilevel metamodel (MLMM) that accounts for the dependent data structure that arises from fitting multiple models to the same simulated data set. In this study, we articulate the theoretical rationale for the MLMM and illustrate how it can improve the interpretability of simulation results, better account for complex simulation designs, and provide new insights into the generalizability of simulation findings.
{"title":"Multilevel Metamodels: Enhancing Inference, Interpretability, and Generalizability in Monte Carlo Simulation Studies.","authors":"Joshua B Gilbert, Luke W Miratrix","doi":"10.1080/00273171.2025.2586631","DOIUrl":"https://doi.org/10.1080/00273171.2025.2586631","url":null,"abstract":"<p><p>Metamodels, or the regression analysis of Monte Carlo simulation results, provide a powerful tool to summarize simulation findings. However, an underutilized approach is the multilevel metamodel (MLMM) that accounts for the dependent data structure that arises from fitting multiple models to the same simulated data set. In this study, we articulate the theoretical rationale for the MLMM and illustrate how it can improve the interpretability of simulation results, better account for complex simulation designs, and provide new insights into the generalizability of simulation findings.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1-24"},"PeriodicalIF":3.5,"publicationDate":"2025-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145558457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-17DOI: 10.1080/00273171.2025.2565598
Flora Le, Dorothea Dumuid, Tyman E Stanford, Joshua F Wiley
Multilevel compositional data, such as data sampled over time that are non-negative and sum to a constant value, are common in various fields. However, there is currently no software specifically built to model compositional data in a multilevel framework. The R package multilevelcoda implements a collection of tools for modeling compositional data in a Bayesian multivariate, multilevel pipeline. The user-friendly setup only requires the data, model formula, and minimal specification of the analysis. This article outlines the statistical theory underlying the Bayesian compositional multilevel modeling approach and details the implementation of the functions available in multilevelcoda, using an example dataset of compositional daily sleep-wake behaviors. This innovative method can be used to robustly answer scientific questions from the increasingly available multilevel compositional data from intensive, longitudinal studies.
{"title":"Bayesian Multilevel Compositional Data Analysis with the R Package <i>multilevelcoda</i>.","authors":"Flora Le, Dorothea Dumuid, Tyman E Stanford, Joshua F Wiley","doi":"10.1080/00273171.2025.2565598","DOIUrl":"https://doi.org/10.1080/00273171.2025.2565598","url":null,"abstract":"<p><p>Multilevel compositional data, such as data sampled over time that are non-negative and sum to a constant value, are common in various fields. However, there is currently no software specifically built to model compositional data in a multilevel framework. The <b>R</b> package <b><i>multilevelcoda</i></b> implements a collection of tools for modeling compositional data in a Bayesian multivariate, multilevel pipeline. The user-friendly setup only requires the data, model formula, and minimal specification of the analysis. This article outlines the statistical theory underlying the Bayesian compositional multilevel modeling approach and details the implementation of the functions available in <b><i>multilevelcoda</i></b>, using an example dataset of compositional daily sleep-wake behaviors. This innovative method can be used to robustly answer scientific questions from the increasingly available multilevel compositional data from intensive, longitudinal studies.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1-19"},"PeriodicalIF":3.5,"publicationDate":"2025-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145543877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-10DOI: 10.1080/00273171.2025.2557274
R M Kuiper, E L Hamaker
The appeal of lagged-effects models, like the first-order vector autoregressive (VAR(1)) model, is the interpretation of the lagged coefficients in terms of predictive-and possibly causal-relationships between variables over time. While the focus in VAR(1) applications has traditionally been on the strength and sign of the lagged relationships, there has been a growing interest in the residual relationships (i.e., the correlations between the innovations) as well. In this article, we will investigate what residual correlations can and cannot signal, for both the discrete-time (DT) and continuous-time (CT) VAR(1) model, when inspecting a CT process. We will show that one should not take on a DT perspective when investigating a CT process: Correlated (i.e., non-zero) DT residuals can flag omitted common causes and effects at shorter intervals (which is well-known), but-when having a CT process-also effects at longer intervals. Furthermore, when inspecting a CT process, uncorrelated (i.e., zero) DT residuals do not imply that the variables have no effect on each other at other intervals, nor does it preclude the risk of having omitted common causes. Additionally, we will show that residual correlations in a CT model signal omitted causes for one or more of the observed variables. This may bias the estimation of lagged relationships, implying that the found predictive lagged relationships do not equal the underlying causal lagged relationships. Unfortunately, the CT residual correlations do not reflect the magnitude of the distortion.
{"title":"Correlated Residuals in Lagged-Effects Models: What They (Do Not) Represent in the Case of a Continuous-Time Process.","authors":"R M Kuiper, E L Hamaker","doi":"10.1080/00273171.2025.2557274","DOIUrl":"https://doi.org/10.1080/00273171.2025.2557274","url":null,"abstract":"<p><p>The appeal of lagged-effects models, like the first-order vector autoregressive (VAR(1)) model, is the interpretation of the lagged coefficients in terms of predictive-and possibly causal-relationships between variables over time. While the focus in VAR(1) applications has traditionally been on the strength and sign of the lagged relationships, there has been a growing interest in the residual relationships (i.e., the correlations between the innovations) as well. In this article, we will investigate what residual correlations can and cannot signal, for both the discrete-time (DT) and continuous-time (CT) VAR(1) model, when inspecting a CT process. We will show that one should not take on a DT perspective when investigating a CT process: Correlated (i.e., non-zero) DT residuals can flag omitted common causes and effects at shorter intervals (which is well-known), but-when having a CT process-also effects at longer intervals. Furthermore, when inspecting a CT process, uncorrelated (i.e., zero) DT residuals do not imply that the variables have no effect on each other at other intervals, nor does it preclude the risk of having omitted common causes. Additionally, we will show that residual correlations in a CT model signal omitted causes for one or more of the observed variables. This may bias the estimation of lagged relationships, implying that the found predictive lagged relationships do not equal the underlying causal lagged relationships. Unfortunately, the CT residual correlations do not reflect the magnitude of the distortion.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1-27"},"PeriodicalIF":3.5,"publicationDate":"2025-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145483732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-03DOI: 10.1080/00273171.2025.2575399
Giuseppe A Veltri
Seemingly routine data-preprocessing choices can exert outsized influence on the conclusions drawn from randomized controlled trials (RCTs), particularly in behavioral science where data are noisy, skewed and replete with outliers. We demonstrate this influence with two fully specified multiverse analyses on simulated RCT data. Each analysis spans 180 analytical pathways, produced by crossing 36 preprocessing pipelines that vary outlier handling, missing-data imputation and scale transformation, with five common model specifications. In Simulation A, which uses linear regression families, preprocessing decisions explain 76.9% of the total variance in estimated treatment effects, whereas model choice explains only 7.5%. In Simulation B, which replaces the linear models with advanced algorithms (generalized additive models, random forests, gradient boosting), the dominance of preprocessing is even clearer: 99.8% of the variance is attributable to data handling and just 0.1% to model specification. The ranges of mean effects show the same pattern (4.34 vs. 1.43 in Simulation A; 15.30 vs. 0.56 in Simulation B). Particular pipelines-most notably those that standardize or log-transform variables-shrink effect estimates by more than 90% relative to the raw-data baseline, while pipelines that leave the original scale intact can inflate effects by an order of magnitude. Because preprocessing choices can overshadow even large shifts in statistical methodology, we call for meticulous reporting of these steps and for routine sensitivity or multiverse analyses that make their impact transparent. Such practices are essential for improving the robustness and replicability of behavioral-science RCTs.
看似常规的数据预处理选择可能会对随机对照试验(rct)得出的结论产生巨大影响,特别是在数据嘈杂、扭曲和充满异常值的行为科学中。我们通过模拟RCT数据的两个完全指定的多元宇宙分析来证明这种影响。每个分析跨越180个分析路径,通过36个预处理管道产生,这些管道包括异常值处理、缺失数据输入和尺度转换,具有五种常见的模型规格。在使用线性回归族的模拟A中,预处理决策解释了估计治疗效果中总方差的76.9%,而模型选择只解释了7.5%。在用高级算法(广义加性模型、随机森林、梯度增强)取代线性模型的模拟B中,预处理的主导地位更加明显:99.8%的方差归因于数据处理,只有0.1%归因于模型规范。平均效应的范围显示出相同的模式(模拟A为4.34 vs. 1.43;模拟B为15.30 vs. 0.56)。特定的管道——最明显的是那些标准化或对数变换变量的管道——相对于原始数据基线收缩了90%以上的效果估计,而保持原始规模完整的管道可以将效果膨胀一个数量级。由于预处理的选择甚至会掩盖统计方法上的重大变化,我们呼吁对这些步骤进行细致的报告,并进行常规的敏感性或多元宇宙分析,使其影响透明。这些实践对于提高行为科学随机对照试验的稳健性和可重复性至关重要。
{"title":"The Effects of Data Preprocessing Choices on Behavioral RCT Outcomes: A Multiverse Analysis.","authors":"Giuseppe A Veltri","doi":"10.1080/00273171.2025.2575399","DOIUrl":"https://doi.org/10.1080/00273171.2025.2575399","url":null,"abstract":"<p><p>Seemingly routine data-preprocessing choices can exert outsized influence on the conclusions drawn from randomized controlled trials (RCTs), particularly in behavioral science where data are noisy, skewed and replete with outliers. We demonstrate this influence with two fully specified multiverse analyses on simulated RCT data. Each analysis spans 180 analytical pathways, produced by crossing 36 preprocessing pipelines that vary outlier handling, missing-data imputation and scale transformation, with five common model specifications. In Simulation A, which uses linear regression families, preprocessing decisions explain 76.9% of the total variance in estimated treatment effects, whereas model choice explains only 7.5%. In Simulation B, which replaces the linear models with advanced algorithms (generalized additive models, random forests, gradient boosting), the dominance of preprocessing is even clearer: 99.8% of the variance is attributable to data handling and just 0.1% to model specification. The ranges of mean effects show the same pattern (4.34 vs. 1.43 in Simulation A; 15.30 vs. 0.56 in Simulation B). Particular pipelines-most notably those that standardize or log-transform variables-shrink effect estimates by more than 90% relative to the raw-data baseline, while pipelines that leave the original scale intact can inflate effects by an order of magnitude. Because preprocessing choices can overshadow even large shifts in statistical methodology, we call for meticulous reporting of these steps and for routine sensitivity or multiverse analyses that make their impact transparent. Such practices are essential for improving the robustness and replicability of behavioral-science RCTs.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1-16"},"PeriodicalIF":3.5,"publicationDate":"2025-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145439947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-02DOI: 10.1080/00273171.2025.2552304
Melanie Viola Partsch, David Goretzko
Despite the popularity of structural equation modeling in psychological research, accurately evaluating the fit of these models to data is still challenging. Using fixed fit index cutoffs is error-prone due to the fit indices' dependence on various features of the model and data ("nuisance parameters"). Nonetheless, applied researchers mostly rely on fixed fit index cutoffs, neglecting the risk of falsely accepting (or rejecting) their model. With the goal of developing a broadly applicable method that is almost independent of nuisance parameters, we introduce a machine learning (ML)-based approach to evaluate the fit of multi-factorial measurement models. We trained an ML model based on 173 model and data features that we extracted from 1,323,866 simulated data sets and models fitted by means of confirmatory factor analysis. We evaluated the performance of the ML model based on 1,659,386 independent test observations. The ML model performed very well in detecting model (mis-)fit in most conditions, hereby outperforming commonly used fixed fit index cutoffs across the board. Only minor misspecifications, such as a single neglected residual correlation, proved to be challenging to detect. This proof-of-concept study shows that ML is very promising in the context of model fit evaluation.
{"title":"Detecting Model Misfit in Structural Equation Modeling with Machine Learning-A Proof of Concept.","authors":"Melanie Viola Partsch, David Goretzko","doi":"10.1080/00273171.2025.2552304","DOIUrl":"10.1080/00273171.2025.2552304","url":null,"abstract":"<p><p>Despite the popularity of structural equation modeling in psychological research, accurately evaluating the fit of these models to data is still challenging. Using fixed fit index cutoffs is error-prone due to the fit indices' dependence on various features of the model and data (\"nuisance parameters\"). Nonetheless, applied researchers mostly rely on fixed fit index cutoffs, neglecting the risk of falsely accepting (or rejecting) their model. With the goal of developing a broadly applicable method that is almost independent of nuisance parameters, we introduce a machine learning (ML)-based approach to evaluate the fit of multi-factorial measurement models. We trained an ML model based on 173 model and data features that we extracted from 1,323,866 simulated data sets and models fitted by means of confirmatory factor analysis. We evaluated the performance of the ML model based on 1,659,386 independent test observations. The ML model performed very well in detecting model (mis-)fit in most conditions, hereby outperforming commonly used fixed fit index cutoffs across the board. Only minor misspecifications, such as a single neglected residual correlation, proved to be challenging to detect. This proof-of-concept study shows that ML is very promising in the context of model fit evaluation.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1-24"},"PeriodicalIF":3.5,"publicationDate":"2025-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145433132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-01Epub Date: 2025-09-02DOI: 10.1080/00273171.2025.2545367
Kenneth Koslowski, Jana Holtmann
Indicators of affect dynamics (IADs) capture temporal dependencies and instability in affective trajectories over time. However, the relevance of IADs for the prediction of time-invariant outcomes (e.g., depressive symptoms) was recently challenged due to results suggesting low predictive utility beyond intraindividual means and variances. We argue that these results may in part be explained by mathematical redundancies between IADs and static variability as well as the chosen modeling strategy. In three extensive simulation studies we investigate the accuracy and power for detecting non-null relations between IADs and an outcome variable in different relevant settings, illustrating the effect of the length of a time series, the presence of missing values or measurement error, as well as of erroneously fixing innovation variances to be equal across persons. We show that, if uncertainty in individual IAD estimates is not accounted for, relations between IADs (i.e., autoregressive effects) and a time-invariant outcome are underestimated even in large samples and propose the use of a latent multilevel one-step approach. In an empirical application we illustrate that the different modeling approaches can lead to different substantive conclusions regarding the role of negative affect inertia in the prediction of depressive symptoms.
{"title":"Unique Contributions of Dynamic Affect Indicators - Beyond Static Variability.","authors":"Kenneth Koslowski, Jana Holtmann","doi":"10.1080/00273171.2025.2545367","DOIUrl":"10.1080/00273171.2025.2545367","url":null,"abstract":"<p><p>Indicators of affect dynamics (IADs) capture temporal dependencies and instability in affective trajectories over time. However, the relevance of IADs for the prediction of time-invariant outcomes (e.g., depressive symptoms) was recently challenged due to results suggesting low predictive utility beyond intraindividual means and variances. We argue that these results may in part be explained by mathematical redundancies between IADs and static variability as well as the chosen modeling strategy. In three extensive simulation studies we investigate the accuracy and power for detecting non-null relations between IADs and an outcome variable in different relevant settings, illustrating the effect of the length of a time series, the presence of missing values or measurement error, as well as of erroneously fixing innovation variances to be equal across persons. We show that, if uncertainty in individual IAD estimates is not accounted for, relations between IADs (i.e., autoregressive effects) and a time-invariant outcome are underestimated even in large samples and propose the use of a latent multilevel one-step approach. In an empirical application we illustrate that the different modeling approaches can lead to different substantive conclusions regarding the role of negative affect inertia in the prediction of depressive symptoms.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1199-1220"},"PeriodicalIF":3.5,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144978248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-01Epub Date: 2025-09-08DOI: 10.1080/00273171.2025.2545362
Luis Eduardo Garrido, Alexander P Christensen, Hudson Golino, Agustín Martínez-Molina, Víctor B Arias, Kiero Guerra-Peña, María Dolores Nieto-Cañaveras, Flávio Azevedo, Francisco J Abad
Wording effects, the systematic method variance arising from the inconsistent responding to positively and negatively worded items of the same construct, are pervasive in the behavioral and health sciences. Although several factor modeling strategies have been proposed to mitigate their adverse effects, there is limited systematic research assessing their performance with exploratory structural equation models (ESEM). The present study evaluated the impact of different types of response bias related to wording effects (random and straight-line carelessness, acquiescence, item difficulty, and mixed) on ESEM models incorporating two popular method modeling strategies, the correlated traits-correlated methods minus one (CTC[M-1]) model and random intercept item factor analysis (RIIFA), as well as the "do nothing" approach. Five variables were manipulated using Monte Carlo methods: the type and magnitude of response bias, factor loadings, factor correlations, and sample size. Overall, the results showed that ignoring wording effects leads to poor model fit and serious distortions of the ESEM estimates. The RIIFA approach generally performed best at countering these adverse impacts and recovering unbiased factor structures, whereas the CTC(M-1) models struggled when biases affected both positively and negatively worded items. Our findings also indicated that method factors can sometimes reflect or absorb substantive variance, which may blur their associations with external variables and complicate their interpretation when embedded in broader structural models. A straightforward guide is offered to applied researchers who wish to use ESEM with mixed-worded scales.
{"title":"A Systematic Evaluation of Wording Effects Modeling Under the Exploratory Structural Equation Modeling Framework.","authors":"Luis Eduardo Garrido, Alexander P Christensen, Hudson Golino, Agustín Martínez-Molina, Víctor B Arias, Kiero Guerra-Peña, María Dolores Nieto-Cañaveras, Flávio Azevedo, Francisco J Abad","doi":"10.1080/00273171.2025.2545362","DOIUrl":"10.1080/00273171.2025.2545362","url":null,"abstract":"<p><p>Wording effects, the systematic method variance arising from the inconsistent responding to positively and negatively worded items of the same construct, are pervasive in the behavioral and health sciences. Although several factor modeling strategies have been proposed to mitigate their adverse effects, there is limited systematic research assessing their performance with exploratory structural equation models (ESEM). The present study evaluated the impact of different types of response bias related to wording effects (random and straight-line carelessness, acquiescence, item difficulty, and mixed) on ESEM models incorporating two popular method modeling strategies, the correlated traits-correlated methods minus one (CTC[M-1]) model and random intercept item factor analysis (RIIFA), as well as the \"do nothing\" approach. Five variables were manipulated using Monte Carlo methods: the type and magnitude of response bias, factor loadings, factor correlations, and sample size. Overall, the results showed that ignoring wording effects leads to poor model fit and serious distortions of the ESEM estimates. The RIIFA approach generally performed best at countering these adverse impacts and recovering unbiased factor structures, whereas the CTC(M-1) models struggled when biases affected both positively and negatively worded items. Our findings also indicated that method factors can sometimes reflect or absorb substantive variance, which may blur their associations with external variables and complicate their interpretation when embedded in broader structural models. A straightforward guide is offered to applied researchers who wish to use ESEM with mixed-worded scales.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1169-1198"},"PeriodicalIF":3.5,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145016636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-01Epub Date: 2025-09-04DOI: 10.1080/00273171.2025.2522733
Oisín Ryan, Fabian Dablander
Many psychological phenomena can be understood as arising from systems of causally connected components that evolve over time within an individual. In current empirical practice, researchers frequently study these systems by fitting statistical models to data collected at a single moment in time, that is, cross-sectional data. This raises a central question: Can cross-sectional data analysis ever yield causal insights into systems that evolve over time-and if so, under what conditions? In this paper, we address this question by introducing Equilibrium Causal Models (ECMs) to the psychological literature. ECMs are causal abstractions of an underlying dynamical system that allow for inferences about the long-term effects of interventions, permit cyclic causal relations, and can in principle be estimated from cross-sectional data, as long as information about the resting state of the system is captured by those measurements. We explain the conditions under which ECM estimation is possible, show that they allow researchers to learn about within-person processes from cross-sectional data, and discuss how tools from both the psychological measurement modeling and the causal discovery literature can inform the ways in which researchers collect and analyze their data.
{"title":"Equilibrium Causal Models: Connecting Dynamical Systems Modeling and Cross-Sectional Data Analysis.","authors":"Oisín Ryan, Fabian Dablander","doi":"10.1080/00273171.2025.2522733","DOIUrl":"10.1080/00273171.2025.2522733","url":null,"abstract":"<p><p>Many psychological phenomena can be understood as arising from systems of causally connected components that evolve over time within an individual. In current empirical practice, researchers frequently study these systems by fitting statistical models to data collected at a single moment in time, that is, cross-sectional data. This raises a central question: Can cross-sectional data analysis ever yield causal insights into systems that evolve over time-and if so, under what conditions? In this paper, we address this question by introducing Equilibrium Causal Models (ECMs) to the psychological literature. ECMs are causal abstractions of an underlying dynamical system that allow for inferences about the long-term effects of interventions, permit cyclic causal relations, and can in principle be estimated from cross-sectional data, as long as information about the resting state of the system is captured by those measurements. We explain the conditions under which ECM estimation is possible, show that they allow researchers to learn about within-person processes from cross-sectional data, and discuss how tools from both the psychological measurement modeling and the causal discovery literature can inform the ways in which researchers collect and analyze their data.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1116-1150"},"PeriodicalIF":3.5,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144994473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}