In unrestricted or exploratory factor analysis (EFA), there is a wide range of recommendations about the size samples should be to attain correct and stable solutions. In general, however, these recommendations are either rules of thumb or based on simulation results. As it is hard to establish the extent to which a particular data set suits the conditions used in a simulation study, the advice produced by simulation studies is not direct enough to be of practical use. Instead of trying to provide general and complex recommendations, in this article, we propose to estimate the sample size that is needed to analyze a data set at hand. The estimation takes into account the specified EFA model. The proposal is based on an intensive simulation process in which the sample correlation matrix is used as a basis for generating data sets from a pseudo-population in which the parent correlation holds exactly, and the criterion for determining the size required is a threshold that quantifies the closeness between the pseudo-population and the sample reproduced correlation matrices. The simulation results suggest that the proposal works well and that the determinants identified agree with those in the literature.
Ambulatory assessment (AA) is becoming an increasingly popular research method in the fields of psychology and life science. Nevertheless, knowledge about the effects that design choices, such as questionnaire length (i.e., number of items per questionnaire), have on AA data quality is still surprisingly restricted. Additionally, response styles (RS), which threaten data quality, have hardly been analyzed in the context of AA. The aim of the current research was to experimentally manipulate questionnaire length and investigate the association between questionnaire length and RS in an AA study. We expected that the group with the longer (82-item) questionnaire would show greater reliance on RS relative to the substantive traits than the group with the shorter (33-item) questionnaire. Students (n = 284) received questionnaires three times a day for 14 days. We used a multigroup two-dimensional item response tree model in a multilevel structural equation modeling framework to estimate midpoint and extreme RS in our AA study. We found that the long questionnaire group showed a greater reliance on RS relative to trait-based processes than the short questionnaire group. Although further validation of our findings is necessary, we hope that researchers consider our findings when planning an AA study in the future.
We implement an analytic approach for ordinal measures and we use it to investigate the structure and the changes over time of self-worth in a sample of adolescents students in high school. We represent the variations in self-worth and its various sub-domains using entropy-based measures that capture the observed uncertainty. We then study the evolution of the entropy across four time points throughout a semester of high school. Our analytic approach yields information about the configuration of the various dimensions of the self together with time-related changes and associations among these dimensions. We represent the results using a network that depicts self-worth changes over time. This approach also identifies groups of adolescent students who show different patterns of associations, thus emphasizing the need to consider heterogeneity in the data.
Continuous-time modeling using differential equations is a promising technique to model change processes with longitudinal data. Among ways to fit this model, the Latent Differential Structural Equation Modeling (LDSEM) approach defines latent derivative variables within a structural equation modeling (SEM) framework, thereby allowing researchers to leverage advantages of the SEM framework for model building, estimation, inference, and comparison purposes. Still, a few issues remain unresolved, including performance of multilevel variations of the LDSEM under short time lengths (e.g., 14 time points), particularly when coupled multivariate processes and time-varying covariates are involved. Additionally, the possibility of using Bayesian estimation to facilitate the estimation of multilevel LDSEM (M-LDSEM) models with complex and higher-dimensional random effect structures has not been investigated. We present a series of Monte Carlo simulations to evaluate three possible approaches to fitting M-LDSEM, including: frequentist single-level and two-level robust estimators and Bayesian two-level estimator. Our findings suggested that the Bayesian approach outperformed other frequentist approaches. The effects of time-varying covariates are well recovered, and coupling parameters are the least biased especially using higher-order derivative information with the Bayesian estimator. Finally, an empirical example is provided to show the applicability of the approach.
We present the R package galamm, whose goal is to provide common ground between structural equation modeling and mixed effect models. It supports estimation of models with an arbitrary number of crossed or nested random effects, smoothing splines, mixed response types, factor structures, heteroscedastic residuals, and data missing at random. Implementation using sparse matrix methods and automatic differentiation ensures computational efficiency. We here briefly present the implemented methodology, give an overview of the package and an example demonstrating its use.
Network psychometrics uses graphical models to assess the network structure of psychological variables. An important task in their analysis is determining which variables are unrelated in the network, i.e., are independent given the rest of the network variables. This conditional independence structure is a gateway to understanding the causal structure underlying psychological processes. Thus, it is crucial to have an appropriate method for evaluating conditional independence and dependence hypotheses. Bayesian approaches to testing such hypotheses allow researchers to differentiate between absence of evidence and evidence of absence of connections (edges) between pairs of variables in a network. Three Bayesian approaches to assessing conditional independence have been proposed in the network psychometrics literature. We believe that their theoretical foundations are not widely known, and therefore we provide a conceptual review of the proposed methods and highlight their strengths and limitations through a simulation study. We also illustrate the methods using an empirical example with data on Dark Triad Personality. Finally, we provide recommendations on how to choose the optimal method and discuss the current gaps in the literature on this important topic.
While Bayesian methodology is increasingly favored in behavioral research for its clear probabilistic inference and model structure, its widespread acceptance as a standard meta-analysis approach remains limited. Although some conventional Bayesian hierarchical models are frequently used for analysis, their performance has not been thoroughly examined. This study evaluates two commonly used Bayesian models for meta-analysis of standardized mean difference and identifies significant issues with these models. In response, we introduce a new Bayesian model equipped with novel features that address existing model concerns and a broader limitation of the current Bayesian meta-analysis. Furthermore, we introduce a simple computational approach to construct simultaneous credible intervals for the summary effect and between-study heterogeneity, based on their joint posterior samples. This fully captures the joint uncertainty in these parameters, a task that is challenging or impractical with frequentist models. Through simulation studies rooted in a joint Bayesian/frequentist paradigm, we compare our model's performance against existing ones under conditions that mirror realistic research scenarios. The results reveal that our new model outperforms others and shows enhanced statistical properties. We also demonstrate the practicality of our models using real-world examples, highlighting how our approach strengthens the robustness of inferences regarding the summary effect.
There has been an increasing call to model multivariate time series data with measurement error. The combination of latent factors with a vector autoregressive (VAR) model leads to the dynamic factor model (DFM), in which dynamic relations are derived within factor series, among factors and observed time series, or both. However, a few limitations exist in the current DFM representatives and estimation: (1) the dynamic component contains either directed or undirected contemporaneous relations, but not both, (2) selecting the optimal model in exploratory DFM is a challenge, (3) the consequences of structural misspecifications from model selection is barely studied. Our paper serves to advance DFM with a hybrid VAR representations and the utilization of LASSO regularization to select dynamic implied instrumental variable, two-stage least squares (MIIV-2SLS) estimation. Our proposed method highlights the flexibility in modeling the directions of dynamic relations with a robust estimation. We aim to offer researchers guidance on model selection and estimation in person-centered dynamic assessments.
Psychologists leverage longitudinal designs to examine the causal effects of a focal predictor (i.e., treatment or exposure) over time. But causal inference of naturally observed time-varying treatments is complicated by treatment-dependent confounding in which earlier treatments affect confounders of later treatments. In this tutorial article, we introduce psychologists to an established solution to this problem from the causal inference literature: the parametric g-computation formula. We explain why the g-formula is effective at handling treatment-dependent confounding. We demonstrate that the parametric g-formula is conceptually intuitive, easy to implement, and well-suited for psychological research. We first clarify that the parametric g-formula essentially utilizes a series of statistical models to estimate the joint distribution of all post-treatment variables. These statistical models can be readily specified as standard multiple linear regression functions. We leverage this insight to implement the parametric g-formula using lavaan, a widely adopted R package for structural equation modeling. Moreover, we describe how the parametric g-formula may be used to estimate a marginal structural model whose causal parameters parsimoniously encode time-varying treatment effects. We hope this accessible introduction to the parametric g-formula will equip psychologists with an analytic tool to address their causal inquiries using longitudinal data.