This study challenges three common methodological beliefs and practices. The first question examines whether ordinal reliability estimators are more accurate than continuous estimators for unidimensional data with uncorrelated errors. Continuous estimators (e.g., coefficient alpha) can be applied to both continuous and ordinal data, while ordinal estimators (e.g., ordinal alpha and categorical omega) are specific to ordinal data. Although ordinal estimators are often argued to have conceptual advantages, comprehensive investigations into their accuracy are limited. The second question explores the relationship between skewness and kurtosis in ordinal data. Previous simulation studies have primarily examined cases where skewness and kurtosis change in the same direction, leaving gaps in understanding their independent effects. The third question addresses item response theory (IRT) models: Should the scaling constant always be fixed at the same value (e.g., 1.7)? To answer these questions, this study conducted a Monte Carlo simulation comparing four continuous estimators and eight ordinal estimators. The results indicated that most estimators achieved acceptable levels of accuracy. On average, ordinal estimators were slightly less accurate than continuous estimators, though the difference was smaller than what most users would consider practically significant (e.g., less than 0.01). However, ordinal alpha stood out as a notable exception, severely overestimating reliability across various conditions. Regarding the scaling constant in IRT models, the results indicated that its optimal value varied depending on the data type (e.g., dichotomous vs. polytomous). In some cases, values below 1.7 were optimal, while in others, values above 1.8 were optimal. (PsycInfo Database Record (c) 2025 APA, all rights reserved).
{"title":"Reliability in unidimensional ordinal data: A comparison of continuous and ordinal estimators.","authors":"Eunseong Cho, Sébastien Béland","doi":"10.1037/met0000739","DOIUrl":"https://doi.org/10.1037/met0000739","url":null,"abstract":"<p><p>This study challenges three common methodological beliefs and practices. The first question examines whether ordinal reliability estimators are more accurate than continuous estimators for unidimensional data with uncorrelated errors. Continuous estimators (e.g., coefficient alpha) can be applied to both continuous and ordinal data, while ordinal estimators (e.g., ordinal alpha and categorical omega) are specific to ordinal data. Although ordinal estimators are often argued to have conceptual advantages, comprehensive investigations into their accuracy are limited. The second question explores the relationship between skewness and kurtosis in ordinal data. Previous simulation studies have primarily examined cases where skewness and kurtosis change in the same direction, leaving gaps in understanding their independent effects. The third question addresses item response theory (IRT) models: Should the scaling constant always be fixed at the same value (e.g., 1.7)? To answer these questions, this study conducted a Monte Carlo simulation comparing four continuous estimators and eight ordinal estimators. The results indicated that most estimators achieved acceptable levels of accuracy. On average, ordinal estimators were slightly less accurate than continuous estimators, though the difference was smaller than what most users would consider practically significant (e.g., less than 0.01). However, ordinal alpha stood out as a notable exception, severely overestimating reliability across various conditions. Regarding the scaling constant in IRT models, the results indicated that its optimal value varied depending on the data type (e.g., dichotomous vs. polytomous). In some cases, values below 1.7 were optimal, while in others, values above 1.8 were optimal. (PsycInfo Database Record (c) 2025 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.6,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143391582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
To study the dimensional structure of psychological phenomena, a precise definition of unidimensionality is essential. Most definitions of unidimensionality rely on factor analysis. However, the reliability of factor analysis depends on the input data, which primarily consists of Pearson correlations. A significant issue with Pearson correlations is that they are almost guaranteed to underestimate unidimensionality, rendering them unsuitable for evaluating the unidimensionality of a scale. This article formally demonstrates that the simple unidimensionality index H is always at least as high as, or higher than, the Pearson correlation for dichotomous and polytomous items (φ). Leveraging this inequality, a case is presented where five dichotomous items are perfectly unidimensional, yet factor analysis based on φ incorrectly suggests a two-dimensional solution. To illustrate that this issue extends beyond theoretical scenarios, an analysis of real data from a statistics exam (N = 133) is conducted, revealing the same problem. An in-depth analysis of the exam data shows that violations of unidimensionality are systematic and should not be dismissed as mere noise. Inconsistent answering patterns can indicate whether a participant blundered, cheated, or has conceptual misunderstandings, information typically overlooked by traditional scaling procedures based on correlations. The conclusion is that psychologists should consider unidimensionality not as a peripheral concern but as the foundation for any serious scaling attempt. The index H could play a crucial role in establishing this foundation. (PsycInfo Database Record (c) 2025 APA, all rights reserved).
{"title":"The relationship between the phi coefficient and the unidimensionality index H: Improving psychological scaling from the ground up.","authors":"Johannes Titz","doi":"10.1037/met0000736","DOIUrl":"https://doi.org/10.1037/met0000736","url":null,"abstract":"<p><p>To study the dimensional structure of psychological phenomena, a precise definition of unidimensionality is essential. Most definitions of unidimensionality rely on factor analysis. However, the reliability of factor analysis depends on the input data, which primarily consists of Pearson correlations. A significant issue with Pearson correlations is that they are almost guaranteed to underestimate unidimensionality, rendering them unsuitable for evaluating the unidimensionality of a scale. This article formally demonstrates that the simple unidimensionality index <i>H</i> is always at least as high as, or higher than, the Pearson correlation for dichotomous and polytomous items (φ). Leveraging this inequality, a case is presented where five dichotomous items are perfectly unidimensional, yet factor analysis based on φ incorrectly suggests a two-dimensional solution. To illustrate that this issue extends beyond theoretical scenarios, an analysis of real data from a statistics exam (<i>N</i> = 133) is conducted, revealing the same problem. An in-depth analysis of the exam data shows that violations of unidimensionality are systematic and should not be dismissed as mere noise. Inconsistent answering patterns can indicate whether a participant blundered, cheated, or has conceptual misunderstandings, information typically overlooked by traditional scaling procedures based on correlations. The conclusion is that psychologists should consider unidimensionality not as a peripheral concern but as the foundation for any serious scaling attempt. The index <i>H</i> could play a crucial role in establishing this foundation. (PsycInfo Database Record (c) 2025 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.6,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143391502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wes Bonifay, Li Cai, Carl F Falk, Kristopher J Preacher
Model complexity is a critical consideration when evaluating a statistical model. To quantify complexity, one can examine fitting propensity (FP), or the ability of the model to fit well to diverse patterns of data. The scant foundational research on FP has focused primarily on proof of concept rather than practical application. To address this oversight, the present work joins a recently published study in examining the FP of models that are commonly applied in factor analysis. We begin with a historical account of statistical model evaluation, which refutes the notion that complexity can be fully understood by counting the number of free parameters in the model. We then present three sets of analytic examples to better understand the FP of exploratory and confirmatory factor analysis models that are widely used in applied research. We characterize our findings relative to previously disseminated claims about factor model FP. Finally, we provide some recommendations for future research on FP in latent variable modeling. (PsycInfo Database Record (c) 2025 APA, all rights reserved).
{"title":"Reassessing the fitting propensity of factor models.","authors":"Wes Bonifay, Li Cai, Carl F Falk, Kristopher J Preacher","doi":"10.1037/met0000735","DOIUrl":"https://doi.org/10.1037/met0000735","url":null,"abstract":"<p><p>Model complexity is a critical consideration when evaluating a statistical model. To quantify complexity, one can examine fitting propensity (FP), or the ability of the model to fit well to diverse patterns of data. The scant foundational research on FP has focused primarily on proof of concept rather than practical application. To address this oversight, the present work joins a recently published study in examining the FP of models that are commonly applied in factor analysis. We begin with a historical account of statistical model evaluation, which refutes the notion that complexity can be fully understood by counting the number of free parameters in the model. We then present three sets of analytic examples to better understand the FP of exploratory and confirmatory factor analysis models that are widely used in applied research. We characterize our findings relative to previously disseminated claims about factor model FP. Finally, we provide some recommendations for future research on FP in latent variable modeling. (PsycInfo Database Record (c) 2025 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.6,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143391579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-01Epub Date: 2023-06-12DOI: 10.1037/met0000593
Matthew J Valente, Judith J M Rijnhart, Oscar Gonzalez
Moderation analysis is used to study under what conditions or for which subgroups of individuals a treatment effect is stronger or weaker. When a moderator variable is categorical, such as assigned sex, treatment effects can be estimated for each group resulting in a treatment effect for males and a treatment effect for females. If a moderator variable is a continuous variable, a strategy for investigating moderated treatment effects is to estimate conditional effects (i.e., simple slopes) via the pick-a-point approach. When conditional effects are estimated using the pick-a-point approach, the conditional effects are often given the interpretation of "the treatment effect for the subgroup of individuals…." However, the interpretation of these conditional effects as subgroup effects is potentially misleading because conditional effects are interpreted at a specific value of the moderator variable (e.g., +1 SD above the mean). We describe a simple solution that resolves this problem using a simulation-based approach. We describe how to apply this simulation-based approach to estimate subgroup effects by defining subgroups using a range of scores on the continuous moderator variable. We apply this method to three empirical examples to demonstrate how to estimate subgroup effects for moderated treatment and moderated mediated effects when the moderator variable is a continuous variable. Finally, we provide researchers with both SAS and R code to implement this method for similar situations described in this paper. (PsycInfo Database Record (c) 2025 APA, all rights reserved).
{"title":"A novel approach to estimate moderated treatment effects and moderated mediated effects with continuous moderators.","authors":"Matthew J Valente, Judith J M Rijnhart, Oscar Gonzalez","doi":"10.1037/met0000593","DOIUrl":"10.1037/met0000593","url":null,"abstract":"<p><p>Moderation analysis is used to study under what conditions or for which subgroups of individuals a treatment effect is stronger or weaker. When a moderator variable is categorical, such as assigned sex, treatment effects can be estimated for each group resulting in a treatment effect for males and a treatment effect for females. If a moderator variable is a continuous variable, a strategy for investigating moderated treatment effects is to estimate conditional effects (i.e., simple slopes) via the pick-a-point approach. When conditional effects are estimated using the pick-a-point approach, the conditional effects are often given the interpretation of \"the treatment effect for the subgroup of individuals….\" However, the interpretation of these conditional effects as <i>subgroup</i> effects is potentially misleading because conditional effects are interpreted at a specific value of the moderator variable (e.g., +1 <i>SD</i> above the mean). We describe a simple solution that resolves this problem using a simulation-based approach. We describe how to apply this simulation-based approach to estimate subgroup effects by defining subgroups using a <i>range of scores</i> on the continuous moderator variable. We apply this method to three empirical examples to demonstrate how to estimate subgroup effects for moderated treatment and moderated mediated effects when the moderator variable is a continuous variable. Finally, we provide researchers with both SAS and R code to implement this method for similar situations described in this paper. (PsycInfo Database Record (c) 2025 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":"1-15"},"PeriodicalIF":7.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10713862/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9620515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-01Epub Date: 2023-03-27DOI: 10.1037/met0000554
Beth Baribault, Anne G E Collins
Using Bayesian methods to apply computational models of cognitive processes, or Bayesian cognitive modeling, is an important new trend in psychological research. The rise of Bayesian cognitive modeling has been accelerated by the introduction of software that efficiently automates the Markov chain Monte Carlo sampling used for Bayesian model fitting-including the popular Stan and PyMC packages, which automate the dynamic Hamiltonian Monte Carlo and No-U-Turn Sampler (HMC/NUTS) algorithms that we spotlight here. Unfortunately, Bayesian cognitive models can struggle to pass the growing number of diagnostic checks required of Bayesian models. If any failures are left undetected, inferences about cognition based on the model's output may be biased or incorrect. As such, Bayesian cognitive models almost always require troubleshooting before being used for inference. Here, we present a deep treatment of the diagnostic checks and procedures that are critical for effective troubleshooting, but are often left underspecified by tutorial papers. After a conceptual introduction to Bayesian cognitive modeling and HMC/NUTS sampling, we outline the diagnostic metrics, procedures, and plots necessary to detect problems in model output with an emphasis on how these requirements have recently been changed and extended. Throughout, we explain how uncovering the exact nature of the problem is often the key to identifying solutions. We also demonstrate the troubleshooting process for an example hierarchical Bayesian model of reinforcement learning, including supplementary code. With this comprehensive guide to techniques for detecting, identifying, and overcoming problems in fitting Bayesian cognitive models, psychologists across subfields can more confidently build and use Bayesian cognitive models in their research. (PsycInfo Database Record (c) 2025 APA, all rights reserved).
{"title":"Troubleshooting Bayesian cognitive models.","authors":"Beth Baribault, Anne G E Collins","doi":"10.1037/met0000554","DOIUrl":"10.1037/met0000554","url":null,"abstract":"<p><p>Using Bayesian methods to apply computational models of cognitive processes, or <i>Bayesian cognitive modeling</i>, is an important new trend in psychological research. The rise of Bayesian cognitive modeling has been accelerated by the introduction of software that efficiently automates the Markov chain Monte Carlo sampling used for Bayesian model fitting-including the popular Stan and PyMC packages, which automate the dynamic Hamiltonian Monte Carlo and No-U-Turn Sampler (HMC/NUTS) algorithms that we spotlight here. Unfortunately, Bayesian cognitive models can struggle to pass the growing number of diagnostic checks required of Bayesian models. If any failures are left undetected, inferences about cognition based on the model's output may be biased or incorrect. As such, Bayesian cognitive models almost always require <i>troubleshooting</i> before being used for inference. Here, we present a deep treatment of the diagnostic checks and procedures that are critical for effective troubleshooting, but are often left underspecified by tutorial papers. After a conceptual introduction to Bayesian cognitive modeling and HMC/NUTS sampling, we outline the diagnostic metrics, procedures, and plots necessary to detect problems in model output with an emphasis on how these requirements have recently been changed and extended. Throughout, we explain how uncovering the exact nature of the problem is often the key to identifying solutions. We also demonstrate the troubleshooting process for an example hierarchical Bayesian model of reinforcement learning, including supplementary code. With this comprehensive guide to techniques for detecting, identifying, and overcoming problems in fitting Bayesian cognitive models, psychologists across subfields can more confidently build and use Bayesian cognitive models in their research. (PsycInfo Database Record (c) 2025 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":"128-154"},"PeriodicalIF":7.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10522800/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9188270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-01Epub Date: 2023-05-25DOI: 10.1037/met0000579
Pablo Nájera, Francisco J Abad, Miguel A Sorrel
The number of available factor analytic techniques has been increasing in the last decades. However, the lack of clear guidelines and exhaustive comparison studies between the techniques might hinder that these valuable methodological advances make their way to applied research. The present paper evaluates the performance of confirmatory factor analysis (CFA), CFA with sequential model modification using modification indices and the Saris procedure, exploratory factor analysis (EFA) with different rotation procedures (Geomin, target, and objectively refined target matrix), Bayesian structural equation modeling (BSEM), and a new set of procedures that, after fitting an unrestrictive model (i.e., EFA, BSEM), identify and retain only the relevant loadings to provide a parsimonious CFA solution (ECFA, BCFA). By means of an exhaustive Monte Carlo simulation study and a real data illustration, it is shown that CFA and BSEM are overly stiff and, consequently, do not appropriately recover the structure of slightly misspecified models. EFA usually provides the most accurate parameter estimates, although the rotation procedure choice is of major importance, especially depending on whether the latent factors are correlated or not. Finally, ECFA might be a sound option whenever an a priori structure cannot be hypothesized and the latent factors are correlated. Moreover, it is shown that the pattern of the results of a factor analytic technique can be somehow predicted based on its positioning in the confirmatory-exploratory continuum. Applied recommendations are given for the selection of the most appropriate technique under different representative scenarios by means of a detailed flowchart. (PsycInfo Database Record (c) 2025 APA, all rights reserved).
{"title":"Is exploratory factor analysis always to be preferred? A systematic comparison of factor analytic techniques throughout the confirmatory-exploratory continuum.","authors":"Pablo Nájera, Francisco J Abad, Miguel A Sorrel","doi":"10.1037/met0000579","DOIUrl":"10.1037/met0000579","url":null,"abstract":"<p><p>The number of available factor analytic techniques has been increasing in the last decades. However, the lack of clear guidelines and exhaustive comparison studies between the techniques might hinder that these valuable methodological advances make their way to applied research. The present paper evaluates the performance of confirmatory factor analysis (CFA), CFA with sequential model modification using modification indices and the Saris procedure, exploratory factor analysis (EFA) with different rotation procedures (Geomin, target, and objectively refined target matrix), Bayesian structural equation modeling (BSEM), and a new set of procedures that, after fitting an unrestrictive model (i.e., EFA, BSEM), identify and retain only the relevant loadings to provide a parsimonious CFA solution (ECFA, BCFA). By means of an exhaustive Monte Carlo simulation study and a real data illustration, it is shown that CFA and BSEM are overly stiff and, consequently, do not appropriately recover the structure of slightly misspecified models. EFA usually provides the most accurate parameter estimates, although the rotation procedure choice is of major importance, especially depending on whether the latent factors are correlated or not. Finally, ECFA might be a sound option whenever an a priori structure cannot be hypothesized and the latent factors are correlated. Moreover, it is shown that the pattern of the results of a factor analytic technique can be somehow predicted based on its positioning in the confirmatory-exploratory continuum. Applied recommendations are given for the selection of the most appropriate technique under different representative scenarios by means of a detailed flowchart. (PsycInfo Database Record (c) 2025 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":"16-39"},"PeriodicalIF":7.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9876148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-01Epub Date: 2024-07-18DOI: 10.1037/met0000665
Charles C Driver
The interpretation of cross-effects from vector autoregressive models to infer structure and causality among constructs is widespread and sometimes problematic. I describe problems in the interpretation of cross-effects when processes that are thought to fluctuate continuously in time are, as is typically done, modeled as changing only in discrete steps (as in e.g., structural equation modeling)-zeroes in a discrete-time temporal matrix do not necessarily correspond to zero effects in the underlying continuous processes, and vice versa. This has implications for the common case when the presence or absence of cross-effects is used for inference about underlying causal processes. I demonstrate these problems via simulation, and also show that when an underlying set of processes are continuous in time, even relatively few direct causal links can result in much denser temporal effect matrices in discrete-time. I demonstrate one solution to these issues, namely parameterizing the system as a stochastic differential equation and focusing inference on the continuous-time temporal effects. I follow this with some discussion of issues regarding the switch to continuous-time, specifically regularization, appropriate measurement time lag, and model order. An empirical example using intensive longitudinal data highlights some of the complexities of applying such approaches to real data, particularly with respect to model specification, examining misspecification, and parameter interpretation. (PsycInfo Database Record (c) 2025 APA, all rights reserved).
从向量自回归模型中解释交叉效应来推断结构和构造之间的因果关系是很普遍的,有时也会出现问题。我将描述当被认为在时间上连续波动的过程被建模为仅在离散步骤中变化(如结构方程建模)时,交叉效应解释中存在的问题--离散时间时间矩阵中的零效应并不一定对应于基本连续过程中的零效应,反之亦然。这对利用交叉效应的存在或不存在来推断基本因果过程的常见情况有影响。我通过模拟演示了这些问题,并表明当一组基础过程在时间上是连续的,即使相对较少的直接因果联系也会导致离散时间中更密集的时间效应矩阵。我演示了解决这些问题的一种方法,即把系统参数化为随机微分方程,并把推论重点放在连续时间的时间效应上。接下来,我将讨论有关转换到连续时间的问题,特别是正则化、适当的测量时滞和模型阶数。一个使用密集纵向数据的实证例子突出说明了将这种方法应用于真实数据的一些复杂性,特别是在模型规范、检查错误规范和参数解释方面。(PsycInfo Database Record (c) 2024 APA, 版权所有)。
{"title":"Inference with cross-lagged effects-Problems in time.","authors":"Charles C Driver","doi":"10.1037/met0000665","DOIUrl":"10.1037/met0000665","url":null,"abstract":"<p><p>The interpretation of cross-effects from vector autoregressive models to infer structure and causality among constructs is widespread and sometimes problematic. I describe problems in the interpretation of cross-effects when processes that are thought to fluctuate continuously in time are, as is typically done, modeled as changing only in discrete steps (as in e.g., structural equation modeling)-zeroes in a discrete-time temporal matrix do not necessarily correspond to zero effects in the underlying continuous processes, and vice versa. This has implications for the common case when the presence or absence of cross-effects is used for inference about underlying causal processes. I demonstrate these problems via simulation, and also show that when an underlying set of processes are continuous in time, even relatively few direct causal links can result in much denser temporal effect matrices in discrete-time. I demonstrate one solution to these issues, namely parameterizing the system as a stochastic differential equation and focusing inference on the continuous-time temporal effects. I follow this with some discussion of issues regarding the switch to continuous-time, specifically regularization, appropriate measurement time lag, and model order. An empirical example using intensive longitudinal data highlights some of the complexities of applying such approaches to real data, particularly with respect to model specification, examining misspecification, and parameter interpretation. (PsycInfo Database Record (c) 2025 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":"174-202"},"PeriodicalIF":7.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141634308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-01Epub Date: 2023-08-10DOI: 10.1037/met0000586
Philipp Sterner, David Goretzko, Florian Pargent
Psychology has seen an increase in the use of machine learning (ML) methods. In many applications, observations are classified into one of two groups (binary classification). Off-the-shelf classification algorithms assume that the costs of a misclassification (false positive or false negative) are equal. Because this is often not reasonable (e.g., in clinical psychology), cost-sensitive machine learning (CSL) methods can take different cost ratios into account. We present the mathematical foundations and introduce a taxonomy of the most commonly used CSL methods, before demonstrating their application and usefulness on psychological data, that is, the drug consumption data set (N = 1, 885) from the University of California Irvine ML Repository. In our example, all demonstrated CSL methods noticeably reduced mean misclassification costs compared to regular ML algorithms. We discuss the necessity for researchers to perform small benchmarks of CSL methods for their own practical application. Thus, our open materials provide R code, demonstrating how CSL methods can be applied within the mlr3 framework (https://osf.io/cvks7/). (PsycInfo Database Record (c) 2025 APA, all rights reserved).
{"title":"Everything has its price: Foundations of cost-sensitive machine learning and its application in psychology.","authors":"Philipp Sterner, David Goretzko, Florian Pargent","doi":"10.1037/met0000586","DOIUrl":"10.1037/met0000586","url":null,"abstract":"<p><p>Psychology has seen an increase in the use of machine learning (ML) methods. In many applications, observations are classified into one of two groups (binary classification). Off-the-shelf classification algorithms assume that the costs of a misclassification (false positive or false negative) are equal. Because this is often not reasonable (e.g., in clinical psychology), cost-sensitive machine learning (CSL) methods can take different cost ratios into account. We present the mathematical foundations and introduce a taxonomy of the most commonly used CSL methods, before demonstrating their application and usefulness on psychological data, that is, the drug consumption data set (<i>N</i> = 1, 885) from the University of California Irvine ML Repository. In our example, all demonstrated CSL methods noticeably reduced mean misclassification costs compared to regular ML algorithms. We discuss the necessity for researchers to perform small benchmarks of CSL methods for their own practical application. Thus, our open materials provide R code, demonstrating how CSL methods can be applied within the mlr3 framework (https://osf.io/cvks7/). (PsycInfo Database Record (c) 2025 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":"112-127"},"PeriodicalIF":7.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9967423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-01Epub Date: 2023-01-09DOI: 10.1037/met0000539
Diego G Campos, Mike W-L Cheung, Ronny Scherer
The increasing availability of individual participant data (IPD) in the social sciences offers new possibilities to synthesize research evidence across primary studies. Two-stage IPD meta-analysis represents a framework that can utilize these possibilities. While most of the methodological research on two-stage IPD meta-analysis focused on its performance compared with other approaches, dealing with the complexities of the primary and meta-analytic data has received little attention, particularly when IPD are drawn from complex sampling surveys. Complex sampling surveys often feature clustering, stratification, and multistage sampling to obtain nationally or internationally representative data from a target population. Furthermore, IPD from these studies is likely to provide more than one effect size. To address these complexities, we propose a two-stage meta-analytic approach that generates model-based effect sizes in Stage 1 and synthesizes them in Stage 2. We present a sequence of steps, illustrate their implementation, and discuss the methodological decisions and options within. Given its flexibility to deal with the complex nature of the primary and meta-analytic data and its ability to combine multiple IPD sets or IPD with aggregated data, the proposed two-stage approach opens up new analytic possibilities for synthesizing knowledge from complex sampling surveys. (PsycInfo Database Record (c) 2025 APA, all rights reserved).
{"title":"A primer on synthesizing individual participant data obtained from complex sampling surveys: A two-stage IPD meta-analysis approach.","authors":"Diego G Campos, Mike W-L Cheung, Ronny Scherer","doi":"10.1037/met0000539","DOIUrl":"10.1037/met0000539","url":null,"abstract":"<p><p>The increasing availability of individual participant data (IPD) in the social sciences offers new possibilities to synthesize research evidence across primary studies. Two-stage IPD meta-analysis represents a framework that can utilize these possibilities. While most of the methodological research on two-stage IPD meta-analysis focused on its performance compared with other approaches, dealing with the complexities of the primary and meta-analytic data has received little attention, particularly when IPD are drawn from complex sampling surveys. Complex sampling surveys often feature clustering, stratification, and multistage sampling to obtain nationally or internationally representative data from a target population. Furthermore, IPD from these studies is likely to provide more than one effect size. To address these complexities, we propose a two-stage meta-analytic approach that generates model-based effect sizes in Stage 1 and synthesizes them in Stage 2. We present a sequence of steps, illustrate their implementation, and discuss the methodological decisions and options within. Given its flexibility to deal with the complex nature of the primary and meta-analytic data and its ability to combine multiple IPD sets or IPD with aggregated data, the proposed two-stage approach opens up new analytic possibilities for synthesizing knowledge from complex sampling surveys. (PsycInfo Database Record (c) 2025 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":"83-111"},"PeriodicalIF":7.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10501727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-01Epub Date: 2024-02-15DOI: 10.1037/met0000643
Fabio Mason, Eva Cantoni, Paolo Ghisletta
The linear mixed model (LMM) and latent growth model (LGM) are frequently applied to within-subject two-group comparison studies to investigate group differences in the time effect, supposedly due to differential group treatments. Yet, research about LMM and LGM in the presence of outliers (defined as observations with a very low probability of occurrence if assumed from a given distribution) is scarce. Moreover, when such research exists, it focuses on estimation properties (bias and efficiency), neglecting inferential characteristics (e.g., power and type-I error). We study power and type-I error rates of Wald-type and bootstrap confidence intervals (CIs), as well as coverage and length of CIs and mean absolute error (MAE) of estimates, associated with classical and robust estimations of LMM and LGM, applied to a within-subject two-group comparison design. We conduct a Monte Carlo simulation experiment to compare CIs and MAEs under different conditions: data (a) without contamination, (b) contaminated by within-subject outliers, (c) contaminated by between-subject outliers, and (d) both contaminated by within- and between-subject outliers. Results show that without contamination, methods perform similarly, except CIs based on S, a robust LMM estimator, which are slightly less close to nominal values in their coverage. However, in the presence of both within- and between-subject outliers, CIs based on robust estimators, especially S, performed better than those of classical methods. In particular, the percentile CI with the wild bootstrap applied to the robust LMM estimators outperformed all other methods, especially with between-subject outliers, when we found the classical Wald-type CI based on the t statistic with Satterthwaite approximation for LMM to be highly misleading. We provide R code to compute all methods presented here. (PsycInfo Database Record (c) 2025 APA, all rights reserved).
线性混合模型(LMM)和潜在增长模型(LGM)经常被应用于受试者内两组比较研究,以调查时间效应的组间差异,这可能是由于不同的组间处理造成的。然而,关于 LMM 和 LGM 在存在异常值(定义为假定为给定分布时出现概率极低的观测值)的情况下的研究却很少。此外,即使有此类研究,也主要集中在估计特性(偏差和效率)上,而忽略了推论特性(如功率和类型一误差)。我们研究了 Wald 型和 bootstrap 置信区间 (CI) 的功率和 I 型误差率,以及 CI 的覆盖范围和长度和估计值的平均绝对误差 (MAE),这些都与 LMM 和 LGM 的经典和稳健估计有关,并应用于受试者内两组比较设计。我们进行了蒙特卡罗模拟实验,以比较不同条件下的 CI 和 MAE:数据 (a) 无污染,(b) 受研究对象内异常值污染,(c) 受研究对象间异常值污染,(d) 同时受研究对象内和研究对象间异常值污染。结果表明,在没有污染的情况下,除了基于 S(一种稳健的 LMM 估计器)的 CI 值在覆盖范围上略微偏离名义值之外,其他方法的表现类似。然而,在存在受试者内和受试者间异常值的情况下,基于稳健估计器(尤其是 S)的 CI 比传统方法的 CI 表现更好。特别是,当我们发现基于 t 统计量和 Satterthwaite 近似 LMM 的经典 Wald 型 CI 极易误导时,应用于稳健 LMM 估计器的野生自举法百分位数 CI 的表现优于所有其他方法,尤其是在存在研究对象间异常值的情况下。我们提供了 R 代码来计算本文介绍的所有方法。(PsycInfo 数据库记录 (c) 2024 APA,保留所有权利)。
{"title":"Linear mixed models and latent growth curve models for group comparison studies contaminated by outliers.","authors":"Fabio Mason, Eva Cantoni, Paolo Ghisletta","doi":"10.1037/met0000643","DOIUrl":"10.1037/met0000643","url":null,"abstract":"<p><p>The linear mixed model (LMM) and latent growth model (LGM) are frequently applied to within-subject two-group comparison studies to investigate group differences in the time effect, supposedly due to differential group treatments. Yet, research about LMM and LGM in the presence of outliers (defined as observations with a very low probability of occurrence if assumed from a given distribution) is scarce. Moreover, when such research exists, it focuses on estimation properties (bias and efficiency), neglecting inferential characteristics (e.g., power and type-I error). We study power and type-I error rates of Wald-type and bootstrap confidence intervals (CIs), as well as coverage and length of CIs and mean absolute error (MAE) of estimates, associated with classical and robust estimations of LMM and LGM, applied to a within-subject two-group comparison design. We conduct a Monte Carlo simulation experiment to compare CIs and MAEs under different conditions: data (a) without contamination, (b) contaminated by within-subject outliers, (c) contaminated by between-subject outliers, and (d) both contaminated by within- and between-subject outliers. Results show that without contamination, methods perform similarly, except CIs based on S, a robust LMM estimator, which are slightly less close to nominal values in their coverage. However, in the presence of both within- and between-subject outliers, CIs based on robust estimators, especially S, performed better than those of classical methods. In particular, the percentile CI with the wild bootstrap applied to the robust LMM estimators outperformed all other methods, especially with between-subject outliers, when we found the classical Wald-type CI based on the t statistic with Satterthwaite approximation for LMM to be highly misleading. We provide R code to compute all methods presented here. (PsycInfo Database Record (c) 2025 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":"155-173"},"PeriodicalIF":7.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139735975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}