Cross-validation and predictive metrics in psychological research: Do not leave out the leave-one-out.

IF 4.6 2区心理学 Q1 PSYCHOLOGY, EXPERIMENTAL Behavior Research Methods Pub Date : 2025-02-03 DOI:10.3758/s13428-024-02588-w

Diego Iglesias, Miguel A Sorrel, Ricardo Olmos

{"title":"Cross-validation and predictive metrics in psychological research: Do not leave out the leave-one-out.","authors":"Diego Iglesias, Miguel A Sorrel, Ricardo Olmos","doi":"10.3758/s13428-024-02588-w","DOIUrl":null,"url":null,"abstract":"<p><p>There is growing interest in integrating explanatory and predictive research practices in psychological research. For this integration to be successful, the psychologist's toolkit must incorporate standard procedures that enable a direct estimation of the prediction error, such as cross-validation (CV). Despite their apparent simplicity, CV methods are intricate, and thus it is crucial to adapt them to specific contexts and predictive metrics. This study delves into the performance of different CV methods in estimating the prediction error in the <math> <msup><mrow><mi>R</mi></mrow> <mn>2</mn></msup> </math> and <math><mtext>MSE</mtext></math> metrics in regression analysis, ubiquitous in psychological research. Current approaches, which rely on the 5- or 10-fold rule of thumb or on the squared correlation between predicted and observed values, present limitations when computing the prediction error in the <math> <msup><mrow><mi>R</mi></mrow> <mn>2</mn></msup> </math> metric, a widely used statistic in the behavioral sciences. We propose the use of an alternative method that overcomes these limitations and enables the computation of the leave-one-out (LOO) in the <math> <msup><mrow><mi>R</mi></mrow> <mn>2</mn></msup> </math> metric. Through two Monte Carlo simulation studies and the application of CV to the data from the Many Labs Replication Project, we show that the LOO consistently has the best performance. The CV methods discussed in the present study have been implemented in the R package OutR2.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 3","pages":"85"},"PeriodicalIF":4.6000,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Behavior Research Methods","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.3758/s13428-024-02588-w","RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, EXPERIMENTAL","Score":null,"Total":0}

引用次数: 0

Abstract

There is growing interest in integrating explanatory and predictive research practices in psychological research. For this integration to be successful, the psychologist's toolkit must incorporate standard procedures that enable a direct estimation of the prediction error, such as cross-validation (CV). Despite their apparent simplicity, CV methods are intricate, and thus it is crucial to adapt them to specific contexts and predictive metrics. This study delves into the performance of different CV methods in estimating the prediction error in the $R^{2}$ and $MSE$ metrics in regression analysis, ubiquitous in psychological research. Current approaches, which rely on the 5- or 10-fold rule of thumb or on the squared correlation between predicted and observed values, present limitations when computing the prediction error in the $R^{2}$ metric, a widely used statistic in the behavioral sciences. We propose the use of an alternative method that overcomes these limitations and enables the computation of the leave-one-out (LOO) in the $R^{2}$ metric. Through two Monte Carlo simulation studies and the application of CV to the data from the Many Labs Replication Project, we show that the LOO consistently has the best performance. The CV methods discussed in the present study have been implemented in the R package OutR2.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

求助全文

约1分钟内获得全文去求助

来源期刊

Behavior Research Methods Multiple-

CiteScore

10.30

自引率

9.30%

发文量

266

期刊介绍： Behavior Research Methods publishes articles concerned with the methods, techniques, and instrumentation of research in experimental psychology. The journal focuses particularly on the use of computer technology in psychological research. An annual special issue is devoted to this field.