Jacklyn Luu, Evgenia Borisenko, Valerie Przekop, Advait Patil, Joseph D Forrester, Jeff Choi
{"title":"Practical guide to building machine learning-based clinical prediction models using imbalanced datasets.","authors":"Jacklyn Luu, Evgenia Borisenko, Valerie Przekop, Advait Patil, Joseph D Forrester, Jeff Choi","doi":"10.1136/tsaco-2023-001222","DOIUrl":null,"url":null,"abstract":"<p><p>Clinical prediction models often aim to predict rare, high-risk events, but building such models requires robust understanding of imbalance datasets and their unique study design considerations. This practical guide highlights foundational prediction model principles for surgeon-data scientists and readers who encounter clinical prediction models, from feature engineering and algorithm selection strategies to model evaluation and design techniques specific to imbalanced datasets. We walk through a clinical example using readable code to highlight important considerations and common pitfalls in developing machine learning-based prediction models. We hope this practical guide facilitates developing and critically appraising robust clinical prediction models for the surgical community.</p>","PeriodicalId":23307,"journal":{"name":"Trauma Surgery & Acute Care Open","volume":"9 1","pages":"e001222"},"PeriodicalIF":2.1000,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11177772/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Trauma Surgery & Acute Care Open","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1136/tsaco-2023-001222","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"CRITICAL CARE MEDICINE","Score":null,"Total":0}
引用次数: 0
Abstract
Clinical prediction models often aim to predict rare, high-risk events, but building such models requires robust understanding of imbalance datasets and their unique study design considerations. This practical guide highlights foundational prediction model principles for surgeon-data scientists and readers who encounter clinical prediction models, from feature engineering and algorithm selection strategies to model evaluation and design techniques specific to imbalanced datasets. We walk through a clinical example using readable code to highlight important considerations and common pitfalls in developing machine learning-based prediction models. We hope this practical guide facilitates developing and critically appraising robust clinical prediction models for the surgical community.