{"title":"A matching-based machine learning approach to estimating optimal dynamic treatment regimes with time-to-event outcomes.","authors":"Xuechen Wang, Hyejung Lee, Benjamin Haaland, Kathleen Kerrigan, Sonam Puri, Wallace Akerley, Jincheng Shen","doi":"10.1177/09622802241236954","DOIUrl":null,"url":null,"abstract":"<p><p>Observational data (e.g. electronic health records) has become increasingly important in evidence-based research on dynamic treatment regimes, which tailor treatments over time to patients based on their characteristics and evolving clinical history. It is of great interest for clinicians and statisticians to identify an optimal dynamic treatment regime that can produce the best expected clinical outcome for each individual and thus maximize the treatment benefit over the population. Observational data impose various challenges for using statistical tools to estimate optimal dynamic treatment regimes. Notably, the task becomes more sophisticated when the clinical outcome of primary interest is time-to-event. Here, we propose a matching-based machine learning method to identify the optimal dynamic treatment regime with time-to-event outcomes subject to right-censoring using electronic health record data. In contrast to the established inverse probability weighting-based dynamic treatment regime methods, our proposed approach provides better protection against model misspecification and extreme weights in the context of treatment sequences, effectively addressing a prevalent challenge in the longitudinal analysis of electronic health record data. In simulations, the proposed method demonstrates robust performance across a range of scenarios. In addition, we illustrate the method with an application to estimate optimal dynamic treatment regimes for patients with advanced non-small cell lung cancer using a real-world, nationwide electronic health record database from Flatiron Health.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"794-806"},"PeriodicalIF":1.6000,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistical Methods in Medical Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/09622802241236954","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/3/19 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
Abstract
Observational data (e.g. electronic health records) has become increasingly important in evidence-based research on dynamic treatment regimes, which tailor treatments over time to patients based on their characteristics and evolving clinical history. It is of great interest for clinicians and statisticians to identify an optimal dynamic treatment regime that can produce the best expected clinical outcome for each individual and thus maximize the treatment benefit over the population. Observational data impose various challenges for using statistical tools to estimate optimal dynamic treatment regimes. Notably, the task becomes more sophisticated when the clinical outcome of primary interest is time-to-event. Here, we propose a matching-based machine learning method to identify the optimal dynamic treatment regime with time-to-event outcomes subject to right-censoring using electronic health record data. In contrast to the established inverse probability weighting-based dynamic treatment regime methods, our proposed approach provides better protection against model misspecification and extreme weights in the context of treatment sequences, effectively addressing a prevalent challenge in the longitudinal analysis of electronic health record data. In simulations, the proposed method demonstrates robust performance across a range of scenarios. In addition, we illustrate the method with an application to estimate optimal dynamic treatment regimes for patients with advanced non-small cell lung cancer using a real-world, nationwide electronic health record database from Flatiron Health.
观察数据(如电子健康记录)在动态治疗方案的循证研究中变得越来越重要,动态治疗方案是根据患者的特征和不断变化的临床病史,在一段时间内为患者量身定制治疗方案。对于临床医生和统计学家来说,如何确定一种最佳动态治疗方案,使每个人都能获得最佳预期临床结果,从而使整个人群的治疗效益最大化,是一个非常重要的问题。观察数据给使用统计工具估算最佳动态治疗方案带来了各种挑战。值得注意的是,当主要关注的临床结果是时间到事件时,这项任务就变得更加复杂。在此,我们提出了一种基于匹配的机器学习方法,利用电子健康记录数据来识别具有时间到事件结果的最佳动态治疗方案,并对其进行右删减。与已有的基于反概率权重的动态治疗机制方法相比,我们提出的方法能更好地防止治疗序列中的模型错误规范和极端权重,有效地解决了电子健康记录数据纵向分析中普遍存在的难题。在模拟实验中,所提出的方法在各种情况下都表现出稳健的性能。此外,我们还利用 Flatiron Health 公司提供的真实世界、全国范围的电子健康记录数据库,对晚期非小细胞肺癌患者的最佳动态治疗方案进行了估算,以此来说明该方法。
期刊介绍:
Statistical Methods in Medical Research is a peer reviewed scholarly journal and is the leading vehicle for articles in all the main areas of medical statistics and an essential reference for all medical statisticians. This unique journal is devoted solely to statistics and medicine and aims to keep professionals abreast of the many powerful statistical techniques now available to the medical profession. This journal is a member of the Committee on Publication Ethics (COPE)