{"title":"CURE:在大规模患者数据上预先训练的深度学习框架,用于估计治疗效果","authors":"Ruoqi Liu, Pin-Yu Chen, Ping Zhang","doi":"10.1016/j.patter.2024.100973","DOIUrl":null,"url":null,"abstract":"<p>Treatment effect estimation (TEE) aims to identify the causal effects of treatments on important outcomes. Current machine-learning-based methods, mainly trained on labeled data for specific treatments or outcomes, can be sub-optimal with limited labeled data. In this article, we propose a new pre-training and fine-tuning framework, CURE (causal treatment effect estimation), for TEE from observational data. CURE is pre-trained on large-scale unlabeled patient data to learn representative contextual patient representations and fine-tuned on labeled patient data for TEE. We present a new sequence encoding approach for longitudinal patient data embedding both structure and time. Evaluated on four downstream TEE tasks, CURE outperforms the state-of-the-art methods, marking a 7% increase in area under the precision-recall curve and an 8% rise in the influence-function-based precision of estimating heterogeneous effects. Validation with four randomized clinical trials confirms its efficacy in producing trial conclusions, highlighting CURE’s capacity to supplement traditional clinical trials.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"2011 1","pages":""},"PeriodicalIF":6.7000,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"CURE: A deep learning framework pre-trained on large-scale patient data for treatment effect estimation\",\"authors\":\"Ruoqi Liu, Pin-Yu Chen, Ping Zhang\",\"doi\":\"10.1016/j.patter.2024.100973\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Treatment effect estimation (TEE) aims to identify the causal effects of treatments on important outcomes. Current machine-learning-based methods, mainly trained on labeled data for specific treatments or outcomes, can be sub-optimal with limited labeled data. In this article, we propose a new pre-training and fine-tuning framework, CURE (causal treatment effect estimation), for TEE from observational data. CURE is pre-trained on large-scale unlabeled patient data to learn representative contextual patient representations and fine-tuned on labeled patient data for TEE. We present a new sequence encoding approach for longitudinal patient data embedding both structure and time. Evaluated on four downstream TEE tasks, CURE outperforms the state-of-the-art methods, marking a 7% increase in area under the precision-recall curve and an 8% rise in the influence-function-based precision of estimating heterogeneous effects. Validation with four randomized clinical trials confirms its efficacy in producing trial conclusions, highlighting CURE’s capacity to supplement traditional clinical trials.</p>\",\"PeriodicalId\":36242,\"journal\":{\"name\":\"Patterns\",\"volume\":\"2011 1\",\"pages\":\"\"},\"PeriodicalIF\":6.7000,\"publicationDate\":\"2024-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Patterns\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1016/j.patter.2024.100973\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Patterns","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.patter.2024.100973","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
CURE: A deep learning framework pre-trained on large-scale patient data for treatment effect estimation
Treatment effect estimation (TEE) aims to identify the causal effects of treatments on important outcomes. Current machine-learning-based methods, mainly trained on labeled data for specific treatments or outcomes, can be sub-optimal with limited labeled data. In this article, we propose a new pre-training and fine-tuning framework, CURE (causal treatment effect estimation), for TEE from observational data. CURE is pre-trained on large-scale unlabeled patient data to learn representative contextual patient representations and fine-tuned on labeled patient data for TEE. We present a new sequence encoding approach for longitudinal patient data embedding both structure and time. Evaluated on four downstream TEE tasks, CURE outperforms the state-of-the-art methods, marking a 7% increase in area under the precision-recall curve and an 8% rise in the influence-function-based precision of estimating heterogeneous effects. Validation with four randomized clinical trials confirms its efficacy in producing trial conclusions, highlighting CURE’s capacity to supplement traditional clinical trials.