Jiacong Du, Youfei Yu, Min Zhang, Zhenke Wu, Andrew M Ryan, Bhramar Mukherjee
{"title":"Outcome adaptive propensity score methods for handling censoring and high-dimensionality: Application to insurance claims.","authors":"Jiacong Du, Youfei Yu, Min Zhang, Zhenke Wu, Andrew M Ryan, Bhramar Mukherjee","doi":"10.1177/09622802241306856","DOIUrl":null,"url":null,"abstract":"<p><p>Propensity scores are commonly used to reduce the confounding bias in non-randomized observational studies for estimating the average treatment effect. An important assumption underlying this approach is that all confounders that are associated with both the treatment and the outcome of interest are measured and included in the propensity score model. In the absence of strong prior knowledge about potential confounders, researchers may agnostically want to adjust for a high-dimensional set of pre-treatment variables. As such, variable selection procedure is needed for propensity score estimation. In addition, studies show that including variables related to treatment only in the propensity score model may inflate the variance of the treatment effect estimators, while including variables that are predictive of only the outcome can improve efficiency. In this article, we propose to incorporate outcome-covariate relationship in the propensity score model by including the predicted binary outcome probability as a covariate. Our approach can be easily adapted to an ensemble of variable selection methods, including regularization methods and modern machine-learning tools based on classification and regression trees. We evaluate our method to estimate the treatment effects on a binary outcome, which is possibly censored, across multiple treatment groups. Simulation studies indicate that incorporating outcome probability for estimating the propensity scores can improve statistical efficiency and protect against model misspecification. The proposed methods are applied to a cohort of advanced-stage prostate cancer patients identified from a private insurance claims database for comparing the adverse effects of four commonly used drugs for treating castration-resistant prostate cancer.</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"9622802241306856"},"PeriodicalIF":1.6000,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistical Methods in Medical Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/09622802241306856","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
Abstract
Propensity scores are commonly used to reduce the confounding bias in non-randomized observational studies for estimating the average treatment effect. An important assumption underlying this approach is that all confounders that are associated with both the treatment and the outcome of interest are measured and included in the propensity score model. In the absence of strong prior knowledge about potential confounders, researchers may agnostically want to adjust for a high-dimensional set of pre-treatment variables. As such, variable selection procedure is needed for propensity score estimation. In addition, studies show that including variables related to treatment only in the propensity score model may inflate the variance of the treatment effect estimators, while including variables that are predictive of only the outcome can improve efficiency. In this article, we propose to incorporate outcome-covariate relationship in the propensity score model by including the predicted binary outcome probability as a covariate. Our approach can be easily adapted to an ensemble of variable selection methods, including regularization methods and modern machine-learning tools based on classification and regression trees. We evaluate our method to estimate the treatment effects on a binary outcome, which is possibly censored, across multiple treatment groups. Simulation studies indicate that incorporating outcome probability for estimating the propensity scores can improve statistical efficiency and protect against model misspecification. The proposed methods are applied to a cohort of advanced-stage prostate cancer patients identified from a private insurance claims database for comparing the adverse effects of four commonly used drugs for treating castration-resistant prostate cancer.
期刊介绍:
Statistical Methods in Medical Research is a peer reviewed scholarly journal and is the leading vehicle for articles in all the main areas of medical statistics and an essential reference for all medical statisticians. This unique journal is devoted solely to statistics and medicine and aims to keep professionals abreast of the many powerful statistical techniques now available to the medical profession. This journal is a member of the Committee on Publication Ethics (COPE)