首页 > 最新文献

Observational studies最新文献

英文 中文
Size-biased sensitivity analysis for matched pairs design to assess the impact of healthcare-associated infections 对配对设计进行大小偏倚敏感性分析,以评估医疗保健相关感染的影响
Pub Date : 2023-09-07 DOI: 10.1353/obs.2023.a906628
David Watson
Abstract:Healthcare-associated infections are serious adverse events that occur during a hospital admission. Quantifying the impact of these infections on inpatient length of stay and cost has important policy implications due to the Hospital-Acquired Conditions Reduction Program in the United States. However, most studies on this topic are flawed because they do not account for when a healthcare-associated infection occurred during a hospital admission. Such an approach leads to selection bias because patients with longer hospital stays are more likely to experience an infection due to their increased exposure time. Time of infection is often not incorporated into the estimation strategy because this information is unknown, yet there are no methods that account for the selection bias in this scenario. To address this problem, we propose a sensitivity analysis for matched pairs designs for assessing the effect of healthcare-associated infections on length of stay and cost when time of infection is unknown. The approach models the probability of infection, or the assignment mechanism, as proportional to a power function of the uninfected length of stay, where the sensitivity parameter is the value of the power. The general idea is to incorporate the degree of exposure into the probability of an infection occurring. Under this size-biased assignment mechanism, we develop hypothesis tests under a sharp null hypothesis of constant multiplicative effects. The approach is demonstrated on a pediatric cohort of inpatient encounters and compared to benchmark estimates that properly account for time of infection. The results reaffirm the severe degree of bias when not accounting for time of infection and also show that the proposed sensitivity analysis captures the benchmark estimates for plausible and theoretically justified values of the sensitivity parameter.
摘要:医疗保健相关感染是在入院期间发生的严重不良事件。由于美国的医院获得性疾病减少计划,量化这些感染对住院时间和费用的影响具有重要的政策意义。然而,大多数关于这一主题的研究都有缺陷,因为它们没有说明住院期间何时发生与医疗保健相关的感染。这种方法会导致选择偏差,因为住院时间较长的患者因暴露时间增加而更有可能感染。感染时间通常不包含在估计策略中,因为这些信息是未知的,但在这种情况下,没有任何方法可以解释选择偏差。为了解决这个问题,我们提出了配对设计的敏感性分析,以评估在感染时间未知的情况下,医疗保健相关感染对住院时间和费用的影响。该方法将感染概率或分配机制建模为与未感染停留时间的幂函数成比例,其中敏感性参数是幂的值。一般的想法是将暴露程度纳入感染发生的概率中。在这种大小有偏的分配机制下,我们在常数乘法效应的尖锐零假设下进行假设检验。该方法在儿科住院患者队列中得到了验证,并与正确考虑感染时间的基准估计值进行了比较。结果重申了在不考虑感染时间的情况下的严重偏差程度,并表明所提出的敏感性分析捕捉了敏感性参数的合理和理论上合理值的基准估计。
{"title":"Size-biased sensitivity analysis for matched pairs design to assess the impact of healthcare-associated infections","authors":"David Watson","doi":"10.1353/obs.2023.a906628","DOIUrl":"https://doi.org/10.1353/obs.2023.a906628","url":null,"abstract":"Abstract:Healthcare-associated infections are serious adverse events that occur during a hospital admission. Quantifying the impact of these infections on inpatient length of stay and cost has important policy implications due to the Hospital-Acquired Conditions Reduction Program in the United States. However, most studies on this topic are flawed because they do not account for when a healthcare-associated infection occurred during a hospital admission. Such an approach leads to selection bias because patients with longer hospital stays are more likely to experience an infection due to their increased exposure time. Time of infection is often not incorporated into the estimation strategy because this information is unknown, yet there are no methods that account for the selection bias in this scenario. To address this problem, we propose a sensitivity analysis for matched pairs designs for assessing the effect of healthcare-associated infections on length of stay and cost when time of infection is unknown. The approach models the probability of infection, or the assignment mechanism, as proportional to a power function of the uninfected length of stay, where the sensitivity parameter is the value of the power. The general idea is to incorporate the degree of exposure into the probability of an infection occurring. Under this size-biased assignment mechanism, we develop hypothesis tests under a sharp null hypothesis of constant multiplicative effects. The approach is demonstrated on a pediatric cohort of inpatient encounters and compared to benchmark estimates that properly account for time of infection. The results reaffirm the severe degree of bias when not accounting for time of infection and also show that the proposed sensitivity analysis captures the benchmark estimates for plausible and theoretically justified values of the sensitivity parameter.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"9 1","pages":"1 - 24"},"PeriodicalIF":0.0,"publicationDate":"2023-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42324694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Software Tutorial for Matching in Clustered Observational Studies 集群观测研究中的匹配软件教程
Pub Date : 2023-09-07 DOI: 10.1353/obs.2023.a906624
Luke Keele, Matthew Lenard, Luke Miratrix, Lindsay Page
Abstract:Many interventions occur in settings where treatments are applied to groups. For example, a math intervention may be implemented for all students in some schools and withheld from students in other schools. When such treatments are non-randomly allocated, researchers can use statistical adjustment to make treated and control groups similar in terms of observed characteristics. Recent work in statistics has developed a form of matching, known as multilevel matching, that is designed for contexts where treatments are clustered. In this article, we provide a tutorial on how to analyze clustered treatment using multilevel matching. We use a real data application to explain the full set of steps for the analysis of a clustered observational study.
摘要:许多干预措施发生在治疗适用于群体的环境中。例如,数学干预可能对某些学校的所有学生实施,而对其他学校的学生不实施。当这些治疗是非随机分配时,研究人员可以使用统计调整使治疗组和对照组在观察到的特征方面相似。最近在统计学方面的工作已经发展出一种匹配形式,称为多层次匹配,它是为治疗聚集的环境而设计的。在本文中,我们提供了一个关于如何使用多级匹配分析聚类处理的教程。我们使用一个真实的数据应用程序来解释群集观察性研究分析的全套步骤。
{"title":"A Software Tutorial for Matching in Clustered Observational Studies","authors":"Luke Keele, Matthew Lenard, Luke Miratrix, Lindsay Page","doi":"10.1353/obs.2023.a906624","DOIUrl":"https://doi.org/10.1353/obs.2023.a906624","url":null,"abstract":"Abstract:Many interventions occur in settings where treatments are applied to groups. For example, a math intervention may be implemented for all students in some schools and withheld from students in other schools. When such treatments are non-randomly allocated, researchers can use statistical adjustment to make treated and control groups similar in terms of observed characteristics. Recent work in statistics has developed a form of matching, known as multilevel matching, that is designed for contexts where treatments are clustered. In this article, we provide a tutorial on how to analyze clustered treatment using multilevel matching. We use a real data application to explain the full set of steps for the analysis of a clustered observational study.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"9 1","pages":"73 - 96"},"PeriodicalIF":0.0,"publicationDate":"2023-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45559753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Doubly Robust Estimation of Average Treatment Effects on the Treated through Marginal Structural Models 通过边际结构模型的平均治疗效果的双稳健估计
Pub Date : 2023-05-11 DOI: 10.1353/obs.2023.0025
M. Schomaker, Philipp F. M. Baumann
Abstract:Some causal parameters are defined on subgroups of the observed data, such as the average treatment effect on the treated and variations thereof. We explain how such parameters can be defined through parameters in a marginal structural (working) model. We illustrate how existing software can be used for doubly robust effect estimation of those parameters. Our proposal for confidence interval estimation is based on the delta method. All concepts are illustrated by estimands and data from the data challenge of the 2022 American Causal Inference Conference.
摘要:在观测数据的子组上定义了一些因果参数,如平均治疗效应对被治疗者的影响及其变化。我们解释了如何通过边际结构(工作)模型中的参数来定义这些参数。我们说明了现有的软件如何用于这些参数的双鲁棒效应估计。我们提出的置信区间估计是基于delta方法的。所有概念都由来自2022年美国因果推理会议数据挑战的估计和数据来说明。
{"title":"Doubly Robust Estimation of Average Treatment Effects on the Treated through Marginal Structural Models","authors":"M. Schomaker, Philipp F. M. Baumann","doi":"10.1353/obs.2023.0025","DOIUrl":"https://doi.org/10.1353/obs.2023.0025","url":null,"abstract":"Abstract:Some causal parameters are defined on subgroups of the observed data, such as the average treatment effect on the treated and variations thereof. We explain how such parameters can be defined through parameters in a marginal structural (working) model. We illustrate how existing software can be used for doubly robust effect estimation of those parameters. Our proposal for confidence interval estimation is based on the delta method. All concepts are illustrated by estimands and data from the data challenge of the 2022 American Causal Inference Conference.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"9 1","pages":"43 - 57"},"PeriodicalIF":0.0,"publicationDate":"2023-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41487639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Causal Methods Madness: Lessons Learned from the 2022 ACIC Competition to Estimate Health Policy Impacts 因果方法疯狂:从2022年ACIC竞赛中获得的经验教训,以评估卫生政策的影响
Pub Date : 2023-05-11 DOI: 10.1353/obs.2023.0023
Daniel Thal, M. Finucane
Abstract:Introducing novel causal estimators usually involves simulation studies run by the statistician developing the estimator, but this traditional approach can be fraught: simulation design is often favorable to the new method, unfavorable results might never be published, and comparison across estimators is difficult. The American Causal Inference Conference (ACIC) data challenges offer an alternative. As organizers of the 2022 challenge, we generated thousands of data sets similar to real-world policy evaluations and baked in true causal impacts unknown to participants. Participating teams then competed on an even playing field, using their cutting-edge methods to estimate those effects. In total, 20 teams submitted results from 58 estimators that used a range of approaches. We found several important factors driving performance that are not commonly used in business-as-usual applied policy evaluations, pointing to ways future evaluations could achieve more precise and nuanced estimates of policy impacts. Top-performing methods used flexible modeling of outcome-covariate and outcome-participation relationships as well as regularization of subgroup estimates. Furthermore, we found that model-based uncertainty intervals tended to outperform bootstrap-based ones. Lastly, and counter to our expectations, we found that analyzing large-n patient-level data does not improve performance relative to analyzing smaller-n data aggregated to the primary care practice level, given that in our simulated data sets practices (not individual patients) decided whether to join the intervention. Ultimately, we hope this competition helped identify methods that are best suited for evaluating which social policies move the needle for the individuals and communities they serve.
摘要:引入新的因果估计量通常涉及由开发估计量的统计学家进行的模拟研究,但这种传统方法可能会令人担忧:模拟设计通常对新方法有利,不利的结果可能永远不会公布,并且估计量之间的比较很困难。美国因果推理会议(ACIC)的数据挑战提供了一种替代方案。作为2022年挑战赛的组织者,我们生成了数千个类似于现实世界政策评估的数据集,并烘焙出参与者未知的真实因果影响。参赛队伍随后在一个公平的场地上进行比赛,使用他们的尖端方法来估计这些影响。总共有20个小组提交了58个估计量的结果,这些估计量使用了一系列方法。我们发现了几个驱动绩效的重要因素,这些因素在照常应用的政策评估中并不常用,指出了未来评估可以实现对政策影响更精确、更细致的估计的方法。表现最好的方法使用了结果协变量和结果参与关系的灵活建模,以及子群估计的正则化。此外,我们发现基于模型的不确定性区间往往优于基于自举的区间。最后,与我们的预期相反,我们发现,与分析汇总到初级保健实践层面的小n数据相比,分析大n患者层面的数据并不能提高性能,因为在我们的模拟数据集中,实践(而不是个体患者)决定是否加入干预。最终,我们希望这场比赛有助于确定最适合评估哪些社会政策为他们所服务的个人和社区牵线搭桥的方法。
{"title":"Causal Methods Madness: Lessons Learned from the 2022 ACIC Competition to Estimate Health Policy Impacts","authors":"Daniel Thal, M. Finucane","doi":"10.1353/obs.2023.0023","DOIUrl":"https://doi.org/10.1353/obs.2023.0023","url":null,"abstract":"Abstract:Introducing novel causal estimators usually involves simulation studies run by the statistician developing the estimator, but this traditional approach can be fraught: simulation design is often favorable to the new method, unfavorable results might never be published, and comparison across estimators is difficult. The American Causal Inference Conference (ACIC) data challenges offer an alternative. As organizers of the 2022 challenge, we generated thousands of data sets similar to real-world policy evaluations and baked in true causal impacts unknown to participants. Participating teams then competed on an even playing field, using their cutting-edge methods to estimate those effects. In total, 20 teams submitted results from 58 estimators that used a range of approaches. We found several important factors driving performance that are not commonly used in business-as-usual applied policy evaluations, pointing to ways future evaluations could achieve more precise and nuanced estimates of policy impacts. Top-performing methods used flexible modeling of outcome-covariate and outcome-participation relationships as well as regularization of subgroup estimates. Furthermore, we found that model-based uncertainty intervals tended to outperform bootstrap-based ones. Lastly, and counter to our expectations, we found that analyzing large-n patient-level data does not improve performance relative to analyzing smaller-n data aggregated to the primary care practice level, given that in our simulated data sets practices (not individual patients) decided whether to join the intervention. Ultimately, we hope this competition helped identify methods that are best suited for evaluating which social policies move the needle for the individuals and communities they serve.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"9 1","pages":"27 - 3"},"PeriodicalIF":0.0,"publicationDate":"2023-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44338192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimating Treatment Effect with Propensity Score Weighted Regression and Double Machine Learning 用倾向评分加权回归和双机器学习估计治疗效果
Pub Date : 2023-05-11 DOI: 10.1353/obs.2023.0028
Jun Xue, Wei Zhong Goh, Dana Rotz
Abstract:We applied propensity score weighted regression and double machine learning in the 2022 American Causal Inference Conference Data Challenge. Our double machine learning method achieved the second lowest overall RMSE among all official submissions, but performed less well on heterogeneous treatment effect estimation due to lack of regularization.
摘要:我们在2022年美国因果推理会议数据挑战赛中应用了倾向得分加权回归和双机器学习。我们的双机器学习方法在所有官方提交的报告中获得了第二低的总体RMSE,但由于缺乏正则化,在异构治疗效果估计上表现不佳。
{"title":"Estimating Treatment Effect with Propensity Score Weighted Regression and Double Machine Learning","authors":"Jun Xue, Wei Zhong Goh, Dana Rotz","doi":"10.1353/obs.2023.0028","DOIUrl":"https://doi.org/10.1353/obs.2023.0028","url":null,"abstract":"Abstract:We applied propensity score weighted regression and double machine learning in the 2022 American Causal Inference Conference Data Challenge. Our double machine learning method achieved the second lowest overall RMSE among all official submissions, but performed less well on heterogeneous treatment effect estimation due to lack of regularization.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"10 6","pages":"83 - 90"},"PeriodicalIF":0.0,"publicationDate":"2023-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41291815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimating Treatment Effects over Time with Causal Forests: An application to the ACIC 2022 Data Challenge 利用因果森林估算随时间推移的治疗效果:在ACIC 2022数据挑战中的应用
Pub Date : 2023-05-11 DOI: 10.1353/obs.2023.0026
Shu Wan, Guanghui Zhang
Abstract:In this paper, we present our winning modeling approach, DiConfounder, for the Atlantic Causal Inference Conference (ACIC) 2022 Data Science data challenge. Our method ranks 1st in RMSE and 5th in coverage among the 58 submissions. We propose a transformed outcome estimator by connecting the difference-in-difference and conditional average treatment effect estimation problems. Our comprehensive multistage pipeline encompasses feature engineering, missing value imputation, outcome and propensity score modeling, treatment effects modeling, and SATT and uncertainty estimations. Our model achieves remarkably accurate predictions, with an overall RMSE as low as 11 and 84.5% coverage. Further discussions explore various methods for constructing confidence intervals and analyzing the limitations of our approach under different data generating process settings. We provide evidence that the clustered data structure is the key to success. We also release the source code on GitHub for practitioners to adopt and adapt our methods.
摘要:在本文中,我们为大西洋因果推理会议(ACIC) 2022年数据科学数据挑战赛展示了我们的获奖建模方法DiConfounder。我们的方法在58篇投稿中RMSE排名第1,coverage排名第5。我们将差中差和条件平均治疗效果估计问题联系起来,提出了一个转化的结果估计器。我们的综合多阶段管道包括特征工程、缺失值估算、结果和倾向评分建模、治疗效果建模、SATT和不确定性估计。我们的模型实现了非常准确的预测,总体RMSE低至11,覆盖率为84.5%。进一步的讨论探讨了构建置信区间的各种方法,并分析了我们的方法在不同数据生成过程设置下的局限性。我们提供的证据表明,集群数据结构是成功的关键。我们还在GitHub上发布了源代码,供从业者采用和调整我们的方法。
{"title":"Estimating Treatment Effects over Time with Causal Forests: An application to the ACIC 2022 Data Challenge","authors":"Shu Wan, Guanghui Zhang","doi":"10.1353/obs.2023.0026","DOIUrl":"https://doi.org/10.1353/obs.2023.0026","url":null,"abstract":"Abstract:In this paper, we present our winning modeling approach, DiConfounder, for the Atlantic Causal Inference Conference (ACIC) 2022 Data Science data challenge. Our method ranks 1st in RMSE and 5th in coverage among the 58 submissions. We propose a transformed outcome estimator by connecting the difference-in-difference and conditional average treatment effect estimation problems. Our comprehensive multistage pipeline encompasses feature engineering, missing value imputation, outcome and propensity score modeling, treatment effects modeling, and SATT and uncertainty estimations. Our model achieves remarkably accurate predictions, with an overall RMSE as low as 11 and 84.5% coverage. Further discussions explore various methods for constructing confidence intervals and analyzing the limitations of our approach under different data generating process settings. We provide evidence that the clustered data structure is the key to success. We also release the source code on GitHub for practitioners to adopt and adapt our methods.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"9 1","pages":"59 - 71"},"PeriodicalIF":0.0,"publicationDate":"2023-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43810955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inverse Probability Weighting Difference-in-Differences (IPWDID) 反向概率加权差值(IPWDID)
Pub Date : 2023-05-11 DOI: 10.1353/obs.2023.0027
Yuqin Wei, M. Epland, Jingyuan Liu
Abstract:In this American Causal Inference Conference (ACIC) 2022 challenge submission, the canonical difference-in-differences (DID) estimator has been used with inverse probability weighting (IPW) and strong simplifying assumptions to produce a benchmark model of the sample average treatment effect on the treated (SATT). Despite the restrictive assumptions and simple model, satisfactory performance in both point estimate and confidence intervals was observed, ranking in the top half of the competition.
摘要:在2022年美国因果推断会议(ACIC)挑战提交的文件中,标准差分(DID)估计器已与逆概率加权(IPW)和强简化假设一起使用,以生成样本平均治疗对被治疗者(SATT)影响的基准模型。尽管有限制性的假设和简单的模型,但在点估计和置信区间方面都观察到了令人满意的表现,在竞争中排名前半。
{"title":"Inverse Probability Weighting Difference-in-Differences (IPWDID)","authors":"Yuqin Wei, M. Epland, Jingyuan Liu","doi":"10.1353/obs.2023.0027","DOIUrl":"https://doi.org/10.1353/obs.2023.0027","url":null,"abstract":"Abstract:In this American Causal Inference Conference (ACIC) 2022 challenge submission, the canonical difference-in-differences (DID) estimator has been used with inverse probability weighting (IPW) and strong simplifying assumptions to produce a benchmark model of the sample average treatment effect on the treated (SATT). Despite the restrictive assumptions and simple model, satisfactory performance in both point estimate and confidence intervals was observed, ranking in the top half of the competition.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"9 1","pages":"73 - 81"},"PeriodicalIF":0.0,"publicationDate":"2023-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49451652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
lmtp: An R Package for Estimating the Causal Effects of Modified Treatment Policies lmtp:一个用于估计改良治疗政策因果影响的R包
Pub Date : 2023-03-01 DOI: 10.1353/obs.2023.0019
Nicholas T Williams, I. Díaz
Abstract:We present the lmtp R package for causal inference from longitudinal observational or randomized studies. This package implements the estimators of Díaz et al. (2021) for estimating general non-parametric causal effects based on modified treatment policies. Modified treatment policies generalize static and dynamic interventions, making lmtp and all-purpose package for non-parametric causal inference in observational studies. The methods provided can be applied to both point-treatment and longitudinal settings, and can account for time-varying exposure, covariates, and right censoring thereby providing a very general tool for causal inference. Additionally, two of the provided estimators are based on flexible machine learning regression algorithms, and avoid bias due to parametric model misspecification while maintaining valid statistical inference.
摘要:我们提出了纵向观察或随机研究因果推理的lmtp R包。该软件包实现了Díaz等人(2021)的估计器,用于估计基于修改后的治疗政策的一般非参数因果效应。修改后的治疗政策概括了静态和动态干预措施,使ltp成为观察性研究中非参数因果推断的万能包。所提供的方法可以应用于点处理和纵向设置,并且可以解释时变暴露,协变量和右审查,从而为因果推理提供了一个非常通用的工具。此外,所提供的两个估计器基于灵活的机器学习回归算法,避免了由于参数模型错误规范而导致的偏差,同时保持有效的统计推断。
{"title":"lmtp: An R Package for Estimating the Causal Effects of Modified Treatment Policies","authors":"Nicholas T Williams, I. Díaz","doi":"10.1353/obs.2023.0019","DOIUrl":"https://doi.org/10.1353/obs.2023.0019","url":null,"abstract":"Abstract:We present the lmtp R package for causal inference from longitudinal observational or randomized studies. This package implements the estimators of Díaz et al. (2021) for estimating general non-parametric causal effects based on modified treatment policies. Modified treatment policies generalize static and dynamic interventions, making lmtp and all-purpose package for non-parametric causal inference in observational studies. The methods provided can be applied to both point-treatment and longitudinal settings, and can account for time-varying exposure, covariates, and right censoring thereby providing a very general tool for causal inference. Additionally, two of the provided estimators are based on flexible machine learning regression algorithms, and avoid bias due to parametric model misspecification while maintaining valid statistical inference.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"9 1","pages":"103 - 122"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47362691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Doubly-Robust Inference in R using drtmle 基于drtmle的R中的双稳健推理
Pub Date : 2023-03-01 DOI: 10.1353/obs.2023.0017
D. Benkeser, N. Hejazi
Abstract:Inverse probability of treatment weighted estimators and doubly robust estimators (including augmented inverse probability of treatment weight and targeted minimum loss estimators) are widely used in causal inference to estimate and draw inference about the average effect of a treatment. As an intermediate step, these estimators require estimation of key nuisance parameters, which are often regression functions. Typically, regressions are estimated using maximum likelihood and parametric models. Confidence intervals and p-values may be computed based on standard asymptotic results, such as the central limit theorem, the delta method, and the nonparametric bootstrap. However, in high-dimensional settings, maximum likelihood estimation often breaks down and standard procedures no longer yield correct inference. Instead, we may rely on adaptive estimators of nuisance parameters to construct flexible regression estimators. However, use of adaptive estimators poses a challenge for performing statistical inference about an estimated treatment effect. While doubly robust estimators facilitate inference when all relevant regression functions are consistently estimated, the same cannot be said when at least one nuisance estimator is inconsistent. drtmle implements doubly robust confidence intervals and hypothesis tests for targeted minimum loss estimates of the average treatment effect, in addition to several other recently proposed estimators of the average treatment effect.
摘要:处理加权逆概率估计量和双鲁棒估计量(包括处理权值增广逆概率估计量和目标最小损失估计量)在因果推理中被广泛用于估计和推断处理的平均效果。作为中间步骤,这些估计需要估计关键的干扰参数,这些参数通常是回归函数。通常,回归是使用最大似然和参数模型来估计的。置信区间和p值可以根据标准渐近结果计算,如中心极限定理、delta方法和非参数自举法。然而,在高维环境中,最大似然估计经常失效,标准程序不再产生正确的推断。相反,我们可以依靠自适应估计的干扰参数来构造灵活的回归估计。然而,使用自适应估计器对估计的治疗效果进行统计推断提出了挑战。当所有相关的回归函数都一致估计时,双鲁棒估计器有助于推理,但当至少有一个令人讨厌的估计器不一致时,情况就不一样了。除了最近提出的其他几个平均处理效果的估计之外,Drtmle还实现了双重稳健置信区间和假设检验,以估计平均处理效果的目标最小损失。
{"title":"Doubly-Robust Inference in R using drtmle","authors":"D. Benkeser, N. Hejazi","doi":"10.1353/obs.2023.0017","DOIUrl":"https://doi.org/10.1353/obs.2023.0017","url":null,"abstract":"Abstract:Inverse probability of treatment weighted estimators and doubly robust estimators (including augmented inverse probability of treatment weight and targeted minimum loss estimators) are widely used in causal inference to estimate and draw inference about the average effect of a treatment. As an intermediate step, these estimators require estimation of key nuisance parameters, which are often regression functions. Typically, regressions are estimated using maximum likelihood and parametric models. Confidence intervals and p-values may be computed based on standard asymptotic results, such as the central limit theorem, the delta method, and the nonparametric bootstrap. However, in high-dimensional settings, maximum likelihood estimation often breaks down and standard procedures no longer yield correct inference. Instead, we may rely on adaptive estimators of nuisance parameters to construct flexible regression estimators. However, use of adaptive estimators poses a challenge for performing statistical inference about an estimated treatment effect. While doubly robust estimators facilitate inference when all relevant regression functions are consistently estimated, the same cannot be said when at least one nuisance estimator is inconsistent. drtmle implements doubly robust confidence intervals and hypothesis tests for targeted minimum loss estimates of the average treatment effect, in addition to several other recently proposed estimators of the average treatment effect.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"9 1","pages":"43 - 78"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41508466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Comparison of dimension reduction methods for the identification of heart-healthy dietary patterns 降维方法识别心脏健康饮食模式的比较
Pub Date : 2023-03-01 DOI: 10.1353/obs.2023.0020
Natalie C. Gasca, R. McClelland
Abstract:Most nutritional epidemiology studies investigating diet-disease trends use unsupervised dimension reduction methods, like principal component regression (PCR) and sparse PCR (SPCR), to create dietary patterns. Supervised methods, such as partial least squares (PLS), sparse PLS (SPLS), and Lasso, offer the possibility of more concisely summarizing the foods most related to a disease. In this study we evaluate these five methods for interpretable reduction of food frequency questionnaire (FFQ) data when analyzing a univariate continuous cardiac-related outcome via a simulation study and data application. We also demonstrate that to control for covariates, various scientific premises require different adjustment approaches when using PLS. To emulate food groups, we generated blocks of normally distributed predictors with varying intra-block covariances; only nine of 24 predictors contributed to the normal response. When block covariances were informed by FFQ data, the only methods that performed variable selection were Lasso and SPLS, which selected two and four irrelevant variables, respectively. SPLS had the lowest prediction error, and both PLS-based methods constructed four patterns, while PCR and SPCR created 24 patterns. These methods were applied to 120 FFQ variables and baseline body mass index (BMI) from the Multi-Ethnic Study of Atherosclerosis, which includes 6814 participants aged 45-84, and we adjusted for age, gender, race/ethnicity, exercise, and total energy intake. From 120 variables, PCR created 17 BMI-related patterns and PLS selected one pattern; SPLS only used five variables to create two patterns. All methods exhibited similar predictive performance. Specifically, SPLS’s first pattern highlighted hamburger and diet soda intake (positive associations with BMI), reflecting a fast food diet. By selecting fewer patterns and foods, SPLS can create interpretable dietary patterns while maintaining predictive ability.
摘要:大多数调查饮食疾病趋势的营养流行病学研究都使用无监督降维方法,如主成分回归(PCR)和稀疏PCR(SPCR),来创建饮食模式。监督方法,如偏最小二乘(PLS)、稀疏PLS(SPLS)和Lasso,提供了更简洁地总结与疾病最相关的食物的可能性。在本研究中,我们通过模拟研究和数据应用分析单变量连续心脏相关结果时,评估了这五种可解释的减少食物频率问卷(FFQ)数据的方法。我们还证明,为了控制协变量,在使用PLS时,各种科学前提需要不同的调整方法。为了模拟食物组,我们生成了具有不同块内协变量的正态分布预测因子块;24个预测因子中只有9个对正常反应有贡献。当块协变量由FFQ数据告知时,唯一进行变量选择的方法是Lasso和SPLS,它们分别选择了两个和四个不相关的变量。SPLS的预测误差最低,两种基于PLS的方法都构建了四种模式,而PCR和SPCR则构建了24种模式。这些方法应用于动脉粥样硬化多民族研究的120个FFQ变量和基线体重指数(BMI),该研究包括6814名年龄在45-84岁的参与者,我们对年龄、性别、种族/民族、运动和总能量摄入进行了调整。从120个变量中,PCR创建了17个BMI相关模式,PLS选择了一个模式;SPLS只使用了五个变量来创建两个模式。所有方法都表现出相似的预测性能。具体来说,SPLS的第一个模式强调了汉堡和无糖苏打水的摄入(与BMI呈正相关),反映了快餐饮食。通过选择更少的模式和食物,SPLS可以在保持预测能力的同时创造可解释的饮食模式。
{"title":"Comparison of dimension reduction methods for the identification of heart-healthy dietary patterns","authors":"Natalie C. Gasca, R. McClelland","doi":"10.1353/obs.2023.0020","DOIUrl":"https://doi.org/10.1353/obs.2023.0020","url":null,"abstract":"Abstract:Most nutritional epidemiology studies investigating diet-disease trends use unsupervised dimension reduction methods, like principal component regression (PCR) and sparse PCR (SPCR), to create dietary patterns. Supervised methods, such as partial least squares (PLS), sparse PLS (SPLS), and Lasso, offer the possibility of more concisely summarizing the foods most related to a disease. In this study we evaluate these five methods for interpretable reduction of food frequency questionnaire (FFQ) data when analyzing a univariate continuous cardiac-related outcome via a simulation study and data application. We also demonstrate that to control for covariates, various scientific premises require different adjustment approaches when using PLS. To emulate food groups, we generated blocks of normally distributed predictors with varying intra-block covariances; only nine of 24 predictors contributed to the normal response. When block covariances were informed by FFQ data, the only methods that performed variable selection were Lasso and SPLS, which selected two and four irrelevant variables, respectively. SPLS had the lowest prediction error, and both PLS-based methods constructed four patterns, while PCR and SPCR created 24 patterns. These methods were applied to 120 FFQ variables and baseline body mass index (BMI) from the Multi-Ethnic Study of Atherosclerosis, which includes 6814 participants aged 45-84, and we adjusted for age, gender, race/ethnicity, exercise, and total energy intake. From 120 variables, PCR created 17 BMI-related patterns and PLS selected one pattern; SPLS only used five variables to create two patterns. All methods exhibited similar predictive performance. Specifically, SPLS’s first pattern highlighted hamburger and diet soda intake (positive associations with BMI), reflecting a fast food diet. By selecting fewer patterns and foods, SPLS can create interpretable dietary patterns while maintaining predictive ability.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"9 1","pages":"123 - 156"},"PeriodicalIF":0.0,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49570747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Observational studies
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1