Background: Causal mediation analysis can improve understanding of the mechanism s underlying epidemiologic associations. However, the utility of natural direct and indirect effect estimation has been limited by the assumption of no confounder of the mediator-outcome relationship that is affected by prior exposure (which we call an intermediate confounder)--an assumption frequently violated in practice.
Methods: We build on recent work that identified alternative estimands that do not require this assumption and propose a flexible and double robust targeted minimum loss-based estimator for stochastic direct and indirect effects. The proposed method intervenes stochastically on the mediator using a distribution which conditions on baseline covariates and marginalizes over the intermediate confounder.
Results: We demonstrate the estimator's finite sample and robustness properties in a simple simulation study. We apply the method to an example from the Moving to Opportunity experiment. In this application, randomization to receive a housing voucher is the treatment/instrument that influenced moving with the voucher out of public housing, which is the intermediate confounder. We estimate the stochastic direct effect of randomization to the voucher group on adolescent marijuana use not mediated by change in school district and the stochastic indirect effect mediated by change in school district. We find no evidence of mediation.
Conclusions: Our estimator is easy to implement in standard statistical software, and we provide annotated R code to further lower implementation barriers.
Compartmental model diagrams have been used for nearly a century to depict causal relationships in infectious disease epidemiology. Causal directed acyclic graphs (DAGs) have been used more broadly in epidemiology since the 1990s to guide analyses of a variety of public health problems. Using an example from chronic disease epidemiology, the effect of type 2 diabetes on dementia incidence, we illustrate how compartmental model diagrams can represent the same concepts as causal DAGs, including causation, mediation, confounding, and collider bias. We show how to use compartmental model diagrams to explicitly depict interaction and feedback cycles. While DAGs imply a set of conditional independencies, they do not define conditional distributions parametrically. Compartmental model diagrams parametrically (or semiparametrically) describe state changes based on known biological processes or mechanisms. Compartmental model diagrams are part of a long-term tradition of causal thinking in epidemiology and can parametrically express the same concepts as DAGs, as well as explicitly depict feedback cycles and interactions. As causal inference efforts in epidemiology increasingly draw on simulations and quantitative sensitivity analyses, compartmental model diagrams may be of use to a wider audience. Recognizing simple links between these two common approaches to representing causal processes may facilitate communication between researchers from different traditions.
In a recent BMJ article, the authors conducted a meta-analysis to compare estimated treatment effects from randomized trials with those derived from observational studies based on routinely collected data (RCD). They calculated a pooled relative odds ratio (ROR) of 1.31 (95% confidence interval [CI]: 1.03-1.65) and concluded that RCD studies systematically over-estimated protective effects. However, their meta-analysis inverted results for some clinical questions to force all estimates from RCD to be below 1. We evaluated the statistical properties of this pooled ROR, and found that the selective inversion rule employed in the original meta-analysis can positively bias the estimate of the ROR. We then repeated the random effects meta-analysis using a different inversion rule and found an estimated ROR of 0.98 (0.78-1.23), indicating the ROR is highly dependent on the direction of comparisons. As an alternative to the ROR, we calculated the observed proportion of clinical questions where the RCD and trial CIs overlap, as well as the expected proportion assuming no systematic difference between the studies. Out of 16 clinical questions, 50% CIs overlapped for 8 (50%; 25 to 75%) compared with an expected overlap of 60% assuming no systematic difference between RCD studies and trials. Thus, there was little evidence of a systematic difference in effect estimates between RCD and RCTs. Estimates of pooled RORs across distinct clinical questions are generally not interpretable and may be misleading.
In conducting studies on an exposure of interest, a systematic roadmap should be applied for translating causal questions into statistical analyses and interpreting the results. In this paper we describe an application of one such roadmap applied to estimating the joint effect of both time to availability of a nurse-based triage system (low risk express care (LREC)) and individual enrollment in the program among HIV patients in East Africa. Our study population is comprised of 16,513 subjects found eligible for this task-shifting program within 15 clinics in Kenya between 2006 and 2009, with each clinic starting the LREC program between 2007 and 2008. After discretizing follow-up into 90-day time intervals, we targeted the population mean counterfactual outcome (i. e. counterfactual probability of either dying or being lost to follow up) at up to 450 days after initial LREC eligibility under three fixed treatment interventions. These were (i) under no program availability during the entire follow-up, (ii) under immediate program availability at initial eligibility, but non-enrollment during the entire follow-up, and (iii) under immediate program availability and enrollment at initial eligibility. We further estimated the controlled direct effect of immediate program availability compared to no program availability, under a hypothetical intervention to prevent individual enrollment in the program. Targeted minimum loss-based estimation was used to estimate the mean outcome, while Super Learning was implemented to estimate the required nuisance parameters. Analyses were conducted with the ltmle R package; analysis code is available at an online repository as an R package. Results showed that at 450 days, the probability of in-care survival for subjects with immediate availability and enrollment was 0.93 (95% CI: 0.91, 0.95) and 0.87 (95% CI: 0.86, 0.87) for subjects with immediate availability never enrolling. For subjects without LREC availability, it was 0.91 (95% CI: 0.90, 0.92). Immediate program availability without individual enrollment, compared to no program availability, was estimated to slightly albeit significantly decrease survival by 4% (95% CI 0.03,0.06, p<0.01). Immediately availability and enrollment resulted in a 7 % higher in-care survival compared to immediate availability with non-enrollment after 450 days (95% CI-0.08,-0.05, p<0.01). The results are consistent with a fairly small impact of both availability and enrollment in the LREC program on incare survival.

