Electronic health record (EHR) systems capture patient information inconsistently, with patients generally contributing more data when they are sick than healthy. This creates "informed presence," systematic differences between captured and non-captured data, potentially biasing association estimates. There is growing interest in methods that account for informed presence, but practical approaches for conceptualizing, identifying, and addressing this bias in applied EHR research have received limited attention. Focusing on longitudinal settings, we present a conceptual framework for informed presence bias, which arises when data capture depends on exposure and outcome and thus the visit process acts as a collider. We then illustrate methods that aim to reduce bias by reweighting or resampling observed data to approximate conditional independence between the visit process and the outcome. We illustrate these methods using longitudinal EHR data from pediatric solid organ transplant recipients (N=271) to examine the association between steroids and cytomegalovirus viremia, where the frequency of cytomegalovirus testing varies across patients and over time. Incidence rate ratios decreased from 1.83 (95% CI 1.02, 3.28) in a naïve analysis to 1.37 (0.73, 2.57) when accounting for informed presence using inverse intensity weighting. Incidence rate ratio estimates from bootstrapped inverse intensity weighting were 1.37 (0.71, 2.27) and 1.40 (0.73, 2.68) from multiple outputation. These results show the anticipated attenuation of effect estimates after accounting for informed presence bias. When analyzing irregularly measured EHR data, we recommend (1) identifying the expected observation process using conceptual diagrams, (2) assessing dependence in the observation process, and (3) accounting for outcome dependence in statistical analysis.
扫码关注我们
求助内容:
应助结果提醒方式:
