{"title":"A Quantitative Bias Analysis Approach to Informative Presence Bias in Electronic Health Records.","authors":"Hanxi Zhang, Amy S Clark, Rebecca A Hubbard","doi":"10.1097/ede.0000000000001714","DOIUrl":null,"url":null,"abstract":"Accurate outcome and exposure ascertainment in electronic health record (EHR) data, referred to as EHR phenotyping, relies on the completeness and accuracy of EHR data for each individual. However, some individuals, such as those with a greater comorbidity burden, visit the health care system more frequently and thus have more complete data, compared with others. Ignoring such dependence of exposure and outcome misclassification on visit frequency can bias estimates of associations in EHR analysis. We developed a framework for describing the structure of outcome and exposure misclassification due to informative visit processes in EHR data and assessed the utility of a quantitative bias analysis approach to adjusting for bias induced by informative visit patterns. Using simulations, we found that this method produced unbiased estimates across all informative visit structures, if the phenotype sensitivity and specificity were correctly specified. We applied this method in an example where the association between diabetes and progression-free survival in metastatic breast cancer patients may be subject to informative presence bias. The quantitative bias analysis approach allowed us to evaluate robustness of results to informative presence bias and indicated that findings were unlikely to change across a range of plausible values for phenotype sensitivity and specificity. Researchers using EHR data should carefully consider the informative visit structure reflected in their data and use appropriate approaches such as the quantitative bias analysis approach described here to evaluate robustness of study findings.","PeriodicalId":11779,"journal":{"name":"Epidemiology","volume":"11 1","pages":""},"PeriodicalIF":4.7000,"publicationDate":"2024-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1097/ede.0000000000001714","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0
Abstract
Accurate outcome and exposure ascertainment in electronic health record (EHR) data, referred to as EHR phenotyping, relies on the completeness and accuracy of EHR data for each individual. However, some individuals, such as those with a greater comorbidity burden, visit the health care system more frequently and thus have more complete data, compared with others. Ignoring such dependence of exposure and outcome misclassification on visit frequency can bias estimates of associations in EHR analysis. We developed a framework for describing the structure of outcome and exposure misclassification due to informative visit processes in EHR data and assessed the utility of a quantitative bias analysis approach to adjusting for bias induced by informative visit patterns. Using simulations, we found that this method produced unbiased estimates across all informative visit structures, if the phenotype sensitivity and specificity were correctly specified. We applied this method in an example where the association between diabetes and progression-free survival in metastatic breast cancer patients may be subject to informative presence bias. The quantitative bias analysis approach allowed us to evaluate robustness of results to informative presence bias and indicated that findings were unlikely to change across a range of plausible values for phenotype sensitivity and specificity. Researchers using EHR data should carefully consider the informative visit structure reflected in their data and use appropriate approaches such as the quantitative bias analysis approach described here to evaluate robustness of study findings.
电子健康记录(EHR)数据中准确的结果和暴露确定,即 EHR 表型分析,依赖于每个人 EHR 数据的完整性和准确性。然而,与其他人相比,有些人,如合并症负担较重的人,会更频繁地访问医疗保健系统,因此拥有更完整的数据。如果忽略了暴露和结果误分类对就诊频率的这种依赖性,就会使电子病历分析中对相关性的估计出现偏差。我们建立了一个框架,用于描述电子病历数据中信息性就诊过程导致的结果和暴露误分类的结构,并评估了定量偏倚分析方法在调整信息性就诊模式导致的偏倚方面的实用性。通过模拟实验,我们发现如果表型敏感性和特异性指定正确,该方法可在所有信息性就诊结构中产生无偏估计值。我们在一个例子中应用了这种方法,在这个例子中,转移性乳腺癌患者的糖尿病与无进展生存期之间的关联可能会受到信息性存在偏差的影响。定量偏倚分析方法使我们能够评估结果对信息性存在偏倚的稳健性,并表明在表型敏感性和特异性的一系列可信值范围内,研究结果不太可能发生变化。使用电子病历数据的研究人员应仔细考虑其数据中反映的信息性就诊结构,并使用适当的方法(如本文所述的定量偏倚分析方法)来评估研究结果的稳健性。
期刊介绍:
Epidemiology publishes original research from all fields of epidemiology. The journal also welcomes review articles and meta-analyses, novel hypotheses, descriptions and applications of new methods, and discussions of research theory or public health policy. We give special consideration to papers from developing countries.