Electronic health records (EHRs) may be a promising alternative to traditional health surveys for population health surveillance due to their detailed health and patient information, low cost, and minimal respondent burden. However, concerns about the population representativeness of EHRs raise questions about their validity for public health monitoring and tracking social determinants of health. This study addresses these concerns by evaluating the representativeness of EHRs from UNC Health, a large integrated health delivery system in North Carolina, by linking individual-level EHRs (2018-2022; n = 2.12 million unique patients) with individual-level microdata from the nationally representative American Community Survey (ACS, 2018-2022). Specifically, we evaluate how demographic factors (age, sex, race/ethnicity), socioeconomic factors (education, employment, poverty, food stamps, public assistance), and health insurance impact the likelihood that a North Carolina ACS respondent will appear in the UNC Health EHRs. Linear probability models indicate that although UNC Health patients are not fully representative of the state population, selection biases are small and align with known patterns of healthcare utilization (e.g., overrepresentation among females, older adults, and individuals with health insurance). Moderate selection is observed by race/ethnicity and socioeconomic status, with overrepresentation at both the high and low ends of the socioeconomic spectrum. These findings provide cautious reassurance for the use of appropriately weighted EHR data in population health monitoring while demonstrating the value of evaluating and improving the utility of EHRs in public health research through linkages with individual-level nationally representative data.
扫码关注我们
求助内容:
应助结果提醒方式:
