Ben T Varghese, Marlene E Girardo, Ruchi Gupta, Karen M Fischer, Madison Duellman, Michelle M Mielke, Aoife M Egan, Janet E Olson, Adrian Vella, Kent R Bailey, Sagar B Dugani
{"title":"Algorithm to Identify Type 2 Diabetes Using Electronic Health Record and Self-Reported Data.","authors":"Ben T Varghese, Marlene E Girardo, Ruchi Gupta, Karen M Fischer, Madison Duellman, Michelle M Mielke, Aoife M Egan, Janet E Olson, Adrian Vella, Kent R Bailey, Sagar B Dugani","doi":"10.1089/met.2024.0133","DOIUrl":null,"url":null,"abstract":"<p><p><b><i>Aims:</i></b> Identifying participants with type 2 diabetes (T2D) based only on electronic health record (EHR) or self-reported data has limited accuracy. Therefore, the objective of the study was to develop an algorithm using EHR and self-reported data to identify participants with and without T2D. <b><i>Methods:</i></b> We included participants enrolled in the Mayo Clinic Biobank. At enrollment, participants completed a baseline questionnaire on health conditions, including T2D, and provided access to their EHR data. T2D status was based on self-report and EHR data (International Classification of Diseases codes, hemoglobin A1c [HbA1c], plasma glucose, and glucose-regulating medications) within 5 years prior to and 2 months after enrollment. Participants who self-reported T2D but lacked corroborating EHR data were categorized separately (\"only self-reported T2D\"). After identifying participants with T2D, we identified participants without T2D based on normal HbA1c and plasma glucose. Participants who self-reported the absence of T2D but lacked corroborating EHR data were categorized separately (\"only self-reported no T2D\"). Using manual chart reviews (gold standard), we calculated the positive and negative predictive values (NPV) to identify T2D. <b><i>Results:</i></b> Of 57,000 participants, the algorithm classified participants as having T2D (<i>n</i> = 6,238), no T2D (<i>n</i> = 38,883), \"only self-reported T2D\" (<i>n</i> = 757), and \"only self-reported no-T2D\" (<i>n</i> = 9,759). The algorithm had a high positive predictive value (96.0% [91.5%-98.5%]), NPV (100% [98.0%-100%]), and accuracy (99.5% [98.3%-99.8%]). Participant age (median [range]) ranged from 52 (18-98) years (only self-reported T2D) to 67 (19-99) years (T2D) (<i>P</i> < 0.0001), and the proportion of women ranged from 45.3% (T2D) to 69.6% (only self-reported no T2D) (<i>P</i> < 0.0001). Most participants were of the White race (84.0%-92.7%) and non-Hispanic ethnicity (97.6%-98.6%). <b><i>Conclusions:</i></b> In this study, we developed an algorithm to accurately identify participants with and without T2D, which may be generalizable to cohorts with linked EHR data.</p>","PeriodicalId":18405,"journal":{"name":"Metabolic syndrome and related disorders","volume":" ","pages":"186-192"},"PeriodicalIF":1.7000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12369842/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Metabolic syndrome and related disorders","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1089/met.2024.0133","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/4/7 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
Aims: Identifying participants with type 2 diabetes (T2D) based only on electronic health record (EHR) or self-reported data has limited accuracy. Therefore, the objective of the study was to develop an algorithm using EHR and self-reported data to identify participants with and without T2D. Methods: We included participants enrolled in the Mayo Clinic Biobank. At enrollment, participants completed a baseline questionnaire on health conditions, including T2D, and provided access to their EHR data. T2D status was based on self-report and EHR data (International Classification of Diseases codes, hemoglobin A1c [HbA1c], plasma glucose, and glucose-regulating medications) within 5 years prior to and 2 months after enrollment. Participants who self-reported T2D but lacked corroborating EHR data were categorized separately ("only self-reported T2D"). After identifying participants with T2D, we identified participants without T2D based on normal HbA1c and plasma glucose. Participants who self-reported the absence of T2D but lacked corroborating EHR data were categorized separately ("only self-reported no T2D"). Using manual chart reviews (gold standard), we calculated the positive and negative predictive values (NPV) to identify T2D. Results: Of 57,000 participants, the algorithm classified participants as having T2D (n = 6,238), no T2D (n = 38,883), "only self-reported T2D" (n = 757), and "only self-reported no-T2D" (n = 9,759). The algorithm had a high positive predictive value (96.0% [91.5%-98.5%]), NPV (100% [98.0%-100%]), and accuracy (99.5% [98.3%-99.8%]). Participant age (median [range]) ranged from 52 (18-98) years (only self-reported T2D) to 67 (19-99) years (T2D) (P < 0.0001), and the proportion of women ranged from 45.3% (T2D) to 69.6% (only self-reported no T2D) (P < 0.0001). Most participants were of the White race (84.0%-92.7%) and non-Hispanic ethnicity (97.6%-98.6%). Conclusions: In this study, we developed an algorithm to accurately identify participants with and without T2D, which may be generalizable to cohorts with linked EHR data.
期刊介绍:
Metabolic Syndrome and Related Disorders is the only peer-reviewed journal focusing solely on the pathophysiology, recognition, and treatment of this major health condition. The Journal meets the imperative for comprehensive research, data, and commentary on metabolic disorder as a suspected precursor to a wide range of diseases, including type 2 diabetes, cardiovascular disease, stroke, cancer, polycystic ovary syndrome, gout, and asthma.
Metabolic Syndrome and Related Disorders coverage includes:
-Insulin resistance-
Central obesity-
Glucose intolerance-
Dyslipidemia with elevated triglycerides-
Low HDL-cholesterol-
Microalbuminuria-
Predominance of small dense LDL-cholesterol particles-
Hypertension-
Endothelial dysfunction-
Oxidative stress-
Inflammation-
Related disorders of polycystic ovarian syndrome, fatty liver disease (NASH), and gout