Andrew Guide , Lina Sulieman , Shawn Garbett , Robert M Cronin , Matthew Spotnitz , Karthik Natarajan , Robert J. Carroll , Paul Harris , Qingxia Chen
{"title":"从 \"我们所有人 \"研究计划的成人电子健康记录中识别错误的身高和体重值。","authors":"Andrew Guide , Lina Sulieman , Shawn Garbett , Robert M Cronin , Matthew Spotnitz , Karthik Natarajan , Robert J. Carroll , Paul Harris , Qingxia Chen","doi":"10.1016/j.jbi.2024.104660","DOIUrl":null,"url":null,"abstract":"<div><h3>Introduction</h3><p>Electronic Health Records (EHR) are a useful data source for research, but their usability is hindered by measurement errors. This study investigated an automatic error detection algorithm for adult height and weight measurements in EHR for the <em>All of Us</em> Research Program (<em>All of Us</em>).</p></div><div><h3>Methods</h3><p>We developed reference charts for adult heights and weights that were stratified on participant sex. Our analysis included 4,076,534 height and 5,207,328 wt measurements from ∼ 150,000 participants. Errors were identified using modified standard deviation scores, differences from their expected values, and significant changes between consecutive measurements. We evaluated our method with chart-reviewed heights (8,092) and weights (9,039) from 250 randomly selected participants and compared it with the current cleaning algorithm in <em>All of Us</em>.</p></div><div><h3>Results</h3><p>The proposed algorithm classified 1.4 % of height and 1.5 % of weight errors in the full cohort. Sensitivity was 90.4 % (95 % CI: 79.0–96.8 %) for heights and 65.9 % (95 % CI: 56.9–74.1 %) for weights. Precision was 73.4 % (95 % CI: 60.9–83.7 %) for heights and 62.9 (95 % CI: 54.0–71.1 %) for weights. In comparison, the current cleaning algorithm has inferior performance in sensitivity (55.8 %) and precision (16.5 %) for height errors while having higher precision (94.0 %) and lower sensitivity (61.9 %) for weight errors.</p></div><div><h3>Discussion</h3><p>Our proposed algorithm outperformed in detecting height errors compared to weights. It can serve as a valuable addition to the current <em>All of Us</em> cleaning algorithm for identifying erroneous height values.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"155 ","pages":"Article 104660"},"PeriodicalIF":4.0000,"publicationDate":"2024-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Identifying erroneous height and weight values from adult electronic health records in the All of Us research program\",\"authors\":\"Andrew Guide , Lina Sulieman , Shawn Garbett , Robert M Cronin , Matthew Spotnitz , Karthik Natarajan , Robert J. Carroll , Paul Harris , Qingxia Chen\",\"doi\":\"10.1016/j.jbi.2024.104660\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Introduction</h3><p>Electronic Health Records (EHR) are a useful data source for research, but their usability is hindered by measurement errors. This study investigated an automatic error detection algorithm for adult height and weight measurements in EHR for the <em>All of Us</em> Research Program (<em>All of Us</em>).</p></div><div><h3>Methods</h3><p>We developed reference charts for adult heights and weights that were stratified on participant sex. Our analysis included 4,076,534 height and 5,207,328 wt measurements from ∼ 150,000 participants. Errors were identified using modified standard deviation scores, differences from their expected values, and significant changes between consecutive measurements. We evaluated our method with chart-reviewed heights (8,092) and weights (9,039) from 250 randomly selected participants and compared it with the current cleaning algorithm in <em>All of Us</em>.</p></div><div><h3>Results</h3><p>The proposed algorithm classified 1.4 % of height and 1.5 % of weight errors in the full cohort. Sensitivity was 90.4 % (95 % CI: 79.0–96.8 %) for heights and 65.9 % (95 % CI: 56.9–74.1 %) for weights. Precision was 73.4 % (95 % CI: 60.9–83.7 %) for heights and 62.9 (95 % CI: 54.0–71.1 %) for weights. In comparison, the current cleaning algorithm has inferior performance in sensitivity (55.8 %) and precision (16.5 %) for height errors while having higher precision (94.0 %) and lower sensitivity (61.9 %) for weight errors.</p></div><div><h3>Discussion</h3><p>Our proposed algorithm outperformed in detecting height errors compared to weights. It can serve as a valuable addition to the current <em>All of Us</em> cleaning algorithm for identifying erroneous height values.</p></div>\",\"PeriodicalId\":15263,\"journal\":{\"name\":\"Journal of Biomedical Informatics\",\"volume\":\"155 \",\"pages\":\"Article 104660\"},\"PeriodicalIF\":4.0000,\"publicationDate\":\"2024-05-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Biomedical Informatics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1532046424000789\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Biomedical Informatics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1532046424000789","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
Identifying erroneous height and weight values from adult electronic health records in the All of Us research program
Introduction
Electronic Health Records (EHR) are a useful data source for research, but their usability is hindered by measurement errors. This study investigated an automatic error detection algorithm for adult height and weight measurements in EHR for the All of Us Research Program (All of Us).
Methods
We developed reference charts for adult heights and weights that were stratified on participant sex. Our analysis included 4,076,534 height and 5,207,328 wt measurements from ∼ 150,000 participants. Errors were identified using modified standard deviation scores, differences from their expected values, and significant changes between consecutive measurements. We evaluated our method with chart-reviewed heights (8,092) and weights (9,039) from 250 randomly selected participants and compared it with the current cleaning algorithm in All of Us.
Results
The proposed algorithm classified 1.4 % of height and 1.5 % of weight errors in the full cohort. Sensitivity was 90.4 % (95 % CI: 79.0–96.8 %) for heights and 65.9 % (95 % CI: 56.9–74.1 %) for weights. Precision was 73.4 % (95 % CI: 60.9–83.7 %) for heights and 62.9 (95 % CI: 54.0–71.1 %) for weights. In comparison, the current cleaning algorithm has inferior performance in sensitivity (55.8 %) and precision (16.5 %) for height errors while having higher precision (94.0 %) and lower sensitivity (61.9 %) for weight errors.
Discussion
Our proposed algorithm outperformed in detecting height errors compared to weights. It can serve as a valuable addition to the current All of Us cleaning algorithm for identifying erroneous height values.
期刊介绍:
The Journal of Biomedical Informatics reflects a commitment to high-quality original research papers, reviews, and commentaries in the area of biomedical informatics methodology. Although we publish articles motivated by applications in the biomedical sciences (for example, clinical medicine, health care, population health, and translational bioinformatics), the journal emphasizes reports of new methodologies and techniques that have general applicability and that form the basis for the evolving science of biomedical informatics. Articles on medical devices; evaluations of implemented systems (including clinical trials of information technologies); or papers that provide insight into a biological process, a specific disease, or treatment options would generally be more suitable for publication in other venues. Papers on applications of signal processing and image analysis are often more suitable for biomedical engineering journals or other informatics journals, although we do publish papers that emphasize the information management and knowledge representation/modeling issues that arise in the storage and use of biological signals and images. System descriptions are welcome if they illustrate and substantiate the underlying methodology that is the principal focus of the report and an effort is made to address the generalizability and/or range of application of that methodology. Note also that, given the international nature of JBI, papers that deal with specific languages other than English, or with country-specific health systems or approaches, are acceptable for JBI only if they offer generalizable lessons that are relevant to the broad JBI readership, regardless of their country, language, culture, or health system.