Lew Berman, Yechiam Ostchega, John Giannini, Lakshmi Priya Anandan, Emily Clark, Matthew Spotnitz, Lina Sulieman, Michael Volynski, Andrea Ramirez
{"title":"Application of a Data Quality Framework to Ductal Carcinoma In Situ Using Electronic Health Record Data From the <i>All of Us</i> Research Program.","authors":"Lew Berman, Yechiam Ostchega, John Giannini, Lakshmi Priya Anandan, Emily Clark, Matthew Spotnitz, Lina Sulieman, Michael Volynski, Andrea Ramirez","doi":"10.1200/CCI.24.00052","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>The specific aims of this paper are to (1) develop and operationalize an electronic health record (EHR) data quality framework, (2) apply the dimensions of the framework to the phenotype and treatment pathways of ductal carcinoma in situ (DCIS) using <i>All of Us</i> Research Program data, and (3) propose and apply a checklist to evaluate the application of the framework.</p><p><strong>Methods: </strong>We developed a framework of five data quality dimensions (DQD; completeness, concordance, conformance, plausibility, and temporality). Participants signed a consent and Health Insurance Portability and Accountability Act authorization to share EHR data and responded to demographic questions in the Basics questionnaire. We evaluated the internal characteristics of the data and compared data with external benchmarks with descriptive and inferential statistics. We developed a DQD checklist to evaluate concept selection, internal verification, and external validity for each DQD. The Observational Medical Outcomes Partnership Common Data Model (OMOP CDM) concept ID codes for DCIS were used to select a cohort of 2,209 females 18 years and older.</p><p><strong>Results: </strong>Using the proposed DQD checklist criteria, (1) concepts were selected and internally verified for conformance; (2) concepts were selected and internally verified for completeness; (3) concepts were selected, internally verified, and externally validated for concordance; (4) concepts were selected, internally verified, and externally validated for plausibility; and (5) concepts were selected, internally verified, and externally validated for temporality.</p><p><strong>Conclusion: </strong>This assessment and evaluation provided insights into data quality for the DCIS phenotype using EHR data from the <i>All of Us</i> Research Program. The review demonstrates that salient clinical measures can be selected, applied, and operationalized within a conceptual framework and evaluated for fitness for use by applying a proposed checklist.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":null,"pages":null},"PeriodicalIF":3.3000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JCO Clinical Cancer Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1200/CCI.24.00052","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: The specific aims of this paper are to (1) develop and operationalize an electronic health record (EHR) data quality framework, (2) apply the dimensions of the framework to the phenotype and treatment pathways of ductal carcinoma in situ (DCIS) using All of Us Research Program data, and (3) propose and apply a checklist to evaluate the application of the framework.
Methods: We developed a framework of five data quality dimensions (DQD; completeness, concordance, conformance, plausibility, and temporality). Participants signed a consent and Health Insurance Portability and Accountability Act authorization to share EHR data and responded to demographic questions in the Basics questionnaire. We evaluated the internal characteristics of the data and compared data with external benchmarks with descriptive and inferential statistics. We developed a DQD checklist to evaluate concept selection, internal verification, and external validity for each DQD. The Observational Medical Outcomes Partnership Common Data Model (OMOP CDM) concept ID codes for DCIS were used to select a cohort of 2,209 females 18 years and older.
Results: Using the proposed DQD checklist criteria, (1) concepts were selected and internally verified for conformance; (2) concepts were selected and internally verified for completeness; (3) concepts were selected, internally verified, and externally validated for concordance; (4) concepts were selected, internally verified, and externally validated for plausibility; and (5) concepts were selected, internally verified, and externally validated for temporality.
Conclusion: This assessment and evaluation provided insights into data quality for the DCIS phenotype using EHR data from the All of Us Research Program. The review demonstrates that salient clinical measures can be selected, applied, and operationalized within a conceptual framework and evaluated for fitness for use by applying a proposed checklist.