Daniel Jay Riskin, Keri L Monda, Joshua J Gagne, Robert Reynolds, A Reshad Garan, Nancy Dreyer, Paul Muntner, Brian D Bradbury
{"title":"Implementing Accuracy, Completeness, and Traceability for Data Reliability.","authors":"Daniel Jay Riskin, Keri L Monda, Joshua J Gagne, Robert Reynolds, A Reshad Garan, Nancy Dreyer, Paul Muntner, Brian D Bradbury","doi":"10.1001/jamanetworkopen.2025.0128","DOIUrl":null,"url":null,"abstract":"<p><strong>Importance: </strong>While it is well known that data quality underlies evidence validity, the measurement and impacts of data reliability are less well understood. The need has been highlighted in the 21st Century Cures Act of 2016 and US Food and Drug Administration (FDA) Real-World Evidence Program framework in 2018, draft guidance in 2021 and final guidance in 2024. Timely visibility into implementation may be provided by the Transforming Real-World Evidence With Unstructured and Structured Data to Advance Tailored Therapy (TRUST) study, a Verantos Inc-led FDA-funded demonstration project to explore data quality and inform regulatory decision-making.</p><p><strong>Objective: </strong>To report early learnings from the TRUST study on distilling data reliability to practice including developing a practical approach to quantify accuracy, completeness, and traceability of real-world data (routinely collected patient health data) and comparing traditional to advanced data and technologies on these dimensions.</p><p><strong>Design, setting, and participants: </strong>This quality improvement study was performed using data from 58 hospitals and more than 1180 associated outpatient clinics from academic and community settings in the US. Participants included patients with asthma treated between January 1, 2014, and December 31, 2022. Data were analyzed from January 1 to June 30, 2024.</p><p><strong>Exposures: </strong>The traditional approach used medical and pharmacy claims as source documentation. The advanced approach used medical and pharmacy claims, electronic health records with unstructured data extracted using artificial intelligence methods, and mortality registry data.</p><p><strong>Main outcomes and measures: </strong>Accuracy was assessed using the F1 score. Completeness was estimated as a weighted mean of available data sources during each calendar year under study for each patient. Traceability was estimated as the proportion of data elements identified in clinical source documentation.</p><p><strong>Results: </strong>In total, 120 616 patients met the minimum data requirements (mean [SD] age, 43.2 [18.5] years; 41 011 male [34.0%]). For accuracy, traditional approaches had F1 scores of 59.5% and advanced approaches had scores of 93.4%. For completeness, traditional approaches yielded mean scores of 46.1% (95% CI, 38.2%-54.0%); advanced approaches, 96.6% (95% CI, 85.8%-1.1%). For traceability, traditional approaches had 11.5% (95% CI, 11.4%-11.5%) and advanced approaches had 77.3% (95% CI, 77.3%-77.3%) of data elements traceable to clinical source data.</p><p><strong>Conclusions and relevance: </strong>In this study, practical implementation of data reliability measurement is described. Findings suggest the potential of using multiple data sources and applying advanced methods to increase real-world data reliability. The inclusion of data reliability standards when generating evidence from these sources has the potential to strengthen support for the use of real-world evidence in the prescription, reimbursement, and approval of medications.</p>","PeriodicalId":14694,"journal":{"name":"JAMA Network Open","volume":"8 3","pages":"e250128"},"PeriodicalIF":10.5000,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11894483/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JAMA Network Open","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1001/jamanetworkopen.2025.0128","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}
引用次数: 0
Abstract
Importance: While it is well known that data quality underlies evidence validity, the measurement and impacts of data reliability are less well understood. The need has been highlighted in the 21st Century Cures Act of 2016 and US Food and Drug Administration (FDA) Real-World Evidence Program framework in 2018, draft guidance in 2021 and final guidance in 2024. Timely visibility into implementation may be provided by the Transforming Real-World Evidence With Unstructured and Structured Data to Advance Tailored Therapy (TRUST) study, a Verantos Inc-led FDA-funded demonstration project to explore data quality and inform regulatory decision-making.
Objective: To report early learnings from the TRUST study on distilling data reliability to practice including developing a practical approach to quantify accuracy, completeness, and traceability of real-world data (routinely collected patient health data) and comparing traditional to advanced data and technologies on these dimensions.
Design, setting, and participants: This quality improvement study was performed using data from 58 hospitals and more than 1180 associated outpatient clinics from academic and community settings in the US. Participants included patients with asthma treated between January 1, 2014, and December 31, 2022. Data were analyzed from January 1 to June 30, 2024.
Exposures: The traditional approach used medical and pharmacy claims as source documentation. The advanced approach used medical and pharmacy claims, electronic health records with unstructured data extracted using artificial intelligence methods, and mortality registry data.
Main outcomes and measures: Accuracy was assessed using the F1 score. Completeness was estimated as a weighted mean of available data sources during each calendar year under study for each patient. Traceability was estimated as the proportion of data elements identified in clinical source documentation.
Results: In total, 120 616 patients met the minimum data requirements (mean [SD] age, 43.2 [18.5] years; 41 011 male [34.0%]). For accuracy, traditional approaches had F1 scores of 59.5% and advanced approaches had scores of 93.4%. For completeness, traditional approaches yielded mean scores of 46.1% (95% CI, 38.2%-54.0%); advanced approaches, 96.6% (95% CI, 85.8%-1.1%). For traceability, traditional approaches had 11.5% (95% CI, 11.4%-11.5%) and advanced approaches had 77.3% (95% CI, 77.3%-77.3%) of data elements traceable to clinical source data.
Conclusions and relevance: In this study, practical implementation of data reliability measurement is described. Findings suggest the potential of using multiple data sources and applying advanced methods to increase real-world data reliability. The inclusion of data reliability standards when generating evidence from these sources has the potential to strengthen support for the use of real-world evidence in the prescription, reimbursement, and approval of medications.
期刊介绍:
JAMA Network Open, a member of the esteemed JAMA Network, stands as an international, peer-reviewed, open-access general medical journal.The publication is dedicated to disseminating research across various health disciplines and countries, encompassing clinical care, innovation in health care, health policy, and global health.
JAMA Network Open caters to clinicians, investigators, and policymakers, providing a platform for valuable insights and advancements in the medical field. As part of the JAMA Network, a consortium of peer-reviewed general medical and specialty publications, JAMA Network Open contributes to the collective knowledge and understanding within the medical community.