Lanjing Wang, Vihaan Manchanda, Holly Picotte, Chandler Beon, Jennifer L Hall, Juan Zhao, Xue Feng
{"title":"Synthetic Data for the Get With The Guidelines-Stroke Registry.","authors":"Lanjing Wang, Vihaan Manchanda, Holly Picotte, Chandler Beon, Jennifer L Hall, Juan Zhao, Xue Feng","doi":"10.1161/JAHA.124.039667","DOIUrl":null,"url":null,"abstract":"<p><p>The American Heart Association's Get With The Guidelines-Quality Improvement registry is a vital resource for real-world cardiovascular and stroke data and research, containing >14 million records from >2800 participating hospitals. To facilitate and streamline research, we aim to generate a synthetic data set that increases access to real-world data and facilitates data exploration of the Get With The Guidelines-Stroke registry. We first randomly sampled 1000 records from the entire registry data set from 2005 to 2021 containing 7.8 million records. To preserve privacy and break the links from the original data, we shifted all data time variables and replaced all patient identifiers. To evaluate the generated synthetic data, we compared the distributions of patient demographics (eg, age, race, sex) and other key stroke-related measures. The generated synthetic data exhibited similar distributions in age, race, sex, and time-sensitive metrics such as door-to-needle time and time to intravenous thrombolytic therapy, demonstrating that this open access data set can provide all researchers the opportunity to explore real-world cardiovascular and stroke data.</p>","PeriodicalId":54370,"journal":{"name":"Journal of the American Heart Association","volume":" ","pages":"e039667"},"PeriodicalIF":5.0000,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Heart Association","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1161/JAHA.124.039667","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CARDIAC & CARDIOVASCULAR SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
The American Heart Association's Get With The Guidelines-Quality Improvement registry is a vital resource for real-world cardiovascular and stroke data and research, containing >14 million records from >2800 participating hospitals. To facilitate and streamline research, we aim to generate a synthetic data set that increases access to real-world data and facilitates data exploration of the Get With The Guidelines-Stroke registry. We first randomly sampled 1000 records from the entire registry data set from 2005 to 2021 containing 7.8 million records. To preserve privacy and break the links from the original data, we shifted all data time variables and replaced all patient identifiers. To evaluate the generated synthetic data, we compared the distributions of patient demographics (eg, age, race, sex) and other key stroke-related measures. The generated synthetic data exhibited similar distributions in age, race, sex, and time-sensitive metrics such as door-to-needle time and time to intravenous thrombolytic therapy, demonstrating that this open access data set can provide all researchers the opportunity to explore real-world cardiovascular and stroke data.
期刊介绍:
As an Open Access journal, JAHA - Journal of the American Heart Association is rapidly and freely available, accelerating the translation of strong science into effective practice.
JAHA is an authoritative, peer-reviewed Open Access journal focusing on cardiovascular and cerebrovascular disease. JAHA provides a global forum for basic and clinical research and timely reviews on cardiovascular disease and stroke. As an Open Access journal, its content is free on publication to read, download, and share, accelerating the translation of strong science into effective practice.