ECG-Image-Kit: a synthetic image generation toolbox to facilitate deep learning-based electrocardiogram digitization

IF 2.3 4区医学 Q3 BIOPHYSICS Physiological measurement Pub Date : 2024-05-27 DOI:10.1088/1361-6579/ad4954

Kshama Kodthalu Shivashankara, Deepanshi, Afagh Mehri Shervedani, Gari D Clifford, Matthew A Reyna and Reza Sameni

{"title":"ECG-Image-Kit: a synthetic image generation toolbox to facilitate deep learning-based electrocardiogram digitization","authors":"Kshama Kodthalu Shivashankara, Deepanshi, Afagh Mehri Shervedani, Gari D Clifford, Matthew A Reyna and Reza Sameni","doi":"10.1088/1361-6579/ad4954","DOIUrl":null,"url":null,"abstract":"Objective. Cardiovascular diseases are a major cause of mortality globally, and electrocardiograms (ECGs) are crucial for diagnosing them. Traditionally, ECGs are stored in printed formats. However, these printouts, even when scanned, are incompatible with advanced ECG diagnosis software that require time-series data. Digitizing ECG images is vital for training machine learning models in ECG diagnosis, leveraging the extensive global archives collected over decades. Deep learning models for image processing are promising in this regard, although the lack of clinical ECG archives with reference time-series data is challenging. Data augmentation techniques using realistic generative data models provide a solution. Approach. We introduce ECG-Image-Kit, an open-source toolbox for generating synthetic multi-lead ECG images with realistic artifacts from time-series data, aimed at automating the conversion of scanned ECG images to ECG data points. The tool synthesizes ECG images from real time-series data, applying distortions like text artifacts, wrinkles, and creases on a standard ECG paper background. Main results. As a case study, we used ECG-Image-Kit to create a dataset of 21 801 ECG images from the PhysioNet QT database. We developed and trained a combination of a traditional computer vision and deep neural network model on this dataset to convert synthetic images into time-series data for evaluation. We assessed digitization quality by calculating the signal-to-noise ratio and compared clinical parameters like QRS width, RR, and QT intervals recovered from this pipeline, with the ground truth extracted from ECG time-series. The results show that this deep learning pipeline accurately digitizes paper ECGs, maintaining clinical parameters, and highlights a generative approach to digitization. Significance. The toolbox has broad applications, including model development for ECG image digitization and classification. The toolbox currently supports data augmentation for the 2024 PhysioNet Challenge, focusing on digitizing and classifying paper ECG images.","PeriodicalId":20047,"journal":{"name":"Physiological measurement","volume":"29 1","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physiological measurement","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1088/1361-6579/ad4954","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOPHYSICS","Score":null,"Total":0}

引用次数: 0

Abstract

Objective. Cardiovascular diseases are a major cause of mortality globally, and electrocardiograms (ECGs) are crucial for diagnosing them. Traditionally, ECGs are stored in printed formats. However, these printouts, even when scanned, are incompatible with advanced ECG diagnosis software that require time-series data. Digitizing ECG images is vital for training machine learning models in ECG diagnosis, leveraging the extensive global archives collected over decades. Deep learning models for image processing are promising in this regard, although the lack of clinical ECG archives with reference time-series data is challenging. Data augmentation techniques using realistic generative data models provide a solution. Approach. We introduce ECG-Image-Kit, an open-source toolbox for generating synthetic multi-lead ECG images with realistic artifacts from time-series data, aimed at automating the conversion of scanned ECG images to ECG data points. The tool synthesizes ECG images from real time-series data, applying distortions like text artifacts, wrinkles, and creases on a standard ECG paper background. Main results. As a case study, we used ECG-Image-Kit to create a dataset of 21 801 ECG images from the PhysioNet QT database. We developed and trained a combination of a traditional computer vision and deep neural network model on this dataset to convert synthetic images into time-series data for evaluation. We assessed digitization quality by calculating the signal-to-noise ratio and compared clinical parameters like QRS width, RR, and QT intervals recovered from this pipeline, with the ground truth extracted from ECG time-series. The results show that this deep learning pipeline accurately digitizes paper ECGs, maintaining clinical parameters, and highlights a generative approach to digitization. Significance. The toolbox has broad applications, including model development for ECG image digitization and classification. The toolbox currently supports data augmentation for the 2024 PhysioNet Challenge, focusing on digitizing and classifying paper ECG images.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

ECG-Image-Kit：促进基于深度学习的心电图数字化的合成图像生成工具箱

目的。心血管疾病是全球死亡的主要原因，而心电图（ECG）是诊断心血管疾病的关键。传统上，心电图以打印格式存储。然而，这些打印输出即使经过扫描，也无法与需要时间序列数据的高级心电图诊断软件兼容。心电图图像数字化对于利用数十年来收集的大量全球档案来训练心电图诊断中的机器学习模型至关重要。用于图像处理的深度学习模型在这方面大有可为，尽管缺乏具有参考时间序列数据的临床心电图档案是一项挑战。使用现实生成数据模型的数据增强技术提供了一种解决方案。方法。我们介绍的 ECG-Image-Kit 是一个开源工具箱，用于从时间序列数据生成具有逼真伪影的合成多导联心电图图像，旨在将扫描心电图图像自动转换为心电图数据点。该工具根据真实的时间序列数据合成心电图图像，在标准心电图纸背景上应用文字伪影、皱纹和折痕等变形。主要结果作为案例研究，我们使用 ECG-Image-Kit 从 PhysioNet QT 数据库中创建了一个包含 21 801 张心电图图像的数据集。我们在该数据集上开发并训练了一个传统计算机视觉与深度神经网络相结合的模型，将合成图像转换为时间序列数据进行评估。我们通过计算信噪比来评估数字化质量，并将该管道恢复的 QRS 宽度、RR 和 QT 间期等临床参数与从心电图时间序列中提取的基本事实进行比较。结果表明，该深度学习管道能准确数字化纸质心电图，同时保持临床参数，并突出了数字化的生成方法。意义重大。该工具箱应用广泛，包括心电图图像数字化和分类的模型开发。该工具箱目前支持 2024 PhysioNet 挑战赛的数据扩增，重点关注纸质心电图图像的数字化和分类。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Physiological measurement 生物-工程：生物医学

CiteScore

5.50

自引率

9.40%

发文量

124

审稿时长

3 months

期刊介绍： Physiological Measurement publishes papers about the quantitative assessment and visualization of physiological function in clinical research and practice, with an emphasis on the development of new methods of measurement and their validation. Papers are published on topics including: applied physiology in illness and health electrical bioimpedance, optical and acoustic measurement techniques advanced methods of time series and other data analysis biomedical and clinical engineering in-patient and ambulatory monitoring point-of-care technologies novel clinical measurements of cardiovascular, neurological, and musculoskeletal systems. measurements in molecular, cellular and organ physiology and electrophysiology physiological modeling and simulation novel biomedical sensors, instruments, devices and systems measurement standards and guidelines.