{"title":"Classification of imbalanced ECGs through segmentation models and augmented by conditional diffusion model","authors":"Jinhee Kwak, Jaehee Jung","doi":"10.7717/peerj-cs.2299","DOIUrl":null,"url":null,"abstract":"Electrocardiograms (ECGs) provide essential data for diagnosing arrhythmias, which can potentially cause serious health complications. Early detection through continuous monitoring is crucial for timely intervention. The Massachusetts Institute of Technology-Beth Israel Hospital (MIT-BIH) arrhythmia dataset employed for arrhythmia analysis research comprises imbalanced data. It is necessary to create a robust model independent of data imbalances to classify arrhythmias accurately. To mitigate the pronounced class imbalance in the MIT-BIH arrhythmia dataset, this study employs advanced augmentation techniques, specifically variational autoencoder (VAE) and conditional diffusion, to augment the dataset. Furthermore, accurately segmenting the continuous heartbeat dataset into individual heartbeats is crucial for confidently detecting arrhythmias. This research compared a model that employed annotation-based segmentation, utilizing R-peak labels, and a model that utilized an automated segmentation method based on a deep learning model to segment heartbeats. In our experiments, the proposed model, utilizing MobileNetV2 along with annotation-based segmentation and conditional diffusion augmentation to address minority class, demonstrated a notable 1.23% improvement in the F1 score and 1.73% in the precision, compared to the model classifying arrhythmia classes with the original imbalanced dataset. This research presents a model that accurately classifies a wide range of arrhythmias, including minority classes, moving beyond the previously limited arrhythmia classification models. It can serve as a basis for better data utilization and model performance improvement in arrhythmia diagnosis and medical service research. These achievements enhance the applicability in the medical field and contribute to improving the quality of healthcare services by providing more sophisticated and reliable diagnostic tools.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"15 1","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PeerJ Computer Science","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.7717/peerj-cs.2299","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Electrocardiograms (ECGs) provide essential data for diagnosing arrhythmias, which can potentially cause serious health complications. Early detection through continuous monitoring is crucial for timely intervention. The Massachusetts Institute of Technology-Beth Israel Hospital (MIT-BIH) arrhythmia dataset employed for arrhythmia analysis research comprises imbalanced data. It is necessary to create a robust model independent of data imbalances to classify arrhythmias accurately. To mitigate the pronounced class imbalance in the MIT-BIH arrhythmia dataset, this study employs advanced augmentation techniques, specifically variational autoencoder (VAE) and conditional diffusion, to augment the dataset. Furthermore, accurately segmenting the continuous heartbeat dataset into individual heartbeats is crucial for confidently detecting arrhythmias. This research compared a model that employed annotation-based segmentation, utilizing R-peak labels, and a model that utilized an automated segmentation method based on a deep learning model to segment heartbeats. In our experiments, the proposed model, utilizing MobileNetV2 along with annotation-based segmentation and conditional diffusion augmentation to address minority class, demonstrated a notable 1.23% improvement in the F1 score and 1.73% in the precision, compared to the model classifying arrhythmia classes with the original imbalanced dataset. This research presents a model that accurately classifies a wide range of arrhythmias, including minority classes, moving beyond the previously limited arrhythmia classification models. It can serve as a basis for better data utilization and model performance improvement in arrhythmia diagnosis and medical service research. These achievements enhance the applicability in the medical field and contribute to improving the quality of healthcare services by providing more sophisticated and reliable diagnostic tools.
期刊介绍:
PeerJ Computer Science is the new open access journal covering all subject areas in computer science, with the backing of a prestigious advisory board and more than 300 academic editors.