Serhii Reznichenko MS , John Whitaker MD, PhD , Zixuan Ni PhD , Shijie Zhou PhD
{"title":"Comparing ECG Lead Subsets for Heart Arrhythmia/ECG Pattern Classification: Convolutional Neural Networks and Random Forest","authors":"Serhii Reznichenko MS , John Whitaker MD, PhD , Zixuan Ni PhD , Shijie Zhou PhD","doi":"10.1016/j.cjco.2024.10.012","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Despite the growth in popularity of deep learning (DL), limited research has compared the performance of DL and conventional machine learning (CML) methods in heart arrhythmia/electrocardiography (ECG) pattern classification. In addition, the classification of heart arrhythmias/ECG patterns is often dependent on specific ECG leads for accurate classification, and it remains unknown how DL and CML methods perform on reduced subsets of ECG leads. In this study, we sought to assess the accuracy of convolutional neural network (CNN) and random forest (RF) models for classifying arrhythmias/ECG patterns using reduced ECG lead subsets representing DL and CML methods.</div></div><div><h3>Methods</h3><div>We used a public data set from the PhysioNet Cardiology Challenge 2020. For the DL method, we trained a CNN classifier extracting features for each ECG lead, which were then used in a feedforward neural network. We used a random forest classifier with manually extracted features for the CML method. Optimal ECG lead subsets were identified by means of recursive feature elimination for both methods.</div></div><div><h3>Results</h3><div>The CML method required 19% more leads (equating to ∼ 2 leads) compared with the DL method. Four common leads (I, II, V5, V6) were identified in each of the subsets of ECG leads using the CML method, and no common leads were consistently present for the DL method. The average macro F1 scores were 0.761 for the DL and 0.759 for the CML.</div></div><div><h3>Conclusions</h3><div>Optimal ECG lead subsets provide classification accuracy similar to that using all 12 leads across DL and CML methods. The DL method achieved slightly higher classification accuracy on larger data sets and required fewer ECG leads compared with the CML method.</div></div>","PeriodicalId":36924,"journal":{"name":"CJC Open","volume":"7 2","pages":"Pages 176-186"},"PeriodicalIF":2.5000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"CJC Open","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2589790X24005213","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CARDIAC & CARDIOVASCULAR SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Background
Despite the growth in popularity of deep learning (DL), limited research has compared the performance of DL and conventional machine learning (CML) methods in heart arrhythmia/electrocardiography (ECG) pattern classification. In addition, the classification of heart arrhythmias/ECG patterns is often dependent on specific ECG leads for accurate classification, and it remains unknown how DL and CML methods perform on reduced subsets of ECG leads. In this study, we sought to assess the accuracy of convolutional neural network (CNN) and random forest (RF) models for classifying arrhythmias/ECG patterns using reduced ECG lead subsets representing DL and CML methods.
Methods
We used a public data set from the PhysioNet Cardiology Challenge 2020. For the DL method, we trained a CNN classifier extracting features for each ECG lead, which were then used in a feedforward neural network. We used a random forest classifier with manually extracted features for the CML method. Optimal ECG lead subsets were identified by means of recursive feature elimination for both methods.
Results
The CML method required 19% more leads (equating to ∼ 2 leads) compared with the DL method. Four common leads (I, II, V5, V6) were identified in each of the subsets of ECG leads using the CML method, and no common leads were consistently present for the DL method. The average macro F1 scores were 0.761 for the DL and 0.759 for the CML.
Conclusions
Optimal ECG lead subsets provide classification accuracy similar to that using all 12 leads across DL and CML methods. The DL method achieved slightly higher classification accuracy on larger data sets and required fewer ECG leads compared with the CML method.