A. Kulkarni, Ashwini A Patel, Kanchan V Pipal, Sujeet G Jaiswal, Manisha T Jaisinghani, Vidya Thulkar, Lumbini Gajbhiye, Preeti Gondane, Archana B Patel, M. Mamtani, H. Kulkarni
{"title":"Machine-learning algorithm to non-invasively detect diabetes and pre-diabetes from electrocardiogram","authors":"A. Kulkarni, Ashwini A Patel, Kanchan V Pipal, Sujeet G Jaiswal, Manisha T Jaisinghani, Vidya Thulkar, Lumbini Gajbhiye, Preeti Gondane, Archana B Patel, M. Mamtani, H. Kulkarni","doi":"10.1136/bmjinnov-2021-000759","DOIUrl":null,"url":null,"abstract":"Objectives Early detection is of crucial importance for prevention of type 2 diabetes and pre-diabetes. Diagnosis of these conditions relies on the oral glucose tolerance test and haemoglobin A1c estimation which are invasive and challenging for large-scale screening. We aimed to combine the non-invasive nature of ECG with the power of machine learning to detect diabetes and pre-diabetes. Methods Data for this study come from Diabetes in Sindhi Families in Nagpur study of ethnically endogenous Sindhi population from central India. Final dataset included clinical data from 1262 individuals and 10 461 time-aligned heartbeats recorded digitally. The dataset was split into a training set, a validation set and independent test set (8892, 523 and 1046 beats, respectively). The ECG recordings were processed with median filtering, band-pass filtering and standard scaling. Minority oversampling was undertaken to balance the training dataset before initiation of training. Extreme gradient boosting (XGBoost) was used to train the classifier that used the signal-processed ECG as input and predicted the membership to ‘no diabetes’, pre-diabetes or type 2 diabetes classes (defined according to American Diabetes Association criteria). Results Prevalence of type 2 diabetes and pre-diabetes was ~30% and ~14%, respectively. Training was smooth and quick (convergence achieved within 40 epochs). In the independent test set, the DiaBeats algorithm predicted the classes with 97.1% precision, 96.2% recall, 96.8% accuracy and 96.6% F1 score. The calibrated model had a low calibration error (0.06). The feature importance maps indicated that leads III, augmented Vector Left (aVL), V4, V5 and V6 were most contributory to the classification performance. The predictions matched the clinical expectations based on the biological mechanisms of cardiac involvement in diabetes. Conclusions Machine-learning-based DiaBeats algorithm using ECG signal data accurately predicted diabetes-related classes. This algorithm can help in early detection of diabetes and pre-diabetes after robust validation in external datasets.","PeriodicalId":53454,"journal":{"name":"BMJ Innovations","volume":"59 1","pages":"32 - 42"},"PeriodicalIF":1.4000,"publicationDate":"2022-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMJ Innovations","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1136/bmjinnov-2021-000759","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 8
Abstract
Objectives Early detection is of crucial importance for prevention of type 2 diabetes and pre-diabetes. Diagnosis of these conditions relies on the oral glucose tolerance test and haemoglobin A1c estimation which are invasive and challenging for large-scale screening. We aimed to combine the non-invasive nature of ECG with the power of machine learning to detect diabetes and pre-diabetes. Methods Data for this study come from Diabetes in Sindhi Families in Nagpur study of ethnically endogenous Sindhi population from central India. Final dataset included clinical data from 1262 individuals and 10 461 time-aligned heartbeats recorded digitally. The dataset was split into a training set, a validation set and independent test set (8892, 523 and 1046 beats, respectively). The ECG recordings were processed with median filtering, band-pass filtering and standard scaling. Minority oversampling was undertaken to balance the training dataset before initiation of training. Extreme gradient boosting (XGBoost) was used to train the classifier that used the signal-processed ECG as input and predicted the membership to ‘no diabetes’, pre-diabetes or type 2 diabetes classes (defined according to American Diabetes Association criteria). Results Prevalence of type 2 diabetes and pre-diabetes was ~30% and ~14%, respectively. Training was smooth and quick (convergence achieved within 40 epochs). In the independent test set, the DiaBeats algorithm predicted the classes with 97.1% precision, 96.2% recall, 96.8% accuracy and 96.6% F1 score. The calibrated model had a low calibration error (0.06). The feature importance maps indicated that leads III, augmented Vector Left (aVL), V4, V5 and V6 were most contributory to the classification performance. The predictions matched the clinical expectations based on the biological mechanisms of cardiac involvement in diabetes. Conclusions Machine-learning-based DiaBeats algorithm using ECG signal data accurately predicted diabetes-related classes. This algorithm can help in early detection of diabetes and pre-diabetes after robust validation in external datasets.
期刊介绍:
Healthcare is undergoing a revolution and novel medical technologies are being developed to treat patients in better and faster ways. Mobile revolution has put a handheld computer in pockets of billions and we are ushering in an era of mHealth. In developed and developing world alike healthcare costs are a concern and frugal innovations are being promoted for bringing down the costs of healthcare. BMJ Innovations aims to promote innovative research which creates new, cost-effective medical devices, technologies, processes and systems that improve patient care, with particular focus on the needs of patients, physicians, and the health care industry as a whole and act as a platform to catalyse and seed more innovations. Submissions to BMJ Innovations will be considered from all clinical areas of medicine along with business and process innovations that make healthcare accessible and affordable. Submissions from groups of investigators engaged in international collaborations are especially encouraged. The broad areas of innovations that this journal aims to chronicle include but are not limited to: Medical devices, mHealth and wearable health technologies, Assistive technologies, Diagnostics, Health IT, systems and process innovation.