解密基于 CNN 的集合中的多样性对克服医学数据集中的数据不平衡和稀缺性的影响:糖尿病视网膜病变案例研究

Inamullah , Saima Hassan , Samir Brahim Belhaouari , Ibrar Amin
{"title":"解密基于 CNN 的集合中的多样性对克服医学数据集中的数据不平衡和稀缺性的影响:糖尿病视网膜病变案例研究","authors":"Inamullah ,&nbsp;Saima Hassan ,&nbsp;Samir Brahim Belhaouari ,&nbsp;Ibrar Amin","doi":"10.1016/j.imu.2024.101557","DOIUrl":null,"url":null,"abstract":"<div><p>Early detection of diabetic retinopathy (DR) is critical in preventing vision loss. However, building accurate Artificial intelligence (AI) models for multiple classes, including early-stage (Class-1) detection, is challenging due to limited and imbalanced medical datasets. The availability of such datasets is restricted due to ethical and privacy concerns. Traditional ensemble models also struggle with raw medical images, further complicating the issue as they require structured data. This study presents a novel deep learning-based ensemble model (EM) designed for multiple and specifically for precise early stage (Class 1) DR classification. The EM uses eight diverse Convolutional Neural Networks (CNNs) with carefully crafted strategies to enhance diversity. Data augmentation and generation techniques address imbalanced data through data diversity, while parameter and architectural diver-sity within CNNs-based EM maximize predictive performance. Evaluation on the publicly available Kaggle APTOS DR dataset demonstrates significant improvement over individual models and existing approaches. The proposed EM achieves multi-class accuracy (93.00 %), precision (93.00 %), sensitivity (98.00 %), and specificity (99.00 %). This research highlights the effectiveness of diversified CNNs ensembles in overcoming challenges posed by imbalanced and scarce data for multiple-class DR classification. This approach paves the way for developing robust and accurate AI-powered diagnostic tools for improved diabetic retinopathy screening.</p></div>","PeriodicalId":13953,"journal":{"name":"Informatics in Medicine Unlocked","volume":"49 ","pages":"Article 101557"},"PeriodicalIF":0.0000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2352914824001138/pdfft?md5=7536f15c388ac8fc93a888c571ef8ae7&pid=1-s2.0-S2352914824001138-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Deciphering the impact of diversity in CNN-based ensembles on overcoming data imbalance and scarcity in medical datasets: A case study on diabetic retinopathy\",\"authors\":\"Inamullah ,&nbsp;Saima Hassan ,&nbsp;Samir Brahim Belhaouari ,&nbsp;Ibrar Amin\",\"doi\":\"10.1016/j.imu.2024.101557\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Early detection of diabetic retinopathy (DR) is critical in preventing vision loss. However, building accurate Artificial intelligence (AI) models for multiple classes, including early-stage (Class-1) detection, is challenging due to limited and imbalanced medical datasets. The availability of such datasets is restricted due to ethical and privacy concerns. Traditional ensemble models also struggle with raw medical images, further complicating the issue as they require structured data. This study presents a novel deep learning-based ensemble model (EM) designed for multiple and specifically for precise early stage (Class 1) DR classification. The EM uses eight diverse Convolutional Neural Networks (CNNs) with carefully crafted strategies to enhance diversity. Data augmentation and generation techniques address imbalanced data through data diversity, while parameter and architectural diver-sity within CNNs-based EM maximize predictive performance. Evaluation on the publicly available Kaggle APTOS DR dataset demonstrates significant improvement over individual models and existing approaches. The proposed EM achieves multi-class accuracy (93.00 %), precision (93.00 %), sensitivity (98.00 %), and specificity (99.00 %). This research highlights the effectiveness of diversified CNNs ensembles in overcoming challenges posed by imbalanced and scarce data for multiple-class DR classification. This approach paves the way for developing robust and accurate AI-powered diagnostic tools for improved diabetic retinopathy screening.</p></div>\",\"PeriodicalId\":13953,\"journal\":{\"name\":\"Informatics in Medicine Unlocked\",\"volume\":\"49 \",\"pages\":\"Article 101557\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2352914824001138/pdfft?md5=7536f15c388ac8fc93a888c571ef8ae7&pid=1-s2.0-S2352914824001138-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Informatics in Medicine Unlocked\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2352914824001138\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Medicine\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Informatics in Medicine Unlocked","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352914824001138","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Medicine","Score":null,"Total":0}
引用次数: 0

摘要

早期检测糖尿病视网膜病变(DR)对于预防视力丧失至关重要。然而,由于医疗数据集有限且不平衡,为包括早期(1 级)检测在内的多个类别建立精确的人工智能(AI)模型具有挑战性。出于道德和隐私方面的考虑,此类数据集的可用性受到限制。传统的集合模型也很难处理原始医疗图像,这使问题更加复杂,因为它们需要结构化数据。本研究提出了一种新颖的基于深度学习的集合模型(EM),该模型专为早期(1 级)DR 精确分类而设计。EM 使用八个不同的卷积神经网络 (CNN),并采用精心设计的策略来增强多样性。数据增强和生成技术通过数据多样性解决了不平衡数据的问题,而基于 CNN 的 EM 的参数和架构多样性则最大限度地提高了预测性能。在公开的 Kaggle APTOS DR 数据集上进行的评估表明,与单个模型和现有方法相比,EM 有了显著的改进。提议的 EM 实现了多类准确率(93.00%)、精确率(93.00%)、灵敏度(98.00%)和特异性(99.00%)。这项研究凸显了多样化 CNNs 集合在克服不平衡和稀缺数据对 DR 多类分类带来的挑战方面的有效性。这种方法为开发稳健、准确的人工智能诊断工具,改进糖尿病视网膜病变筛查铺平了道路。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Deciphering the impact of diversity in CNN-based ensembles on overcoming data imbalance and scarcity in medical datasets: A case study on diabetic retinopathy

Early detection of diabetic retinopathy (DR) is critical in preventing vision loss. However, building accurate Artificial intelligence (AI) models for multiple classes, including early-stage (Class-1) detection, is challenging due to limited and imbalanced medical datasets. The availability of such datasets is restricted due to ethical and privacy concerns. Traditional ensemble models also struggle with raw medical images, further complicating the issue as they require structured data. This study presents a novel deep learning-based ensemble model (EM) designed for multiple and specifically for precise early stage (Class 1) DR classification. The EM uses eight diverse Convolutional Neural Networks (CNNs) with carefully crafted strategies to enhance diversity. Data augmentation and generation techniques address imbalanced data through data diversity, while parameter and architectural diver-sity within CNNs-based EM maximize predictive performance. Evaluation on the publicly available Kaggle APTOS DR dataset demonstrates significant improvement over individual models and existing approaches. The proposed EM achieves multi-class accuracy (93.00 %), precision (93.00 %), sensitivity (98.00 %), and specificity (99.00 %). This research highlights the effectiveness of diversified CNNs ensembles in overcoming challenges posed by imbalanced and scarce data for multiple-class DR classification. This approach paves the way for developing robust and accurate AI-powered diagnostic tools for improved diabetic retinopathy screening.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Informatics in Medicine Unlocked
Informatics in Medicine Unlocked Medicine-Health Informatics
CiteScore
9.50
自引率
0.00%
发文量
282
审稿时长
39 days
期刊介绍: Informatics in Medicine Unlocked (IMU) is an international gold open access journal covering a broad spectrum of topics within medical informatics, including (but not limited to) papers focusing on imaging, pathology, teledermatology, public health, ophthalmological, nursing and translational medicine informatics. The full papers that are published in the journal are accessible to all who visit the website.
期刊最新文献
Usability and accessibility in mHealth stroke apps: An empirical assessment Spatiotemporal chest wall movement analysis using depth sensor imaging for detecting respiratory asynchrony Regression and classification of Windkessel parameters from non-invasive cardiovascular quantities using a fully connected neural network Patient2Trial: From patient to participant in clinical trials using large language models Structural modification of Naproxen; physicochemical, spectral, medicinal, and pharmacological evaluation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1