Agustin Castillo-Munguia, Gibran Benitez-Garcia, J. Olivares-Mercado, Hiroki Takahashi
{"title":"Diabetic Retinopathy Grading based on a Sparse Network Fusion of Heterogeneous ConvNeXt Models with Category Attention","authors":"Agustin Castillo-Munguia, Gibran Benitez-Garcia, J. Olivares-Mercado, Hiroki Takahashi","doi":"10.23919/MVA57639.2023.10216129","DOIUrl":null,"url":null,"abstract":"Diabetic retinopathy (DR) is an eye disease caused by high blood sugar levels that may damage vessels in the retina, leading to partial or complete loss of vision in later stages. In recent years, convolutional neural networks (CNN) have been used to help diagnose the DR severity. However, due to the slight differences between each class and the imbalanced nature of the datasets, standard CNNs often struggle to distinguish accurately between different grades of DR. To overcome these challenges, we propose combining a novel CNN model (ConvNeXt) with category-attention blocks incorporated at multiple levels of the architecture. This generates different models that can effectively extract fine-grained features and minimize the impact of dataset imbalance. Finally, we introduce a Sparse Network Fusion technique that learns to combine the outputs of all models to consolidate their individual decisions. Extensive experiments on the challenging DDR dataset show that our proposal achieves a new state-of-the-art performance, improving by about 3% grading accuracy compared with existing methods.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 18th International Conference on Machine Vision and Applications (MVA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/MVA57639.2023.10216129","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Diabetic retinopathy (DR) is an eye disease caused by high blood sugar levels that may damage vessels in the retina, leading to partial or complete loss of vision in later stages. In recent years, convolutional neural networks (CNN) have been used to help diagnose the DR severity. However, due to the slight differences between each class and the imbalanced nature of the datasets, standard CNNs often struggle to distinguish accurately between different grades of DR. To overcome these challenges, we propose combining a novel CNN model (ConvNeXt) with category-attention blocks incorporated at multiple levels of the architecture. This generates different models that can effectively extract fine-grained features and minimize the impact of dataset imbalance. Finally, we introduce a Sparse Network Fusion technique that learns to combine the outputs of all models to consolidate their individual decisions. Extensive experiments on the challenging DDR dataset show that our proposal achieves a new state-of-the-art performance, improving by about 3% grading accuracy compared with existing methods.