Controllable retinal image synthesis using conditional StyleGAN and latent space manipulation for improved diagnosis and grading of diabetic retinopathy

Somayeh PakdelmoezDepartment of Biomedical Engineering, Amirkabir University of Technology, Tehran, Iran, Saba OmidikiaDepartment of Biomedical Engineering, Amirkabir University of Technology, Tehran, Iran, Seyyed Ali SeyyedsalehiDepartment of Biomedical Engineering, Amirkabir University of Technology, Tehran, Iran, Seyyede Zohreh SeyyedsalehiDepartment of Biomedical Engineering, Faculty of Health, Tehran Medical Sciences, Islamic Azad University, Tehran, Iran
{"title":"Controllable retinal image synthesis using conditional StyleGAN and latent space manipulation for improved diagnosis and grading of diabetic retinopathy","authors":"Somayeh PakdelmoezDepartment of Biomedical Engineering, Amirkabir University of Technology, Tehran, Iran, Saba OmidikiaDepartment of Biomedical Engineering, Amirkabir University of Technology, Tehran, Iran, Seyyed Ali SeyyedsalehiDepartment of Biomedical Engineering, Amirkabir University of Technology, Tehran, Iran, Seyyede Zohreh SeyyedsalehiDepartment of Biomedical Engineering, Faculty of Health, Tehran Medical Sciences, Islamic Azad University, Tehran, Iran","doi":"arxiv-2409.07422","DOIUrl":null,"url":null,"abstract":"Diabetic retinopathy (DR) is a consequence of diabetes mellitus characterized\nby vascular damage within the retinal tissue. Timely detection is paramount to\nmitigate the risk of vision loss. However, training robust grading models is\nhindered by a shortage of annotated data, particularly for severe cases. This\npaper proposes a framework for controllably generating high-fidelity and\ndiverse DR fundus images, thereby improving classifier performance in DR\ngrading and detection. We achieve comprehensive control over DR severity and\nvisual features (optic disc, vessel structure, lesion areas) within generated\nimages solely through a conditional StyleGAN, eliminating the need for feature\nmasks or auxiliary networks. Specifically, leveraging the SeFa algorithm to\nidentify meaningful semantics within the latent space, we manipulate the DR\nimages generated conditionally on grades, further enhancing the dataset\ndiversity. Additionally, we propose a novel, effective SeFa-based data\naugmentation strategy, helping the classifier focus on discriminative regions\nwhile ignoring redundant features. Using this approach, a ResNet50 model\ntrained for DR detection achieves 98.09% accuracy, 99.44% specificity, 99.45%\nprecision, and an F1-score of 98.09%. Moreover, incorporating synthetic images\ngenerated by conditional StyleGAN into ResNet50 training for DR grading yields\n83.33% accuracy, a quadratic kappa score of 87.64%, 95.67% specificity, and\n72.24% precision. Extensive experiments conducted on the APTOS 2019 dataset\ndemonstrate the exceptional realism of the generated images and the superior\nperformance of our classifier compared to recent studies.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Image and Video Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07422","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Diabetic retinopathy (DR) is a consequence of diabetes mellitus characterized by vascular damage within the retinal tissue. Timely detection is paramount to mitigate the risk of vision loss. However, training robust grading models is hindered by a shortage of annotated data, particularly for severe cases. This paper proposes a framework for controllably generating high-fidelity and diverse DR fundus images, thereby improving classifier performance in DR grading and detection. We achieve comprehensive control over DR severity and visual features (optic disc, vessel structure, lesion areas) within generated images solely through a conditional StyleGAN, eliminating the need for feature masks or auxiliary networks. Specifically, leveraging the SeFa algorithm to identify meaningful semantics within the latent space, we manipulate the DR images generated conditionally on grades, further enhancing the dataset diversity. Additionally, we propose a novel, effective SeFa-based data augmentation strategy, helping the classifier focus on discriminative regions while ignoring redundant features. Using this approach, a ResNet50 model trained for DR detection achieves 98.09% accuracy, 99.44% specificity, 99.45% precision, and an F1-score of 98.09%. Moreover, incorporating synthetic images generated by conditional StyleGAN into ResNet50 training for DR grading yields 83.33% accuracy, a quadratic kappa score of 87.64%, 95.67% specificity, and 72.24% precision. Extensive experiments conducted on the APTOS 2019 dataset demonstrate the exceptional realism of the generated images and the superior performance of our classifier compared to recent studies.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用条件 StyleGAN 和潜空间操作进行可控视网膜图像合成,改进糖尿病视网膜病变的诊断和分级
糖尿病视网膜病变(DR)是糖尿病的一种后遗症,其特点是视网膜组织内的血管受损。及时检测对降低视力丧失的风险至关重要。然而,由于缺乏注释数据,尤其是严重病例的注释数据,训练稳健的分级模型受到了阻碍。本文提出了一种框架,用于可控地生成高保真和多样化的 DR 眼底图像,从而提高 DR 分级和检测中分类器的性能。我们仅通过条件式广域网(StyleGAN)就实现了对 DR 严重程度和生成图像中视觉特征(视盘、血管结构、病变区域)的全面控制,从而消除了对特征掩码或辅助网络的需求。具体来说,我们利用 SeFa 算法识别潜空间内有意义的语义,根据等级有条件地处理生成的 DR 图像,进一步增强了数据集的多样性。此外,我们还提出了一种新颖、有效的基于 SeFa 的数据分割策略,帮助分类器专注于有区分度的区域,同时忽略冗余特征。利用这种方法,针对 DR 检测训练的 ResNet50 模型达到了 98.09% 的准确率、99.44% 的特异性、99.45% 的精确性和 98.09% 的 F1 分数。此外,将有条件的 StyleGAN 生成的合成图像纳入用于 DR 分级的 ResNet50 训练,可获得 83.33% 的准确率、87.64% 的二次 kappa 分数、95.67% 的特异性和 72.24% 的精确度。在 APTOS 2019 数据集上进行的大量实验证明,生成的图像异常逼真,与近期的研究相比,我们的分类器性能更优。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
multiPI-TransBTS: A Multi-Path Learning Framework for Brain Tumor Image Segmentation Based on Multi-Physical Information Autopet III challenge: Incorporating anatomical knowledge into nnUNet for lesion segmentation in PET/CT Denoising diffusion models for high-resolution microscopy image restoration Tumor aware recurrent inter-patient deformable image registration of computed tomography scans with lung cancer Cross-Organ and Cross-Scanner Adenocarcinoma Segmentation using Rein to Fine-tune Vision Foundation Models
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1