Optimizing deep learning models for glaucoma screening with vision transformers for resource efficiency and the pie augmentation method.

IF 2.6 3区综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES PLoS ONE Pub Date : 2025-03-21 eCollection Date: 2025-01-01 DOI:10.1371/journal.pone.0314111

Sirikorn Sangchocanonta, Pakinee Pooprasert, Nichapa Lerthirunvibul, Kanyarak Patchimnan, Phongphan Phienphanich, Adirek Munthuli, Sujittra Puangarom, Rath Itthipanichpong, Kitiya Ratanawongphaibul, Sunee Chansangpetch, Anita Manassakorn, Visanee Tantisevi, Prin Rojanapongpun, Charturong Tantibundhit

{"title":"Optimizing deep learning models for glaucoma screening with vision transformers for resource efficiency and the pie augmentation method.","authors":"Sirikorn Sangchocanonta, Pakinee Pooprasert, Nichapa Lerthirunvibul, Kanyarak Patchimnan, Phongphan Phienphanich, Adirek Munthuli, Sujittra Puangarom, Rath Itthipanichpong, Kitiya Ratanawongphaibul, Sunee Chansangpetch, Anita Manassakorn, Visanee Tantisevi, Prin Rojanapongpun, Charturong Tantibundhit","doi":"10.1371/journal.pone.0314111","DOIUrl":null,"url":null,"abstract":"<p><p>Glaucoma is the leading cause of irreversible vision impairment, emphasizing the critical need for early detection. Typically, AI-based glaucoma screening relies on fundus imaging. To tackle the resource and time challenges in glaucoma screening with convolutional neural network (CNN), we chose the Data-efficient image Transformers (DeiT), a vision transformer, known for its reduced computational demands, with preprocessing time decreased by a factor of 10. Our approach utilized the meticulously annotated GlauCUTU-DATA dataset, curated by ophthalmologists through consensus, encompassing both unanimous agreement (3/3) and majority agreement (2/3) data. However, DeiT's performance was initially lower than CNN. Therefore, we introduced the \"pie method,\" an augmentation method aligned with the ISNT rule. Along with employing polar transformation to improved cup region visibility and alignment with the vision transformer's input to elevated performance levels. The classification results demonstrated improvements comparable to CNN. Using the 3/3 data, excluding the superior and nasal regions, especially in glaucoma suspects, sensitivity increased by 40.18% from 47.06% to 88.24%. The average area under the curve (AUC) ± standard deviation (SD) for glaucoma, glaucoma suspects, and no glaucoma were 92.63 ± 4.39%, 92.35 ± 4.39%, and 92.32 ± 1.45%, respectively. With the 2/3 data, excluding the superior and temporal regions, sensitivity for diagnosing glaucoma increased by 11.36% from 47.73% to 59.09%. The average AUC ± SD for glaucoma, glaucoma suspects, and no glaucoma were 68.22 ± 4.45%, 68.23 ± 4.39%, and 73.09 ± 3.05%, respectively. For both datasets, the AUC values for glaucoma, glaucoma suspects, and no glaucoma were 84.53%, 84.54%, and 91.05%, respectively, which approach the performance of a CNN model that achieved 84.70%, 84.69%, and 93.19%, respectively. Moreover, the incorporation of attention maps from DeiT facilitated the precise localization of clinically significant areas, such as the disc rim and notching, thereby enhancing the overall effectiveness of glaucoma screening.</p>","PeriodicalId":20189,"journal":{"name":"PLoS ONE","volume":"20 3","pages":"e0314111"},"PeriodicalIF":2.6000,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11927916/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLoS ONE","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1371/journal.pone.0314111","RegionNum":3,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 0

Abstract

Glaucoma is the leading cause of irreversible vision impairment, emphasizing the critical need for early detection. Typically, AI-based glaucoma screening relies on fundus imaging. To tackle the resource and time challenges in glaucoma screening with convolutional neural network (CNN), we chose the Data-efficient image Transformers (DeiT), a vision transformer, known for its reduced computational demands, with preprocessing time decreased by a factor of 10. Our approach utilized the meticulously annotated GlauCUTU-DATA dataset, curated by ophthalmologists through consensus, encompassing both unanimous agreement (3/3) and majority agreement (2/3) data. However, DeiT's performance was initially lower than CNN. Therefore, we introduced the "pie method," an augmentation method aligned with the ISNT rule. Along with employing polar transformation to improved cup region visibility and alignment with the vision transformer's input to elevated performance levels. The classification results demonstrated improvements comparable to CNN. Using the 3/3 data, excluding the superior and nasal regions, especially in glaucoma suspects, sensitivity increased by 40.18% from 47.06% to 88.24%. The average area under the curve (AUC) ± standard deviation (SD) for glaucoma, glaucoma suspects, and no glaucoma were 92.63 ± 4.39%, 92.35 ± 4.39%, and 92.32 ± 1.45%, respectively. With the 2/3 data, excluding the superior and temporal regions, sensitivity for diagnosing glaucoma increased by 11.36% from 47.73% to 59.09%. The average AUC ± SD for glaucoma, glaucoma suspects, and no glaucoma were 68.22 ± 4.45%, 68.23 ± 4.39%, and 73.09 ± 3.05%, respectively. For both datasets, the AUC values for glaucoma, glaucoma suspects, and no glaucoma were 84.53%, 84.54%, and 91.05%, respectively, which approach the performance of a CNN model that achieved 84.70%, 84.69%, and 93.19%, respectively. Moreover, the incorporation of attention maps from DeiT facilitated the precise localization of clinically significant areas, such as the disc rim and notching, thereby enhancing the overall effectiveness of glaucoma screening.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

优化青光眼视觉变形筛查深度学习模型的资源效率和饼增强法。

青光眼是造成不可逆视力损伤的主要原因，因此强调早期检测至关重要。通常，基于人工智能的青光眼筛查依赖于眼底成像。为了利用卷积神经网络（CNN）解决青光眼筛查中的资源和时间挑战，我们选择了数据高效图像转换器（DeiT），这是一种视觉转换器，以减少计算需求而著称，其预处理时间减少了 10 倍。我们的方法利用了由眼科专家通过协商一致的方式精心注释的 GlauCUTU-DATA 数据集，其中包括一致同意（3/3）和多数同意（2/3）的数据。然而，DeiT 的性能最初低于 CNN。因此，我们引入了 "饼法"，一种与 ISNT 规则相一致的增强方法。此外，我们还采用了极性变换来改善杯区的可见度，并与视觉变换器的输入相匹配，从而提高了性能水平。分类结果表明，其改进程度可与 CNN 相媲美。使用 3/3 数据，剔除上部和鼻腔区域，尤其是青光眼疑似患者，灵敏度提高了 40.18%，从 47.06% 提高到 88.24%。青光眼、疑似青光眼和无青光眼的平均曲线下面积（AUC）± 标准差（SD）分别为 92.63 ± 4.39%、92.35 ± 4.39% 和 92.32 ± 1.45%。根据 2/3 数据，排除上部和颞部区域，诊断青光眼的灵敏度增加了 11.36%，从 47.73% 增加到 59.09%。青光眼、疑似青光眼和无青光眼的平均 AUC ± SD 分别为 68.22 ± 4.45%、68.23 ± 4.39% 和 73.09 ± 3.05%。对于这两个数据集，青光眼、青光眼疑似患者和无青光眼的 AUC 值分别为 84.53%、84.54% 和 91.05%，接近 CNN 模型的性能，分别达到 84.70%、84.69% 和 93.19%。此外，结合 DeiT 的注意力图有助于精确定位具有临床意义的区域，例如视盘边缘和切迹，从而提高青光眼筛查的整体效果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

PLoS ONE 生物-生物学

CiteScore

6.20

自引率

5.40%

发文量

14242

审稿时长

3.7 months

期刊介绍： PLOS ONE is an international, peer-reviewed, open-access, online publication. PLOS ONE welcomes reports on primary research from any scientific discipline. It provides: * Open-access—freely accessible online, authors retain copyright * Fast publication times * Peer review by expert, practicing researchers * Post-publication tools to indicate quality and impact * Community-based dialogue on articles * Worldwide media coverage