{"title":"Optimizing deep learning models for glaucoma screening with vision transformers for resource efficiency and the pie augmentation method.","authors":"Sirikorn Sangchocanonta, Pakinee Pooprasert, Nichapa Lerthirunvibul, Kanyarak Patchimnan, Phongphan Phienphanich, Adirek Munthuli, Sujittra Puangarom, Rath Itthipanichpong, Kitiya Ratanawongphaibul, Sunee Chansangpetch, Anita Manassakorn, Visanee Tantisevi, Prin Rojanapongpun, Charturong Tantibundhit","doi":"10.1371/journal.pone.0314111","DOIUrl":null,"url":null,"abstract":"<p><p>Glaucoma is the leading cause of irreversible vision impairment, emphasizing the critical need for early detection. Typically, AI-based glaucoma screening relies on fundus imaging. To tackle the resource and time challenges in glaucoma screening with convolutional neural network (CNN), we chose the Data-efficient image Transformers (DeiT), a vision transformer, known for its reduced computational demands, with preprocessing time decreased by a factor of 10. Our approach utilized the meticulously annotated GlauCUTU-DATA dataset, curated by ophthalmologists through consensus, encompassing both unanimous agreement (3/3) and majority agreement (2/3) data. However, DeiT's performance was initially lower than CNN. Therefore, we introduced the \"pie method,\" an augmentation method aligned with the ISNT rule. Along with employing polar transformation to improved cup region visibility and alignment with the vision transformer's input to elevated performance levels. The classification results demonstrated improvements comparable to CNN. Using the 3/3 data, excluding the superior and nasal regions, especially in glaucoma suspects, sensitivity increased by 40.18% from 47.06% to 88.24%. The average area under the curve (AUC) ± standard deviation (SD) for glaucoma, glaucoma suspects, and no glaucoma were 92.63 ± 4.39%, 92.35 ± 4.39%, and 92.32 ± 1.45%, respectively. With the 2/3 data, excluding the superior and temporal regions, sensitivity for diagnosing glaucoma increased by 11.36% from 47.73% to 59.09%. The average AUC ± SD for glaucoma, glaucoma suspects, and no glaucoma were 68.22 ± 4.45%, 68.23 ± 4.39%, and 73.09 ± 3.05%, respectively. For both datasets, the AUC values for glaucoma, glaucoma suspects, and no glaucoma were 84.53%, 84.54%, and 91.05%, respectively, which approach the performance of a CNN model that achieved 84.70%, 84.69%, and 93.19%, respectively. Moreover, the incorporation of attention maps from DeiT facilitated the precise localization of clinically significant areas, such as the disc rim and notching, thereby enhancing the overall effectiveness of glaucoma screening.</p>","PeriodicalId":20189,"journal":{"name":"PLoS ONE","volume":"20 3","pages":"e0314111"},"PeriodicalIF":2.9000,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11927916/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLoS ONE","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1371/journal.pone.0314111","RegionNum":3,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Glaucoma is the leading cause of irreversible vision impairment, emphasizing the critical need for early detection. Typically, AI-based glaucoma screening relies on fundus imaging. To tackle the resource and time challenges in glaucoma screening with convolutional neural network (CNN), we chose the Data-efficient image Transformers (DeiT), a vision transformer, known for its reduced computational demands, with preprocessing time decreased by a factor of 10. Our approach utilized the meticulously annotated GlauCUTU-DATA dataset, curated by ophthalmologists through consensus, encompassing both unanimous agreement (3/3) and majority agreement (2/3) data. However, DeiT's performance was initially lower than CNN. Therefore, we introduced the "pie method," an augmentation method aligned with the ISNT rule. Along with employing polar transformation to improved cup region visibility and alignment with the vision transformer's input to elevated performance levels. The classification results demonstrated improvements comparable to CNN. Using the 3/3 data, excluding the superior and nasal regions, especially in glaucoma suspects, sensitivity increased by 40.18% from 47.06% to 88.24%. The average area under the curve (AUC) ± standard deviation (SD) for glaucoma, glaucoma suspects, and no glaucoma were 92.63 ± 4.39%, 92.35 ± 4.39%, and 92.32 ± 1.45%, respectively. With the 2/3 data, excluding the superior and temporal regions, sensitivity for diagnosing glaucoma increased by 11.36% from 47.73% to 59.09%. The average AUC ± SD for glaucoma, glaucoma suspects, and no glaucoma were 68.22 ± 4.45%, 68.23 ± 4.39%, and 73.09 ± 3.05%, respectively. For both datasets, the AUC values for glaucoma, glaucoma suspects, and no glaucoma were 84.53%, 84.54%, and 91.05%, respectively, which approach the performance of a CNN model that achieved 84.70%, 84.69%, and 93.19%, respectively. Moreover, the incorporation of attention maps from DeiT facilitated the precise localization of clinically significant areas, such as the disc rim and notching, thereby enhancing the overall effectiveness of glaucoma screening.
期刊介绍:
PLOS ONE is an international, peer-reviewed, open-access, online publication. PLOS ONE welcomes reports on primary research from any scientific discipline. It provides:
* Open-access—freely accessible online, authors retain copyright
* Fast publication times
* Peer review by expert, practicing researchers
* Post-publication tools to indicate quality and impact
* Community-based dialogue on articles
* Worldwide media coverage