Chubin Ou, Xifei Wei, Lin An, Jia Qin, Min Zhu, Mei Jin, Xiangbin Kong
{"title":"A Deep Learning Network for Accurate Retinal Multidisease Diagnosis Using Multiview Fusion of En Face and B-Scan Images: A Multicenter Study.","authors":"Chubin Ou, Xifei Wei, Lin An, Jia Qin, Min Zhu, Mei Jin, Xiangbin Kong","doi":"10.1167/tvst.13.12.31","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Accurate diagnosis of retinal disease based on optical coherence tomography (OCT) requires scrutiny of both B-scan and en face images. The aim of this study was to investigate the effectiveness of fusing en face and B-scan images for better diagnostic performance of deep learning models.</p><p><strong>Methods: </strong>A multiview fusion network (MVFN) with a decision fusion module to integrate fast-axis and slow-axis B-scans and en face information was proposed and compared with five state-of-the-art methods: a model using B-scans, a model using en face imaging, a model using three-dimensional volume, and two other relevant methods. They were evaluated using the OCTA-500 public dataset and a private multicenter dataset with 2330 cases; cases from the first center were used for training and cases from the second center were used for external validation. Performance was assessed by averaged area under the curve (AUC), accuracy, sensitivity, specificity, and precision.</p><p><strong>Results: </strong>In the private external test set, our MVFN achieved the highest AUC of 0.994, significantly outperforming the other models (P < 0.01). Similarly, for the OCTA-500 public dataset, our proposed method also outperformed the other methods with the highest AUC of 0.976, further demonstrating its effectiveness. Typical cases were demonstrated using activation heatmaps to illustrate the synergy of combining en face and B-scan images.</p><p><strong>Conclusions: </strong>The fusion of en face and B-scan information is an effective strategy for improving the diagnostic accuracy of deep learning models.</p><p><strong>Translational relevance: </strong>Multiview fusion models combining B-scan and en face images demonstrate great potential in improving AI performance for retina disease diagnosis.</p>","PeriodicalId":23322,"journal":{"name":"Translational Vision Science & Technology","volume":"13 12","pages":"31"},"PeriodicalIF":2.6000,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11668356/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Translational Vision Science & Technology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1167/tvst.13.12.31","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: Accurate diagnosis of retinal disease based on optical coherence tomography (OCT) requires scrutiny of both B-scan and en face images. The aim of this study was to investigate the effectiveness of fusing en face and B-scan images for better diagnostic performance of deep learning models.
Methods: A multiview fusion network (MVFN) with a decision fusion module to integrate fast-axis and slow-axis B-scans and en face information was proposed and compared with five state-of-the-art methods: a model using B-scans, a model using en face imaging, a model using three-dimensional volume, and two other relevant methods. They were evaluated using the OCTA-500 public dataset and a private multicenter dataset with 2330 cases; cases from the first center were used for training and cases from the second center were used for external validation. Performance was assessed by averaged area under the curve (AUC), accuracy, sensitivity, specificity, and precision.
Results: In the private external test set, our MVFN achieved the highest AUC of 0.994, significantly outperforming the other models (P < 0.01). Similarly, for the OCTA-500 public dataset, our proposed method also outperformed the other methods with the highest AUC of 0.976, further demonstrating its effectiveness. Typical cases were demonstrated using activation heatmaps to illustrate the synergy of combining en face and B-scan images.
Conclusions: The fusion of en face and B-scan information is an effective strategy for improving the diagnostic accuracy of deep learning models.
Translational relevance: Multiview fusion models combining B-scan and en face images demonstrate great potential in improving AI performance for retina disease diagnosis.
期刊介绍:
Translational Vision Science & Technology (TVST), an official journal of the Association for Research in Vision and Ophthalmology (ARVO), an international organization whose purpose is to advance research worldwide into understanding the visual system and preventing, treating and curing its disorders, is an online, open access, peer-reviewed journal emphasizing multidisciplinary research that bridges the gap between basic research and clinical care. A highly qualified and diverse group of Associate Editors and Editorial Board Members is led by Editor-in-Chief Marco Zarbin, MD, PhD, FARVO.
The journal covers a broad spectrum of work, including but not limited to:
Applications of stem cell technology for regenerative medicine,
Development of new animal models of human diseases,
Tissue bioengineering,
Chemical engineering to improve virus-based gene delivery,
Nanotechnology for drug delivery,
Design and synthesis of artificial extracellular matrices,
Development of a true microsurgical operating environment,
Refining data analysis algorithms to improve in vivo imaging technology,
Results of Phase 1 clinical trials,
Reverse translational ("bedside to bench") research.
TVST seeks manuscripts from scientists and clinicians with diverse backgrounds ranging from basic chemistry to ophthalmic surgery that will advance or change the way we understand and/or treat vision-threatening diseases. TVST encourages the use of color, multimedia, hyperlinks, program code and other digital enhancements.