{"title":"Estimating Uncertainty of Geographic Atrophy Segmentations with Bayesian Deep Learning","authors":"","doi":"10.1016/j.xops.2024.100587","DOIUrl":null,"url":null,"abstract":"<div><h3>Purpose</h3><p>To apply methods for quantifying uncertainty of deep learning segmentation of geographic atrophy (GA).</p></div><div><h3>Design</h3><p>Retrospective analysis of OCT images and model comparison.</p></div><div><h3>Participants</h3><p>One hundred twenty-six eyes from 87 participants with GA in the SWAGGER cohort of the Nonexudative Age-Related Macular Degeneration Imaged with Swept-Source OCT (SS-OCT) study.</p></div><div><h3>Methods</h3><p>The manual segmentations of GA lesions were conducted on structural subretinal pigment epithelium en face images from the SS-OCT images. Models were developed for 2 approximate Bayesian deep learning techniques, Monte Carlo dropout and ensemble, to assess the uncertainty of GA semantic segmentation and compared to a traditional deep learning model.</p></div><div><h3>Main Outcome Measures</h3><p>Model performance (Dice score) was compared. Uncertainty was calculated using the formula for Shannon Entropy.</p></div><div><h3>Results</h3><p>The output of both Bayesian technique models showed a greater number of pixels with high entropy than the standard model. Dice scores for the Monte Carlo dropout method (0.90, 95% confidence interval 0.87–0.93) and the ensemble method (0.88, 95% confidence interval 0.85–0.91) were significantly higher (<em>P</em> < 0.001) than for the traditional model (0.82, 95% confidence interval 0.78–0.86).</p></div><div><h3>Conclusions</h3><p>Quantifying the uncertainty in a prediction of GA may improve trustworthiness of the models and aid clinicians in decision-making. The Bayesian deep learning techniques generated pixel-wise estimates of model uncertainty for segmentation, while also improving model performance compared with traditionally trained deep learning models.</p></div><div><h3>Financial Disclosures</h3><p>Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.</p></div>","PeriodicalId":74363,"journal":{"name":"Ophthalmology science","volume":null,"pages":null},"PeriodicalIF":3.2000,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666914524001234/pdfft?md5=678a9a10974ce3ddf09356f4abea5102&pid=1-s2.0-S2666914524001234-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ophthalmology science","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666914524001234","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose
To apply methods for quantifying uncertainty of deep learning segmentation of geographic atrophy (GA).
Design
Retrospective analysis of OCT images and model comparison.
Participants
One hundred twenty-six eyes from 87 participants with GA in the SWAGGER cohort of the Nonexudative Age-Related Macular Degeneration Imaged with Swept-Source OCT (SS-OCT) study.
Methods
The manual segmentations of GA lesions were conducted on structural subretinal pigment epithelium en face images from the SS-OCT images. Models were developed for 2 approximate Bayesian deep learning techniques, Monte Carlo dropout and ensemble, to assess the uncertainty of GA semantic segmentation and compared to a traditional deep learning model.
Main Outcome Measures
Model performance (Dice score) was compared. Uncertainty was calculated using the formula for Shannon Entropy.
Results
The output of both Bayesian technique models showed a greater number of pixels with high entropy than the standard model. Dice scores for the Monte Carlo dropout method (0.90, 95% confidence interval 0.87–0.93) and the ensemble method (0.88, 95% confidence interval 0.85–0.91) were significantly higher (P < 0.001) than for the traditional model (0.82, 95% confidence interval 0.78–0.86).
Conclusions
Quantifying the uncertainty in a prediction of GA may improve trustworthiness of the models and aid clinicians in decision-making. The Bayesian deep learning techniques generated pixel-wise estimates of model uncertainty for segmentation, while also improving model performance compared with traditionally trained deep learning models.
Financial Disclosures
Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.