Oleksandr Kovalyk-Borodyak, Juan Morales-Sánchez, Rafael Verdú-Monedero, José-Luis Sancho-Gómez
{"title":"Glaucoma detection: Binocular approach and clinical data in machine learning","authors":"Oleksandr Kovalyk-Borodyak, Juan Morales-Sánchez, Rafael Verdú-Monedero, José-Luis Sancho-Gómez","doi":"10.1016/j.artmed.2024.103050","DOIUrl":null,"url":null,"abstract":"<div><div>In this work, we present a multi-modal machine learning method to automate early glaucoma diagnosis. The proposed methodology introduces two novel aspects for automated diagnosis not previously explored in the literature: simultaneous use of ocular fundus images from both eyes and integration with the patient’s additional clinical data. We begin by establishing a baseline, termed <em>monocular mode</em>, which adheres to the traditional approach of considering the data from each eye as a separate instance. We then explore the <em>binocular mode</em>, investigating how combining information from both eyes of the same patient can enhance glaucoma diagnosis accuracy. This exploration employs the PAPILA dataset, comprising information from both eyes, clinical data, ocular fundus images, and expert segmentation of these images. Additionally, we compare two image-derived data modalities: direct ocular fundus images and morphological data from manual expert segmentation. Our method integrates Gradient-Boosted Decision Trees (GBDT) and Convolutional Neural Networks (CNN), specifically focusing on the MobileNet, VGG16, ResNet-50, and Inception models. SHAP values are used to interpret GBDT models, while the Deep Explainer method is applied in conjunction with SHAP to analyze the outputs of convolutional-based models. Our findings show the viability of considering both eyes, which improves the model performance. The binocular approach, incorporating information from morphological and clinical data yielded an AUC of 0.796 (<span><math><mrow><mo>±</mo><mn>0</mn><mo>.</mo><mn>003</mn></mrow></math></span> at a 95% confidence interval), while the CNN, using the same approach (both eyes), achieved an AUC of 0.764 (<span><math><mrow><mo>±</mo><mn>0</mn><mo>.</mo><mn>005</mn></mrow></math></span> at a 95% confidence interval).</div></div>","PeriodicalId":55458,"journal":{"name":"Artificial Intelligence in Medicine","volume":"160 ","pages":"Article 103050"},"PeriodicalIF":6.1000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence in Medicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0933365724002926","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
In this work, we present a multi-modal machine learning method to automate early glaucoma diagnosis. The proposed methodology introduces two novel aspects for automated diagnosis not previously explored in the literature: simultaneous use of ocular fundus images from both eyes and integration with the patient’s additional clinical data. We begin by establishing a baseline, termed monocular mode, which adheres to the traditional approach of considering the data from each eye as a separate instance. We then explore the binocular mode, investigating how combining information from both eyes of the same patient can enhance glaucoma diagnosis accuracy. This exploration employs the PAPILA dataset, comprising information from both eyes, clinical data, ocular fundus images, and expert segmentation of these images. Additionally, we compare two image-derived data modalities: direct ocular fundus images and morphological data from manual expert segmentation. Our method integrates Gradient-Boosted Decision Trees (GBDT) and Convolutional Neural Networks (CNN), specifically focusing on the MobileNet, VGG16, ResNet-50, and Inception models. SHAP values are used to interpret GBDT models, while the Deep Explainer method is applied in conjunction with SHAP to analyze the outputs of convolutional-based models. Our findings show the viability of considering both eyes, which improves the model performance. The binocular approach, incorporating information from morphological and clinical data yielded an AUC of 0.796 ( at a 95% confidence interval), while the CNN, using the same approach (both eyes), achieved an AUC of 0.764 ( at a 95% confidence interval).
期刊介绍:
Artificial Intelligence in Medicine publishes original articles from a wide variety of interdisciplinary perspectives concerning the theory and practice of artificial intelligence (AI) in medicine, medically-oriented human biology, and health care.
Artificial intelligence in medicine may be characterized as the scientific discipline pertaining to research studies, projects, and applications that aim at supporting decision-based medical tasks through knowledge- and/or data-intensive computer-based solutions that ultimately support and improve the performance of a human care provider.