{"title":"Understanding Spatial Context in Convolutional Neural Networks using Explainable Methods: Application to Interpretable GREMLIN","authors":"K. Hilburn","doi":"10.1175/aies-d-22-0093.1","DOIUrl":null,"url":null,"abstract":"\nConvolutional neural networks (CNNs) are opening new possibilities in the realm of satellite remote sensing. CNNs are especially useful for capturing the information in spatial patterns that is evident to the human eye but has eluded classical pixelwise retrieval algorithms. However, the black box nature of CNN predictions makes them difficult to interpret, hindering their trustworthiness. This paper explores a new way to simplify CNNs that allows them to be implemented in a fully transparent and interpretable framework. This clarity is accomplished by moving the inner workings of the CNN out into a feature engineering step and replacing the CNN with a regression model. The specific example of GREMLIN (GOES Radar Estimation via Machine Learning to Inform NWP) is used to demonstrate that such simplifications are possible and show the benefits of the interpretable approach. GREMLIN translates images of GOES radiances and lightning into images of radar reflectivity, and previous research used Explainable AI (XAI) approaches to explain some aspects of how GREMLIN makes predictions. However, the Interpretable GREMLIN model shows that XAI missed several strategies, and XAI does not provide guarantees on how the model will respond when confronted with new scenarios. In contrast, the interpretable model establishes well defined relationships between inputs and outputs, offering a clear mapping of the spatial context utilized by the CNN to make accurate predictions; and providing guarantees on how the model will respond to new inputs. The significance of this work is that it provides a new approach for developing trustworthy AI models.","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"33 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial intelligence for the earth systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1175/aies-d-22-0093.1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Convolutional neural networks (CNNs) are opening new possibilities in the realm of satellite remote sensing. CNNs are especially useful for capturing the information in spatial patterns that is evident to the human eye but has eluded classical pixelwise retrieval algorithms. However, the black box nature of CNN predictions makes them difficult to interpret, hindering their trustworthiness. This paper explores a new way to simplify CNNs that allows them to be implemented in a fully transparent and interpretable framework. This clarity is accomplished by moving the inner workings of the CNN out into a feature engineering step and replacing the CNN with a regression model. The specific example of GREMLIN (GOES Radar Estimation via Machine Learning to Inform NWP) is used to demonstrate that such simplifications are possible and show the benefits of the interpretable approach. GREMLIN translates images of GOES radiances and lightning into images of radar reflectivity, and previous research used Explainable AI (XAI) approaches to explain some aspects of how GREMLIN makes predictions. However, the Interpretable GREMLIN model shows that XAI missed several strategies, and XAI does not provide guarantees on how the model will respond when confronted with new scenarios. In contrast, the interpretable model establishes well defined relationships between inputs and outputs, offering a clear mapping of the spatial context utilized by the CNN to make accurate predictions; and providing guarantees on how the model will respond to new inputs. The significance of this work is that it provides a new approach for developing trustworthy AI models.