A Primer on Topological Data Analysis to Support Image Analysis Tasks in Environmental Science

Lander Ver Hoef, Henry Adams, Emily J. King, Imme Ebert-Uphoff
{"title":"A Primer on Topological Data Analysis to Support Image Analysis Tasks in Environmental Science","authors":"Lander Ver Hoef, Henry Adams, Emily J. King, Imme Ebert-Uphoff","doi":"10.1175/aies-d-22-0039.1","DOIUrl":null,"url":null,"abstract":"Abstract Topological data analysis (TDA) is a tool from data science and mathematics that is beginning to make waves in environmental science. In this work, we seek to provide an intuitive and understandable introduction to a tool from TDA that is particularly useful for the analysis of imagery, namely, persistent homology. We briefly discuss the theoretical background but focus primarily on understanding the output of this tool and discussing what information it can glean. To this end, we frame our discussion around a guiding example of classifying satellite images from the sugar, fish, flower, and gravel dataset produced for the study of mesoscale organization of clouds by Rasp et al. We demonstrate how persistent homology and its vectorization, persistence landscapes, can be used in a workflow with a simple machine learning algorithm to obtain good results, and we explore in detail how we can explain this behavior in terms of image-level features. One of the core strengths of persistent homology is how interpretable it can be, so throughout this paper we discuss not just the patterns we find but why those results are to be expected given what we know about the theory of persistent homology. Our goal is that readers of this paper will leave with a better understanding of TDA and persistent homology, will be able to identify problems and datasets of their own for which persistent homology could be helpful, and will gain an understanding of the results they obtain from applying the included GitHub example code. Significance Statement Information such as the geometric structure and texture of image data can greatly support the inference of the physical state of an observed Earth system, for example, in remote sensing to determine whether wildfires are active or to identify local climate zones. Persistent homology is a branch of topological data analysis that allows one to extract such information in an interpretable way—unlike black-box methods like deep neural networks. The purpose of this paper is to explain in an intuitive manner what persistent homology is and how researchers in environmental science can use it to create interpretable models. We demonstrate the approach to identify certain cloud patterns from satellite imagery and find that the resulting model is indeed interpretable.","PeriodicalId":94369,"journal":{"name":"Artificial intelligence for the earth systems","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial intelligence for the earth systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1175/aies-d-22-0039.1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Abstract Topological data analysis (TDA) is a tool from data science and mathematics that is beginning to make waves in environmental science. In this work, we seek to provide an intuitive and understandable introduction to a tool from TDA that is particularly useful for the analysis of imagery, namely, persistent homology. We briefly discuss the theoretical background but focus primarily on understanding the output of this tool and discussing what information it can glean. To this end, we frame our discussion around a guiding example of classifying satellite images from the sugar, fish, flower, and gravel dataset produced for the study of mesoscale organization of clouds by Rasp et al. We demonstrate how persistent homology and its vectorization, persistence landscapes, can be used in a workflow with a simple machine learning algorithm to obtain good results, and we explore in detail how we can explain this behavior in terms of image-level features. One of the core strengths of persistent homology is how interpretable it can be, so throughout this paper we discuss not just the patterns we find but why those results are to be expected given what we know about the theory of persistent homology. Our goal is that readers of this paper will leave with a better understanding of TDA and persistent homology, will be able to identify problems and datasets of their own for which persistent homology could be helpful, and will gain an understanding of the results they obtain from applying the included GitHub example code. Significance Statement Information such as the geometric structure and texture of image data can greatly support the inference of the physical state of an observed Earth system, for example, in remote sensing to determine whether wildfires are active or to identify local climate zones. Persistent homology is a branch of topological data analysis that allows one to extract such information in an interpretable way—unlike black-box methods like deep neural networks. The purpose of this paper is to explain in an intuitive manner what persistent homology is and how researchers in environmental science can use it to create interpretable models. We demonstrate the approach to identify certain cloud patterns from satellite imagery and find that the resulting model is indeed interpretable.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
支持环境科学中图像分析任务的拓扑数据分析入门
拓扑数据分析(TDA)是一种来自数据科学和数学的工具,它开始在环境科学中掀起波澜。在这项工作中,我们试图提供一个直观和可理解的介绍,从TDA的工具,是特别有用的分析图像,即持久同源。我们简要地讨论了理论背景,但主要集中在理解这个工具的输出和讨论它可以收集什么信息。为此,我们将围绕Rasp等人为研究云的中尺度组织而制作的糖、鱼、花和砾石数据集的卫星图像分类的指导性示例进行讨论。我们演示了如何在一个简单的机器学习算法的工作流中使用持久同构及其矢量化,持久景观,以获得良好的结果,我们详细探讨了如何在图像级特征方面解释这种行为。持久同调的核心优势之一是它的可解释性,因此在本文中,我们不仅讨论了我们发现的模式,还讨论了为什么我们知道关于持久同调理论的这些结果是可以预期的。我们的目标是,本文的读者将更好地理解TDA和持久同源性,将能够识别持久同源性可能有帮助的问题和数据集,并将了解他们从应用所包含的GitHub示例代码中获得的结果。图像数据的几何结构和纹理等信息可以极大地支持对被观测地球系统物理状态的推断,例如在遥感中确定野火是否活跃或确定当地气候带。持久同调是拓扑数据分析的一个分支,它允许人们以一种可解释的方式提取这些信息——不像深度神经网络这样的黑箱方法。本文的目的是以一种直观的方式解释什么是持久同源性,以及环境科学研究人员如何使用它来创建可解释的模型。我们演示了从卫星图像中识别某些云模式的方法,并发现所得模型确实是可解释的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Transferability and explainability of deep learning emulators for regional climate model projections: Perspectives for future applications Classification of ice particle shapes using machine learning on forward light scattering images Convolutional encoding and normalizing flows: a deep learning approach for offshore wind speed probabilistic forecasting in the Mediterranean Sea Neural networks to find the optimal forcing for offsetting the anthropogenic climate change effects Machine Learning Approach for Spatiotemporal Multivariate Optimization of Environmental Monitoring Sensor Locations
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1