评估二元分类器在地貌应用中的准确性

IF 2.8 2区 地球科学 Q2 GEOGRAPHY, PHYSICAL Earth Surface Dynamics Pub Date : 2024-05-17 DOI:10.5194/esurf-12-765-2024
M. Rossi
{"title":"评估二元分类器在地貌应用中的准确性","authors":"M. Rossi","doi":"10.5194/esurf-12-765-2024","DOIUrl":null,"url":null,"abstract":"Abstract. Increased access to high-resolution topography has revolutionized our ability to map out fine-scale topographic features at watershed to landscape scales. As our “vision” of the land surface has improved, so has the need for more robust quantification of the accuracy of the geomorphic maps we derive from these data. One broad class of mapping challenges is that of binary classification whereby remote sensing data are used to identify the presence or absence of a given feature. Fortunately, there is a large suite of metrics developed in the data sciences well suited to quantifying the pixel-level accuracy of binary classifiers. This analysis focuses on how these metrics perform when there is a need to quantify how the number and extent of landforms are expected to vary as a function of the environmental forcing (e.g., due to climate, ecology, material property, erosion rate). Results from a suite of synthetic surfaces show how the most widely used pixel-level accuracy metric, the F1 score, is particularly poorly suited to quantifying accuracy for this kind of application. Well-known biases to imbalanced data are exacerbated by methodological strategies that calibrate and validate classifiers across settings where feature abundances vary. The Matthews correlation coefficient largely removes this bias over a wide range of feature abundances such that the sensitivity of accuracy scores to geomorphic setting instead embeds information about the size and shape of features and the type of error. If error is random, the Matthews correlation coefficient is insensitive to feature size and shape, though preferential modification of the dominant class can limit the domain over which scores can be compared. If the error is systematic (e.g., due to co-registration error between remote sensing datasets), this metric shows strong sensitivity to feature size and shape such that smaller features with more complex boundaries induce more classification error. Future studies should build on this analysis by interrogating how pixel-level accuracy metrics respond to different kinds of feature distributions indicative of different types of surface processes.\n","PeriodicalId":48749,"journal":{"name":"Earth Surface Dynamics","volume":null,"pages":null},"PeriodicalIF":2.8000,"publicationDate":"2024-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Evaluating the accuracy of binary classifiers for geomorphic applications\",\"authors\":\"M. Rossi\",\"doi\":\"10.5194/esurf-12-765-2024\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract. Increased access to high-resolution topography has revolutionized our ability to map out fine-scale topographic features at watershed to landscape scales. As our “vision” of the land surface has improved, so has the need for more robust quantification of the accuracy of the geomorphic maps we derive from these data. One broad class of mapping challenges is that of binary classification whereby remote sensing data are used to identify the presence or absence of a given feature. Fortunately, there is a large suite of metrics developed in the data sciences well suited to quantifying the pixel-level accuracy of binary classifiers. This analysis focuses on how these metrics perform when there is a need to quantify how the number and extent of landforms are expected to vary as a function of the environmental forcing (e.g., due to climate, ecology, material property, erosion rate). Results from a suite of synthetic surfaces show how the most widely used pixel-level accuracy metric, the F1 score, is particularly poorly suited to quantifying accuracy for this kind of application. Well-known biases to imbalanced data are exacerbated by methodological strategies that calibrate and validate classifiers across settings where feature abundances vary. The Matthews correlation coefficient largely removes this bias over a wide range of feature abundances such that the sensitivity of accuracy scores to geomorphic setting instead embeds information about the size and shape of features and the type of error. If error is random, the Matthews correlation coefficient is insensitive to feature size and shape, though preferential modification of the dominant class can limit the domain over which scores can be compared. If the error is systematic (e.g., due to co-registration error between remote sensing datasets), this metric shows strong sensitivity to feature size and shape such that smaller features with more complex boundaries induce more classification error. Future studies should build on this analysis by interrogating how pixel-level accuracy metrics respond to different kinds of feature distributions indicative of different types of surface processes.\\n\",\"PeriodicalId\":48749,\"journal\":{\"name\":\"Earth Surface Dynamics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2024-05-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Earth Surface Dynamics\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://doi.org/10.5194/esurf-12-765-2024\",\"RegionNum\":2,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"GEOGRAPHY, PHYSICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Earth Surface Dynamics","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.5194/esurf-12-765-2024","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GEOGRAPHY, PHYSICAL","Score":null,"Total":0}
引用次数: 0

摘要

摘要随着高分辨率地形图获取能力的提高,我们绘制从流域到景观尺度的精细地形图的能力发生了革命性的变化。随着我们对地表 "视野 "的改善,我们也需要对根据这些数据绘制的地貌图的准确性进行更有力的量化。二元分类是制图挑战中的一大类别,通过二元分类,我们可以利用遥感数据识别特定地物的存在与否。幸运的是,数据科学领域已经开发出一整套指标,非常适合量化二元分类器的像素级精度。本分析的重点是,当需要量化地貌的数量和范围如何随环境因素(如气候、生态、材料属性、侵蚀率等)而变化时,这些指标的表现如何。一套合成地表的研究结果表明,最广泛使用的像素级精度指标 F1 分数尤其不适合量化此类应用的精度。众所周知,在特征丰度不同的环境中校准和验证分类器的方法策略会加剧不平衡数据的偏差。马修斯相关系数在很大程度上消除了广泛特征丰度范围内的这种偏差,因此准确度分数对地貌环境的敏感性反而包含了有关特征大小和形状以及误差类型的信息。如果误差是随机的,则马修斯相关系数对地物的大小和形状不敏感,但对优势类的优先修改会限制可比较分数的范围。如果误差是系统性的(例如,由于遥感数据集之间的共同注册误差),该指标就会对特征大小和形状表现出很强的敏感性,例如,边界更复杂的较小特征会引起更大的分类误差。未来的研究应在这一分析的基础上,探讨像素级精度指标如何对不同类型的地表过程特征分布做出响应。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Evaluating the accuracy of binary classifiers for geomorphic applications
Abstract. Increased access to high-resolution topography has revolutionized our ability to map out fine-scale topographic features at watershed to landscape scales. As our “vision” of the land surface has improved, so has the need for more robust quantification of the accuracy of the geomorphic maps we derive from these data. One broad class of mapping challenges is that of binary classification whereby remote sensing data are used to identify the presence or absence of a given feature. Fortunately, there is a large suite of metrics developed in the data sciences well suited to quantifying the pixel-level accuracy of binary classifiers. This analysis focuses on how these metrics perform when there is a need to quantify how the number and extent of landforms are expected to vary as a function of the environmental forcing (e.g., due to climate, ecology, material property, erosion rate). Results from a suite of synthetic surfaces show how the most widely used pixel-level accuracy metric, the F1 score, is particularly poorly suited to quantifying accuracy for this kind of application. Well-known biases to imbalanced data are exacerbated by methodological strategies that calibrate and validate classifiers across settings where feature abundances vary. The Matthews correlation coefficient largely removes this bias over a wide range of feature abundances such that the sensitivity of accuracy scores to geomorphic setting instead embeds information about the size and shape of features and the type of error. If error is random, the Matthews correlation coefficient is insensitive to feature size and shape, though preferential modification of the dominant class can limit the domain over which scores can be compared. If the error is systematic (e.g., due to co-registration error between remote sensing datasets), this metric shows strong sensitivity to feature size and shape such that smaller features with more complex boundaries induce more classification error. Future studies should build on this analysis by interrogating how pixel-level accuracy metrics respond to different kinds of feature distributions indicative of different types of surface processes.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Earth Surface Dynamics
Earth Surface Dynamics GEOGRAPHY, PHYSICALGEOSCIENCES, MULTIDISCI-GEOSCIENCES, MULTIDISCIPLINARY
CiteScore
5.40
自引率
5.90%
发文量
56
审稿时长
20 weeks
期刊介绍: Earth Surface Dynamics (ESurf) is an international scientific journal dedicated to the publication and discussion of high-quality research on the physical, chemical, and biological processes shaping Earth''s surface and their interactions on all scales.
期刊最新文献
Pliocene shorelines and the epeirogenic motion of continental margins: a target dataset for dynamic topography models Decadal-scale decay of landslide-derived fluvial suspended sediment after Typhoon Morakot Exotic tree plantations in the Chilean Coastal Range: balancing the effects of discrete disturbances, connectivity, and a persistent drought on catchment erosion Role of the forcing sources in morphodynamic modelling of an embayed beach Equilibrium distance from long-range dune interactions
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1