Uncertainty is not sufficient for identifying noisy labels in training data for binary segmentation of building footprints

Hannah Ulman, Jonas Gütter, Julia Niebling
{"title":"Uncertainty is not sufficient for identifying noisy labels in training data for binary segmentation of building footprints","authors":"Hannah Ulman, Jonas Gütter, Julia Niebling","doi":"10.3389/frsen.2022.1100012","DOIUrl":null,"url":null,"abstract":"Obtaining high quality labels is a major challenge for the application of deep neural networks in the remote sensing domain. A common way of acquiring labels is the usage of crowd sourcing which can provide much needed training data sets but also often contains incorrect labels which can affect the training process of a deep neural network significantly. In this paper, we exploit uncertainty to identify a certain type of label noise for semantic segmentation of buildings in satellite imagery. That type of label noise is known as “omission noise,” i.e., missing labels for whole buildings which still appear in the satellite image. Following the literature, uncertainty during training can help in identifying the “sweet spot” between generalizing well and overfitting to label noise, which is further used to differentiate between noisy and clean labels. The differentiation between clean and noisy labels is based on pixel-wise uncertainty estimation and beta distribution fitting to the uncertainty estimates. For our study, we create a data set for building segmentation with different levels of omission noise to evaluate the impact of the noise level on the performance of the deep neural network during training. In doing so, we show that established uncertainty-based methods to identify noisy labels are in general not sufficient enough for our kind of remote sensing data. On the other hand, for some noise levels, we observe some promising differences between noisy and clean data which opens the possibility to refine the state-of-the-art methods further.","PeriodicalId":198378,"journal":{"name":"Frontiers in Remote Sensing","volume":"140 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Remote Sensing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/frsen.2022.1100012","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Obtaining high quality labels is a major challenge for the application of deep neural networks in the remote sensing domain. A common way of acquiring labels is the usage of crowd sourcing which can provide much needed training data sets but also often contains incorrect labels which can affect the training process of a deep neural network significantly. In this paper, we exploit uncertainty to identify a certain type of label noise for semantic segmentation of buildings in satellite imagery. That type of label noise is known as “omission noise,” i.e., missing labels for whole buildings which still appear in the satellite image. Following the literature, uncertainty during training can help in identifying the “sweet spot” between generalizing well and overfitting to label noise, which is further used to differentiate between noisy and clean labels. The differentiation between clean and noisy labels is based on pixel-wise uncertainty estimation and beta distribution fitting to the uncertainty estimates. For our study, we create a data set for building segmentation with different levels of omission noise to evaluate the impact of the noise level on the performance of the deep neural network during training. In doing so, we show that established uncertainty-based methods to identify noisy labels are in general not sufficient enough for our kind of remote sensing data. On the other hand, for some noise levels, we observe some promising differences between noisy and clean data which opens the possibility to refine the state-of-the-art methods further.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
在建筑足迹二值分割的训练数据中,不确定性不足以识别噪声标签
获取高质量的标签是深度神经网络在遥感领域应用的主要挑战。一种常见的获取标签的方法是使用众包,它可以提供急需的训练数据集,但也经常包含不正确的标签,这可能会严重影响深度神经网络的训练过程。在本文中,我们利用不确定性来识别特定类型的标签噪声,用于卫星图像中建筑物的语义分割。这种类型的标签噪声被称为“遗漏噪声”,即在卫星图像中仍然出现的整个建筑物的缺失标签。根据文献,训练期间的不确定性可以帮助识别泛化良好和标签噪声过拟合之间的“最佳点”,这进一步用于区分有噪声标签和干净标签。区分干净标签和噪声标签是基于逐像素的不确定性估计和beta分布拟合的不确定性估计。在我们的研究中,我们创建了一个数据集,用于构建具有不同遗漏噪声水平的分割,以评估噪声水平对训练过程中深度神经网络性能的影响。在这样做的过程中,我们表明,建立基于不确定性的方法来识别噪声标签通常不足以满足我们这种遥感数据。另一方面,对于某些噪声水平,我们观察到噪声数据和干净数据之间存在一些有希望的差异,这为进一步改进最先进的方法提供了可能性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A near-real-time tropical deforestation monitoring algorithm based on the CuSum change detection method Suitability of different in-water algorithms for eutrophic and absorbing waters applied to Sentinel-2 MSI and Sentinel-3 OLCI data Sea surface barometry with an O2 differential absorption radar: retrieval algorithm development and simulation Assessment of advanced neural networks for the dual estimation of water quality indicators and their uncertainties Selecting HyperNav deployment sites for calibrating and validating PACE ocean color observations
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1