The illusion of success: Test set disproportion causes inflated accuracy in remote sensing mapping research

Yuanjun Xiao , Zhen Zhao , Jingfeng Huang , Ran Huang , Wei Weng , Gerui Liang , Chang Zhou , Qi Shao , Qiyu Tian
{"title":"The illusion of success: Test set disproportion causes inflated accuracy in remote sensing mapping research","authors":"Yuanjun Xiao ,&nbsp;Zhen Zhao ,&nbsp;Jingfeng Huang ,&nbsp;Ran Huang ,&nbsp;Wei Weng ,&nbsp;Gerui Liang ,&nbsp;Chang Zhou ,&nbsp;Qi Shao ,&nbsp;Qiyu Tian","doi":"10.1016/j.jag.2024.104256","DOIUrl":null,"url":null,"abstract":"<div><div>In remote sensing mapping studies, selecting an appropriate test set to accurately evaluate the results is critical. An imprecise accuracy assessment can be misleading and fail to validate the applicability of mapping products. Commencing with the WHU-Hi-HanChuan dataset, this paper revealed the impact of sample size ratios in test sets on accuracy metrics by generating a series of test sets with varying ratios of positive and negative sample size to evaluate the same map. A rigorous approach for accuracy assessment was suggested, and an example of tea plantations mapping is used to demonstrate the process and analyse potential issues in traditional approaches. A scale factor (<span><math><mi>λ</mi></math></span>) was constructed to measure the discrepancy in sample size ratios between test sets and actual conditions. Accuracy adjustment formulas were developed and applied to adjust the accuracy of 42 previous maps based on the <span><math><mi>λ</mi></math></span>. Results showed a higher ratio of positive to negative sample size in test set led to inflated user’s accuracy (UA), F1-score (F1) and overall accuracy (OA), but had little impact on producer’s accuracy. When the ratio aligned with that in the target area, the UA, F1, and OA closely matched the true values, indicating the proportion of positive and negative samples in test set should be consistent with that in actual situation. The accuracies reported by the traditional approaches including test set sampling from labelled data and 5-fold cross validation were far from the true accuracy and could not reflect the performance of the map. Among 42 previous maps, nearly 60% of the maps had UAs overestimated by 10%, and 9.5% of the maps had UAs and F1s deviations of more than 25%. The conclusions of this study provide a clear caution for future mapping research and assist in producing and identifying truly excellent maps.</div></div>","PeriodicalId":73423,"journal":{"name":"International journal of applied earth observation and geoinformation : ITC journal","volume":"135 ","pages":"Article 104256"},"PeriodicalIF":7.6000,"publicationDate":"2024-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of applied earth observation and geoinformation : ITC journal","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1569843224006125","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"REMOTE SENSING","Score":null,"Total":0}
引用次数: 0

Abstract

In remote sensing mapping studies, selecting an appropriate test set to accurately evaluate the results is critical. An imprecise accuracy assessment can be misleading and fail to validate the applicability of mapping products. Commencing with the WHU-Hi-HanChuan dataset, this paper revealed the impact of sample size ratios in test sets on accuracy metrics by generating a series of test sets with varying ratios of positive and negative sample size to evaluate the same map. A rigorous approach for accuracy assessment was suggested, and an example of tea plantations mapping is used to demonstrate the process and analyse potential issues in traditional approaches. A scale factor (λ) was constructed to measure the discrepancy in sample size ratios between test sets and actual conditions. Accuracy adjustment formulas were developed and applied to adjust the accuracy of 42 previous maps based on the λ. Results showed a higher ratio of positive to negative sample size in test set led to inflated user’s accuracy (UA), F1-score (F1) and overall accuracy (OA), but had little impact on producer’s accuracy. When the ratio aligned with that in the target area, the UA, F1, and OA closely matched the true values, indicating the proportion of positive and negative samples in test set should be consistent with that in actual situation. The accuracies reported by the traditional approaches including test set sampling from labelled data and 5-fold cross validation were far from the true accuracy and could not reflect the performance of the map. Among 42 previous maps, nearly 60% of the maps had UAs overestimated by 10%, and 9.5% of the maps had UAs and F1s deviations of more than 25%. The conclusions of this study provide a clear caution for future mapping research and assist in producing and identifying truly excellent maps.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
成功的假象:测试集比例失调导致遥感测绘研究的精确度膨胀
在遥感测绘研究中,选择适当的测试集以准确评估结果至关重要。不精确的精度评估可能会产生误导,无法验证测绘产品的适用性。本文从西湖大学-汉川数据集入手,通过生成一系列正负样本量比例不同的测试集来评估同一幅地图,揭示了测试集中样本量比例对精度指标的影响。提出了一种严格的精度评估方法,并以茶园制图为例演示了这一过程,分析了传统方法中可能存在的问题。构建了一个比例因子(λ),用于衡量测试集与实际情况之间样本量比率的差异。根据 λ 制定并应用了精确度调整公式,以调整 42 幅先前地图的精确度。结果显示,测试集中正负样本量的比例越高,用户准确率(UA)、F1 分数(F1)和总体准确率(OA)就越高,但对生产者的准确率影响不大。当比例与目标区域的比例一致时,UA、F1 和 OA 与真实值非常接近,表明测试集中正负样本的比例应与实际情况一致。传统方法(包括从标记数据中抽取测试集样本和 5 倍交叉验证)所报告的准确度与真实准确度相差甚远,无法反映地图的性能。在以往的 42 幅地图中,近 60% 的地图的 UAs 高估了 10%,9.5% 的地图的 UAs 和 F1s 偏差超过 25%。本研究的结论为今后的地图研究提供了明确的警示,有助于制作和识别真正优秀的地图。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
International journal of applied earth observation and geoinformation : ITC journal
International journal of applied earth observation and geoinformation : ITC journal Global and Planetary Change, Management, Monitoring, Policy and Law, Earth-Surface Processes, Computers in Earth Sciences
CiteScore
12.00
自引率
0.00%
发文量
0
审稿时长
77 days
期刊介绍: The International Journal of Applied Earth Observation and Geoinformation publishes original papers that utilize earth observation data for natural resource and environmental inventory and management. These data primarily originate from remote sensing platforms, including satellites and aircraft, supplemented by surface and subsurface measurements. Addressing natural resources such as forests, agricultural land, soils, and water, as well as environmental concerns like biodiversity, land degradation, and hazards, the journal explores conceptual and data-driven approaches. It covers geoinformation themes like capturing, databasing, visualization, interpretation, data quality, and spatial uncertainty.
期刊最新文献
Combining readily available population and land cover maps to generate non-residential built-up labels to train Sentinel-2 image segmentation models An intercomparison of national and global land use and land cover products for Fiji The illusion of success: Test set disproportion causes inflated accuracy in remote sensing mapping research Multispectral imaging and terrestrial laser scanning for the detection of drought-induced paraheliotropic leaf movement in soybean DeLA: An extremely faster network with decoupled local aggregation for large scale point cloud learning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1