VCDFormer: Investigating cloud detection approaches in sub-second-level satellite videos

Xianyu Jin , Jiang He , Yi Xiao , Ziyang Lihe , Jie Li , Qiangqiang Yuan
{"title":"VCDFormer: Investigating cloud detection approaches in sub-second-level satellite videos","authors":"Xianyu Jin ,&nbsp;Jiang He ,&nbsp;Yi Xiao ,&nbsp;Ziyang Lihe ,&nbsp;Jie Li ,&nbsp;Qiangqiang Yuan","doi":"10.1016/j.jag.2025.104465","DOIUrl":null,"url":null,"abstract":"<div><div>Satellite video, as an emerging data source for Earth observation, enables dynamic monitoring and has wide-ranging applications in diverse fields. Nevertheless, cloud occlusion hinders the ability of satellite video to provide uninterrupted monitoring of the Earth’s surface. To mitigate the interference of clouds, cloud-free areas need to be selected before application, or an optimized solution like a cloud removal algorithm can be utilized to recover the occluded regions, both of which inherently demand the precise detection of clouds. However, no existing methods are capable of robust cloud detection in satellite videos. We propose the first sub-second-level satellite video cloud detection model VCDFormer to handle this problem. In VCDFormer, a spatial–temporal-enhanced transformer consisting of a local spatial–temporal reconfiguration block and a spatial-enhanced block is introduced to explore global spatial–temporal correspondence efficiently. Additionally, we construct WHU-VCD, the first sub-second-level synthetic dataset specifically designed to capture the more realistic motion characteristics of both thick and thin clouds in satellite videos. Compared to the state-of-the-art cloud detection methods, VCDFormer achieves an approximate 10%–15% improvement in the IoU metric and a 5%–8% increase in the F1-Score on the simulated test set. Experimental evaluations on Jilin-1 satellite videos, involving both synthetic and real-world scenarios, demonstrate that our proposed VCDFormer achieves superior performance in satellite video cloud detection tasks. The source code is available at <span><span>https://github.com/XyJin99/VCDFormer</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":73423,"journal":{"name":"International journal of applied earth observation and geoinformation : ITC journal","volume":"138 ","pages":"Article 104465"},"PeriodicalIF":8.6000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of applied earth observation and geoinformation : ITC journal","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1569843225001128","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/3/15 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"REMOTE SENSING","Score":null,"Total":0}
引用次数: 0

Abstract

Satellite video, as an emerging data source for Earth observation, enables dynamic monitoring and has wide-ranging applications in diverse fields. Nevertheless, cloud occlusion hinders the ability of satellite video to provide uninterrupted monitoring of the Earth’s surface. To mitigate the interference of clouds, cloud-free areas need to be selected before application, or an optimized solution like a cloud removal algorithm can be utilized to recover the occluded regions, both of which inherently demand the precise detection of clouds. However, no existing methods are capable of robust cloud detection in satellite videos. We propose the first sub-second-level satellite video cloud detection model VCDFormer to handle this problem. In VCDFormer, a spatial–temporal-enhanced transformer consisting of a local spatial–temporal reconfiguration block and a spatial-enhanced block is introduced to explore global spatial–temporal correspondence efficiently. Additionally, we construct WHU-VCD, the first sub-second-level synthetic dataset specifically designed to capture the more realistic motion characteristics of both thick and thin clouds in satellite videos. Compared to the state-of-the-art cloud detection methods, VCDFormer achieves an approximate 10%–15% improvement in the IoU metric and a 5%–8% increase in the F1-Score on the simulated test set. Experimental evaluations on Jilin-1 satellite videos, involving both synthetic and real-world scenarios, demonstrate that our proposed VCDFormer achieves superior performance in satellite video cloud detection tasks. The source code is available at https://github.com/XyJin99/VCDFormer.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
VCDFormer:研究亚秒级卫星视频中的云检测方法
卫星视频作为一种新兴的对地观测数据源,能够实现动态监测,在各个领域有着广泛的应用。然而,云层遮挡阻碍了卫星视频对地球表面进行不间断监测的能力。为了减轻云的干扰,在应用前需要选择无云区域,或者使用云去除算法等优化方案来恢复被遮挡的区域,这两者本质上都需要精确的云检测。然而,目前还没有一种方法能够对卫星视频中的云进行鲁棒检测。针对这一问题,我们提出了首个亚秒级卫星视频云检测模型VCDFormer。在VCDFormer中,引入了一个由局部时空重构块和空间增强块组成的时空增强变压器,以有效地探索全局时空对应关系。此外,我们构建了WHU-VCD,这是第一个亚秒级合成数据集,专门用于捕捉卫星视频中厚云和薄云的更真实的运动特征。与最先进的云检测方法相比,VCDFormer在IoU指标上提高了大约10%-15%,在模拟测试集中F1-Score提高了5%-8%。对“吉林一号”卫星视频合成场景和真实场景的实验评估表明,我们提出的VCDFormer在卫星视频云检测任务中取得了优异的性能。源代码可从https://github.com/XyJin99/VCDFormer获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
International journal of applied earth observation and geoinformation : ITC journal
International journal of applied earth observation and geoinformation : ITC journal Global and Planetary Change, Management, Monitoring, Policy and Law, Earth-Surface Processes, Computers in Earth Sciences
CiteScore
12.00
自引率
0.00%
发文量
0
审稿时长
77 days
期刊介绍: The International Journal of Applied Earth Observation and Geoinformation publishes original papers that utilize earth observation data for natural resource and environmental inventory and management. These data primarily originate from remote sensing platforms, including satellites and aircraft, supplemented by surface and subsurface measurements. Addressing natural resources such as forests, agricultural land, soils, and water, as well as environmental concerns like biodiversity, land degradation, and hazards, the journal explores conceptual and data-driven approaches. It covers geoinformation themes like capturing, databasing, visualization, interpretation, data quality, and spatial uncertainty.
期刊最新文献
Quantifying the ability of bidirectional reflectance distribution function (BRDF) model to Respond to soil moisture and the normalized difference vegetation index (NDVI) Evaluating Pléiades Neo capabilities for deriving rock glacier velocity Contrasting trends in climatic and ecohydrological aridity over one-fifth of global drylands Earth observation derived yield forecasting and estimation in low- and lower-middle-income countries dominated by smallholder agriculture: A review A global continuous 500 m nighttime light dataset (1992–2024) via NDVI-guided DMSP-OLS correction and U-TransNet cross-sensor harmonization
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1