Self-supervised learning of Vision Transformers for digital soil mapping using visual data

IF 5.6 1区 农林科学 Q1 SOIL SCIENCE Geoderma Pub Date : 2024-10-01 DOI:10.1016/j.geoderma.2024.117056
Paul Tresson , Maxime Dumont , Marc Jaeger , Frédéric Borne , Stéphane Boivin , Loïc Marie-Louise , Jérémie François , Hassan Boukcim , Hervé Goëau
{"title":"Self-supervised learning of Vision Transformers for digital soil mapping using visual data","authors":"Paul Tresson ,&nbsp;Maxime Dumont ,&nbsp;Marc Jaeger ,&nbsp;Frédéric Borne ,&nbsp;Stéphane Boivin ,&nbsp;Loïc Marie-Louise ,&nbsp;Jérémie François ,&nbsp;Hassan Boukcim ,&nbsp;Hervé Goëau","doi":"10.1016/j.geoderma.2024.117056","DOIUrl":null,"url":null,"abstract":"<div><div>In arid environments, prospecting cultivable land is challenging due to harsh climatic conditions and vast, hard-to-access areas. However, the soil is often bare, with little vegetation cover, making it easy to observe from above. Hence, remote sensing can drastically reduce costs to explore these areas. For the past few years, deep learning has extended remote sensing analysis, first with Convolutional Neural Networks (CNNs), then with Vision Transformers (ViTs). The main drawback of deep learning methods is their reliance on large calibration datasets, as data collection is a cumbersome and costly task, particularly in drylands. However, recent studies demonstrate that ViTs can be trained in a self-supervised manner to take advantage of large amounts of unlabelled data to pre-train models. These backbone models can then be finetuned to learn a supervised regression model with few labelled data.</div><div>In our study, we trained ViTs in a self-supervised way with a 9500 km<sup>2</sup> satellite image of dry-lands in Saudi Arabia with a spatial resolution of 1.5 m per pixel. The resulting models were used to extract features describing the bare soil and predict soil attributes (pH H<sub>2</sub>O, pH KCl, Si composition). Using only RGB data, we can accurately predict these soil properties and achieve, for instance, an RMSE of 0.40 ± 0.03 when predicting alkaline soil pH. We also assess the effectiveness of adding additional covariates, such as elevation. The pretrained models can as well be used as visual features extractors. These features can be used to automatically generate a clustered map of an area or as input of random forests models, providing a versatile way to generate maps with limited labelled data and input variables.</div></div>","PeriodicalId":12511,"journal":{"name":"Geoderma","volume":"450 ","pages":"Article 117056"},"PeriodicalIF":5.6000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Geoderma","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0016706124002854","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SOIL SCIENCE","Score":null,"Total":0}
引用次数: 0

Abstract

In arid environments, prospecting cultivable land is challenging due to harsh climatic conditions and vast, hard-to-access areas. However, the soil is often bare, with little vegetation cover, making it easy to observe from above. Hence, remote sensing can drastically reduce costs to explore these areas. For the past few years, deep learning has extended remote sensing analysis, first with Convolutional Neural Networks (CNNs), then with Vision Transformers (ViTs). The main drawback of deep learning methods is their reliance on large calibration datasets, as data collection is a cumbersome and costly task, particularly in drylands. However, recent studies demonstrate that ViTs can be trained in a self-supervised manner to take advantage of large amounts of unlabelled data to pre-train models. These backbone models can then be finetuned to learn a supervised regression model with few labelled data.
In our study, we trained ViTs in a self-supervised way with a 9500 km2 satellite image of dry-lands in Saudi Arabia with a spatial resolution of 1.5 m per pixel. The resulting models were used to extract features describing the bare soil and predict soil attributes (pH H2O, pH KCl, Si composition). Using only RGB data, we can accurately predict these soil properties and achieve, for instance, an RMSE of 0.40 ± 0.03 when predicting alkaline soil pH. We also assess the effectiveness of adding additional covariates, such as elevation. The pretrained models can as well be used as visual features extractors. These features can be used to automatically generate a clustered map of an area or as input of random forests models, providing a versatile way to generate maps with limited labelled data and input variables.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用视觉数据进行数字土壤制图的视觉变换器自我监督学习
在干旱的环境中,由于气候条件恶劣,耕地面积广阔,难以进入,因此勘探耕地具有挑战性。然而,土壤通常是裸露的,植被覆盖很少,便于从高空进行观测。因此,遥感技术可以大大降低探索这些地区的成本。过去几年,深度学习扩展了遥感分析,首先是卷积神经网络(CNN),然后是视觉转换器(ViT)。深度学习方法的主要缺点是依赖大型校准数据集,因为数据收集是一项繁琐且成本高昂的任务,尤其是在干旱地区。不过,最近的研究表明,ViTs 可以通过自我监督的方式进行训练,以利用大量未标记的数据对模型进行预训练。在我们的研究中,我们利用沙特阿拉伯 9500 平方公里的旱地卫星图像,以每像素 1.5 米的空间分辨率对 ViTs 进行了自我监督式训练。所得模型用于提取裸露土壤的特征,并预测土壤属性(pH H2O、pH KCl、Si 成分)。仅使用 RGB 数据,我们就能准确预测这些土壤属性,例如,在预测碱性土壤 pH 值时,RMSE 为 0.40 ± 0.03。我们还评估了添加海拔等其他协变量的效果。预训练模型还可用作视觉特征提取器。这些特征可用于自动生成一个区域的聚类地图,或作为随机森林模型的输入,为利用有限的标注数据和输入变量生成地图提供了一种通用方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Geoderma
Geoderma 农林科学-土壤科学
CiteScore
11.80
自引率
6.60%
发文量
597
审稿时长
58 days
期刊介绍: Geoderma - the global journal of soil science - welcomes authors, readers and soil research from all parts of the world, encourages worldwide soil studies, and embraces all aspects of soil science and its associated pedagogy. The journal particularly welcomes interdisciplinary work focusing on dynamic soil processes and functions across space and time.
期刊最新文献
Effects of exopolysaccharides from Rhizobium tropici on transformation and aggregate sizes of iron oxides Effects of different tillage methods on soil properties and maize seedling growth in alternating wide and narrow rows rotation mode in the Songliao Plain of China High resolution soil moisture mapping in 3D space and time using machine learning and depth functions A European soil organic carbon monitoring system leveraging Sentinel 2 imagery and the LUCAS soil data base Formation of placic horizons in soils of a temperate climate – The interplay of lithology and pedogenesis (Stołowe Mts, SW Poland)
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1