MoMFormer: Mixture of modality transformer model for vegetation extraction under shadow conditions

IF 5.8 2区环境科学与生态学 Q1 ECOLOGY Ecological Informatics Pub Date : 2024-09-12 DOI:10.1016/j.ecoinf.2024.102818

Yingxuan He , Wei Chen , Zhou Huang , Qingpeng Wang

{"title":"MoMFormer: Mixture of modality transformer model for vegetation extraction under shadow conditions","authors":"Yingxuan He , Wei Chen , Zhou Huang , Qingpeng Wang","doi":"10.1016/j.ecoinf.2024.102818","DOIUrl":null,"url":null,"abstract":"<div><p>Accurate estimation of fractional vegetation coverage (FVC) is essential for assessing the ecological environment and acquiring ecological information. However, under natural lighting conditions, shadows in vegetation scenes can easily lead to confusion between shadowed vegetation and shadowed soil, leading to misclassification and omission errors. This issue limits the precision of both vegetation extraction and FVC estimation. To address this challenge, this study introduces a novel deep learning model, the Mixture of Modality Transformer (MoMFormer), which is specifically designed to mitigate shadow interference in vegetation extraction. Our model uses the Swin-transformer V2 as a feature extractor, effectively capturing vegetation features from a dual-modality (regular-exposure RGB and high dynamic range HDR) dataset. A dynamic aggregation module (DAM) is integrated to adaptively blend the most relevant vegetation features. We selected several state-of-the-art (SOTA) methods and conducted extensive experiments using a self-annotated dataset featuring diverse vegetation–soil scenes and compare our model with several state-of-the-art methods. The results demonstrate that MoMFormer achieves an accuracy of 89.43 % on the HDR-RGB dual-modality dataset, with an FVC accuracy of 87.57 %, outperforming other algorithms and demonstrating high vegetation extraction accuracy and adaptability under natural lighting conditions. This research offers new insights into accurate vegetation information extraction in naturally lit environments with shadows, providing robust technical support for high-precision validation of vegetation coverage products and algorithms based on multimodal data. The code and datasets used in this study are publicly available at <span><span>https://github.com/hhhxiaohe/MoMFormer</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":51024,"journal":{"name":"Ecological Informatics","volume":null,"pages":null},"PeriodicalIF":5.8000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1574954124003601/pdfft?md5=f86e3b9567567c1cac9fdc7b86af1f24&pid=1-s2.0-S1574954124003601-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ecological Informatics","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1574954124003601","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Accurate estimation of fractional vegetation coverage (FVC) is essential for assessing the ecological environment and acquiring ecological information. However, under natural lighting conditions, shadows in vegetation scenes can easily lead to confusion between shadowed vegetation and shadowed soil, leading to misclassification and omission errors. This issue limits the precision of both vegetation extraction and FVC estimation. To address this challenge, this study introduces a novel deep learning model, the Mixture of Modality Transformer (MoMFormer), which is specifically designed to mitigate shadow interference in vegetation extraction. Our model uses the Swin-transformer V2 as a feature extractor, effectively capturing vegetation features from a dual-modality (regular-exposure RGB and high dynamic range HDR) dataset. A dynamic aggregation module (DAM) is integrated to adaptively blend the most relevant vegetation features. We selected several state-of-the-art (SOTA) methods and conducted extensive experiments using a self-annotated dataset featuring diverse vegetation–soil scenes and compare our model with several state-of-the-art methods. The results demonstrate that MoMFormer achieves an accuracy of 89.43 % on the HDR-RGB dual-modality dataset, with an FVC accuracy of 87.57 %, outperforming other algorithms and demonstrating high vegetation extraction accuracy and adaptability under natural lighting conditions. This research offers new insights into accurate vegetation information extraction in naturally lit environments with shadows, providing robust technical support for high-precision validation of vegetation coverage products and algorithms based on multimodal data. The code and datasets used in this study are publicly available at https://github.com/hhhxiaohe/MoMFormer.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

MoMFormer：用于阴影条件下植被提取的混合模态变换器模型

准确估算植被覆盖率（FVC）对于评估生态环境和获取生态信息至关重要。然而，在自然光条件下，植被场景中的阴影很容易导致阴影植被和阴影土壤之间的混淆，从而导致误分类和遗漏错误。这一问题限制了植被提取和 FVC 估计的精度。为应对这一挑战，本研究引入了一种新型深度学习模型--混合模态变换器（MoMFormer），该模型专门用于减轻植被提取中的阴影干扰。我们的模型使用斯温变换器 V2 作为特征提取器，可有效捕捉双模态（常规曝光 RGB 和高动态范围 HDR）数据集中的植被特征。我们还集成了一个动态聚合模块（DAM），用于自适应地融合最相关的植被特征。我们选择了几种最先进的（SOTA）方法，并使用具有不同植被-土壤场景的自标注数据集进行了广泛的实验，并将我们的模型与几种最先进的方法进行了比较。结果表明，MoMFormer 在 HDR-RGB 双模态数据集上的准确率达到 89.43%，FVC 准确率为 87.57%，优于其他算法，证明了在自然光条件下植被提取的高准确率和高适应性。这项研究为在有阴影的自然光照环境下准确提取植被信息提供了新的见解，为基于多模态数据的植被覆盖产品和算法的高精度验证提供了强有力的技术支持。本研究使用的代码和数据集可在 https://github.com/hhhxiaohe/MoMFormer 网站上公开获取。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Ecological Informatics 环境科学-生态学

CiteScore

8.30

自引率

11.80%

发文量

346

审稿时长

46 days

期刊介绍： The journal Ecological Informatics is devoted to the publication of high quality, peer-reviewed articles on all aspects of computational ecology, data science and biogeography. The scope of the journal takes into account the data-intensive nature of ecology, the growing capacity of information technology to access, harness and leverage complex data as well as the critical need for informing sustainable management in view of global environmental and climate change. The nature of the journal is interdisciplinary at the crossover between ecology and informatics. It focuses on novel concepts and techniques for image- and genome-based monitoring and interpretation, sensor- and multimedia-based data acquisition, internet-based data archiving and sharing, data assimilation, modelling and prediction of ecological data.