用于高光谱图像跨场景分类的动态标记增强 Mamba

IF 8.6 1区地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Transactions on Geoscience and Remote Sensing Pub Date : 2024-11-26 DOI:10.1109/TGRS.2024.3506749

Xizeng Huang;Yuxiang Zhang;Fulin Luo;Yanni Dong

{"title":"用于高光谱图像跨场景分类的动态标记增强 Mamba","authors":"Xizeng Huang;Yuxiang Zhang;Fulin Luo;Yanni Dong","doi":"10.1109/TGRS.2024.3506749","DOIUrl":null,"url":null,"abstract":"Cross-scene classification of hyperspectral image (HSI) based on single-source domain generalization (SDG) focuses on developing a model that can effectively classify images from unseen target domains using only source domain images, without the need for retraining. Most existing SDG approaches for cross-scene classification rely on convolutional neural networks (CNNs). However, the convolutional kernel operation causes the model to emphasize local object features, which can lead to overfitting on the source domain and limits its ability to generalize. Recently, methods based on the state space model (SSM) have demonstrated excellent performance in image classification by capturing global features across different image patches. Building on this inspiration, we propose a novel approach called dynamic token augmentation mamba (DTAM), which aims to explore the potential of SSMs in the cross-scene classification of HSI. The method gradually focuses on the global features of the image by constructing hidden states for HSIs unfolded into long sequences. To further enhance the global features of HSIs, we design a dynamic token augmentation (DTA) module to transform the sample features by perturbing the contextual information while preserving the object information tokens. Additionally, we introduce a loss of classified compensation combined with labels of random samples to suppress the excessive narrowing of the feature range learned by the model. Comprehensive extensive experiments on three publicly available HSI datasets show that the proposed method outperforms the state-of-the-art (SOTA) method. Our code is available at \n<uri>https://github.com/Varro-pepsi/DTAM</uri>\n.","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":"62 ","pages":"1-13"},"PeriodicalIF":8.6000,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Dynamic Token Augmentation Mamba for Cross-Scene Classification of Hyperspectral Image\",\"authors\":\"Xizeng Huang;Yuxiang Zhang;Fulin Luo;Yanni Dong\",\"doi\":\"10.1109/TGRS.2024.3506749\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Cross-scene classification of hyperspectral image (HSI) based on single-source domain generalization (SDG) focuses on developing a model that can effectively classify images from unseen target domains using only source domain images, without the need for retraining. Most existing SDG approaches for cross-scene classification rely on convolutional neural networks (CNNs). However, the convolutional kernel operation causes the model to emphasize local object features, which can lead to overfitting on the source domain and limits its ability to generalize. Recently, methods based on the state space model (SSM) have demonstrated excellent performance in image classification by capturing global features across different image patches. Building on this inspiration, we propose a novel approach called dynamic token augmentation mamba (DTAM), which aims to explore the potential of SSMs in the cross-scene classification of HSI. The method gradually focuses on the global features of the image by constructing hidden states for HSIs unfolded into long sequences. To further enhance the global features of HSIs, we design a dynamic token augmentation (DTA) module to transform the sample features by perturbing the contextual information while preserving the object information tokens. Additionally, we introduce a loss of classified compensation combined with labels of random samples to suppress the excessive narrowing of the feature range learned by the model. Comprehensive extensive experiments on three publicly available HSI datasets show that the proposed method outperforms the state-of-the-art (SOTA) method. Our code is available at \\n<uri>https://github.com/Varro-pepsi/DTAM</uri>\\n.\",\"PeriodicalId\":13213,\"journal\":{\"name\":\"IEEE Transactions on Geoscience and Remote Sensing\",\"volume\":\"62 \",\"pages\":\"1-13\"},\"PeriodicalIF\":8.6000,\"publicationDate\":\"2024-11-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Geoscience and Remote Sensing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10768958/\",\"RegionNum\":1,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10768958/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

基于单源域泛化（SDG）的高光谱图像跨场景分类（HSI）的重点是开发一种模型，该模型可以仅使用源域图像有效地对未见过的目标域进行图像分类，而无需再训练。大多数现有的SDG跨场景分类方法依赖于卷积神经网络（cnn）。然而，卷积核运算导致模型强调局部目标特征，这可能导致源域上的过拟合，限制了模型的泛化能力。近年来，基于状态空间模型（SSM）的方法通过捕获不同图像补丁的全局特征，在图像分类中表现出优异的性能。基于这一灵感，我们提出了一种称为动态令牌增强曼巴（DTAM）的新方法，旨在探索ssm在HSI跨场景分类中的潜力。该方法通过对展开为长序列的hsi构造隐藏状态，逐步关注图像的全局特征。为了进一步增强hsi的全局特征，我们设计了一个动态令牌增强（DTA）模块，通过扰动上下文信息来转换样本特征，同时保留对象信息令牌。此外，我们引入了分类补偿的损失与随机样本的标签相结合，以抑制模型学习到的特征范围的过度缩小。在三个公开可用的HSI数据集上进行的综合广泛实验表明，所提出的方法优于最先进的（SOTA）方法。我们的代码可在https://github.com/Varro-pepsi/DTAM上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Dynamic Token Augmentation Mamba for Cross-Scene Classification of Hyperspectral Image

Cross-scene classification of hyperspectral image (HSI) based on single-source domain generalization (SDG) focuses on developing a model that can effectively classify images from unseen target domains using only source domain images, without the need for retraining. Most existing SDG approaches for cross-scene classification rely on convolutional neural networks (CNNs). However, the convolutional kernel operation causes the model to emphasize local object features, which can lead to overfitting on the source domain and limits its ability to generalize. Recently, methods based on the state space model (SSM) have demonstrated excellent performance in image classification by capturing global features across different image patches. Building on this inspiration, we propose a novel approach called dynamic token augmentation mamba (DTAM), which aims to explore the potential of SSMs in the cross-scene classification of HSI. The method gradually focuses on the global features of the image by constructing hidden states for HSIs unfolded into long sequences. To further enhance the global features of HSIs, we design a dynamic token augmentation (DTA) module to transform the sample features by perturbing the contextual information while preserving the object information tokens. Additionally, we introduce a loss of classified compensation combined with labels of random samples to suppress the excessive narrowing of the feature range learned by the model. Comprehensive extensive experiments on three publicly available HSI datasets show that the proposed method outperforms the state-of-the-art (SOTA) method. Our code is available at https://github.com/Varro-pepsi/DTAM .

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Geoscience and Remote Sensing 工程技术-地球化学与地球物理

CiteScore

11.50

自引率

28.00%

发文量

1912

审稿时长

4.0 months

期刊介绍： IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.