{"title":"用于高光谱图像跨场景分类的动态标记增强 Mamba","authors":"Xizeng Huang;Yuxiang Zhang;Fulin Luo;Yanni Dong","doi":"10.1109/TGRS.2024.3506749","DOIUrl":null,"url":null,"abstract":"Cross-scene classification of hyperspectral image (HSI) based on single-source domain generalization (SDG) focuses on developing a model that can effectively classify images from unseen target domains using only source domain images, without the need for retraining. Most existing SDG approaches for cross-scene classification rely on convolutional neural networks (CNNs). However, the convolutional kernel operation causes the model to emphasize local object features, which can lead to overfitting on the source domain and limits its ability to generalize. Recently, methods based on the state space model (SSM) have demonstrated excellent performance in image classification by capturing global features across different image patches. Building on this inspiration, we propose a novel approach called dynamic token augmentation mamba (DTAM), which aims to explore the potential of SSMs in the cross-scene classification of HSI. The method gradually focuses on the global features of the image by constructing hidden states for HSIs unfolded into long sequences. To further enhance the global features of HSIs, we design a dynamic token augmentation (DTA) module to transform the sample features by perturbing the contextual information while preserving the object information tokens. Additionally, we introduce a loss of classified compensation combined with labels of random samples to suppress the excessive narrowing of the feature range learned by the model. Comprehensive extensive experiments on three publicly available HSI datasets show that the proposed method outperforms the state-of-the-art (SOTA) method. Our code is available at \n<uri>https://github.com/Varro-pepsi/DTAM</uri>\n.","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":"62 ","pages":"1-13"},"PeriodicalIF":8.6000,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Dynamic Token Augmentation Mamba for Cross-Scene Classification of Hyperspectral Image\",\"authors\":\"Xizeng Huang;Yuxiang Zhang;Fulin Luo;Yanni Dong\",\"doi\":\"10.1109/TGRS.2024.3506749\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Cross-scene classification of hyperspectral image (HSI) based on single-source domain generalization (SDG) focuses on developing a model that can effectively classify images from unseen target domains using only source domain images, without the need for retraining. Most existing SDG approaches for cross-scene classification rely on convolutional neural networks (CNNs). However, the convolutional kernel operation causes the model to emphasize local object features, which can lead to overfitting on the source domain and limits its ability to generalize. Recently, methods based on the state space model (SSM) have demonstrated excellent performance in image classification by capturing global features across different image patches. Building on this inspiration, we propose a novel approach called dynamic token augmentation mamba (DTAM), which aims to explore the potential of SSMs in the cross-scene classification of HSI. The method gradually focuses on the global features of the image by constructing hidden states for HSIs unfolded into long sequences. To further enhance the global features of HSIs, we design a dynamic token augmentation (DTA) module to transform the sample features by perturbing the contextual information while preserving the object information tokens. Additionally, we introduce a loss of classified compensation combined with labels of random samples to suppress the excessive narrowing of the feature range learned by the model. Comprehensive extensive experiments on three publicly available HSI datasets show that the proposed method outperforms the state-of-the-art (SOTA) method. Our code is available at \\n<uri>https://github.com/Varro-pepsi/DTAM</uri>\\n.\",\"PeriodicalId\":13213,\"journal\":{\"name\":\"IEEE Transactions on Geoscience and Remote Sensing\",\"volume\":\"62 \",\"pages\":\"1-13\"},\"PeriodicalIF\":8.6000,\"publicationDate\":\"2024-11-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Geoscience and Remote Sensing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10768958/\",\"RegionNum\":1,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10768958/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
Dynamic Token Augmentation Mamba for Cross-Scene Classification of Hyperspectral Image
Cross-scene classification of hyperspectral image (HSI) based on single-source domain generalization (SDG) focuses on developing a model that can effectively classify images from unseen target domains using only source domain images, without the need for retraining. Most existing SDG approaches for cross-scene classification rely on convolutional neural networks (CNNs). However, the convolutional kernel operation causes the model to emphasize local object features, which can lead to overfitting on the source domain and limits its ability to generalize. Recently, methods based on the state space model (SSM) have demonstrated excellent performance in image classification by capturing global features across different image patches. Building on this inspiration, we propose a novel approach called dynamic token augmentation mamba (DTAM), which aims to explore the potential of SSMs in the cross-scene classification of HSI. The method gradually focuses on the global features of the image by constructing hidden states for HSIs unfolded into long sequences. To further enhance the global features of HSIs, we design a dynamic token augmentation (DTA) module to transform the sample features by perturbing the contextual information while preserving the object information tokens. Additionally, we introduce a loss of classified compensation combined with labels of random samples to suppress the excessive narrowing of the feature range learned by the model. Comprehensive extensive experiments on three publicly available HSI datasets show that the proposed method outperforms the state-of-the-art (SOTA) method. Our code is available at
https://github.com/Varro-pepsi/DTAM
.
期刊介绍:
IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.