Xu He;Shilin Zhou;Qiang Ling;Miao Li;Zhaoxu Li;Yuyuan Zhang;Zaiping Lin
{"title":"用于高光谱异常检测的全球到本地空间光谱感知变换器网络","authors":"Xu He;Shilin Zhou;Qiang Ling;Miao Li;Zhaoxu Li;Yuyuan Zhang;Zaiping Lin","doi":"10.1109/TGRS.2024.3456799","DOIUrl":null,"url":null,"abstract":"Hyperspectral anomaly detection (HAD) is one of the momentous technologies in the field of Earth observation and remote sensing monitoring. Profiting from puissant deep feature extraction abilities, deep convolutional networks (DCN) perform excellently in the HAD domain. Nevertheless, limited by the restriction of unique local receptive fields, DCN-based detection methods struggle to catch the long-range dependence from a global perspective. In contrast, vision transformers (ViTs) perform better in global feature extraction but still disregard the local dependence properties. To this end, we proposed a novel method entitled the global-to-local spatial-spectral awareness transformer (G2LSSAT) network, in which the global transformer block (GTB) and local transformer block (LTB) are deployed in sequence to capture deep reconstruction characteristics from the global view to the local view in a spatial-spectral domain. In particular, the GTB is designed to explore the global spatial-spectral characteristics that are dependent on a crossbar-based global sparse attention module. Furthermore, the global glanced image is divided into multiple local patches and the LTB is devised to learn the local spatial-spectral features supported by a patch-based local self-invisible attention module. In addition, considering that the abnormal pixels always be unexpectedly reconstructed with the conventional self-attention module in ViTs, we introduce a invisible diagonal mask (IDM), which is embedded into the LTB module, to overshadow each pixel itself in the receptive field and reconstruct itself based on global and local dependent spatial-spectral features. Extensive experimental results on six datasets illustrate the superiority of the proposed G2LSSAT compared with other state-of-the-art detectors.","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":null,"pages":null},"PeriodicalIF":7.5000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Global-to-Local Spatial–Spectral Awareness Transformer Network for Hyperspectral Anomaly Detection\",\"authors\":\"Xu He;Shilin Zhou;Qiang Ling;Miao Li;Zhaoxu Li;Yuyuan Zhang;Zaiping Lin\",\"doi\":\"10.1109/TGRS.2024.3456799\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Hyperspectral anomaly detection (HAD) is one of the momentous technologies in the field of Earth observation and remote sensing monitoring. Profiting from puissant deep feature extraction abilities, deep convolutional networks (DCN) perform excellently in the HAD domain. Nevertheless, limited by the restriction of unique local receptive fields, DCN-based detection methods struggle to catch the long-range dependence from a global perspective. In contrast, vision transformers (ViTs) perform better in global feature extraction but still disregard the local dependence properties. To this end, we proposed a novel method entitled the global-to-local spatial-spectral awareness transformer (G2LSSAT) network, in which the global transformer block (GTB) and local transformer block (LTB) are deployed in sequence to capture deep reconstruction characteristics from the global view to the local view in a spatial-spectral domain. In particular, the GTB is designed to explore the global spatial-spectral characteristics that are dependent on a crossbar-based global sparse attention module. Furthermore, the global glanced image is divided into multiple local patches and the LTB is devised to learn the local spatial-spectral features supported by a patch-based local self-invisible attention module. In addition, considering that the abnormal pixels always be unexpectedly reconstructed with the conventional self-attention module in ViTs, we introduce a invisible diagonal mask (IDM), which is embedded into the LTB module, to overshadow each pixel itself in the receptive field and reconstruct itself based on global and local dependent spatial-spectral features. Extensive experimental results on six datasets illustrate the superiority of the proposed G2LSSAT compared with other state-of-the-art detectors.\",\"PeriodicalId\":13213,\"journal\":{\"name\":\"IEEE Transactions on Geoscience and Remote Sensing\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2024-09-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Geoscience and Remote Sensing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10670704/\",\"RegionNum\":1,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10670704/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
Global-to-Local Spatial–Spectral Awareness Transformer Network for Hyperspectral Anomaly Detection
Hyperspectral anomaly detection (HAD) is one of the momentous technologies in the field of Earth observation and remote sensing monitoring. Profiting from puissant deep feature extraction abilities, deep convolutional networks (DCN) perform excellently in the HAD domain. Nevertheless, limited by the restriction of unique local receptive fields, DCN-based detection methods struggle to catch the long-range dependence from a global perspective. In contrast, vision transformers (ViTs) perform better in global feature extraction but still disregard the local dependence properties. To this end, we proposed a novel method entitled the global-to-local spatial-spectral awareness transformer (G2LSSAT) network, in which the global transformer block (GTB) and local transformer block (LTB) are deployed in sequence to capture deep reconstruction characteristics from the global view to the local view in a spatial-spectral domain. In particular, the GTB is designed to explore the global spatial-spectral characteristics that are dependent on a crossbar-based global sparse attention module. Furthermore, the global glanced image is divided into multiple local patches and the LTB is devised to learn the local spatial-spectral features supported by a patch-based local self-invisible attention module. In addition, considering that the abnormal pixels always be unexpectedly reconstructed with the conventional self-attention module in ViTs, we introduce a invisible diagonal mask (IDM), which is embedded into the LTB module, to overshadow each pixel itself in the receptive field and reconstruct itself based on global and local dependent spatial-spectral features. Extensive experimental results on six datasets illustrate the superiority of the proposed G2LSSAT compared with other state-of-the-art detectors.
期刊介绍:
IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.