用于高光谱图像分类的张量转换器

IF 9.1 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pattern Recognition Pub Date : 2025-07-01 Epub Date: 2025-02-22 DOI:10.1016/j.patcog.2025.111470

Wei-Tao Zhang, Yv Bai, Sheng-Di Zheng, Jian Cui, Zhen-zhen Huang

{"title":"用于高光谱图像分类的张量转换器","authors":"Wei-Tao Zhang, Yv Bai, Sheng-Di Zheng, Jian Cui, Zhen-zhen Huang","doi":"10.1016/j.patcog.2025.111470","DOIUrl":null,"url":null,"abstract":"<div><div>Hyperspectral image (HSI) is widely used in real-world classification tasks since it contains rich spatial and spectral features consisting of hundreds of continuous bands. In recent years, the deep learning-based HSI classification methods, such as convolutional neural network (CNN) and Transformer, have achieved good performance in HSI classification tasks. Indeed, it is acknowledged that Transformer-based neural networks, owing to their remarkable capacity to extract long-range features, frequently outperform CNN-based neural networks in HSI classification scenarios. However, Transformer-based methods always require the sequentialization of the raw 3-D HSI data, potentially disrupting the spatial–spectral structural features. This shortcoming has degraded the classification accuracy of HSI data. In this paper, we proposed a Tensor Transformer (TT) framework for HSI classification. The TT model is an end-to-end network that directly takes the raw HSI tensor data as the input sample, without the need for raw data sequentialization. The core component of the proposed framework is the Tensor Self-Attention Mechanism (TSAM), which enables the network to efficiently extract long-range spatial–spectral structural features without losing the inherent structural relationships inner the sample. Through extensive experiments on four widely used HSI datasets, the proposed TT model demonstrates superior classification performance in discriminating land features with similar spectrum compared to state-of-the-art methods.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"163 ","pages":"Article 111470"},"PeriodicalIF":9.1000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Tensor Transformer for hyperspectral image classification\",\"authors\":\"Wei-Tao Zhang, Yv Bai, Sheng-Di Zheng, Jian Cui, Zhen-zhen Huang\",\"doi\":\"10.1016/j.patcog.2025.111470\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Hyperspectral image (HSI) is widely used in real-world classification tasks since it contains rich spatial and spectral features consisting of hundreds of continuous bands. In recent years, the deep learning-based HSI classification methods, such as convolutional neural network (CNN) and Transformer, have achieved good performance in HSI classification tasks. Indeed, it is acknowledged that Transformer-based neural networks, owing to their remarkable capacity to extract long-range features, frequently outperform CNN-based neural networks in HSI classification scenarios. However, Transformer-based methods always require the sequentialization of the raw 3-D HSI data, potentially disrupting the spatial–spectral structural features. This shortcoming has degraded the classification accuracy of HSI data. In this paper, we proposed a Tensor Transformer (TT) framework for HSI classification. The TT model is an end-to-end network that directly takes the raw HSI tensor data as the input sample, without the need for raw data sequentialization. The core component of the proposed framework is the Tensor Self-Attention Mechanism (TSAM), which enables the network to efficiently extract long-range spatial–spectral structural features without losing the inherent structural relationships inner the sample. Through extensive experiments on four widely used HSI datasets, the proposed TT model demonstrates superior classification performance in discriminating land features with similar spectrum compared to state-of-the-art methods.</div></div>\",\"PeriodicalId\":49713,\"journal\":{\"name\":\"Pattern Recognition\",\"volume\":\"163 \",\"pages\":\"Article 111470\"},\"PeriodicalIF\":9.1000,\"publicationDate\":\"2025-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pattern Recognition\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S003132032500130X\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/2/22 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S003132032500130X","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/22 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

高光谱图像由于包含数百个连续波段的丰富空间和光谱特征，被广泛应用于现实世界的分类任务中。近年来，基于深度学习的HSI分类方法，如卷积神经网络（CNN）和Transformer，在HSI分类任务中取得了较好的表现。事实上，基于transformer的神经网络由于其卓越的远程特征提取能力，在HSI分类场景中经常优于基于cnn的神经网络。然而，基于变压器的方法总是需要对原始的3-D HSI数据进行序列化，这可能会破坏空间光谱结构特征。这一缺点降低了HSI数据的分类精度。在本文中，我们提出了一个用于HSI分类的张量变换（Tensor Transformer， TT）框架。TT模型是端到端网络，直接将原始HSI张量数据作为输入样本，不需要对原始数据进行排序。该框架的核心部分是张量自关注机制（TSAM），该机制使网络能够在不丢失样本内部固有结构关系的情况下有效提取远程空间光谱结构特征。通过在四个广泛使用的HSI数据集上进行的大量实验，与最先进的方法相比，所提出的TT模型在区分具有相似光谱的地物方面表现出优越的分类性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Tensor Transformer for hyperspectral image classification

Hyperspectral image (HSI) is widely used in real-world classification tasks since it contains rich spatial and spectral features consisting of hundreds of continuous bands. In recent years, the deep learning-based HSI classification methods, such as convolutional neural network (CNN) and Transformer, have achieved good performance in HSI classification tasks. Indeed, it is acknowledged that Transformer-based neural networks, owing to their remarkable capacity to extract long-range features, frequently outperform CNN-based neural networks in HSI classification scenarios. However, Transformer-based methods always require the sequentialization of the raw 3-D HSI data, potentially disrupting the spatial–spectral structural features. This shortcoming has degraded the classification accuracy of HSI data. In this paper, we proposed a Tensor Transformer (TT) framework for HSI classification. The TT model is an end-to-end network that directly takes the raw HSI tensor data as the input sample, without the need for raw data sequentialization. The core component of the proposed framework is the Tensor Self-Attention Mechanism (TSAM), which enables the network to efficiently extract long-range spatial–spectral structural features without losing the inherent structural relationships inner the sample. Through extensive experiments on four widely used HSI datasets, the proposed TT model demonstrates superior classification performance in discriminating land features with similar spectrum compared to state-of-the-art methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Pattern Recognition 工程技术-工程：电子与电气

CiteScore

14.40

自引率

16.20%

发文量

683

审稿时长

5.6 months

期刊介绍： The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.