Explicit High-Level Semantic Network for Domain Generalization in Hyperspectral Image Classification

IF 7.5 1区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Transactions on Geoscience and Remote Sensing Pub Date : 2024-11-11 DOI:10.1109/TGRS.2024.3495765
Xusheng Wang;Shoubin Dong;Xiaorou Zheng;Runuo Lu;Jianxin Jia
{"title":"Explicit High-Level Semantic Network for Domain Generalization in Hyperspectral Image Classification","authors":"Xusheng Wang;Shoubin Dong;Xiaorou Zheng;Runuo Lu;Jianxin Jia","doi":"10.1109/TGRS.2024.3495765","DOIUrl":null,"url":null,"abstract":"When applied across different scenes, hyperspectral image (HSI) classification models often struggle to generalize due to the data distribution disparities and labels’ scarcity, leading to domain shift (DS) problems. Recently, the high-level semantics from text has demonstrated the potential to address the DS problem, by improving the generalization capability of image encoders through aligning image-text pairs. However, the main challenge still lies in crafting appropriate texts that accurately represent the intricate interrelationships and the fragmented nature of land cover in HSIs and effectively extracting spectral-spatial features from HSI data. This article proposes a domain generalization (DG) method, EHSnet, to address these issues by leveraging multilayered explicit high-level semantic (EHS) information from different types of texts to provide precisely relevant semantic information for the image encoder. A multilayered EHS information paradigm is well-defined, aiming to extract the HSI’s intricate interrelationships and the fragmented land-cover features, and a dual-residual encoder connected by a 2-D convolution is designed, which combines CNNs with residual structure and Vision Transformers (ViTs) with short-range cross-layer connections to explore the spectral-spatial features of HSIs. By aligning text features with image features in the semantic space, EHSnet improves the representation capability of the image encoder and is endowed with zero-shot generalization ability for cross-scene tasks. Extensive experiments conducted on three hyperspectral datasets, including Houston, Pavia, and XS datasets, validate the effectiveness and superiority of EHSnet, with the Kappa coefficient improved by 8.17%, 3.22%, and 3.62% across three datasets compared to the state-of-the-art (SOTA) methods. The code is available at \n<uri>https://github.com/SCUT-CCNL/EHSnet</uri>\n.","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":"62 ","pages":"1-14"},"PeriodicalIF":7.5000,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10750220/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

When applied across different scenes, hyperspectral image (HSI) classification models often struggle to generalize due to the data distribution disparities and labels’ scarcity, leading to domain shift (DS) problems. Recently, the high-level semantics from text has demonstrated the potential to address the DS problem, by improving the generalization capability of image encoders through aligning image-text pairs. However, the main challenge still lies in crafting appropriate texts that accurately represent the intricate interrelationships and the fragmented nature of land cover in HSIs and effectively extracting spectral-spatial features from HSI data. This article proposes a domain generalization (DG) method, EHSnet, to address these issues by leveraging multilayered explicit high-level semantic (EHS) information from different types of texts to provide precisely relevant semantic information for the image encoder. A multilayered EHS information paradigm is well-defined, aiming to extract the HSI’s intricate interrelationships and the fragmented land-cover features, and a dual-residual encoder connected by a 2-D convolution is designed, which combines CNNs with residual structure and Vision Transformers (ViTs) with short-range cross-layer connections to explore the spectral-spatial features of HSIs. By aligning text features with image features in the semantic space, EHSnet improves the representation capability of the image encoder and is endowed with zero-shot generalization ability for cross-scene tasks. Extensive experiments conducted on three hyperspectral datasets, including Houston, Pavia, and XS datasets, validate the effectiveness and superiority of EHSnet, with the Kappa coefficient improved by 8.17%, 3.22%, and 3.62% across three datasets compared to the state-of-the-art (SOTA) methods. The code is available at https://github.com/SCUT-CCNL/EHSnet .
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于高光谱图像分类领域泛化的显式高层语义网络
在不同场景中应用高光谱图像(HSI)分类模型时,由于数据分布的差异和标签的稀缺性,分类模型往往难以泛化,从而导致领域偏移(DS)问题。最近,来自文本的高级语义已经证明了解决域偏移问题的潜力,它通过对齐图像-文本对来提高图像编码器的泛化能力。然而,主要的挑战仍然在于如何制作适当的文本,以准确地表达恒星图像中错综复杂的相互关系和土地覆盖的破碎性,并有效地从恒星图像数据中提取光谱空间特征。本文提出了一种领域泛化(DG)方法--EHSnet,利用不同类型文本中的多层显式高级语义(EHS)信息为图像编码器提供精确的相关语义信息,从而解决这些问题。我们定义了一种多层 EHS 信息范式,旨在提取 HSI 错综复杂的相互关系和零散的土地覆盖特征,并设计了一种通过二维卷积连接的双残差编码器,它结合了具有残差结构的 CNN 和具有短程跨层连接的视觉变换器(ViT),以探索 HSI 的频谱空间特征。通过在语义空间中将文本特征与图像特征相匹配,EHSnet 提高了图像编码器的表示能力,并具有跨场景任务的零点泛化能力。在三个高光谱数据集(包括休斯顿、帕维亚和 XS 数据集)上进行的广泛实验验证了 EHSnet 的有效性和优越性,与最先进的(SOTA)方法相比,三个数据集的 Kappa 系数分别提高了 8.17%、3.22% 和 3.62%。代码见 https://github.com/SCUT-CCNL/EHSnet。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Transactions on Geoscience and Remote Sensing
IEEE Transactions on Geoscience and Remote Sensing 工程技术-地球化学与地球物理
CiteScore
11.50
自引率
28.00%
发文量
1912
审稿时长
4.0 months
期刊介绍: IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.
期刊最新文献
HDSR: Image Super-Resolution Method for Harmonic Diffraction Optical Imaging System Based on Plug and Play Technology Community Structure Guided Network for Hyperspectral Image Classification A Feature Enhanced Autoencoder Integrated with Fourier Neural Operator for Intelligent Elastic Wavefield Modeling On-orbit System Vicarious Calibrations for Three VIIRS Sensors Using the NIR-SWIR Ocean Color Data Processing Approach Ship Kelvin Wake Velocity Inversion Method Based on KelvinPointNet in Polarization Images
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1