{"title":"FGRMNet: Fully graph relational matching network for few-shot remote sensing scene classification","authors":"Jacob Regan, Mahdi Khodayar","doi":"10.1016/j.eswa.2025.126823","DOIUrl":null,"url":null,"abstract":"<div><div>Few-shot remote sensing scene classification (FS-RSSC) is an essential task within remote sensing (RS) and aims to develop models that can quickly and accurately adapt to new aerial scene categories provided only a few labeled examples of the novel scenes. Convolutional neural network (CNN)-based methods have demonstrated decent performance for remote sensing scene classification (RSSC) and FS-RSSC, but they cannot handle irregular patterns well. Vision Transformer (ViT) does not suffer from this drawback, but its large data dependency makes it less viable for few-shot learning. To alleviate these weaknesses, we propose a novel end-to-end, fully graph-based framework for FS-RSSC called the fully graph relational matching network (FGRMNet). This framework consists of three principle components: (1) a deep graph neural network (GNN) embedding network comprised of dynamic GCN layers to extract long-range and irregular patterns from aerial scene samples. Unlike CNN, our GNN has a dynamic receptive field allowing it to extract richer, relational connections from object features. (2) A graph contrastive matching module (GCM) consisting of a local–global and global-global contrastive learning objective to improve the robustness and generalization of the embedding network for graph similarity learning by improving how the GNN encoder adapts its receptive field between latent layers. (3) A graph relational attention (GRAT) module, which consists of a graph attention network that learns to measure the similarity between the global graph representations of a query and the support samples by incorporating high-level node information with global graph context in the relational learning step. More precisely, the GRAT module improves the quality of the relational scores by assigning higher value to the parts of a query’s node embeddings most relevant to the comparison between the global representation of the query and the global representation of the support class. Extensive experimentation conducted for FGRMNet on three popular RS datasets demonstrates that our framework achieves state-of-the-art performance.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"274 ","pages":"Article 126823"},"PeriodicalIF":7.5000,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425004452","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Few-shot remote sensing scene classification (FS-RSSC) is an essential task within remote sensing (RS) and aims to develop models that can quickly and accurately adapt to new aerial scene categories provided only a few labeled examples of the novel scenes. Convolutional neural network (CNN)-based methods have demonstrated decent performance for remote sensing scene classification (RSSC) and FS-RSSC, but they cannot handle irregular patterns well. Vision Transformer (ViT) does not suffer from this drawback, but its large data dependency makes it less viable for few-shot learning. To alleviate these weaknesses, we propose a novel end-to-end, fully graph-based framework for FS-RSSC called the fully graph relational matching network (FGRMNet). This framework consists of three principle components: (1) a deep graph neural network (GNN) embedding network comprised of dynamic GCN layers to extract long-range and irregular patterns from aerial scene samples. Unlike CNN, our GNN has a dynamic receptive field allowing it to extract richer, relational connections from object features. (2) A graph contrastive matching module (GCM) consisting of a local–global and global-global contrastive learning objective to improve the robustness and generalization of the embedding network for graph similarity learning by improving how the GNN encoder adapts its receptive field between latent layers. (3) A graph relational attention (GRAT) module, which consists of a graph attention network that learns to measure the similarity between the global graph representations of a query and the support samples by incorporating high-level node information with global graph context in the relational learning step. More precisely, the GRAT module improves the quality of the relational scores by assigning higher value to the parts of a query’s node embeddings most relevant to the comparison between the global representation of the query and the global representation of the support class. Extensive experimentation conducted for FGRMNet on three popular RS datasets demonstrates that our framework achieves state-of-the-art performance.
期刊介绍:
Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.