PatchOut: A novel patch-free approach based on a transformer-CNN hybrid framework for fine-grained land-cover classification on large-scale airborne hyperspectral images
Renjie Ji , Kun Tan , Xue Wang , Shuwei Tang , Jin Sun , Chao Niu , Chen Pan
{"title":"PatchOut: A novel patch-free approach based on a transformer-CNN hybrid framework for fine-grained land-cover classification on large-scale airborne hyperspectral images","authors":"Renjie Ji , Kun Tan , Xue Wang , Shuwei Tang , Jin Sun , Chao Niu , Chen Pan","doi":"10.1016/j.jag.2025.104457","DOIUrl":null,"url":null,"abstract":"<div><div>Airborne hyperspectral systems can provide high-resolution hyperspectral images (HSIs) covering large scenes, enabling fine-grained land-cover classification. However, the most popular patch-based methods are limited by low computational efficiency and broken classification results, which hinders the full utilization of this powerful technology in Earth observation applications. Therefore, in this paper, considering the efficiency requirements for large-scale land-cover classification, a novel <strong>p</strong>atch-free <strong>a</strong>pproach based on a <strong>T</strong>ransformer-<strong>C</strong>NN <strong>h</strong>ybrid (PatchOut) framework is proposed. The proposed PatchOut framework adopts an encoder-decoder architecture, enabling rapid semantic segmentation for HSI classification. For the encoder module, we introduce a computationally efficient reduced Transformer module integrated with convolutional neural network (CNN), to leverage their complementary strengths for long-range and local feature extraction, respectively. A multi-scale spatial-spectral feature fusion (MSSSFF) module is also proposed to amalgamate the characteristics of different levels from the encoder, which enhances the overall feature representation. Then, to address the loss of semantic detail and resolution inherent in multi-level feature extraction, a novel feature reconstruction module (FRM) is applied to recover high-quality semantic features. Finally, a large-scale benchmark dataset, Qingpu-HSI, is presented, comprising airborne HSIs covering 33.91 km<sup>2</sup> with 20 land-cover classes. Experiments on the Qingpu-HSI and another public dataset demonstrate the superior accuracy and efficiency of our proposed PatchOut framework, outperforming several well-known patch-free and patch-based methods. The Qingpu HSI dataset, along with the PatchOut framework code will be released at <span><span>https://github.com/busbyjrj/PatchOut</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":73423,"journal":{"name":"International journal of applied earth observation and geoinformation : ITC journal","volume":"138 ","pages":"Article 104457"},"PeriodicalIF":7.6000,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of applied earth observation and geoinformation : ITC journal","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1569843225001049","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"REMOTE SENSING","Score":null,"Total":0}
引用次数: 0
Abstract
Airborne hyperspectral systems can provide high-resolution hyperspectral images (HSIs) covering large scenes, enabling fine-grained land-cover classification. However, the most popular patch-based methods are limited by low computational efficiency and broken classification results, which hinders the full utilization of this powerful technology in Earth observation applications. Therefore, in this paper, considering the efficiency requirements for large-scale land-cover classification, a novel patch-free approach based on a Transformer-CNN hybrid (PatchOut) framework is proposed. The proposed PatchOut framework adopts an encoder-decoder architecture, enabling rapid semantic segmentation for HSI classification. For the encoder module, we introduce a computationally efficient reduced Transformer module integrated with convolutional neural network (CNN), to leverage their complementary strengths for long-range and local feature extraction, respectively. A multi-scale spatial-spectral feature fusion (MSSSFF) module is also proposed to amalgamate the characteristics of different levels from the encoder, which enhances the overall feature representation. Then, to address the loss of semantic detail and resolution inherent in multi-level feature extraction, a novel feature reconstruction module (FRM) is applied to recover high-quality semantic features. Finally, a large-scale benchmark dataset, Qingpu-HSI, is presented, comprising airborne HSIs covering 33.91 km2 with 20 land-cover classes. Experiments on the Qingpu-HSI and another public dataset demonstrate the superior accuracy and efficiency of our proposed PatchOut framework, outperforming several well-known patch-free and patch-based methods. The Qingpu HSI dataset, along with the PatchOut framework code will be released at https://github.com/busbyjrj/PatchOut.
期刊介绍:
The International Journal of Applied Earth Observation and Geoinformation publishes original papers that utilize earth observation data for natural resource and environmental inventory and management. These data primarily originate from remote sensing platforms, including satellites and aircraft, supplemented by surface and subsurface measurements. Addressing natural resources such as forests, agricultural land, soils, and water, as well as environmental concerns like biodiversity, land degradation, and hazards, the journal explores conceptual and data-driven approaches. It covers geoinformation themes like capturing, databasing, visualization, interpretation, data quality, and spatial uncertainty.