A Dual-Contrast Adaptation Network Coupling Global Context and Geometry Information for Cross-Domain Building Extraction

IF 8.6 1区地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Transactions on Geoscience and Remote Sensing Pub Date : 2025-02-14 DOI:10.1109/TGRS.2025.3542481

Gang Xu;Min Deng;Jingru Zhu;Geng Sun;Zhenbo Gou;Ya Guo;Jie Chen

{"title":"A Dual-Contrast Adaptation Network Coupling Global Context and Geometry Information for Cross-Domain Building Extraction","authors":"Gang Xu;Min Deng;Jingru Zhu;Geng Sun;Zhenbo Gou;Ya Guo;Jie Chen","doi":"10.1109/TGRS.2025.3542481","DOIUrl":null,"url":null,"abstract":"Semantic segmentation of high-resolution remote sensing imagery (HRSI) suffers from the domain shift, resulting in poor performance of the model in another unseen domain. Unsupervised domain adaptation (UDA) semantic segmentation aims to adapt the semantic segmentation model trained on the labeled source domain to an unlabeled target domain. However, the existing UDA semantic segmentation models tend to focus only on the semantic mask information of the building and neglect modeling the structural and boundary information of the building, which leads to irregularities and inaccuracies of prediction results. We propose a dual-contrast adaptation network coupling global context and geometry information for cross-domain building extraction of HRSIs. It first uses a hierarchical transformer to extract multilevel global context information of buildings from HRSI and adopts a multilevel context fusion module to perform feature fusion on the context information extracted by the transformer at different levels to improve the model’s ability to learn domain-invariant knowledge. Then, it employs hierarchical supervised signals of attraction field map (AFM) and masks to guide the transformer-based network to migrate the building knowledge in a shape-aware manner. Finally, it implements prototype contrast learning (CL) on high-level mask prediction and AFM prediction to bridge the discrepancy between source and target domain on mask semantic information and geometry information, respectively. Extensive experiments under four cross-domain tasks indicate that the proposed method is remarkably superior to the state-of-the-art methods.","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":"63 ","pages":"1-16"},"PeriodicalIF":8.6000,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10891046/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Semantic segmentation of high-resolution remote sensing imagery (HRSI) suffers from the domain shift, resulting in poor performance of the model in another unseen domain. Unsupervised domain adaptation (UDA) semantic segmentation aims to adapt the semantic segmentation model trained on the labeled source domain to an unlabeled target domain. However, the existing UDA semantic segmentation models tend to focus only on the semantic mask information of the building and neglect modeling the structural and boundary information of the building, which leads to irregularities and inaccuracies of prediction results. We propose a dual-contrast adaptation network coupling global context and geometry information for cross-domain building extraction of HRSIs. It first uses a hierarchical transformer to extract multilevel global context information of buildings from HRSI and adopts a multilevel context fusion module to perform feature fusion on the context information extracted by the transformer at different levels to improve the model’s ability to learn domain-invariant knowledge. Then, it employs hierarchical supervised signals of attraction field map (AFM) and masks to guide the transformer-based network to migrate the building knowledge in a shape-aware manner. Finally, it implements prototype contrast learning (CL) on high-level mask prediction and AFM prediction to bridge the discrepancy between source and target domain on mask semantic information and geometry information, respectively. Extensive experiments under four cross-domain tasks indicate that the proposed method is remarkably superior to the state-of-the-art methods.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

一种耦合全局上下文和几何信息的双对比度自适应网络用于跨域建筑提取*

高分辨率遥感图像的语义分割受到域漂移的影响，导致模型在另一个未知域的性能不佳。无监督域自适应（UDA）语义分割旨在将在标记的源域上训练的语义分割模型适应于未标记的目标域。然而，现有的UDA语义分割模型往往只关注建筑的语义掩模信息，忽略了对建筑结构和边界信息的建模，导致预测结果的不规则性和不准确性。提出了一种耦合全局上下文和几何信息的双对比度自适应网络，用于hrsi的跨域构建提取。首先利用分层变压器从HRSI中提取建筑的多层全局上下文信息，然后利用多层上下文融合模块对变压器提取的不同层次的上下文信息进行特征融合，提高模型学习领域不变知识的能力。然后，利用吸引场图（AFM）和掩模的分层监督信号，引导基于变压器的网络以形状感知的方式迁移建筑知识。最后，在高级掩码预测和AFM预测上分别实现了原型对比学习（CL），以弥合源域和目标域在掩码语义信息和几何信息上的差异。在四个跨域任务下的大量实验表明，该方法明显优于当前的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Geoscience and Remote Sensing 工程技术-地球化学与地球物理

CiteScore

11.50

自引率

28.00%

发文量

1912

审稿时长

4.0 months

期刊介绍： IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.