Gang Xu;Min Deng;Jingru Zhu;Geng Sun;Zhenbo Gou;Ya Guo;Jie Chen
{"title":"A Dual-Contrast Adaptation Network Coupling Global Context and Geometry Information for Cross-Domain Building Extraction","authors":"Gang Xu;Min Deng;Jingru Zhu;Geng Sun;Zhenbo Gou;Ya Guo;Jie Chen","doi":"10.1109/TGRS.2025.3542481","DOIUrl":null,"url":null,"abstract":"Semantic segmentation of high-resolution remote sensing imagery (HRSI) suffers from the domain shift, resulting in poor performance of the model in another unseen domain. Unsupervised domain adaptation (UDA) semantic segmentation aims to adapt the semantic segmentation model trained on the labeled source domain to an unlabeled target domain. However, the existing UDA semantic segmentation models tend to focus only on the semantic mask information of the building and neglect modeling the structural and boundary information of the building, which leads to irregularities and inaccuracies of prediction results. We propose a dual-contrast adaptation network coupling global context and geometry information for cross-domain building extraction of HRSIs. It first uses a hierarchical transformer to extract multilevel global context information of buildings from HRSI and adopts a multilevel context fusion module to perform feature fusion on the context information extracted by the transformer at different levels to improve the model’s ability to learn domain-invariant knowledge. Then, it employs hierarchical supervised signals of attraction field map (AFM) and masks to guide the transformer-based network to migrate the building knowledge in a shape-aware manner. Finally, it implements prototype contrast learning (CL) on high-level mask prediction and AFM prediction to bridge the discrepancy between source and target domain on mask semantic information and geometry information, respectively. Extensive experiments under four cross-domain tasks indicate that the proposed method is remarkably superior to the state-of-the-art methods.","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":"63 ","pages":"1-16"},"PeriodicalIF":8.6000,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10891046/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Semantic segmentation of high-resolution remote sensing imagery (HRSI) suffers from the domain shift, resulting in poor performance of the model in another unseen domain. Unsupervised domain adaptation (UDA) semantic segmentation aims to adapt the semantic segmentation model trained on the labeled source domain to an unlabeled target domain. However, the existing UDA semantic segmentation models tend to focus only on the semantic mask information of the building and neglect modeling the structural and boundary information of the building, which leads to irregularities and inaccuracies of prediction results. We propose a dual-contrast adaptation network coupling global context and geometry information for cross-domain building extraction of HRSIs. It first uses a hierarchical transformer to extract multilevel global context information of buildings from HRSI and adopts a multilevel context fusion module to perform feature fusion on the context information extracted by the transformer at different levels to improve the model’s ability to learn domain-invariant knowledge. Then, it employs hierarchical supervised signals of attraction field map (AFM) and masks to guide the transformer-based network to migrate the building knowledge in a shape-aware manner. Finally, it implements prototype contrast learning (CL) on high-level mask prediction and AFM prediction to bridge the discrepancy between source and target domain on mask semantic information and geometry information, respectively. Extensive experiments under four cross-domain tasks indicate that the proposed method is remarkably superior to the state-of-the-art methods.
期刊介绍:
IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.