B3-CDG：用于双时态建筑物二进制变化检测的伪样本扩散发生器

IF 10.6 1区地球科学 Q1 GEOGRAPHY, PHYSICAL ISPRS Journal of Photogrammetry and Remote Sensing Pub Date : 2024-11-14 DOI:10.1016/j.isprsjprs.2024.10.021

Peng Chen , Peixian Li , Bing Wang , Sihai Zhao , Yongliang Zhang , Tao Zhang , Xingcheng Ding

{"title":"B3-CDG：用于双时态建筑物二进制变化检测的伪样本扩散发生器","authors":"Peng Chen , Peixian Li , Bing Wang , Sihai Zhao , Yongliang Zhang , Tao Zhang , Xingcheng Ding","doi":"10.1016/j.isprsjprs.2024.10.021","DOIUrl":null,"url":null,"abstract":"<div><div>Building change detection (CD) plays a crucial role in urban planning, land resource management, and disaster monitoring. Currently, deep learning has become a key approach in building CD, but challenges persist. Obtaining large-scale, accurately registered bi-temporal images is difficult, and annotation is time-consuming. Therefore, we propose B<sup>3</sup>-CDG, a bi-temporal building binary CD pseudo-sample generator based on the principle of latent diffusion. This generator treats building change processes as local semantic states transformations. It utilizes textual instructions and mask prompts to generate specific class changes in designated regions of single-temporal images, creating different temporal images with clear semantic transitions. B<sup>3</sup>-CDG is driven by large-scale pretrained models and utilizes external adapters to guide the model in learning remote sensing image distributions. To generate seamless building boundaries, B<sup>3</sup>-CDG adopts a simple and effective approach—dilation masks—to compel the model to learn boundary details. In addition, B<sup>3</sup>-CDG incorporates diffusion guidance and data augmentation to enhance image realism. In the generation experiments, B<sup>3</sup>-CDG achieved the best performance with the lowest FID (26.40) and the highest IS (4.60) compared to previous baseline methods (such as Inpaint and IAug). This method effectively addresses challenges such as boundary continuity, shadow generation, and vegetation occlusion while ensuring that the generated building roof structures and colors are realistic and diverse. In the application experiments, B<sup>3</sup>-CDG improved the IOU of the validation model (SFFNet) by 6.34 % and 7.10 % on the LEVIR and WHUCD datasets, respectively. When the real data is extremely limited (using only 5 % of the original data), the improvement further reaches 33.68 % and 32.40 %. Moreover, B<sup>3</sup>-CDG can enhance the baseline performance of advanced CD models, such as SNUNet and ChangeFormer. Ablation studies further confirm the effectiveness of the B<sup>3</sup>-CDG design. This study introduces a novel research paradigm for building CD, potentially advancing the field. Source code and datasets will be available at <span><span>https://github.com/ABCnutter/B3-CDG</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"218 ","pages":"Pages 408-429"},"PeriodicalIF":10.6000,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"B3-CDG: A pseudo-sample diffusion generator for bi-temporal building binary change detection\",\"authors\":\"Peng Chen , Peixian Li , Bing Wang , Sihai Zhao , Yongliang Zhang , Tao Zhang , Xingcheng Ding\",\"doi\":\"10.1016/j.isprsjprs.2024.10.021\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Building change detection (CD) plays a crucial role in urban planning, land resource management, and disaster monitoring. Currently, deep learning has become a key approach in building CD, but challenges persist. Obtaining large-scale, accurately registered bi-temporal images is difficult, and annotation is time-consuming. Therefore, we propose B<sup>3</sup>-CDG, a bi-temporal building binary CD pseudo-sample generator based on the principle of latent diffusion. This generator treats building change processes as local semantic states transformations. It utilizes textual instructions and mask prompts to generate specific class changes in designated regions of single-temporal images, creating different temporal images with clear semantic transitions. B<sup>3</sup>-CDG is driven by large-scale pretrained models and utilizes external adapters to guide the model in learning remote sensing image distributions. To generate seamless building boundaries, B<sup>3</sup>-CDG adopts a simple and effective approach—dilation masks—to compel the model to learn boundary details. In addition, B<sup>3</sup>-CDG incorporates diffusion guidance and data augmentation to enhance image realism. In the generation experiments, B<sup>3</sup>-CDG achieved the best performance with the lowest FID (26.40) and the highest IS (4.60) compared to previous baseline methods (such as Inpaint and IAug). This method effectively addresses challenges such as boundary continuity, shadow generation, and vegetation occlusion while ensuring that the generated building roof structures and colors are realistic and diverse. In the application experiments, B<sup>3</sup>-CDG improved the IOU of the validation model (SFFNet) by 6.34 % and 7.10 % on the LEVIR and WHUCD datasets, respectively. When the real data is extremely limited (using only 5 % of the original data), the improvement further reaches 33.68 % and 32.40 %. Moreover, B<sup>3</sup>-CDG can enhance the baseline performance of advanced CD models, such as SNUNet and ChangeFormer. Ablation studies further confirm the effectiveness of the B<sup>3</sup>-CDG design. This study introduces a novel research paradigm for building CD, potentially advancing the field. Source code and datasets will be available at <span><span>https://github.com/ABCnutter/B3-CDG</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50269,\"journal\":{\"name\":\"ISPRS Journal of Photogrammetry and Remote Sensing\",\"volume\":\"218 \",\"pages\":\"Pages 408-429\"},\"PeriodicalIF\":10.6000,\"publicationDate\":\"2024-11-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ISPRS Journal of Photogrammetry and Remote Sensing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0924271624003988\",\"RegionNum\":1,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GEOGRAPHY, PHYSICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ISPRS Journal of Photogrammetry and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0924271624003988","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOGRAPHY, PHYSICAL","Score":null,"Total":0}

引用次数: 0

摘要

建筑物变化检测（CD）在城市规划、土地资源管理和灾害监测中发挥着至关重要的作用。目前，深度学习已成为建筑物变化检测的关键方法，但挑战依然存在。获取大规模、精确注册的双时相图像非常困难，标注也非常耗时。因此，我们提出了基于潜在扩散原理的双时态建筑二进制 CD 伪样本生成器 B3-CDG。该生成器将建筑变化过程视为局部语义状态转换。它利用文字说明和掩码提示，在单时相图像的指定区域生成特定类别的变化，从而创建具有清晰语义转换的不同时相图像。B3-CDG 由大规模预训练模型驱动，并利用外部适配器引导模型学习遥感图像分布。为了生成无缝的建筑边界，B3-CDG 采用了一种简单而有效的方法--压缩遮罩，迫使模型学习边界细节。此外，B3-CDG 还结合了扩散引导和数据增强技术，以增强图像的真实性。在生成实验中，与之前的基线方法（如 Inpaint 和 IAug）相比，B3-CDG 性能最佳，FID 最低（26.40），IS 最高（4.60）。该方法有效地解决了边界连续性、阴影生成和植被遮挡等难题，同时确保生成的建筑屋顶结构和颜色逼真多样。在应用实验中，B3-CDG 在 LEVIR 和 WHUCD 数据集上分别将验证模型（SFFNet）的 IOU 提高了 6.34 % 和 7.10 %。当真实数据极其有限时（仅使用原始数据的 5%），改进幅度进一步达到 33.68 % 和 32.40 %。此外，B3-CDG 还能提高 SNUNet 和 ChangeFormer 等高级 CD 模型的基线性能。消融研究进一步证实了 B3-CDG 设计的有效性。这项研究为构建 CD 引入了一种新的研究范式，有可能推动该领域的发展。源代码和数据集将发布在 https://github.com/ABCnutter/B3-CDG 网站上。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

B3-CDG: A pseudo-sample diffusion generator for bi-temporal building binary change detection

Building change detection (CD) plays a crucial role in urban planning, land resource management, and disaster monitoring. Currently, deep learning has become a key approach in building CD, but challenges persist. Obtaining large-scale, accurately registered bi-temporal images is difficult, and annotation is time-consuming. Therefore, we propose B³-CDG, a bi-temporal building binary CD pseudo-sample generator based on the principle of latent diffusion. This generator treats building change processes as local semantic states transformations. It utilizes textual instructions and mask prompts to generate specific class changes in designated regions of single-temporal images, creating different temporal images with clear semantic transitions. B³-CDG is driven by large-scale pretrained models and utilizes external adapters to guide the model in learning remote sensing image distributions. To generate seamless building boundaries, B³-CDG adopts a simple and effective approach—dilation masks—to compel the model to learn boundary details. In addition, B³-CDG incorporates diffusion guidance and data augmentation to enhance image realism. In the generation experiments, B³-CDG achieved the best performance with the lowest FID (26.40) and the highest IS (4.60) compared to previous baseline methods (such as Inpaint and IAug). This method effectively addresses challenges such as boundary continuity, shadow generation, and vegetation occlusion while ensuring that the generated building roof structures and colors are realistic and diverse. In the application experiments, B³-CDG improved the IOU of the validation model (SFFNet) by 6.34 % and 7.10 % on the LEVIR and WHUCD datasets, respectively. When the real data is extremely limited (using only 5 % of the original data), the improvement further reaches 33.68 % and 32.40 %. Moreover, B³-CDG can enhance the baseline performance of advanced CD models, such as SNUNet and ChangeFormer. Ablation studies further confirm the effectiveness of the B³-CDG design. This study introduces a novel research paradigm for building CD, potentially advancing the field. Source code and datasets will be available at https://github.com/ABCnutter/B3-CDG.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ISPRS Journal of Photogrammetry and Remote Sensing 工程技术-成像科学与照相技术

CiteScore

21.00

自引率

6.30%

发文量

273

审稿时长

40 days

期刊介绍： The ISPRS Journal of Photogrammetry and Remote Sensing (P&RS) serves as the official journal of the International Society for Photogrammetry and Remote Sensing (ISPRS). It acts as a platform for scientists and professionals worldwide who are involved in various disciplines that utilize photogrammetry, remote sensing, spatial information systems, computer vision, and related fields. The journal aims to facilitate communication and dissemination of advancements in these disciplines, while also acting as a comprehensive source of reference and archive. P&RS endeavors to publish high-quality, peer-reviewed research papers that are preferably original and have not been published before. These papers can cover scientific/research, technological development, or application/practical aspects. Additionally, the journal welcomes papers that are based on presentations from ISPRS meetings, as long as they are considered significant contributions to the aforementioned fields. In particular, P&RS encourages the submission of papers that are of broad scientific interest, showcase innovative applications (especially in emerging fields), have an interdisciplinary focus, discuss topics that have received limited attention in P&RS or related journals, or explore new directions in scientific or professional realms. It is preferred that theoretical papers include practical applications, while papers focusing on systems and applications should include a theoretical background.