transnn - hae:用于盲图像绘制的变压器- cnn混合自编码器

Proceedings of the 30th ACM International Conference on Multimedia Pub Date : 2022-10-10 DOI:10.1145/3503161.3547848

Haoru Zhao, Zhaorui Gu, Bing Zheng, Haiyong Zheng

{"title":"transnn - hae:用于盲图像绘制的变压器- cnn混合自编码器","authors":"Haoru Zhao, Zhaorui Gu, Bing Zheng, Haiyong Zheng","doi":"10.1145/3503161.3547848","DOIUrl":null,"url":null,"abstract":"Blind image inpainting is extremely challenging due to the unknown and multi-property complexity of contamination in different contaminated images. Current mainstream work decomposes blind image inpainting into two stages: mask estimating from the contaminated image and image inpainting based on the estimated mask, and this two-stage solution involves two CNN-based encoder-decoder architectures for estimating and inpainting separately. In this work, we propose a novel one-stage Transformer-CNN Hybrid AutoEncoder (TransCNN-HAE) for blind image inpainting, which intuitively follows the inpainting-then-reconstructing pipeline by leveraging global long-range contextual modeling of Transformer to repair contaminated regions and local short-range contextual modeling of CNN to reconstruct the repaired image. Moreover, a Cross-layer Dissimilarity Prompt (CDP) is devised to accelerate the identifying and inpainting of contaminated regions. Ablation studies validate the efficacy of both TransCNN-HAE and CDP, and extensive experiments on various datasets with multi-property contaminations show that our method achieves state-of-the-art performance with much lower computational cost on blind image inpainting. Our code is available at https://github.com/zhenglab/TransCNN-HAE.","PeriodicalId":412792,"journal":{"name":"Proceedings of the 30th ACM International Conference on Multimedia","volume":"52 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"TransCNN-HAE: Transformer-CNN Hybrid AutoEncoder for Blind Image Inpainting\",\"authors\":\"Haoru Zhao, Zhaorui Gu, Bing Zheng, Haiyong Zheng\",\"doi\":\"10.1145/3503161.3547848\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Blind image inpainting is extremely challenging due to the unknown and multi-property complexity of contamination in different contaminated images. Current mainstream work decomposes blind image inpainting into two stages: mask estimating from the contaminated image and image inpainting based on the estimated mask, and this two-stage solution involves two CNN-based encoder-decoder architectures for estimating and inpainting separately. In this work, we propose a novel one-stage Transformer-CNN Hybrid AutoEncoder (TransCNN-HAE) for blind image inpainting, which intuitively follows the inpainting-then-reconstructing pipeline by leveraging global long-range contextual modeling of Transformer to repair contaminated regions and local short-range contextual modeling of CNN to reconstruct the repaired image. Moreover, a Cross-layer Dissimilarity Prompt (CDP) is devised to accelerate the identifying and inpainting of contaminated regions. Ablation studies validate the efficacy of both TransCNN-HAE and CDP, and extensive experiments on various datasets with multi-property contaminations show that our method achieves state-of-the-art performance with much lower computational cost on blind image inpainting. Our code is available at https://github.com/zhenglab/TransCNN-HAE.\",\"PeriodicalId\":412792,\"journal\":{\"name\":\"Proceedings of the 30th ACM International Conference on Multimedia\",\"volume\":\"52 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 30th ACM International Conference on Multimedia\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3503161.3547848\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 30th ACM International Conference on Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3503161.3547848","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

由于不同污染图像中污染的未知性和多属性复杂性，使得图像盲涂非常具有挑战性。目前的主流工作将盲图像补漆分解为两个阶段:从污染图像中估计掩码和基于估计掩码的图像补漆，这个两阶段的解决方案涉及两个基于cnn的编码器-解码器架构，分别用于估计和补漆。在这项工作中，我们提出了一种用于盲图像修复的新型一级变压器-CNN混合自动编码器(TransCNN-HAE)，它通过利用变压器的全局远程上下文建模来修复污染区域，利用CNN的局部短程上下文建模来重建修复后的图像，直观地遵循修复-重建的管道。此外，设计了一种跨层不相似提示(CDP)，以加快污染区域的识别和涂漆。烧蚀研究验证了TransCNN-HAE和CDP的有效性，并且在具有多属性污染的各种数据集上进行的大量实验表明，我们的方法在盲图像喷漆上实现了最先进的性能，计算成本更低。我们的代码可在https://github.com/zhenglab/TransCNN-HAE上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

TransCNN-HAE: Transformer-CNN Hybrid AutoEncoder for Blind Image Inpainting

Blind image inpainting is extremely challenging due to the unknown and multi-property complexity of contamination in different contaminated images. Current mainstream work decomposes blind image inpainting into two stages: mask estimating from the contaminated image and image inpainting based on the estimated mask, and this two-stage solution involves two CNN-based encoder-decoder architectures for estimating and inpainting separately. In this work, we propose a novel one-stage Transformer-CNN Hybrid AutoEncoder (TransCNN-HAE) for blind image inpainting, which intuitively follows the inpainting-then-reconstructing pipeline by leveraging global long-range contextual modeling of Transformer to repair contaminated regions and local short-range contextual modeling of CNN to reconstruct the repaired image. Moreover, a Cross-layer Dissimilarity Prompt (CDP) is devised to accelerate the identifying and inpainting of contaminated regions. Ablation studies validate the efficacy of both TransCNN-HAE and CDP, and extensive experiments on various datasets with multi-property contaminations show that our method achieves state-of-the-art performance with much lower computational cost on blind image inpainting. Our code is available at https://github.com/zhenglab/TransCNN-HAE.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 30th ACM International Conference on Multimedia

自引率

0.00%

发文量