Uncertainty-aware Active Domain Adaptive Salient Object Detection.

Guanbin Li, Chaowei Fang, Zhuohua Chen, Mingzhi Mao, Liang Lin
{"title":"Uncertainty-aware Active Domain Adaptive Salient Object Detection.","authors":"Guanbin Li, Chaowei Fang, Zhuohua Chen, Mingzhi Mao, Liang Lin","doi":"10.1109/TIP.2024.3413598","DOIUrl":null,"url":null,"abstract":"<p><p>Due to the advancement of deep learning, the performance of salient object detection (SOD) has been significantly improved. However, deep learning-based techniques require a sizable amount of pixel-wise annotations. To relieve the burden of data annotation, a variety of deep weakly-supervised and unsupervised SOD methods have been proposed, yet the performance gap between them and fully supervised methods remains significant. In this paper, we propose a novel, cost-efficient salient object detection framework, which can adapt models from synthetic data to real-world data with the help of a limited number of actively selected annotations. Specifically, we first construct a synthetic SOD dataset by copying and pasting foreground objects into pure background images. With the masks of foreground objects taken as the ground-truth saliency maps, this dataset can be used for training the SOD model initially. However, due to the large domain gap between synthetic images and real-world images, the performance of the initially trained model on the real-world images is deficient. To transfer the model from the synthetic dataset to the real-world datasets, we further design an uncertainty-aware active domain adaptive algorithm to generate labels for the real-world target images. The prediction variances against data augmentations are utilized to calculate the superpixel-level uncertainty values. For those superpixels with relatively low uncertainty, we directly generate pseudo labels according to the network predictions. Meanwhile, we select a few superpixels with high uncertainty scores and assign labels to them manually. This labeling strategy is capable of generating high-quality labels without incurring too much annotation cost. Experimental results on six benchmark SOD datasets demonstrate that our method outperforms the existing state-of-the-art weakly-supervised and unsupervised SOD methods and is even comparable to the fully supervised ones. Code will be released at: https://github.com/czh-3/UADA.</p>","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TIP.2024.3413598","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Due to the advancement of deep learning, the performance of salient object detection (SOD) has been significantly improved. However, deep learning-based techniques require a sizable amount of pixel-wise annotations. To relieve the burden of data annotation, a variety of deep weakly-supervised and unsupervised SOD methods have been proposed, yet the performance gap between them and fully supervised methods remains significant. In this paper, we propose a novel, cost-efficient salient object detection framework, which can adapt models from synthetic data to real-world data with the help of a limited number of actively selected annotations. Specifically, we first construct a synthetic SOD dataset by copying and pasting foreground objects into pure background images. With the masks of foreground objects taken as the ground-truth saliency maps, this dataset can be used for training the SOD model initially. However, due to the large domain gap between synthetic images and real-world images, the performance of the initially trained model on the real-world images is deficient. To transfer the model from the synthetic dataset to the real-world datasets, we further design an uncertainty-aware active domain adaptive algorithm to generate labels for the real-world target images. The prediction variances against data augmentations are utilized to calculate the superpixel-level uncertainty values. For those superpixels with relatively low uncertainty, we directly generate pseudo labels according to the network predictions. Meanwhile, we select a few superpixels with high uncertainty scores and assign labels to them manually. This labeling strategy is capable of generating high-quality labels without incurring too much annotation cost. Experimental results on six benchmark SOD datasets demonstrate that our method outperforms the existing state-of-the-art weakly-supervised and unsupervised SOD methods and is even comparable to the fully supervised ones. Code will be released at: https://github.com/czh-3/UADA.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
不确定性感知主动域自适应突出物体检测。
随着深度学习技术的发展,突出物体检测(SOD)的性能得到了显著提高。然而,基于深度学习的技术需要大量的像素标注。为了减轻数据注释的负担,人们提出了多种深度弱监督和无监督 SOD 方法,但它们与完全监督方法之间的性能差距仍然很大。在本文中,我们提出了一种新颖、低成本高效率的突出对象检测框架,它可以借助有限数量的主动选择注释,将合成数据中的模型调整到真实世界的数据中。具体来说,我们首先通过将前景物体复制并粘贴到纯背景图像中来构建合成 SOD 数据集。将前景物体的遮罩作为地面实况显著性图,该数据集可用于初始 SOD 模型的训练。然而,由于合成图像与真实世界图像之间存在较大的域差距,最初训练的模型在真实世界图像上的性能存在缺陷。为了将模型从合成数据集转移到真实世界数据集,我们进一步设计了一种不确定性感知的主动域自适应算法,以生成真实世界目标图像的标签。利用数据增强的预测差异来计算超像素级的不确定性值。对于不确定性相对较低的超像素,我们直接根据网络预测生成伪标签。同时,我们会选择一些不确定性得分较高的超像素,并手动为其分配标签。这种标注策略既能生成高质量的标签,又不会产生过多的标注成本。在六个基准 SOD 数据集上的实验结果表明,我们的方法优于现有的最先进的弱监督和无监督 SOD 方法,甚至可与完全监督方法相媲美。代码将在 https://github.com/czh-3/UADA 发布。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Balanced Destruction-Reconstruction Dynamics for Memory-Replay Class Incremental Learning Blind Video Quality Prediction by Uncovering Human Video Perceptual Representation. Contrastive Open-set Active Learning based Sample Selection for Image Classification. Generating Stylized Features for Single-Source Cross-Dataset Palmprint Recognition With Unseen Target Dataset Learning Prompt-Enhanced Context Features for Weakly-Supervised Video Anomaly Detection
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1