Iterative segmentation and propagation based interactive video object segmentation

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023) Pub Date : 2024-01-09 DOI:10.1117/12.3014487

Sihan Luo, Sizhe Yang, Xia Yuan

{"title":"Iterative segmentation and propagation based interactive video object segmentation","authors":"Sihan Luo, Sizhe Yang, Xia Yuan","doi":"10.1117/12.3014487","DOIUrl":null,"url":null,"abstract":"Interactive video object segmentation (iVOS), which aims to efficiently produce high-quality segmentation masks of the target object in a video with user interactions. Recently, numerous works are proposed to advance the task of iVOS. However, their usages on user intent are limited. First, typical modules usually try to direct generate the segmentation without any further exploration on the input interaction, which misses valuable information. Second, recent iVOS approaches also do not consider the raw interactive information. As a result, the final segmentation results will be poisoned by the erroneous information given by the previous round’s segmentation masks. To solve the aforementioned weaknesses, in this paper, an Iterative Segmentation and Propagation based iVOS method is proposed to conduct better user intent exploration, namely ISP. ISP directly models user intent into the PGI2M module and TP module. Specifically, ISP first extracts a coarse-grained segmentation mask by analyzing the user’s input. Subsequently, this mask is used as a prior to aid the PGI2M module. Secondly, ISP presents a new interaction-driven self-attention module to recall the user’s intent in the TP module. Extensive experiments on two public datasets show the superiority of ISP over existing methods.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"43 5","pages":"129691A - 129691A-10"},"PeriodicalIF":0.0000,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.3014487","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Interactive video object segmentation (iVOS), which aims to efficiently produce high-quality segmentation masks of the target object in a video with user interactions. Recently, numerous works are proposed to advance the task of iVOS. However, their usages on user intent are limited. First, typical modules usually try to direct generate the segmentation without any further exploration on the input interaction, which misses valuable information. Second, recent iVOS approaches also do not consider the raw interactive information. As a result, the final segmentation results will be poisoned by the erroneous information given by the previous round’s segmentation masks. To solve the aforementioned weaknesses, in this paper, an Iterative Segmentation and Propagation based iVOS method is proposed to conduct better user intent exploration, namely ISP. ISP directly models user intent into the PGI2M module and TP module. Specifically, ISP first extracts a coarse-grained segmentation mask by analyzing the user’s input. Subsequently, this mask is used as a prior to aid the PGI2M module. Secondly, ISP presents a new interaction-driven self-attention module to recall the user’s intent in the TP module. Extensive experiments on two public datasets show the superiority of ISP over existing methods.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于迭代分割和传播的交互式视频对象分割

交互式视频对象分割（iVOS），旨在通过用户交互有效地生成视频中目标对象的高质量分割掩码。最近，有许多作品被提出来推进 iVOS 任务。然而，它们在用户意图方面的应用都很有限。首先，典型的模块通常试图直接生成分割，而不对输入交互进行任何进一步的探索，这就错过了有价值的信息。其次，最近的 iVOS 方法也没有考虑原始交互信息。因此，最终的分割结果会受到上一轮分割掩码所提供的错误信息的影响。为了解决上述缺陷，本文提出了一种基于迭代分割和传播的 iVOS 方法，即 ISP，以更好地探索用户意图。ISP 将用户意图直接建模到 PGI2M 模块和 TP 模块中。具体来说，ISP 首先通过分析用户输入提取粗粒度分割掩码。然后，将该掩码作为先验，辅助 PGI2M 模块。其次，ISP 提出了一种新的交互驱动型自我注意模块，用于在 TP 模块中回忆用户的意图。在两个公开数据集上进行的广泛实验表明，ISP 比现有方法更具优势。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)

自引率

0.00%

发文量