基于迭代分割和传播的交互式视频对象分割

Sihan Luo, Sizhe Yang, Xia Yuan
{"title":"基于迭代分割和传播的交互式视频对象分割","authors":"Sihan Luo, Sizhe Yang, Xia Yuan","doi":"10.1117/12.3014487","DOIUrl":null,"url":null,"abstract":"Interactive video object segmentation (iVOS), which aims to efficiently produce high-quality segmentation masks of the target object in a video with user interactions. Recently, numerous works are proposed to advance the task of iVOS. However, their usages on user intent are limited. First, typical modules usually try to direct generate the segmentation without any further exploration on the input interaction, which misses valuable information. Second, recent iVOS approaches also do not consider the raw interactive information. As a result, the final segmentation results will be poisoned by the erroneous information given by the previous round’s segmentation masks. To solve the aforementioned weaknesses, in this paper, an Iterative Segmentation and Propagation based iVOS method is proposed to conduct better user intent exploration, namely ISP. ISP directly models user intent into the PGI2M module and TP module. Specifically, ISP first extracts a coarse-grained segmentation mask by analyzing the user’s input. Subsequently, this mask is used as a prior to aid the PGI2M module. Secondly, ISP presents a new interaction-driven self-attention module to recall the user’s intent in the TP module. Extensive experiments on two public datasets show the superiority of ISP over existing methods.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"43 5","pages":"129691A - 129691A-10"},"PeriodicalIF":0.0000,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Iterative segmentation and propagation based interactive video object segmentation\",\"authors\":\"Sihan Luo, Sizhe Yang, Xia Yuan\",\"doi\":\"10.1117/12.3014487\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Interactive video object segmentation (iVOS), which aims to efficiently produce high-quality segmentation masks of the target object in a video with user interactions. Recently, numerous works are proposed to advance the task of iVOS. However, their usages on user intent are limited. First, typical modules usually try to direct generate the segmentation without any further exploration on the input interaction, which misses valuable information. Second, recent iVOS approaches also do not consider the raw interactive information. As a result, the final segmentation results will be poisoned by the erroneous information given by the previous round’s segmentation masks. To solve the aforementioned weaknesses, in this paper, an Iterative Segmentation and Propagation based iVOS method is proposed to conduct better user intent exploration, namely ISP. ISP directly models user intent into the PGI2M module and TP module. Specifically, ISP first extracts a coarse-grained segmentation mask by analyzing the user’s input. Subsequently, this mask is used as a prior to aid the PGI2M module. Secondly, ISP presents a new interaction-driven self-attention module to recall the user’s intent in the TP module. Extensive experiments on two public datasets show the superiority of ISP over existing methods.\",\"PeriodicalId\":516634,\"journal\":{\"name\":\"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)\",\"volume\":\"43 5\",\"pages\":\"129691A - 129691A-10\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1117/12.3014487\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.3014487","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

交互式视频对象分割(iVOS),旨在通过用户交互有效地生成视频中目标对象的高质量分割掩码。最近,有许多作品被提出来推进 iVOS 任务。然而,它们在用户意图方面的应用都很有限。首先,典型的模块通常试图直接生成分割,而不对输入交互进行任何进一步的探索,这就错过了有价值的信息。其次,最近的 iVOS 方法也没有考虑原始交互信息。因此,最终的分割结果会受到上一轮分割掩码所提供的错误信息的影响。为了解决上述缺陷,本文提出了一种基于迭代分割和传播的 iVOS 方法,即 ISP,以更好地探索用户意图。ISP 将用户意图直接建模到 PGI2M 模块和 TP 模块中。具体来说,ISP 首先通过分析用户输入提取粗粒度分割掩码。然后,将该掩码作为先验,辅助 PGI2M 模块。其次,ISP 提出了一种新的交互驱动型自我注意模块,用于在 TP 模块中回忆用户的意图。在两个公开数据集上进行的广泛实验表明,ISP 比现有方法更具优势。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Iterative segmentation and propagation based interactive video object segmentation
Interactive video object segmentation (iVOS), which aims to efficiently produce high-quality segmentation masks of the target object in a video with user interactions. Recently, numerous works are proposed to advance the task of iVOS. However, their usages on user intent are limited. First, typical modules usually try to direct generate the segmentation without any further exploration on the input interaction, which misses valuable information. Second, recent iVOS approaches also do not consider the raw interactive information. As a result, the final segmentation results will be poisoned by the erroneous information given by the previous round’s segmentation masks. To solve the aforementioned weaknesses, in this paper, an Iterative Segmentation and Propagation based iVOS method is proposed to conduct better user intent exploration, namely ISP. ISP directly models user intent into the PGI2M module and TP module. Specifically, ISP first extracts a coarse-grained segmentation mask by analyzing the user’s input. Subsequently, this mask is used as a prior to aid the PGI2M module. Secondly, ISP presents a new interaction-driven self-attention module to recall the user’s intent in the TP module. Extensive experiments on two public datasets show the superiority of ISP over existing methods.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
The ship classification and detection method of optical remote sensing image based on improved YOLOv7-tiny Collaborative filtering recommendation method based on graph convolutional neural networks Research on the simplification of building complex model under multi-factor constraints Improved ant colony algorithm based on artificial gravity field for adaptive dynamic path planning Application analysis of three-dimensional laser scanning technology in the protection of dong drum tower in Sanjiang county
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1