利用自适应记忆网络为长期视频提供高效的半监督物体分割技术

IF 5 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE IEEE Transactions on Cognitive and Developmental Systems Pub Date : 2024-04-08 DOI:10.1109/TCDS.2024.3385849
Shan Zhong;Guoqiang Li;Wenhao Ying;Fuzhou Zhao;Gengsheng Xie;Shengrong Gong
{"title":"利用自适应记忆网络为长期视频提供高效的半监督物体分割技术","authors":"Shan Zhong;Guoqiang Li;Wenhao Ying;Fuzhou Zhao;Gengsheng Xie;Shengrong Gong","doi":"10.1109/TCDS.2024.3385849","DOIUrl":null,"url":null,"abstract":"Video object segmentation (VOS) uses the first annotated video mask to achieve consistent and precise segmentation in subsequent frames. Recently, memory-based methods have received significant attention owing to their substantial performance enhancements. However, these approaches rely on a fixed global memory strategy, which poses a challenge to segmentation accuracy and speed in the context of longer videos. To alleviate this limitation, we propose a novel semisupervised VOS model, founded on the principles of the adaptive memory network. Our proposed model adaptively extracts object features by focusing on the object area while effectively filtering out extraneous background noise. An identification mechanism is also thoughtfully applied to discern each object in multiobject scenarios. To further reduce storage consumption without compromising the saliency of object information, the outdated features residing in the memory pool are compressed into salient features through the employment of a self-attention mechanism. Furthermore, we introduce a local matching module, specifically devised to refine object features by fusing the contextual information from historical frames. We demonstrate the efficiency of our approach through experiments, substantially augmenting both the speed and precision of segmentation for long-term videos, while maintaining comparable performance for short videos.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"16 5","pages":"1789-1802"},"PeriodicalIF":5.0000,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Efficient Semisupervised Object Segmentation for Long-Term Videos Using Adaptive Memory Network\",\"authors\":\"Shan Zhong;Guoqiang Li;Wenhao Ying;Fuzhou Zhao;Gengsheng Xie;Shengrong Gong\",\"doi\":\"10.1109/TCDS.2024.3385849\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Video object segmentation (VOS) uses the first annotated video mask to achieve consistent and precise segmentation in subsequent frames. Recently, memory-based methods have received significant attention owing to their substantial performance enhancements. However, these approaches rely on a fixed global memory strategy, which poses a challenge to segmentation accuracy and speed in the context of longer videos. To alleviate this limitation, we propose a novel semisupervised VOS model, founded on the principles of the adaptive memory network. Our proposed model adaptively extracts object features by focusing on the object area while effectively filtering out extraneous background noise. An identification mechanism is also thoughtfully applied to discern each object in multiobject scenarios. To further reduce storage consumption without compromising the saliency of object information, the outdated features residing in the memory pool are compressed into salient features through the employment of a self-attention mechanism. Furthermore, we introduce a local matching module, specifically devised to refine object features by fusing the contextual information from historical frames. We demonstrate the efficiency of our approach through experiments, substantially augmenting both the speed and precision of segmentation for long-term videos, while maintaining comparable performance for short videos.\",\"PeriodicalId\":54300,\"journal\":{\"name\":\"IEEE Transactions on Cognitive and Developmental Systems\",\"volume\":\"16 5\",\"pages\":\"1789-1802\"},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2024-04-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Cognitive and Developmental Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10494676/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cognitive and Developmental Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10494676/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

视频对象分割(VOS)利用首次注释的视频掩码来实现后续帧的一致和精确分割。最近,基于内存的方法因其性能大幅提升而备受关注。然而,这些方法依赖于固定的全局内存策略,这对较长视频的分割精度和速度提出了挑战。为了缓解这一限制,我们根据自适应记忆网络的原理,提出了一种新颖的半监督 VOS 模型。我们提出的模型通过聚焦对象区域自适应地提取对象特征,同时有效地过滤掉无关的背景噪声。此外,我们还贴心地采用了一种识别机制,以辨别多物体场景中的每个物体。为了在不影响对象信息显著性的前提下进一步减少存储消耗,我们采用了自我关注机制,将内存池中的过时特征压缩成显著特征。此外,我们还引入了一个局部匹配模块,专门用于通过融合历史帧的上下文信息来完善对象特征。我们通过实验证明了这一方法的高效性,大大提高了长期视频的分割速度和精度,同时保持了与短视频相当的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Efficient Semisupervised Object Segmentation for Long-Term Videos Using Adaptive Memory Network
Video object segmentation (VOS) uses the first annotated video mask to achieve consistent and precise segmentation in subsequent frames. Recently, memory-based methods have received significant attention owing to their substantial performance enhancements. However, these approaches rely on a fixed global memory strategy, which poses a challenge to segmentation accuracy and speed in the context of longer videos. To alleviate this limitation, we propose a novel semisupervised VOS model, founded on the principles of the adaptive memory network. Our proposed model adaptively extracts object features by focusing on the object area while effectively filtering out extraneous background noise. An identification mechanism is also thoughtfully applied to discern each object in multiobject scenarios. To further reduce storage consumption without compromising the saliency of object information, the outdated features residing in the memory pool are compressed into salient features through the employment of a self-attention mechanism. Furthermore, we introduce a local matching module, specifically devised to refine object features by fusing the contextual information from historical frames. We demonstrate the efficiency of our approach through experiments, substantially augmenting both the speed and precision of segmentation for long-term videos, while maintaining comparable performance for short videos.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
7.20
自引率
10.00%
发文量
170
期刊介绍: The IEEE Transactions on Cognitive and Developmental Systems (TCDS) focuses on advances in the study of development and cognition in natural (humans, animals) and artificial (robots, agents) systems. It welcomes contributions from multiple related disciplines including cognitive systems, cognitive robotics, developmental and epigenetic robotics, autonomous and evolutionary robotics, social structures, multi-agent and artificial life systems, computational neuroscience, and developmental psychology. Articles on theoretical, computational, application-oriented, and experimental studies as well as reviews in these areas are considered.
期刊最新文献
Table of Contents IEEE Transactions on Cognitive and Developmental Systems Publication Information IEEE Transactions on Cognitive and Developmental Systems Information for Authors Guest Editorial: Special Issue on Advancing Machine Intelligence With Neuromorphic Computing IEEE Computational Intelligence Society Information
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1