基于时间动态建模的多帧动态环境中光流的无监督学习

IF 5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Complex & Intelligent Systems Pub Date : 2023-10-31 DOI:10.1007/s40747-023-01266-2
Zitang Sun, Zhengbo Luo, Shin’ya Nishida
{"title":"基于时间动态建模的多帧动态环境中光流的无监督学习","authors":"Zitang Sun, Zhengbo Luo, Shin’ya Nishida","doi":"10.1007/s40747-023-01266-2","DOIUrl":null,"url":null,"abstract":"<p>For visual estimation of optical flow, which is crucial for various vision analyses, unsupervised learning by view synthesis has emerged as a promising alternative to supervised methods because the ground-truth flow is not readily available in many cases. However, unsupervised learning is likely to be unstable when pixel tracking is lost via occlusion and motion blur, or pixel correspondence is impaired by variations in image content and spatial structure over time. Recognizing that dynamic occlusions and object variations usually exhibit a smooth temporal transition in natural settings, we shifted our focus to model unsupervised learning optical flow from multi-frame sequences of such dynamic scenes. Specifically, we simulated various dynamic scenarios and occlusion phenomena based on Markov property, allowing the model to extract motion laws and thus gain performance in dynamic and occluded areas, which diverges from existing methods without considering temporal dynamics. In addition, we introduced a temporal dynamic model based on a well-designed spatial-temporal dual recurrent block, resulting in a lightweight model structure with fast inference speed. Assuming the temporal smoothness of optical flow, we used the prior motions of adjacent frames to supervise the occluded regions more reliably. Experiments on several optical flow benchmarks demonstrated the effectiveness of our method, as the performance is comparable to several state-of-the-art methods with advantages in memory and computational overhead.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"5 2","pages":""},"PeriodicalIF":5.0000,"publicationDate":"2023-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Unsupervised learning of optical flow in a multi-frame dynamic environment using temporal dynamic modeling\",\"authors\":\"Zitang Sun, Zhengbo Luo, Shin’ya Nishida\",\"doi\":\"10.1007/s40747-023-01266-2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>For visual estimation of optical flow, which is crucial for various vision analyses, unsupervised learning by view synthesis has emerged as a promising alternative to supervised methods because the ground-truth flow is not readily available in many cases. However, unsupervised learning is likely to be unstable when pixel tracking is lost via occlusion and motion blur, or pixel correspondence is impaired by variations in image content and spatial structure over time. Recognizing that dynamic occlusions and object variations usually exhibit a smooth temporal transition in natural settings, we shifted our focus to model unsupervised learning optical flow from multi-frame sequences of such dynamic scenes. Specifically, we simulated various dynamic scenarios and occlusion phenomena based on Markov property, allowing the model to extract motion laws and thus gain performance in dynamic and occluded areas, which diverges from existing methods without considering temporal dynamics. In addition, we introduced a temporal dynamic model based on a well-designed spatial-temporal dual recurrent block, resulting in a lightweight model structure with fast inference speed. Assuming the temporal smoothness of optical flow, we used the prior motions of adjacent frames to supervise the occluded regions more reliably. Experiments on several optical flow benchmarks demonstrated the effectiveness of our method, as the performance is comparable to several state-of-the-art methods with advantages in memory and computational overhead.</p>\",\"PeriodicalId\":10524,\"journal\":{\"name\":\"Complex & Intelligent Systems\",\"volume\":\"5 2\",\"pages\":\"\"},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2023-10-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Complex & Intelligent Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s40747-023-01266-2\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Complex & Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s40747-023-01266-2","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

对于对各种视觉分析至关重要的光流的视觉估计,通过视图合成的无监督学习已成为有监督方法的一种有前途的替代方法,因为在许多情况下,基本真实流并不容易获得。然而,当像素跟踪因遮挡和运动模糊而丢失,或者像素对应性因图像内容和空间结构随时间变化而受损时,无监督学习可能是不稳定的。认识到动态遮挡和物体变化通常在自然环境中表现出平稳的时间过渡,我们将重点转移到对这种动态场景的多帧序列的无监督学习光流进行建模。具体来说,我们基于马尔可夫特性模拟了各种动态场景和遮挡现象,使模型能够提取运动规律,从而在动态和遮挡区域中获得性能,这与现有方法不同,没有考虑时间动态。此外,我们引入了一个基于精心设计的时空双递归块的时间动态模型,从而形成了一个具有快速推理速度的轻量级模型结构。假设光流的时间平滑性,我们使用相邻帧的先验运动来更可靠地监督被遮挡区域。在几个光流基准上的实验证明了我们方法的有效性,因为它的性能与几种最先进的方法相当,在内存和计算开销方面具有优势。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Unsupervised learning of optical flow in a multi-frame dynamic environment using temporal dynamic modeling

For visual estimation of optical flow, which is crucial for various vision analyses, unsupervised learning by view synthesis has emerged as a promising alternative to supervised methods because the ground-truth flow is not readily available in many cases. However, unsupervised learning is likely to be unstable when pixel tracking is lost via occlusion and motion blur, or pixel correspondence is impaired by variations in image content and spatial structure over time. Recognizing that dynamic occlusions and object variations usually exhibit a smooth temporal transition in natural settings, we shifted our focus to model unsupervised learning optical flow from multi-frame sequences of such dynamic scenes. Specifically, we simulated various dynamic scenarios and occlusion phenomena based on Markov property, allowing the model to extract motion laws and thus gain performance in dynamic and occluded areas, which diverges from existing methods without considering temporal dynamics. In addition, we introduced a temporal dynamic model based on a well-designed spatial-temporal dual recurrent block, resulting in a lightweight model structure with fast inference speed. Assuming the temporal smoothness of optical flow, we used the prior motions of adjacent frames to supervise the occluded regions more reliably. Experiments on several optical flow benchmarks demonstrated the effectiveness of our method, as the performance is comparable to several state-of-the-art methods with advantages in memory and computational overhead.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Complex & Intelligent Systems
Complex & Intelligent Systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-
CiteScore
9.60
自引率
10.30%
发文量
297
期刊介绍: Complex & Intelligent Systems aims to provide a forum for presenting and discussing novel approaches, tools and techniques meant for attaining a cross-fertilization between the broad fields of complex systems, computational simulation, and intelligent analytics and visualization. The transdisciplinary research that the journal focuses on will expand the boundaries of our understanding by investigating the principles and processes that underlie many of the most profound problems facing society today.
期刊最新文献
FL-Joint: joint aligning features and labels in federated learning for data heterogeneity Large-scale multiobjective competitive swarm optimizer algorithm based on regional multidirectional search Towards fairness-aware multi-objective optimization Low-frequency spectral graph convolution networks with one-hop connections information for personalized tag recommendation A decentralized feedback-based consensus model considering the consistency maintenance and readability of probabilistic linguistic preference relations for large-scale group decision-making
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1