Unsupervised learning of optical flow in a multi-frame dynamic environment using temporal dynamic modeling

IF 4.6 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Complex & Intelligent Systems Pub Date : 2023-10-31 DOI:10.1007/s40747-023-01266-2

Zitang Sun, Zhengbo Luo, Shin’ya Nishida

{"title":"Unsupervised learning of optical flow in a multi-frame dynamic environment using temporal dynamic modeling","authors":"Zitang Sun, Zhengbo Luo, Shin’ya Nishida","doi":"10.1007/s40747-023-01266-2","DOIUrl":null,"url":null,"abstract":"<p>For visual estimation of optical flow, which is crucial for various vision analyses, unsupervised learning by view synthesis has emerged as a promising alternative to supervised methods because the ground-truth flow is not readily available in many cases. However, unsupervised learning is likely to be unstable when pixel tracking is lost via occlusion and motion blur, or pixel correspondence is impaired by variations in image content and spatial structure over time. Recognizing that dynamic occlusions and object variations usually exhibit a smooth temporal transition in natural settings, we shifted our focus to model unsupervised learning optical flow from multi-frame sequences of such dynamic scenes. Specifically, we simulated various dynamic scenarios and occlusion phenomena based on Markov property, allowing the model to extract motion laws and thus gain performance in dynamic and occluded areas, which diverges from existing methods without considering temporal dynamics. In addition, we introduced a temporal dynamic model based on a well-designed spatial-temporal dual recurrent block, resulting in a lightweight model structure with fast inference speed. Assuming the temporal smoothness of optical flow, we used the prior motions of adjacent frames to supervise the occluded regions more reliably. Experiments on several optical flow benchmarks demonstrated the effectiveness of our method, as the performance is comparable to several state-of-the-art methods with advantages in memory and computational overhead.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"5 2","pages":""},"PeriodicalIF":4.6000,"publicationDate":"2023-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Complex & Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s40747-023-01266-2","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

For visual estimation of optical flow, which is crucial for various vision analyses, unsupervised learning by view synthesis has emerged as a promising alternative to supervised methods because the ground-truth flow is not readily available in many cases. However, unsupervised learning is likely to be unstable when pixel tracking is lost via occlusion and motion blur, or pixel correspondence is impaired by variations in image content and spatial structure over time. Recognizing that dynamic occlusions and object variations usually exhibit a smooth temporal transition in natural settings, we shifted our focus to model unsupervised learning optical flow from multi-frame sequences of such dynamic scenes. Specifically, we simulated various dynamic scenarios and occlusion phenomena based on Markov property, allowing the model to extract motion laws and thus gain performance in dynamic and occluded areas, which diverges from existing methods without considering temporal dynamics. In addition, we introduced a temporal dynamic model based on a well-designed spatial-temporal dual recurrent block, resulting in a lightweight model structure with fast inference speed. Assuming the temporal smoothness of optical flow, we used the prior motions of adjacent frames to supervise the occluded regions more reliably. Experiments on several optical flow benchmarks demonstrated the effectiveness of our method, as the performance is comparable to several state-of-the-art methods with advantages in memory and computational overhead.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于时间动态建模的多帧动态环境中光流的无监督学习

对于对各种视觉分析至关重要的光流的视觉估计，通过视图合成的无监督学习已成为有监督方法的一种有前途的替代方法，因为在许多情况下，基本真实流并不容易获得。然而，当像素跟踪因遮挡和运动模糊而丢失，或者像素对应性因图像内容和空间结构随时间变化而受损时，无监督学习可能是不稳定的。认识到动态遮挡和物体变化通常在自然环境中表现出平稳的时间过渡，我们将重点转移到对这种动态场景的多帧序列的无监督学习光流进行建模。具体来说，我们基于马尔可夫特性模拟了各种动态场景和遮挡现象，使模型能够提取运动规律，从而在动态和遮挡区域中获得性能，这与现有方法不同，没有考虑时间动态。此外，我们引入了一个基于精心设计的时空双递归块的时间动态模型，从而形成了一个具有快速推理速度的轻量级模型结构。假设光流的时间平滑性，我们使用相邻帧的先验运动来更可靠地监督被遮挡区域。在几个光流基准上的实验证明了我们方法的有效性，因为它的性能与几种最先进的方法相当，在内存和计算开销方面具有优势。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Complex & Intelligent Systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-

CiteScore

9.60

自引率

10.30%

发文量

297

期刊介绍： Complex & Intelligent Systems aims to provide a forum for presenting and discussing novel approaches, tools and techniques meant for attaining a cross-fertilization between the broad fields of complex systems, computational simulation, and intelligent analytics and visualization. The transdisciplinary research that the journal focuses on will expand the boundaries of our understanding by investigating the principles and processes that underlie many of the most profound problems facing society today.