{"title":"基于时间动态建模的多帧动态环境中光流的无监督学习","authors":"Zitang Sun, Zhengbo Luo, Shin’ya Nishida","doi":"10.1007/s40747-023-01266-2","DOIUrl":null,"url":null,"abstract":"<p>For visual estimation of optical flow, which is crucial for various vision analyses, unsupervised learning by view synthesis has emerged as a promising alternative to supervised methods because the ground-truth flow is not readily available in many cases. However, unsupervised learning is likely to be unstable when pixel tracking is lost via occlusion and motion blur, or pixel correspondence is impaired by variations in image content and spatial structure over time. Recognizing that dynamic occlusions and object variations usually exhibit a smooth temporal transition in natural settings, we shifted our focus to model unsupervised learning optical flow from multi-frame sequences of such dynamic scenes. Specifically, we simulated various dynamic scenarios and occlusion phenomena based on Markov property, allowing the model to extract motion laws and thus gain performance in dynamic and occluded areas, which diverges from existing methods without considering temporal dynamics. In addition, we introduced a temporal dynamic model based on a well-designed spatial-temporal dual recurrent block, resulting in a lightweight model structure with fast inference speed. Assuming the temporal smoothness of optical flow, we used the prior motions of adjacent frames to supervise the occluded regions more reliably. Experiments on several optical flow benchmarks demonstrated the effectiveness of our method, as the performance is comparable to several state-of-the-art methods with advantages in memory and computational overhead.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"5 2","pages":""},"PeriodicalIF":5.0000,"publicationDate":"2023-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Unsupervised learning of optical flow in a multi-frame dynamic environment using temporal dynamic modeling\",\"authors\":\"Zitang Sun, Zhengbo Luo, Shin’ya Nishida\",\"doi\":\"10.1007/s40747-023-01266-2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>For visual estimation of optical flow, which is crucial for various vision analyses, unsupervised learning by view synthesis has emerged as a promising alternative to supervised methods because the ground-truth flow is not readily available in many cases. However, unsupervised learning is likely to be unstable when pixel tracking is lost via occlusion and motion blur, or pixel correspondence is impaired by variations in image content and spatial structure over time. Recognizing that dynamic occlusions and object variations usually exhibit a smooth temporal transition in natural settings, we shifted our focus to model unsupervised learning optical flow from multi-frame sequences of such dynamic scenes. Specifically, we simulated various dynamic scenarios and occlusion phenomena based on Markov property, allowing the model to extract motion laws and thus gain performance in dynamic and occluded areas, which diverges from existing methods without considering temporal dynamics. In addition, we introduced a temporal dynamic model based on a well-designed spatial-temporal dual recurrent block, resulting in a lightweight model structure with fast inference speed. Assuming the temporal smoothness of optical flow, we used the prior motions of adjacent frames to supervise the occluded regions more reliably. Experiments on several optical flow benchmarks demonstrated the effectiveness of our method, as the performance is comparable to several state-of-the-art methods with advantages in memory and computational overhead.</p>\",\"PeriodicalId\":10524,\"journal\":{\"name\":\"Complex & Intelligent Systems\",\"volume\":\"5 2\",\"pages\":\"\"},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2023-10-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Complex & Intelligent Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s40747-023-01266-2\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Complex & Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s40747-023-01266-2","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Unsupervised learning of optical flow in a multi-frame dynamic environment using temporal dynamic modeling
For visual estimation of optical flow, which is crucial for various vision analyses, unsupervised learning by view synthesis has emerged as a promising alternative to supervised methods because the ground-truth flow is not readily available in many cases. However, unsupervised learning is likely to be unstable when pixel tracking is lost via occlusion and motion blur, or pixel correspondence is impaired by variations in image content and spatial structure over time. Recognizing that dynamic occlusions and object variations usually exhibit a smooth temporal transition in natural settings, we shifted our focus to model unsupervised learning optical flow from multi-frame sequences of such dynamic scenes. Specifically, we simulated various dynamic scenarios and occlusion phenomena based on Markov property, allowing the model to extract motion laws and thus gain performance in dynamic and occluded areas, which diverges from existing methods without considering temporal dynamics. In addition, we introduced a temporal dynamic model based on a well-designed spatial-temporal dual recurrent block, resulting in a lightweight model structure with fast inference speed. Assuming the temporal smoothness of optical flow, we used the prior motions of adjacent frames to supervise the occluded regions more reliably. Experiments on several optical flow benchmarks demonstrated the effectiveness of our method, as the performance is comparable to several state-of-the-art methods with advantages in memory and computational overhead.
期刊介绍:
Complex & Intelligent Systems aims to provide a forum for presenting and discussing novel approaches, tools and techniques meant for attaining a cross-fertilization between the broad fields of complex systems, computational simulation, and intelligent analytics and visualization. The transdisciplinary research that the journal focuses on will expand the boundaries of our understanding by investigating the principles and processes that underlie many of the most profound problems facing society today.