{"title":"PSRUNet:基于并行简单递归单元的时空序列预测递归神经网络","authors":"Wei Tian, Fan Luo, Kailing Shen","doi":"10.1007/s00138-024-01539-x","DOIUrl":null,"url":null,"abstract":"<p>Unsupervised video prediction is widely applied in intelligent decision-making scenarios due to its capability to model unknown scenes. Traditional video prediction models based on Long Short-Term Memory (LSTM) and Gate Recurrent Unit (GRU) consume large amounts of computational resources while constantly losing the original picture information. This paper addresses the challenges discussed and introduces PSRUNet, a novel model featuring the lightweight ParallelSRU unit. By prioritizing global spatiotemporal features and minimizing redundancy, PSRUNet effectively enhances the model’s early perception of complex spatiotemporal changes. The addition of an encoder-decoder architecture captures high-dimensional image information, and information recall is introduced to mitigate gradient vanishing during deep network training. We evaluated the performance of PSRUNet and analyzed the capabilities of ParallelSRU in real-world applications, including short-term precipitation forecasting, traffic flow prediction, and human behavior prediction. Experimental results across multiple video prediction benchmarks demonstrate that PSRUNet achieves remarkably efficient and cost-effective predictions, making it a promising solution for meeting the real-time and accuracy requirements of practical business scenarios.</p>","PeriodicalId":51116,"journal":{"name":"Machine Vision and Applications","volume":"81 1","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2024-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"PSRUNet: a recurrent neural network for spatiotemporal sequence forecasting based on parallel simple recurrent unit\",\"authors\":\"Wei Tian, Fan Luo, Kailing Shen\",\"doi\":\"10.1007/s00138-024-01539-x\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Unsupervised video prediction is widely applied in intelligent decision-making scenarios due to its capability to model unknown scenes. Traditional video prediction models based on Long Short-Term Memory (LSTM) and Gate Recurrent Unit (GRU) consume large amounts of computational resources while constantly losing the original picture information. This paper addresses the challenges discussed and introduces PSRUNet, a novel model featuring the lightweight ParallelSRU unit. By prioritizing global spatiotemporal features and minimizing redundancy, PSRUNet effectively enhances the model’s early perception of complex spatiotemporal changes. The addition of an encoder-decoder architecture captures high-dimensional image information, and information recall is introduced to mitigate gradient vanishing during deep network training. We evaluated the performance of PSRUNet and analyzed the capabilities of ParallelSRU in real-world applications, including short-term precipitation forecasting, traffic flow prediction, and human behavior prediction. Experimental results across multiple video prediction benchmarks demonstrate that PSRUNet achieves remarkably efficient and cost-effective predictions, making it a promising solution for meeting the real-time and accuracy requirements of practical business scenarios.</p>\",\"PeriodicalId\":51116,\"journal\":{\"name\":\"Machine Vision and Applications\",\"volume\":\"81 1\",\"pages\":\"\"},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2024-04-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Machine Vision and Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s00138-024-01539-x\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Vision and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00138-024-01539-x","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
PSRUNet: a recurrent neural network for spatiotemporal sequence forecasting based on parallel simple recurrent unit
Unsupervised video prediction is widely applied in intelligent decision-making scenarios due to its capability to model unknown scenes. Traditional video prediction models based on Long Short-Term Memory (LSTM) and Gate Recurrent Unit (GRU) consume large amounts of computational resources while constantly losing the original picture information. This paper addresses the challenges discussed and introduces PSRUNet, a novel model featuring the lightweight ParallelSRU unit. By prioritizing global spatiotemporal features and minimizing redundancy, PSRUNet effectively enhances the model’s early perception of complex spatiotemporal changes. The addition of an encoder-decoder architecture captures high-dimensional image information, and information recall is introduced to mitigate gradient vanishing during deep network training. We evaluated the performance of PSRUNet and analyzed the capabilities of ParallelSRU in real-world applications, including short-term precipitation forecasting, traffic flow prediction, and human behavior prediction. Experimental results across multiple video prediction benchmarks demonstrate that PSRUNet achieves remarkably efficient and cost-effective predictions, making it a promising solution for meeting the real-time and accuracy requirements of practical business scenarios.
期刊介绍:
Machine Vision and Applications publishes high-quality technical contributions in machine vision research and development. Specifically, the editors encourage submittals in all applications and engineering aspects of image-related computing. In particular, original contributions dealing with scientific, commercial, industrial, military, and biomedical applications of machine vision, are all within the scope of the journal.
Particular emphasis is placed on engineering and technology aspects of image processing and computer vision.
The following aspects of machine vision applications are of interest: algorithms, architectures, VLSI implementations, AI techniques and expert systems for machine vision, front-end sensing, multidimensional and multisensor machine vision, real-time techniques, image databases, virtual reality and visualization. Papers must include a significant experimental validation component.