{"title":"用于视频压缩传感的运动感知动态图神经网络","authors":"Ruiying Lu, Ziheng Cheng, Bo Chen, Xin Yuan","doi":"10.1109/TPAMI.2024.3395804","DOIUrl":null,"url":null,"abstract":"<p><p>Video snapshot compressive imaging (SCI) utilizes a 2D detector to capture sequential video frames and compress them into a single measurement. Various reconstruction methods have been developed to recover the high-speed video frames from the snapshot measurement. However, most existing reconstruction methods are incapable of efficiently capturing long-range spatial and temporal dependencies, which are critical for video processing. In this paper, we propose a flexible and robust approach based on the graph neural network (GNN) to efficiently model non-local interactions between pixels in space and time regardless of the distance. Specifically, we develop a motion-aware dynamic GNN for better video representation, i.e., represent each node as the aggregation of relative neighbors under the guidance of frame-by-frame motions, which consists of motion-aware dynamic sampling, cross-scale node sampling, global knowledge integration, and graph aggregation. Extensive results on both simulation and real data demonstrate both the effectiveness and efficiency of the proposed approach, and the visualization illustrates the intrinsic dynamic sampling operations of our proposed model for boosting the video SCI reconstruction results. The code and model will be released.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Motion-Aware Dynamic Graph Neural Network for Video Compressive Sensing.\",\"authors\":\"Ruiying Lu, Ziheng Cheng, Bo Chen, Xin Yuan\",\"doi\":\"10.1109/TPAMI.2024.3395804\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Video snapshot compressive imaging (SCI) utilizes a 2D detector to capture sequential video frames and compress them into a single measurement. Various reconstruction methods have been developed to recover the high-speed video frames from the snapshot measurement. However, most existing reconstruction methods are incapable of efficiently capturing long-range spatial and temporal dependencies, which are critical for video processing. In this paper, we propose a flexible and robust approach based on the graph neural network (GNN) to efficiently model non-local interactions between pixels in space and time regardless of the distance. Specifically, we develop a motion-aware dynamic GNN for better video representation, i.e., represent each node as the aggregation of relative neighbors under the guidance of frame-by-frame motions, which consists of motion-aware dynamic sampling, cross-scale node sampling, global knowledge integration, and graph aggregation. Extensive results on both simulation and real data demonstrate both the effectiveness and efficiency of the proposed approach, and the visualization illustrates the intrinsic dynamic sampling operations of our proposed model for boosting the video SCI reconstruction results. The code and model will be released.</p>\",\"PeriodicalId\":94034,\"journal\":{\"name\":\"IEEE transactions on pattern analysis and machine intelligence\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on pattern analysis and machine intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TPAMI.2024.3395804\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/11/6 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TPAMI.2024.3395804","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/11/6 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
Motion-Aware Dynamic Graph Neural Network for Video Compressive Sensing.
Video snapshot compressive imaging (SCI) utilizes a 2D detector to capture sequential video frames and compress them into a single measurement. Various reconstruction methods have been developed to recover the high-speed video frames from the snapshot measurement. However, most existing reconstruction methods are incapable of efficiently capturing long-range spatial and temporal dependencies, which are critical for video processing. In this paper, we propose a flexible and robust approach based on the graph neural network (GNN) to efficiently model non-local interactions between pixels in space and time regardless of the distance. Specifically, we develop a motion-aware dynamic GNN for better video representation, i.e., represent each node as the aggregation of relative neighbors under the guidance of frame-by-frame motions, which consists of motion-aware dynamic sampling, cross-scale node sampling, global knowledge integration, and graph aggregation. Extensive results on both simulation and real data demonstrate both the effectiveness and efficiency of the proposed approach, and the visualization illustrates the intrinsic dynamic sampling operations of our proposed model for boosting the video SCI reconstruction results. The code and model will be released.