视频从一个单一的编码曝光照片使用学习过完整的字典

2011 International Conference on Computer Vision Pub Date : 2011-11-06 DOI:10.1109/ICCV.2011.6126254

Y. Hitomi, Jinwei Gu, Mohit Gupta, T. Mitsunaga, S. Nayar

{"title":"视频从一个单一的编码曝光照片使用学习过完整的字典","authors":"Y. Hitomi, Jinwei Gu, Mohit Gupta, T. Mitsunaga, S. Nayar","doi":"10.1109/ICCV.2011.6126254","DOIUrl":null,"url":null,"abstract":"Cameras face a fundamental tradeoff between the spatial and temporal resolution - digital still cameras can capture images with high spatial resolution, but most high-speed video cameras suffer from low spatial resolution. It is hard to overcome this tradeoff without incurring a significant increase in hardware costs. In this paper, we propose techniques for sampling, representing and reconstructing the space-time volume in order to overcome this tradeoff. Our approach has two important distinctions compared to previous works: (1) we achieve sparse representation of videos by learning an over-complete dictionary on video patches, and (2) we adhere to practical constraints on sampling scheme which is imposed by architectures of present image sensor devices. Consequently, our sampling scheme can be implemented on image sensors by making a straightforward modification to the control unit. To demonstrate the power of our approach, we have implemented a prototype imaging system with per-pixel coded exposure control using a liquid crystal on silicon (LCoS) device. Using both simulations and experiments on a wide range of scenes, we show that our method can effectively reconstruct a video from a single image maintaining high spatial resolution.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":"35 1","pages":"287-294"},"PeriodicalIF":0.0000,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"233","resultStr":"{\"title\":\"Video from a single coded exposure photograph using a learned over-complete dictionary\",\"authors\":\"Y. Hitomi, Jinwei Gu, Mohit Gupta, T. Mitsunaga, S. Nayar\",\"doi\":\"10.1109/ICCV.2011.6126254\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Cameras face a fundamental tradeoff between the spatial and temporal resolution - digital still cameras can capture images with high spatial resolution, but most high-speed video cameras suffer from low spatial resolution. It is hard to overcome this tradeoff without incurring a significant increase in hardware costs. In this paper, we propose techniques for sampling, representing and reconstructing the space-time volume in order to overcome this tradeoff. Our approach has two important distinctions compared to previous works: (1) we achieve sparse representation of videos by learning an over-complete dictionary on video patches, and (2) we adhere to practical constraints on sampling scheme which is imposed by architectures of present image sensor devices. Consequently, our sampling scheme can be implemented on image sensors by making a straightforward modification to the control unit. To demonstrate the power of our approach, we have implemented a prototype imaging system with per-pixel coded exposure control using a liquid crystal on silicon (LCoS) device. Using both simulations and experiments on a wide range of scenes, we show that our method can effectively reconstruct a video from a single image maintaining high spatial resolution.\",\"PeriodicalId\":6391,\"journal\":{\"name\":\"2011 International Conference on Computer Vision\",\"volume\":\"35 1\",\"pages\":\"287-294\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-11-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"233\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 International Conference on Computer Vision\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCV.2011.6126254\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 International Conference on Computer Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCV.2011.6126254","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 233

摘要

相机面临着空间分辨率和时间分辨率之间的基本权衡——数码相机可以捕捉高空间分辨率的图像，但大多数高速摄像机的空间分辨率都很低。很难在不显著增加硬件成本的情况下克服这种权衡。在本文中，我们提出了采样，表示和重构时空体积的技术，以克服这种权衡。与以前的工作相比，我们的方法有两个重要的区别:(1)我们通过学习视频补丁上的过完备字典来实现视频的稀疏表示，(2)我们坚持由当前图像传感器设备架构施加的采样方案的实际约束。因此，通过对控制单元进行简单的修改，我们的采样方案可以在图像传感器上实现。为了证明我们方法的强大功能，我们已经实现了一个原型成像系统，该系统使用硅上液晶(LCoS)器件实现了每像素编码曝光控制。通过对各种场景的模拟和实验，我们证明了我们的方法可以有效地从单个图像重建视频，并保持高空间分辨率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Video from a single coded exposure photograph using a learned over-complete dictionary

Cameras face a fundamental tradeoff between the spatial and temporal resolution - digital still cameras can capture images with high spatial resolution, but most high-speed video cameras suffer from low spatial resolution. It is hard to overcome this tradeoff without incurring a significant increase in hardware costs. In this paper, we propose techniques for sampling, representing and reconstructing the space-time volume in order to overcome this tradeoff. Our approach has two important distinctions compared to previous works: (1) we achieve sparse representation of videos by learning an over-complete dictionary on video patches, and (2) we adhere to practical constraints on sampling scheme which is imposed by architectures of present image sensor devices. Consequently, our sampling scheme can be implemented on image sensors by making a straightforward modification to the control unit. To demonstrate the power of our approach, we have implemented a prototype imaging system with per-pixel coded exposure control using a liquid crystal on silicon (LCoS) device. Using both simulations and experiments on a wide range of scenes, we show that our method can effectively reconstruct a video from a single image maintaining high spatial resolution.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2011 International Conference on Computer Vision

自引率

0.00%

发文量

期刊最新文献

Robust and efficient parametric face alignment Video parsing for abnormality detection From learning models of natural image patches to whole image restoration Discriminative figure-centric models for joint action localization and recognition A general preconditioning scheme for difference measures in deformable registration