Ehsan Ullah, Marius Pedersen, Kjartan Sebastian Waaseth, Bernt-Erik Baltzersen
{"title":"多注意力引导的SKFHDRNet用于HDR视频重建","authors":"Ehsan Ullah, Marius Pedersen, Kjartan Sebastian Waaseth, Bernt-Erik Baltzersen","doi":"10.2352/j.imagingsci.technol.2023.67.5.050409","DOIUrl":null,"url":null,"abstract":"We propose a three stage learning-based approach for High Dynamic Range (HDR) video reconstruction with alternating exposures. The first stage performs alignment of neighboring frames to the reference frame by estimating the flows between them, the second stage is composed of multi-attention modules and a pyramid cascading deformable alignment module to refine aligned features, and the final stage merges and estimates the final HDR scene using a series of dilated selective kernel fusion residual dense blocks (DSKFRDBs) to fill the over-exposed regions with details. The proposed model variants give HDR-VDP-2 values on a dynamic dataset of 79.12, 78.49, and 78.89 respectively, compared to Chen et al. [“HDR video reconstruction: A coarse-to-fine network and a real-world benchmark dataset,” Proc. IEEE/CVF Int’l. Conf. on Computer Vision (IEEE, Piscataway, NJ, 2021), pp. 2502–2511] 79.09, Yan et al. [“Attention-guided network for ghost-free high dynamic range imaging,” Proc. IEEE/CVF Conf. on Computer Vision and Pattern Recognition (IEEE, Piscataway, NJ, 2019), pp. 1751–1760] 78.69, Kalantari et al. [“Patch-based high dynamic range video,” ACM Trans. Graph. 32 (2013) 202–1] 70.36, and Kalantari et al. [“Deep hdr video from sequences with alternating exposures,” Computer Graphics Forum (Wiley Online Library, 2019), Vol. 38, pp. 193–205] 77.91. We achieve better detail reproduction and alignment in over-exposed regions compared to state-of-the-art methods and with a smaller number of parameters.","PeriodicalId":15924,"journal":{"name":"Journal of Imaging Science and Technology","volume":"5 1","pages":"0"},"PeriodicalIF":0.6000,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-Attention Guided SKFHDRNet For HDR Video Reconstruction\",\"authors\":\"Ehsan Ullah, Marius Pedersen, Kjartan Sebastian Waaseth, Bernt-Erik Baltzersen\",\"doi\":\"10.2352/j.imagingsci.technol.2023.67.5.050409\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We propose a three stage learning-based approach for High Dynamic Range (HDR) video reconstruction with alternating exposures. The first stage performs alignment of neighboring frames to the reference frame by estimating the flows between them, the second stage is composed of multi-attention modules and a pyramid cascading deformable alignment module to refine aligned features, and the final stage merges and estimates the final HDR scene using a series of dilated selective kernel fusion residual dense blocks (DSKFRDBs) to fill the over-exposed regions with details. The proposed model variants give HDR-VDP-2 values on a dynamic dataset of 79.12, 78.49, and 78.89 respectively, compared to Chen et al. [“HDR video reconstruction: A coarse-to-fine network and a real-world benchmark dataset,” Proc. IEEE/CVF Int’l. Conf. on Computer Vision (IEEE, Piscataway, NJ, 2021), pp. 2502–2511] 79.09, Yan et al. [“Attention-guided network for ghost-free high dynamic range imaging,” Proc. IEEE/CVF Conf. on Computer Vision and Pattern Recognition (IEEE, Piscataway, NJ, 2019), pp. 1751–1760] 78.69, Kalantari et al. [“Patch-based high dynamic range video,” ACM Trans. Graph. 32 (2013) 202–1] 70.36, and Kalantari et al. [“Deep hdr video from sequences with alternating exposures,” Computer Graphics Forum (Wiley Online Library, 2019), Vol. 38, pp. 193–205] 77.91. We achieve better detail reproduction and alignment in over-exposed regions compared to state-of-the-art methods and with a smaller number of parameters.\",\"PeriodicalId\":15924,\"journal\":{\"name\":\"Journal of Imaging Science and Technology\",\"volume\":\"5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.6000,\"publicationDate\":\"2023-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Imaging Science and Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2352/j.imagingsci.technol.2023.67.5.050409\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Imaging Science and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2352/j.imagingsci.technol.2023.67.5.050409","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
我们提出了一种基于三阶段学习的方法用于交替曝光的高动态范围(HDR)视频重建。第一阶段通过估计相邻帧与参考帧之间的流量,将相邻帧与参考帧对齐;第二阶段由多关注模块和金字塔级联可变形对齐模块组成,对对齐特征进行细化;最后阶段使用一系列膨胀选择性核融合残余密集块(dskfrdb)对过度曝光区域进行细节填充,对最终HDR场景进行合并和估计。与Chen等人相比,所提出的模型变体在动态数据集上给出的HDR-VDP-2值分别为79.12、78.49和78.89。HDR视频重建:一个从粗到精的网络和一个真实世界的基准数据集,”Proc. IEEE/CVF Int’Conf. on Computer Vision (IEEE, Piscataway, NJ, 2021), pp. 2502–2511] 79.09, Yan等 等。[“无鬼高动态范围成像的注意力引导网络,”Proc. IEEE/CVF Conf. on Computer Vision and Pattern Recognition (IEEE, Piscataway, NJ, 2019), pp. 1751–1760] 78.69, Kalantari et 等。[“基于补丁的高动态范围视频,”ACM反式。图32 (2013)202–1] 70.36, and Kalantari et al。[“交替曝光序列的深hdr视频,”计算机图形学论坛(Wiley Online Library, 2019), Vol. 38, pp. 193–205] 77.91。与最先进的方法和更少的参数相比,我们在过度曝光区域实现了更好的细节再现和对齐。
Multi-Attention Guided SKFHDRNet For HDR Video Reconstruction
We propose a three stage learning-based approach for High Dynamic Range (HDR) video reconstruction with alternating exposures. The first stage performs alignment of neighboring frames to the reference frame by estimating the flows between them, the second stage is composed of multi-attention modules and a pyramid cascading deformable alignment module to refine aligned features, and the final stage merges and estimates the final HDR scene using a series of dilated selective kernel fusion residual dense blocks (DSKFRDBs) to fill the over-exposed regions with details. The proposed model variants give HDR-VDP-2 values on a dynamic dataset of 79.12, 78.49, and 78.89 respectively, compared to Chen et al. [“HDR video reconstruction: A coarse-to-fine network and a real-world benchmark dataset,” Proc. IEEE/CVF Int’l. Conf. on Computer Vision (IEEE, Piscataway, NJ, 2021), pp. 2502–2511] 79.09, Yan et al. [“Attention-guided network for ghost-free high dynamic range imaging,” Proc. IEEE/CVF Conf. on Computer Vision and Pattern Recognition (IEEE, Piscataway, NJ, 2019), pp. 1751–1760] 78.69, Kalantari et al. [“Patch-based high dynamic range video,” ACM Trans. Graph. 32 (2013) 202–1] 70.36, and Kalantari et al. [“Deep hdr video from sequences with alternating exposures,” Computer Graphics Forum (Wiley Online Library, 2019), Vol. 38, pp. 193–205] 77.91. We achieve better detail reproduction and alignment in over-exposed regions compared to state-of-the-art methods and with a smaller number of parameters.
期刊介绍:
Typical issues include research papers and/or comprehensive reviews from a variety of topical areas. In the spirit of fostering constructive scientific dialog, the Journal accepts Letters to the Editor commenting on previously published articles. Periodically the Journal features a Special Section containing a group of related— usually invited—papers introduced by a Guest Editor. Imaging research topics that have coverage in JIST include:
Digital fabrication and biofabrication;
Digital printing technologies;
3D imaging: capture, display, and print;
Augmented and virtual reality systems;
Mobile imaging;
Computational and digital photography;
Machine vision and learning;
Data visualization and analysis;
Image and video quality evaluation;
Color image science;
Image archiving, permanence, and security;
Imaging applications including astronomy, medicine, sports, and autonomous vehicles.