Learning GAN-Based Foveated Reconstruction to Recover Perceptually Important Image Features

IF 1.9 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING ACM Transactions on Applied Perception Pub Date : 2021-08-07 DOI:10.1145/3583072
L. Surace, Marek Wernikowski, C. Tursun, K. Myszkowski, R. Mantiuk, P. Didyk
{"title":"Learning GAN-Based Foveated Reconstruction to Recover Perceptually Important Image Features","authors":"L. Surace, Marek Wernikowski, C. Tursun, K. Myszkowski, R. Mantiuk, P. Didyk","doi":"10.1145/3583072","DOIUrl":null,"url":null,"abstract":"A foveated image can be entirely reconstructed from a sparse set of samples distributed according to the retinal sensitivity of the human visual system, which rapidly decreases with increasing eccentricity. The use of generative adversarial networks (GANs) has recently been shown to be a promising solution for such a task, as they can successfully hallucinate missing image information. As in the case of other supervised learning approaches, the definition of the loss function and the training strategy heavily influence the quality of the output. In this work,we consider the problem of efficiently guiding the training of foveated reconstruction techniques such that they are more aware of the capabilities and limitations of the human visual system, and thus can reconstruct visually important image features. Our primary goal is to make the training procedure less sensitive to distortions that humans cannot detect and focus on penalizing perceptually important artifacts. Given the nature of GAN-based solutions, we focus on the sensitivity of human vision to hallucination in case of input samples with different densities. We propose psychophysical experiments, a dataset, and a procedure for training foveated image reconstruction. The proposed strategy renders the generator network flexible by penalizing only perceptually important deviations in the output. As a result, the method emphasized the recovery of perceptually important image features. We evaluated our strategy and compared it with alternative solutions by using a newly trained objective metric, a recent foveated video quality metric, and user experiments. Our evaluations revealed significant improvements in the perceived image reconstruction quality compared with the standard GAN-based training approach.","PeriodicalId":50921,"journal":{"name":"ACM Transactions on Applied Perception","volume":null,"pages":null},"PeriodicalIF":1.9000,"publicationDate":"2021-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Applied Perception","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3583072","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0

Abstract

A foveated image can be entirely reconstructed from a sparse set of samples distributed according to the retinal sensitivity of the human visual system, which rapidly decreases with increasing eccentricity. The use of generative adversarial networks (GANs) has recently been shown to be a promising solution for such a task, as they can successfully hallucinate missing image information. As in the case of other supervised learning approaches, the definition of the loss function and the training strategy heavily influence the quality of the output. In this work,we consider the problem of efficiently guiding the training of foveated reconstruction techniques such that they are more aware of the capabilities and limitations of the human visual system, and thus can reconstruct visually important image features. Our primary goal is to make the training procedure less sensitive to distortions that humans cannot detect and focus on penalizing perceptually important artifacts. Given the nature of GAN-based solutions, we focus on the sensitivity of human vision to hallucination in case of input samples with different densities. We propose psychophysical experiments, a dataset, and a procedure for training foveated image reconstruction. The proposed strategy renders the generator network flexible by penalizing only perceptually important deviations in the output. As a result, the method emphasized the recovery of perceptually important image features. We evaluated our strategy and compared it with alternative solutions by using a newly trained objective metric, a recent foveated video quality metric, and user experiments. Our evaluations revealed significant improvements in the perceived image reconstruction quality compared with the standard GAN-based training approach.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于学习GAN的Fovead重建恢复感知重要图像特征
根据人类视觉系统的视网膜灵敏度分布的稀疏样本集可以完全重建凹陷图像,视网膜灵敏度随着离心率的增加而迅速降低。生成对抗性网络(GANs)的使用最近被证明是这类任务的一种很有前途的解决方案,因为它们可以成功地使丢失的图像信息产生幻觉。与其他监督学习方法一样,损失函数的定义和训练策略严重影响输出的质量。在这项工作中,我们考虑了有效指导凹陷重建技术训练的问题,使它们更了解人类视觉系统的能力和局限性,从而能够重建视觉上重要的图像特征。我们的主要目标是使训练过程对人类无法检测到的失真不那么敏感,并专注于惩罚感知上重要的人工制品。鉴于基于GAN的解决方案的性质,我们重点关注在不同密度的输入样本的情况下,人类视觉对幻觉的敏感性。我们提出了心理物理实验,一个数据集,以及一个训练凹图像重建的程序。所提出的策略通过只惩罚输出中感知到的重要偏差,使发电网络变得灵活。因此,该方法强调恢复感知上重要的图像特征。我们评估了我们的策略,并通过使用新训练的客观指标、最近的视频质量指标和用户实验将其与替代解决方案进行了比较。我们的评估显示,与基于GAN的标准训练方法相比,感知图像重建质量显著提高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
ACM Transactions on Applied Perception
ACM Transactions on Applied Perception 工程技术-计算机:软件工程
CiteScore
3.70
自引率
0.00%
发文量
22
审稿时长
12 months
期刊介绍: ACM Transactions on Applied Perception (TAP) aims to strengthen the synergy between computer science and psychology/perception by publishing top quality papers that help to unify research in these fields. The journal publishes inter-disciplinary research of significant and lasting value in any topic area that spans both Computer Science and Perceptual Psychology. All papers must incorporate both perceptual and computer science components.
期刊最新文献
Virtual Reality Audio Game for Entertainment & Sound Localization Training The Impact of Nature Realism on the Restorative Quality of Virtual Reality Forest Bathing Color Theme Evaluation through User Preference Modeling Understanding the Impact of Visual and Kinematic Information on the Perception of Physicality Errors Decoding Functional Brain Data for Emotion Recognition: A Machine Learning Approach
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1