Learning GAN-Based Foveated Reconstruction to Recover Perceptually Important Image Features

IF 1.9 4区 计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING ACM Transactions on Applied Perception Pub Date : 2023-04-21 DOI:https://dl.acm.org/doi/10.1145/3583072
Luca Surace, Marek Wernikowski, Cara Tursun, Karol Myszkowski, Radosław Mantiuk, Piotr Didyk
{"title":"Learning GAN-Based Foveated Reconstruction to Recover Perceptually Important Image Features","authors":"Luca Surace, Marek Wernikowski, Cara Tursun, Karol Myszkowski, Radosław Mantiuk, Piotr Didyk","doi":"https://dl.acm.org/doi/10.1145/3583072","DOIUrl":null,"url":null,"abstract":"<p>A foveated image can be entirely reconstructed from a sparse set of samples distributed according to the retinal sensitivity of the human visual system, which rapidly decreases with increasing eccentricity. The use of generative adversarial networks (GANs) has recently been shown to be a promising solution for such a task, as they can successfully hallucinate missing image information. As in the case of other supervised learning approaches, the definition of the loss function and the training strategy heavily influence the quality of the output. In this work,we consider the problem of efficiently guiding the training of foveated reconstruction techniques such that they are more aware of the capabilities and limitations of the human visual system, and thus can reconstruct visually important image features. Our primary goal is to make the training procedure less sensitive to distortions that humans cannot detect and focus on penalizing perceptually important artifacts. Given the nature of GAN-based solutions, we focus on the sensitivity of human vision to hallucination in case of input samples with different densities. We propose psychophysical experiments, a dataset, and a procedure for training foveated image reconstruction. The proposed strategy renders the generator network flexible by penalizing only perceptually important deviations in the output. As a result, the method emphasized the recovery of perceptually important image features. We evaluated our strategy and compared it with alternative solutions by using a newly trained objective metric, a recent foveated video quality metric, and user experiments. Our evaluations revealed significant improvements in the perceived image reconstruction quality compared with the standard GAN-based training approach.</p>","PeriodicalId":50921,"journal":{"name":"ACM Transactions on Applied Perception","volume":null,"pages":null},"PeriodicalIF":1.9000,"publicationDate":"2023-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Applied Perception","FirstCategoryId":"94","ListUrlMain":"https://doi.org/https://dl.acm.org/doi/10.1145/3583072","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0

Abstract

A foveated image can be entirely reconstructed from a sparse set of samples distributed according to the retinal sensitivity of the human visual system, which rapidly decreases with increasing eccentricity. The use of generative adversarial networks (GANs) has recently been shown to be a promising solution for such a task, as they can successfully hallucinate missing image information. As in the case of other supervised learning approaches, the definition of the loss function and the training strategy heavily influence the quality of the output. In this work,we consider the problem of efficiently guiding the training of foveated reconstruction techniques such that they are more aware of the capabilities and limitations of the human visual system, and thus can reconstruct visually important image features. Our primary goal is to make the training procedure less sensitive to distortions that humans cannot detect and focus on penalizing perceptually important artifacts. Given the nature of GAN-based solutions, we focus on the sensitivity of human vision to hallucination in case of input samples with different densities. We propose psychophysical experiments, a dataset, and a procedure for training foveated image reconstruction. The proposed strategy renders the generator network flexible by penalizing only perceptually important deviations in the output. As a result, the method emphasized the recovery of perceptually important image features. We evaluated our strategy and compared it with alternative solutions by using a newly trained objective metric, a recent foveated video quality metric, and user experiments. Our evaluations revealed significant improvements in the perceived image reconstruction quality compared with the standard GAN-based training approach.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
学习基于gan的注视点重建以恢复感知上重要的图像特征
人眼视觉系统的视网膜灵敏度随偏心率的增大而迅速降低,根据视网膜灵敏度分布的稀疏样本集可以完全重构出注视点图像。生成对抗网络(GANs)的使用最近被证明是一个很有前途的解决方案,因为它们可以成功地产生缺失的图像信息。与其他监督学习方法一样,损失函数的定义和训练策略严重影响输出的质量。在这项工作中,我们考虑了有效指导注视点重建技术训练的问题,使他们更加了解人类视觉系统的能力和局限性,从而可以重建视觉上重要的图像特征。我们的主要目标是使训练过程对人类无法检测到的扭曲不那么敏感,并专注于惩罚感知上重要的工件。鉴于基于gan的解决方案的性质,我们关注的是在不同密度的输入样本情况下,人类视觉对幻觉的敏感性。我们提出了心理物理实验、数据集和一个训练注视点图像重建的程序。所提出的策略通过只惩罚输出中感知上重要的偏差,使发电机网络具有灵活性。因此,该方法强调恢复感知上重要的图像特征。我们评估了我们的策略,并通过使用新训练的客观度量、最近的焦点视频质量度量和用户实验,将其与替代解决方案进行了比较。我们的评估显示,与标准的基于gan的训练方法相比,感知图像重建质量有显著改善。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
ACM Transactions on Applied Perception
ACM Transactions on Applied Perception 工程技术-计算机:软件工程
CiteScore
3.70
自引率
0.00%
发文量
22
审稿时长
12 months
期刊介绍: ACM Transactions on Applied Perception (TAP) aims to strengthen the synergy between computer science and psychology/perception by publishing top quality papers that help to unify research in these fields. The journal publishes inter-disciplinary research of significant and lasting value in any topic area that spans both Computer Science and Perceptual Psychology. All papers must incorporate both perceptual and computer science components.
期刊最新文献
Virtual Reality Audio Game for Entertainment & Sound Localization Training The Impact of Nature Realism on the Restorative Quality of Virtual Reality Forest Bathing Color Theme Evaluation through User Preference Modeling Understanding the Impact of Visual and Kinematic Information on the Perception of Physicality Errors Decoding Functional Brain Data for Emotion Recognition: A Machine Learning Approach
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1