Human Latent Metrics: Perceptual and Cognitive Response Correlates to Distance in GAN Latent Space for Facial Images

ACM Symposium on Applied Perception 2022 Pub Date : 2022-09-22 DOI:10.1145/3548814.3551460

Kye Shimizu, Naoto Ienaga, Kazuma Takada, M. Sugimoto, Shunichi Kasahara

{"title":"Human Latent Metrics: Perceptual and Cognitive Response Correlates to Distance in GAN Latent Space for Facial Images","authors":"Kye Shimizu, Naoto Ienaga, Kazuma Takada, M. Sugimoto, Shunichi Kasahara","doi":"10.1145/3548814.3551460","DOIUrl":null,"url":null,"abstract":"Generative adversarial networks (GANs) generate high-dimensional vector spaces (latent spaces) that can interchangeably represent vectors as images. Advancements have extended their ability to computationally generate images indistinguishable from real images such as faces, and more importantly, to manipulate images using their inherit vector values in the latent space. This interchangeability of latent vectors has the potential to calculate not only the distance in the latent space, but also the human perceptual and cognitive distance toward images, that is, how humans perceive and recognize images. However, it is still unclear how the distance in the latent space correlates with human perception and cognition. Our studies investigated the relationship between latent vectors and human perception or cognition through psycho-visual experiments that manipulates the latent vectors of face images. In the perception study, a change perception task was used to examine whether participants could perceive visual changes in face images before and after moving an arbitrary distance in the latent space. In the cognition study, a face recognition task was utilized to examine whether participants could recognize a face as the same, even after moving an arbitrary distance in the latent space. Our experiments show that the distance between face images in the latent space correlates with human perception and cognition for visual changes in face imagery, which can be modeled with a logistic function. By utilizing our methodology, it will be possible to interchangeably convert between the distance in the latent space and the metric of human perception and cognition, potentially leading to image processing that better reflects human perception and cognition.","PeriodicalId":376962,"journal":{"name":"ACM Symposium on Applied Perception 2022","volume":"129 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Symposium on Applied Perception 2022","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3548814.3551460","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Generative adversarial networks (GANs) generate high-dimensional vector spaces (latent spaces) that can interchangeably represent vectors as images. Advancements have extended their ability to computationally generate images indistinguishable from real images such as faces, and more importantly, to manipulate images using their inherit vector values in the latent space. This interchangeability of latent vectors has the potential to calculate not only the distance in the latent space, but also the human perceptual and cognitive distance toward images, that is, how humans perceive and recognize images. However, it is still unclear how the distance in the latent space correlates with human perception and cognition. Our studies investigated the relationship between latent vectors and human perception or cognition through psycho-visual experiments that manipulates the latent vectors of face images. In the perception study, a change perception task was used to examine whether participants could perceive visual changes in face images before and after moving an arbitrary distance in the latent space. In the cognition study, a face recognition task was utilized to examine whether participants could recognize a face as the same, even after moving an arbitrary distance in the latent space. Our experiments show that the distance between face images in the latent space correlates with human perception and cognition for visual changes in face imagery, which can be modeled with a logistic function. By utilizing our methodology, it will be possible to interchangeably convert between the distance in the latent space and the metric of human perception and cognition, potentially leading to image processing that better reflects human perception and cognition.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

人类潜在度量:在GAN潜在空间中与距离相关的感知和认知反应

生成式对抗网络(GANs)生成高维向量空间(潜空间)，可以交替地将向量表示为图像。这些进步已经扩展了它们的计算能力，可以生成与真实图像(如人脸)无法区分的图像，更重要的是，可以使用潜在空间中的继承向量值来操作图像。潜在向量的这种互换性不仅可以计算潜在空间中的距离，还可以计算人类对图像的感知和认知距离，即人类如何感知和识别图像。然而，潜在空间中的距离与人的感知和认知之间的关系尚不清楚。我们的研究通过操纵人脸图像的潜在向量的心理视觉实验来探讨潜在向量与人类感知或认知之间的关系。在感知研究中，变化感知任务被用来检验被试是否能感知人脸图像在潜在空间中移动任意距离前后的视觉变化。在认知研究中，一个人脸识别任务被用来检验参与者在潜在空间中移动任意距离后是否能识别出同一张脸。我们的实验表明，潜在空间中人脸图像之间的距离与人类对人脸图像视觉变化的感知和认知相关，可以用逻辑函数来建模。利用我们的方法，可以在潜在空间的距离和人类感知和认知的度量之间互换转换，从而有可能导致更好地反映人类感知和认知的图像处理。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

ACM Symposium on Applied Perception 2022

自引率

0.00%

发文量