Probabilistic and semantic descriptions of image manifolds and their applications

IF 2.4 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Frontiers in Computer Science Pub Date : 2023-11-02 DOI:10.3389/fcomp.2023.1253682
Peter Tu, Zhaoyuan Yang, Richard Hartley, Zhiwei Xu, Jing Zhang, Yiwei Fu, Dylan Campbell, Jaskirat Singh, Tianyu Wang
{"title":"Probabilistic and semantic descriptions of image manifolds and their applications","authors":"Peter Tu, Zhaoyuan Yang, Richard Hartley, Zhiwei Xu, Jing Zhang, Yiwei Fu, Dylan Campbell, Jaskirat Singh, Tianyu Wang","doi":"10.3389/fcomp.2023.1253682","DOIUrl":null,"url":null,"abstract":"This paper begins with a description of methods for estimating probability density functions for images that reflects the observation that such data is usually constrained to lie in restricted regions of the high-dimensional image space—not every pattern of pixels is an image. It is common to say that images lie on a lower-dimensional manifold in the high-dimensional space. However, although images may lie on such lower-dimensional manifolds, it is not the case that all points on the manifold have an equal probability of being images. Images are unevenly distributed on the manifold, and our task is to devise ways to model this distribution as a probability distribution. In pursuing this goal, we consider generative models that are popular in AI and computer vision community. For our purposes, generative/probabilistic models should have the properties of (1) sample generation: it should be possible to sample from this distribution according to the modeled density function, and (2) probability computation: given a previously unseen sample from the dataset of interest, one should be able to compute the probability of the sample, at least up to a normalizing constant. To this end, we investigate the use of methods such as normalizing flow and diffusion models. We then show how semantic interpretations are used to describe points on the manifold. To achieve this, we consider an emergent language framework that makes use of variational encoders to produce a disentangled representation of points that reside on a given manifold. Trajectories between points on a manifold can then be described in terms of evolving semantic descriptions. In addition to describing the manifold in terms of density and semantic disentanglement, we also show that such probabilistic descriptions (bounded) can be used to improve semantic consistency by constructing defenses against adversarial attacks. We evaluate our methods on CelebA and point samples for likelihood estimation with improved semantic robustness and out-of-distribution detection capability, MNIST and CelebA for semantic disentanglement with explainable and editable semantic interpolation, and CelebA and Fashion-MNIST to defend against patch attacks with significantly improved classification accuracy. We also discuss the limitations of applying our likelihood estimation to 2D images in diffusion models.","PeriodicalId":52823,"journal":{"name":"Frontiers in Computer Science","volume":null,"pages":null},"PeriodicalIF":2.4000,"publicationDate":"2023-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/fcomp.2023.1253682","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

This paper begins with a description of methods for estimating probability density functions for images that reflects the observation that such data is usually constrained to lie in restricted regions of the high-dimensional image space—not every pattern of pixels is an image. It is common to say that images lie on a lower-dimensional manifold in the high-dimensional space. However, although images may lie on such lower-dimensional manifolds, it is not the case that all points on the manifold have an equal probability of being images. Images are unevenly distributed on the manifold, and our task is to devise ways to model this distribution as a probability distribution. In pursuing this goal, we consider generative models that are popular in AI and computer vision community. For our purposes, generative/probabilistic models should have the properties of (1) sample generation: it should be possible to sample from this distribution according to the modeled density function, and (2) probability computation: given a previously unseen sample from the dataset of interest, one should be able to compute the probability of the sample, at least up to a normalizing constant. To this end, we investigate the use of methods such as normalizing flow and diffusion models. We then show how semantic interpretations are used to describe points on the manifold. To achieve this, we consider an emergent language framework that makes use of variational encoders to produce a disentangled representation of points that reside on a given manifold. Trajectories between points on a manifold can then be described in terms of evolving semantic descriptions. In addition to describing the manifold in terms of density and semantic disentanglement, we also show that such probabilistic descriptions (bounded) can be used to improve semantic consistency by constructing defenses against adversarial attacks. We evaluate our methods on CelebA and point samples for likelihood estimation with improved semantic robustness and out-of-distribution detection capability, MNIST and CelebA for semantic disentanglement with explainable and editable semantic interpolation, and CelebA and Fashion-MNIST to defend against patch attacks with significantly improved classification accuracy. We also discuss the limitations of applying our likelihood estimation to 2D images in diffusion models.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
图像流形的概率和语义描述及其应用
本文首先描述了估计图像概率密度函数的方法,这些方法反映了这样的观察,即这些数据通常被限制在高维图像空间的受限区域中——并非每个像素模式都是图像。通常说,图像位于高维空间中的低维流形上。然而,尽管图像可能位于这样的低维流形上,但并非流形上的所有点都具有相同的图像概率。图像在流形上的分布是不均匀的,我们的任务是设计出将这种分布建模为概率分布的方法。为了实现这一目标,我们考虑了在人工智能和计算机视觉社区中流行的生成模型。为了我们的目的,生成/概率模型应该具有(1)样本生成的特性:它应该可以根据建模的密度函数从这个分布中采样,以及(2)概率计算:给定感兴趣的数据集中以前未见过的样本,人们应该能够计算样本的概率,至少到一个归一化常数。为此,我们研究了正态流和扩散模型等方法的使用。然后我们将展示如何使用语义解释来描述流形上的点。为了实现这一点,我们考虑了一个新兴的语言框架,它利用变分编码器来产生驻留在给定流形上的点的解纠缠表示。流形上点之间的轨迹可以用演化的语义描述来描述。除了用密度和语义解纠缠来描述流形外,我们还表明这种概率描述(有界)可以通过构建对抗攻击的防御来提高语义一致性。我们在CelebA和点样本上评估了我们的方法,以提高语义鲁棒性和分布外检测能力进行似然估计,MNIST和CelebA通过可解释和可编辑的语义插值进行语义解纠集,CelebA和Fashion-MNIST用于防御补丁攻击,显著提高了分类精度。我们还讨论了将我们的似然估计应用于扩散模型中的二维图像的局限性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Frontiers in Computer Science
Frontiers in Computer Science COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS-
CiteScore
4.30
自引率
0.00%
发文量
152
审稿时长
13 weeks
期刊最新文献
A Support Vector Machine based approach for plagiarism detection in Python code submissions in undergraduate settings Working with agile and crowd: human factors identified from the industry Energy-efficient, low-latency, and non-contact eye blink detection with capacitive sensing Experimenting with D-Wave quantum annealers on prime factorization problems Fuzzy Markov model for the reliability analysis of hybrid microgrids
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1