Beta-Sigma VAE: Separating beta and decoder variance in Gaussian variational autoencoder

Seunghwan Kim, Seungkyu Lee
{"title":"Beta-Sigma VAE: Separating beta and decoder variance in Gaussian variational autoencoder","authors":"Seunghwan Kim, Seungkyu Lee","doi":"arxiv-2409.09361","DOIUrl":null,"url":null,"abstract":"Variational autoencoder (VAE) is an established generative model but is\nnotorious for its blurriness. In this work, we investigate the blurry output\nproblem of VAE and resolve it, exploiting the variance of Gaussian decoder and\n$\\beta$ of beta-VAE. Specifically, we reveal that the indistinguishability of\ndecoder variance and $\\beta$ hinders appropriate analysis of the model by\nrandom likelihood value, and limits performance improvement by omitting the\ngain from $\\beta$. To address the problem, we propose Beta-Sigma VAE (BS-VAE)\nthat explicitly separates $\\beta$ and decoder variance $\\sigma^2_x$ in the\nmodel. Our method demonstrates not only superior performance in natural image\nsynthesis but also controllable parameters and predictable analysis compared to\nconventional VAE. In our experimental evaluation, we employ the analysis of\nrate-distortion curve and proxy metrics on computer vision datasets. The code\nis available on https://github.com/overnap/BS-VAE","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.09361","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Variational autoencoder (VAE) is an established generative model but is notorious for its blurriness. In this work, we investigate the blurry output problem of VAE and resolve it, exploiting the variance of Gaussian decoder and $\beta$ of beta-VAE. Specifically, we reveal that the indistinguishability of decoder variance and $\beta$ hinders appropriate analysis of the model by random likelihood value, and limits performance improvement by omitting the gain from $\beta$. To address the problem, we propose Beta-Sigma VAE (BS-VAE) that explicitly separates $\beta$ and decoder variance $\sigma^2_x$ in the model. Our method demonstrates not only superior performance in natural image synthesis but also controllable parameters and predictable analysis compared to conventional VAE. In our experimental evaluation, we employ the analysis of rate-distortion curve and proxy metrics on computer vision datasets. The code is available on https://github.com/overnap/BS-VAE
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
贝塔-西格玛 VAE:分离高斯变异自动编码器中的贝塔方差和解码器方差
变异自动编码器(VAE)是一种成熟的生成模型,但因其模糊性而臭名昭著。在这项工作中,我们利用高斯解码器的方差和贝塔自编码器的贝塔值,研究并解决了自编码器输出模糊的问题。具体来说,我们发现解码器方差和 $\beta$ 的不可分性阻碍了通过随机似然值对模型进行适当的分析,并限制了通过省略 $\beta$ 的增益来提高性能。为了解决这个问题,我们提出了 Beta-Sigma VAE(BS-VAE),它在模型中明确分离了 $\beta$ 和解码器方差 $\sigma^2_x$。与传统的 VAE 相比,我们的方法不仅在自然图像合成中表现出卓越的性能,而且参数可控、分析可预测。在实验评估中,我们采用了计算机视觉数据集上的rate-distortion 曲线分析和代理度量。代码见 https://github.com/overnap/BS-VAE
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Fitting Multilevel Factor Models Cartan moving frames and the data manifolds Symmetry-Based Structured Matrices for Efficient Approximately Equivariant Networks Recurrent Interpolants for Probabilistic Time Series Prediction PieClam: A Universal Graph Autoencoder Based on Overlapping Inclusive and Exclusive Communities
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1