通过即时插值进行噪声校正,实现基于扩散的图像间平移

Junsung Lee, Minsoo Kang, Bohyung Han
{"title":"通过即时插值进行噪声校正,实现基于扩散的图像间平移","authors":"Junsung Lee, Minsoo Kang, Bohyung Han","doi":"arxiv-2409.08077","DOIUrl":null,"url":null,"abstract":"We propose a simple but effective training-free approach tailored to\ndiffusion-based image-to-image translation. Our approach revises the original\nnoise prediction network of a pretrained diffusion model by introducing a noise\ncorrection term. We formulate the noise correction term as the difference\nbetween two noise predictions; one is computed from the denoising network with\na progressive interpolation of the source and target prompt embeddings, while\nthe other is the noise prediction with the source prompt embedding. The final\nnoise prediction network is given by a linear combination of the standard\ndenoising term and the noise correction term, where the former is designed to\nreconstruct must-be-preserved regions while the latter aims to effectively edit\nregions of interest relevant to the target prompt. Our approach can be easily\nincorporated into existing image-to-image translation methods based on\ndiffusion models. Extensive experiments verify that the proposed technique\nachieves outstanding performance with low latency and consistently improves\nexisting frameworks when combined with them.","PeriodicalId":501130,"journal":{"name":"arXiv - CS - Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Diffusion-Based Image-to-Image Translation by Noise Correction via Prompt Interpolation\",\"authors\":\"Junsung Lee, Minsoo Kang, Bohyung Han\",\"doi\":\"arxiv-2409.08077\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We propose a simple but effective training-free approach tailored to\\ndiffusion-based image-to-image translation. Our approach revises the original\\nnoise prediction network of a pretrained diffusion model by introducing a noise\\ncorrection term. We formulate the noise correction term as the difference\\nbetween two noise predictions; one is computed from the denoising network with\\na progressive interpolation of the source and target prompt embeddings, while\\nthe other is the noise prediction with the source prompt embedding. The final\\nnoise prediction network is given by a linear combination of the standard\\ndenoising term and the noise correction term, where the former is designed to\\nreconstruct must-be-preserved regions while the latter aims to effectively edit\\nregions of interest relevant to the target prompt. Our approach can be easily\\nincorporated into existing image-to-image translation methods based on\\ndiffusion models. Extensive experiments verify that the proposed technique\\nachieves outstanding performance with low latency and consistently improves\\nexisting frameworks when combined with them.\",\"PeriodicalId\":501130,\"journal\":{\"name\":\"arXiv - CS - Computer Vision and Pattern Recognition\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Computer Vision and Pattern Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.08077\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.08077","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

我们针对基于扩散的图像到图像转换提出了一种简单而有效的免训练方法。我们的方法通过引入噪声校正项,修改了预训练扩散模型的原始噪声预测网络。我们将噪声校正项表述为两个噪声预测之间的差值;一个是通过对源和目标提示嵌入进行渐进插值的去噪网络计算得出的,另一个是通过源提示嵌入得出的噪声预测。最终的噪声预测网络由标准去噪项和噪声校正项的线性组合构成,前者旨在重建必须保留的区域,后者旨在有效编辑与目标提示相关的感兴趣区域。我们的方法可以轻松融入现有的基于扩散模型的图像到图像翻译方法中。广泛的实验验证了所提出的技术能以较低的延迟实现出色的性能,并在与现有框架相结合时持续改进现有框架。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Diffusion-Based Image-to-Image Translation by Noise Correction via Prompt Interpolation
We propose a simple but effective training-free approach tailored to diffusion-based image-to-image translation. Our approach revises the original noise prediction network of a pretrained diffusion model by introducing a noise correction term. We formulate the noise correction term as the difference between two noise predictions; one is computed from the denoising network with a progressive interpolation of the source and target prompt embeddings, while the other is the noise prediction with the source prompt embedding. The final noise prediction network is given by a linear combination of the standard denoising term and the noise correction term, where the former is designed to reconstruct must-be-preserved regions while the latter aims to effectively edit regions of interest relevant to the target prompt. Our approach can be easily incorporated into existing image-to-image translation methods based on diffusion models. Extensive experiments verify that the proposed technique achieves outstanding performance with low latency and consistently improves existing frameworks when combined with them.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Massively Multi-Person 3D Human Motion Forecasting with Scene Context Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution Precise Forecasting of Sky Images Using Spatial Warping JEAN: Joint Expression and Audio-guided NeRF-based Talking Face Generation Applications of Knowledge Distillation in Remote Sensing: A Survey
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1