InverseMeetInsert:通过引导扩散模型中的几何累积反演进行稳健的真实图像编辑

Yan Zheng, Lemeng Wu
{"title":"InverseMeetInsert:通过引导扩散模型中的几何累积反演进行稳健的真实图像编辑","authors":"Yan Zheng, Lemeng Wu","doi":"arxiv-2409.11734","DOIUrl":null,"url":null,"abstract":"In this paper, we introduce Geometry-Inverse-Meet-Pixel-Insert, short for\nGEO, an exceptionally versatile image editing technique designed to cater to\ncustomized user requirements at both local and global scales. Our approach\nseamlessly integrates text prompts and image prompts to yield diverse and\nprecise editing outcomes. Notably, our method operates without the need for\ntraining and is driven by two key contributions: (i) a novel geometric\naccumulation loss that enhances DDIM inversion to faithfully preserve pixel\nspace geometry and layout, and (ii) an innovative boosted image prompt\ntechnique that combines pixel-level editing for text-only inversion with latent\nspace geometry guidance for standard classifier-free reversion. Leveraging the\npublicly available Stable Diffusion model, our approach undergoes extensive\nevaluation across various image types and challenging prompt editing scenarios,\nconsistently delivering high-fidelity editing results for real images.","PeriodicalId":501130,"journal":{"name":"arXiv - CS - Computer Vision and Pattern Recognition","volume":"3 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"InverseMeetInsert: Robust Real Image Editing via Geometric Accumulation Inversion in Guided Diffusion Models\",\"authors\":\"Yan Zheng, Lemeng Wu\",\"doi\":\"arxiv-2409.11734\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we introduce Geometry-Inverse-Meet-Pixel-Insert, short for\\nGEO, an exceptionally versatile image editing technique designed to cater to\\ncustomized user requirements at both local and global scales. Our approach\\nseamlessly integrates text prompts and image prompts to yield diverse and\\nprecise editing outcomes. Notably, our method operates without the need for\\ntraining and is driven by two key contributions: (i) a novel geometric\\naccumulation loss that enhances DDIM inversion to faithfully preserve pixel\\nspace geometry and layout, and (ii) an innovative boosted image prompt\\ntechnique that combines pixel-level editing for text-only inversion with latent\\nspace geometry guidance for standard classifier-free reversion. Leveraging the\\npublicly available Stable Diffusion model, our approach undergoes extensive\\nevaluation across various image types and challenging prompt editing scenarios,\\nconsistently delivering high-fidelity editing results for real images.\",\"PeriodicalId\":501130,\"journal\":{\"name\":\"arXiv - CS - Computer Vision and Pattern Recognition\",\"volume\":\"3 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Computer Vision and Pattern Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.11734\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11734","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在本文中,我们介绍了几何反转与像素插入(Geometry-Inverse-Meet-Pixel-Insert,简称GEO),这是一种非常灵活的图像编辑技术,旨在满足用户在局部和全局范围内的个性化需求。我们的方法将文本提示和图像提示完美地结合在一起,从而产生多样化的精确编辑结果。值得注意的是,我们的方法无需训练即可运行,并由两个关键贡献驱动:(i) 一种新颖的几何累积损失,可增强 DDIM 反演以忠实保留像素空间的几何和布局;(ii) 一种创新的增强图像提示技术,可将纯文本反演的像素级编辑与标准无分类器反演的潜空间几何引导相结合。利用公开的稳定扩散模型,我们的方法在各种图像类型和具有挑战性的提示编辑场景中进行了广泛的评估,始终如一地为真实图像提供高保真编辑结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
InverseMeetInsert: Robust Real Image Editing via Geometric Accumulation Inversion in Guided Diffusion Models
In this paper, we introduce Geometry-Inverse-Meet-Pixel-Insert, short for GEO, an exceptionally versatile image editing technique designed to cater to customized user requirements at both local and global scales. Our approach seamlessly integrates text prompts and image prompts to yield diverse and precise editing outcomes. Notably, our method operates without the need for training and is driven by two key contributions: (i) a novel geometric accumulation loss that enhances DDIM inversion to faithfully preserve pixel space geometry and layout, and (ii) an innovative boosted image prompt technique that combines pixel-level editing for text-only inversion with latent space geometry guidance for standard classifier-free reversion. Leveraging the publicly available Stable Diffusion model, our approach undergoes extensive evaluation across various image types and challenging prompt editing scenarios, consistently delivering high-fidelity editing results for real images.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Massively Multi-Person 3D Human Motion Forecasting with Scene Context Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution Precise Forecasting of Sky Images Using Spatial Warping JEAN: Joint Expression and Audio-guided NeRF-based Talking Face Generation Applications of Knowledge Distillation in Remote Sensing: A Survey
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1