LEMON: Localized Editing with Mesh Optimization and Neural Shaders

arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2024-09-18 DOI:arxiv-2409.12024

Furkan Mert Algan, Umut Yazgan, Driton Salihu, Cem Eteke, Eckehard Steinbach

{"title":"LEMON: Localized Editing with Mesh Optimization and Neural Shaders","authors":"Furkan Mert Algan, Umut Yazgan, Driton Salihu, Cem Eteke, Eckehard Steinbach","doi":"arxiv-2409.12024","DOIUrl":null,"url":null,"abstract":"In practical use cases, polygonal mesh editing can be faster than generating\nnew ones, but it can still be challenging and time-consuming for users.\nExisting solutions for this problem tend to focus on a single task, either\ngeometry or novel view synthesis, which often leads to disjointed results\nbetween the mesh and view. In this work, we propose LEMON, a mesh editing\npipeline that combines neural deferred shading with localized mesh\noptimization. Our approach begins by identifying the most important vertices in\nthe mesh for editing, utilizing a segmentation model to focus on these key\nregions. Given multi-view images of an object, we optimize a neural shader and\na polygonal mesh while extracting the normal map and the rendered image from\neach view. By using these outputs as conditioning data, we edit the input\nimages with a text-to-image diffusion model and iteratively update our dataset\nwhile deforming the mesh. This process results in a polygonal mesh that is\nedited according to the given text instruction, preserving the geometric\ncharacteristics of the initial mesh while focusing on the most significant\nareas. We evaluate our pipeline using the DTU dataset, demonstrating that it\ngenerates finely-edited meshes more rapidly than the current state-of-the-art\nmethods. We include our code and additional results in the supplementary\nmaterial.","PeriodicalId":501130,"journal":{"name":"arXiv - CS - Computer Vision and Pattern Recognition","volume":"4 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.12024","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In practical use cases, polygonal mesh editing can be faster than generating new ones, but it can still be challenging and time-consuming for users. Existing solutions for this problem tend to focus on a single task, either geometry or novel view synthesis, which often leads to disjointed results between the mesh and view. In this work, we propose LEMON, a mesh editing pipeline that combines neural deferred shading with localized mesh optimization. Our approach begins by identifying the most important vertices in the mesh for editing, utilizing a segmentation model to focus on these key regions. Given multi-view images of an object, we optimize a neural shader and a polygonal mesh while extracting the normal map and the rendered image from each view. By using these outputs as conditioning data, we edit the input images with a text-to-image diffusion model and iteratively update our dataset while deforming the mesh. This process results in a polygonal mesh that is edited according to the given text instruction, preserving the geometric characteristics of the initial mesh while focusing on the most significant areas. We evaluate our pipeline using the DTU dataset, demonstrating that it generates finely-edited meshes more rapidly than the current state-of-the-art methods. We include our code and additional results in the supplementary material.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

LEMON：利用网格优化和神经着色器进行局部编辑

在实际应用案例中，多边形网格编辑可能比生成新网格更快，但对用户来说仍然具有挑战性且耗费时间。在这项工作中，我们提出了 LEMON，一种将神经延迟着色与局部网格优化相结合的网格编辑管道。我们的方法首先要确定网格中最重要的顶点进行编辑，利用分割模型将重点放在这些关键区域上。给定物体的多视图图像后，我们会优化神经着色器和多边形网格，同时从每个视图中提取法线贴图和渲染图像。利用这些输出作为条件数据，我们使用文本到图像的扩散模型编辑输入图像，并在变形网格的同时迭代更新数据集。这一过程的结果是根据给定的文本指令编辑多边形网格，保留初始网格的几何特征，同时关注最重要的区域。我们使用 DTU 数据集对我们的管道进行了评估，结果表明它比当前最先进的方法更快地生成经过精细编辑的网格。我们在补充材料中提供了我们的代码和其他结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

arXiv - CS - Computer Vision and Pattern Recognition

自引率

0.00%

发文量

期刊最新文献

Massively Multi-Person 3D Human Motion Forecasting with Scene Context Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution Precise Forecasting of Sky Images Using Spatial Warping JEAN: Joint Expression and Audio-guided NeRF-based Talking Face Generation Applications of Knowledge Distillation in Remote Sensing: A Survey