We present MatSwap, a method to transfer materials to designated surfaces in an image realistically. Such a task is non-trivial due to the large entanglement of material appearance, geometry, and lighting in a photograph. In the literature, material editing methods typically rely on either cumbersome text engineering or extensive manual annotations requiring artist knowledge and 3D scene properties that are impractical to obtain. In contrast, we propose to directly learn the relationship between the input material—as observed on a flat surface—and its appearance within the scene, without the need for explicit UV mapping. To achieve this, we rely on a custom light- and geometry-aware diffusion model. We fine-tune a large-scale pre-trained text-to-image model for material transfer using our synthetic dataset, preserving its strong priors to ensure effective generalization to real images. As a result, our method seamlessly integrates a desired material into the target location in the photograph while retaining the identity of the scene. MatSwap is evaluated on synthetic and real images showing that it compares favorably to recent works. Our code and data are made publicly available on https://github.com/astra-vision/MatSwap
We leverage finetuned video diffusion models, intrinsic decomposition of videos, and physically-based differentiable rendering to generate high quality materials for 3D models given a text prompt or a single image. We condition a video diffusion model to respect the input geometry and lighting condition. This model produces multiple views of a given 3D model with coherent material properties. Secondly, we use a recent model to extract intrinsics (base color, roughness, metallic) from the generated video. Finally, we use the intrinsics alongside the generated video in a differentiable path tracer to robustly extract PBR materials directly compatible with common content creation tools.
Realistic hair rendering remains a significant challenge in computer graphics due to the intricate microstructure of hair fibers and their anisotropic scattering properties, which make them highly sensitive to noise. Although recent advancements in image-space and 3D-space denoising and antialiasing techniques have facilitated real-time rendering in simple scenes, existing methods still struggle with excessive blurring and artifacts, particularly in fine hair details such as flyaway strands. These issues arise because current techniques often fail to preserve sub-pixel continuity and lack directional sensitivity in the filtering process. To address these limitations, we introduce a novel real-time hair filtering technique that effectively reconstructs fine fiber details while suppressing noise. Our method improves visual quality by maintaining strand-level details and ensuring computational efficiency, making it well-suited for real-time applications in video games and virtual reality (VR) and augmented reality (AR) environments.