Lihan Jiang, Yucheng Mao, Linning Xu, Tao Lu, Kerui Ren, Yichen Jin, Xudong Xu, Mulin Yu, Jiangmiao Pang, Feng Zhao, Dahua Lin, Bo Dai
We introduce AnySplat, a feed-forward network for novel-view synthesis from uncalibrated image collections. In contrast to traditional neural-rendering pipelines that demand known camera poses and per-scene optimization, or recent feed-forward methods that buckle under the computational weight of dense views—our model predicts everything in one shot. A single forward pass yields a set of 3D Gaussian primitives encoding both scene geometry and appearance, and the corresponding camera intrinsics and extrinsics for each input image. This unified design scales effortlessly to casually captured, multi-view datasets without any pose annotations. In extensive zero-shot evaluations, AnySplat matches the quality of pose-aware baselines in both sparse- and dense-view scenarios while surpassing existing pose-free approaches. Moreover, it greatly reduces rendering latency compared to optimization-based neural fields, bringing real-time novel-view synthesis within reach for unconstrained capture settings. Project page: https://city-super.github.io/anysplat/.
{"title":"AnySplat: Feed-forward 3D Gaussian Splatting from Unconstrained Views","authors":"Lihan Jiang, Yucheng Mao, Linning Xu, Tao Lu, Kerui Ren, Yichen Jin, Xudong Xu, Mulin Yu, Jiangmiao Pang, Feng Zhao, Dahua Lin, Bo Dai","doi":"10.1145/3763326","DOIUrl":"https://doi.org/10.1145/3763326","url":null,"abstract":"We introduce AnySplat, a feed-forward network for novel-view synthesis from uncalibrated image collections. In contrast to traditional neural-rendering pipelines that demand known camera poses and per-scene optimization, or recent feed-forward methods that buckle under the computational weight of dense views—our model predicts everything in one shot. A single forward pass yields a set of 3D Gaussian primitives encoding both scene geometry and appearance, and the corresponding camera intrinsics and extrinsics for each input image. This unified design scales effortlessly to casually captured, multi-view datasets without any pose annotations. In extensive zero-shot evaluations, AnySplat matches the quality of pose-aware baselines in both sparse- and dense-view scenarios while surpassing existing pose-free approaches. Moreover, it greatly reduces rendering latency compared to optimization-based neural fields, bringing real-time novel-view synthesis within reach for unconstrained capture settings. Project page: https://city-super.github.io/anysplat/.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"28 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The simulation of sand-water mixtures requires capturing the stochastic behavior of individual sand particles within a uniform, continuous fluid medium. However, most existing approaches, which only treat sand particles as markers within fluid solvers, fail to account for both the forces acting on individual sand particles and the collective feedback of the particle assemblies on the fluid. This prevents faithful reproduction of characteristic phenomena including transport, deposition, and clogging. Building upon kinetic ensemble averaging technique, we propose a physically consistent coupling strategy and introduce a novel Granule-In-Cell (GIC) method for modeling such sand-water interactions. We employ the Discrete Element Method (DEM) to capture fine-scale granule dynamics and the Particle-In-Cell (PIC) method for continuous spatial representation and density projection. To bridge these two frameworks, we treat granules as macroscopic transport flow rather than solid boundaries within the fluid domain. This bidirectional coupling allows our model to incorporate a range of interphase forces using different discretization schemes, resulting in more realistic simulations that strictly adhere to the mass conservation law. Experimental results demonstrate the effectiveness of our method in simulating complex sand-water interactions, uniquely capturing intricate physical phenomena and ensuring exact volume preservation compared to existing approaches.
{"title":"The Granule-In-Cell Method for Simulating Sand–Water Mixtures","authors":"Yizao Tang, Yuechen Zhu, Xingyu Ni, Baoquan Chen","doi":"10.1145/3763279","DOIUrl":"https://doi.org/10.1145/3763279","url":null,"abstract":"The simulation of sand-water mixtures requires capturing the stochastic behavior of individual sand particles within a uniform, continuous fluid medium. However, most existing approaches, which only treat sand particles as markers within fluid solvers, fail to account for both the forces acting on individual sand particles and the collective feedback of the particle assemblies on the fluid. This prevents faithful reproduction of characteristic phenomena including transport, deposition, and clogging. Building upon kinetic ensemble averaging technique, we propose a physically consistent coupling strategy and introduce a novel Granule-In-Cell (GIC) method for modeling such sand-water interactions. We employ the Discrete Element Method (DEM) to capture fine-scale granule dynamics and the Particle-In-Cell (PIC) method for continuous spatial representation and density projection. To bridge these two frameworks, we treat granules as macroscopic transport flow rather than solid boundaries within the fluid domain. This bidirectional coupling allows our model to incorporate a range of interphase forces using different discretization schemes, resulting in more realistic simulations that strictly adhere to the mass conservation law. Experimental results demonstrate the effectiveness of our method in simulating complex sand-water interactions, uniquely capturing intricate physical phenomena and ensuring exact volume preservation compared to existing approaches.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"367 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The intricate geometric complexity of knots, tangles, dreads and clumps require sophisticated grooming systems that allow artists to both realistically model and artistically control fur and hair systems. Recent volumetric and 3D neural style transfer techniques provided a new paradigm of art directability, allowing artists to modify assets drastically with the use of single style images. However, these previous 3D neural stylization approaches were limited to volumes and meshes. In this paper we propose the first stylization pipeline to support hair and fur. Through a carefully tailored fur/hair representation, our approach allows complex, 3D consistent and temporally coherent grooms that are stylized using style images.
{"title":"Shaping Strands with Neural Style Transfer","authors":"Beyzanur Coban, Pascal Chang, Guilherme Gomes Haetinger, Jingwei Tang, Vinicius C. Azevedo","doi":"10.1145/3763365","DOIUrl":"https://doi.org/10.1145/3763365","url":null,"abstract":"The intricate geometric complexity of knots, tangles, dreads and clumps require sophisticated grooming systems that allow artists to both realistically model and artistically control fur and hair systems. Recent volumetric and 3D neural style transfer techniques provided a new paradigm of art directability, allowing artists to modify assets drastically with the use of single style images. However, these previous 3D neural stylization approaches were limited to volumes and meshes. In this paper we propose the first stylization pipeline to support hair and fur. Through a carefully tailored fur/hair representation, our approach allows complex, 3D consistent and temporally coherent grooms that are stylized using style images.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"27 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145674171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Prior research has demonstrated the efficacy of balanced trees as spatially adaptive grids for large-scale simulations. However, state-of-the-art methods for balanced tree construction are restricted by the iterative nature of the ripple effect, thus failing to fully leverage the massive parallelism offered by modern GPU architectures. We propose to reframe the construction of balanced trees as a process to merge N -balanced Minimum Spanning Trees ( N -balanced MSTs) generated from a collection of seed points. To ensure optimal performance, we propose a stack-free parallel strategy for constructing all internal nodes of a specified N -balanced MST. This approach leverages two 32-bit integer registers as buffers rather than relying on an integer array as a stack during construction, which helps maintain balanced workloads across different GPU threads. We then propose a dynamic update algorithm utilizing refinement counters for all internal nodes to enable parallel insertion and deletion operations of N -balanced MSTs. This design achieves significant efficiency improvements compared to full reconstruction from scratch, thereby facilitating fluid simulations in handling dynamic moving boundaries. Our approach is fully compatible with GPU implementation and demonstrates up to an order-of-magnitude speedup compared to the state-of-the-art method [Wang et al. 2024]. The source code for the paper is publicly available at https://github.com/peridyno/peridyno.
先前的研究已经证明了平衡树作为空间自适应网格在大规模模拟中的有效性。然而,最先进的平衡树构建方法受到涟漪效应迭代性质的限制,因此无法充分利用现代GPU架构提供的大规模并行性。我们建议将平衡树的构造重新定义为一个合并由种子点集合生成的N个平衡最小生成树(N -balanced MSTs)的过程。为了保证最优的性能,我们提出了一种无堆栈并行策略来构建指定N均衡MST的所有内部节点。这种方法利用两个32位整数寄存器作为缓冲区,而不是在构造期间依赖整数数组作为堆栈,这有助于在不同GPU线程之间保持平衡的工作负载。然后,我们提出了一种动态更新算法,利用所有内部节点的细化计数器来实现N平衡mst的并行插入和删除操作。与从头开始的完全重建相比,该设计实现了显著的效率提高,从而促进了处理动态移动边界的流体模拟。我们的方法与GPU实现完全兼容,并且与最先进的方法相比,速度提高了一个数量级[Wang et al. 2024]。该论文的源代码可在https://github.com/peridyno/peridyno上公开获取。
{"title":"A Stack-Free Parallel h-Adaptation Algorithm for Dynamically Balanced Trees on GPUs","authors":"Lixin Ren, Xiaowei He, Shusen Liu, Yuzhong Guo, Enhua Wu","doi":"10.1145/3763349","DOIUrl":"https://doi.org/10.1145/3763349","url":null,"abstract":"Prior research has demonstrated the efficacy of balanced trees as spatially adaptive grids for large-scale simulations. However, state-of-the-art methods for balanced tree construction are restricted by the iterative nature of the ripple effect, thus failing to fully leverage the massive parallelism offered by modern GPU architectures. We propose to reframe the construction of balanced trees as a process to merge <jats:italic toggle=\"yes\">N</jats:italic> -balanced Minimum Spanning Trees ( <jats:italic toggle=\"yes\">N</jats:italic> -balanced MSTs) generated from a collection of seed points. To ensure optimal performance, we propose a stack-free parallel strategy for constructing all internal nodes of a specified <jats:italic toggle=\"yes\">N</jats:italic> -balanced MST. This approach leverages two 32-bit integer registers as buffers rather than relying on an integer array as a stack during construction, which helps maintain balanced workloads across different GPU threads. We then propose a dynamic update algorithm utilizing refinement counters for all internal nodes to enable parallel insertion and deletion operations of <jats:italic toggle=\"yes\">N</jats:italic> -balanced MSTs. This design achieves significant efficiency improvements compared to full reconstruction from scratch, thereby facilitating fluid simulations in handling dynamic moving boundaries. Our approach is fully compatible with GPU implementation and demonstrates up to an order-of-magnitude speedup compared to the state-of-the-art method [Wang et al. 2024]. The source code for the paper is publicly available at https://github.com/peridyno/peridyno.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"115 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kaixuan Wei, Hector Romero, Hadi Amata, Jipeng Sun, Qiang Fu, Felix Heide, Wolfgang Heidrich
Differentiable optics, as an emerging paradigm that jointly optimizes optics and (optional) image processing algorithms, has made many innovative optical designs possible across a broad range of imaging and display applications. Many of these systems utilize diffractive optical components for holography, PSF engineering, or wavefront shaping. Existing approaches have, however, mostly remained limited to laboratory prototypes, owing to a large quality gap between simulation and manufactured devices. We aim at lifting the fundamental technical barriers to the practical use of learned diffractive optical systems. To this end, we propose a fabrication-aware design pipeline for diffractive optics fabricated by direct-write grayscale lithography followed by replication with nano-imprinting, which is directly suited for inexpensive mass-production of large area designs. We propose a super-resolved neural lithography model that can accurately predict the 3D geometry generated by the fabrication process. This model can be seamlessly integrated into existing differentiable optics frameworks, enabling fabrication-aware, end-to-end optimization of computational optical systems. To tackle the computational challenges, we also devise tensor-parallel compute framework centered on distributing large-scale FFT computation across many GPUs. As such, we demonstrate large scale diffractive optics designs up to 32.16 mm × 21.44 mm, simulated on grids of up to 128,640 by 85,760 feature points. We find adequate agreement between simulation and fabricated prototypes for applications such as holography and PSF engineering. We also achieve high image quality from an imaging system comprised only of a single diffractive optical element, with images processed only by a one-step inverse filter utilizing the simulation PSF. We believe our findings lift the fabrication limitations for real-world applications of diffractive optics and differentiable optical design.
可微光学作为一种新兴的范例,共同优化光学和(可选)图像处理算法,使许多创新的光学设计在广泛的成像和显示应用中成为可能。许多这些系统利用衍射光学元件全息,PSF工程,或波前整形。然而,由于模拟和制造设备之间存在很大的质量差距,现有的方法大多仍然局限于实验室原型。我们的目标是解除基本的技术障碍,以实际使用的学习衍射光学系统。为此,我们提出了一种制造感知设计管道,用于通过直接写入灰度光刻制造的衍射光学器件,然后使用纳米压印复制,这直接适用于大面积设计的廉价批量生产。我们提出了一种超分辨神经光刻模型,可以准确地预测制造过程中产生的三维几何形状。该模型可以无缝集成到现有的可微分光学框架中,实现计算光学系统的制造感知端到端优化。为了解决计算挑战,我们还设计了张量并行计算框架,该框架以跨多个gpu分布大规模FFT计算为中心。因此,我们展示了32.16 mm × 21.44 mm的大规模衍射光学设计,在高达128,640 × 85,760个特征点的网格上进行了模拟。我们发现模拟和制造原型之间有足够的一致性,用于全息和PSF工程等应用。我们还通过仅由单个衍射光学元件组成的成像系统实现了高图像质量,图像仅通过利用模拟PSF的一步反滤波器处理。我们相信我们的发现解除了衍射光学和可微光学设计在实际应用中的制造限制。
{"title":"Large-Area Fabrication-aware Computational Diffractive Optics","authors":"Kaixuan Wei, Hector Romero, Hadi Amata, Jipeng Sun, Qiang Fu, Felix Heide, Wolfgang Heidrich","doi":"10.1145/3763358","DOIUrl":"https://doi.org/10.1145/3763358","url":null,"abstract":"Differentiable optics, as an emerging paradigm that jointly optimizes optics and (optional) image processing algorithms, has made many innovative optical designs possible across a broad range of imaging and display applications. Many of these systems utilize diffractive optical components for holography, PSF engineering, or wavefront shaping. Existing approaches have, however, mostly remained limited to laboratory prototypes, owing to a large quality gap between simulation and manufactured devices. We aim at lifting the fundamental technical barriers to the practical use of learned diffractive optical systems. To this end, we propose a fabrication-aware design pipeline for diffractive optics fabricated by direct-write grayscale lithography followed by replication with nano-imprinting, which is directly suited for inexpensive mass-production of large area designs. We propose a super-resolved neural lithography model that can accurately predict the 3D geometry generated by the fabrication process. This model can be seamlessly integrated into existing differentiable optics frameworks, enabling fabrication-aware, end-to-end optimization of computational optical systems. To tackle the computational challenges, we also devise tensor-parallel compute framework centered on distributing large-scale FFT computation across many GPUs. As such, we demonstrate large scale diffractive optics designs up to 32.16 mm × 21.44 mm, simulated on grids of up to 128,640 by 85,760 feature points. We find adequate agreement between simulation and fabricated prototypes for applications such as holography and PSF engineering. We also achieve high image quality from an imaging system comprised only of a single diffractive optical element, with images processed only by a one-step inverse filter utilizing the simulation PSF. We believe our findings lift the fabrication limitations for real-world applications of diffractive optics and differentiable optical design.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"21 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kehan Xu, Benedikt Bitterli, Eugene d'Eon, Wojciech Jarosz
A fundamental challenge in rendering has been the dichotomy between surface and volume models. Gaussian Process Implicit Surfaces (GPISes) recently provided a unified approach for surfaces, volumes, and the spectrum in between. However, this representation remains impractical due to its high computational cost and mathematical complexity. We address these limitations by reformulating GPISes as procedural noise, eliminating expensive linear system solves while maintaining control over spatial correlations. Our method enables efficient sampling of stochastic realizations and supports flexible conditioning of values and derivatives through pathwise updates. To further enable practical rendering, we derive analytic distributions for surface normals, allowing for variance-reduced light transport via next-event estimation and multiple importance sampling. Our framework achieves efficient, high-quality rendering of stochastic surfaces and volumes with significantly simplified implementations on both CPU and GPU, while preserving the generality of the original GPIS representation.
{"title":"Practical Gaussian Process Implicit Surfaces with Sparse Convolutions","authors":"Kehan Xu, Benedikt Bitterli, Eugene d'Eon, Wojciech Jarosz","doi":"10.1145/3763329","DOIUrl":"https://doi.org/10.1145/3763329","url":null,"abstract":"A fundamental challenge in rendering has been the dichotomy between surface and volume models. Gaussian Process Implicit Surfaces (GPISes) recently provided a unified approach for surfaces, volumes, and the spectrum in between. However, this representation remains impractical due to its high computational cost and mathematical complexity. We address these limitations by reformulating GPISes as procedural noise, eliminating expensive linear system solves while maintaining control over spatial correlations. Our method enables efficient sampling of stochastic realizations and supports flexible conditioning of values and derivatives through pathwise updates. To further enable practical rendering, we derive analytic distributions for surface normals, allowing for variance-reduced light transport via next-event estimation and multiple importance sampling. Our framework achieves efficient, high-quality rendering of stochastic surfaces and volumes with significantly simplified implementations on both CPU and GPU, while preserving the generality of the original GPIS representation.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"1 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jose Luis Ponton, Sheldon Andrews, Carlos Andujar, Nuria Pelechano
Interactive applications demand believable characters that respond naturally to dynamic environments. Traditional character animation techniques often struggle to handle arbitrary situations, leading to a growing trend of dynamically selecting motion-captured animations based on predefined features. While Motion Matching has proven effective for locomotion by aligning to target trajectories, animating environment interactions and crowd behaviors remains challenging due to the need to consider surrounding elements. Existing approaches often involve manual setup or lack the naturalism of motion capture. Furthermore, in crowd animation, body animation is frequently treated as a separate process from trajectory planning, leading to inconsistencies between body pose and root motion. To address these limitations, we present Environment-aware Motion Matching , a novel real-time system for full-body character animation that dynamically adapts to obstacles and other agents, emphasizing the bidirectional relationship between pose and trajectory. In a preprocessing step, we extract shape, pose, and trajectory features from a motion capture database. At runtime, we perform an efficient search that matches user input and current pose while penalizing collisions with a dynamic environment. Our method allows characters to naturally adjust their pose and trajectory to navigate crowded scenes.
{"title":"Environment-aware Motion Matching","authors":"Jose Luis Ponton, Sheldon Andrews, Carlos Andujar, Nuria Pelechano","doi":"10.1145/3763334","DOIUrl":"https://doi.org/10.1145/3763334","url":null,"abstract":"Interactive applications demand believable characters that respond naturally to dynamic environments. Traditional character animation techniques often struggle to handle arbitrary situations, leading to a growing trend of dynamically selecting motion-captured animations based on predefined features. While Motion Matching has proven effective for locomotion by aligning to target trajectories, animating environment interactions and crowd behaviors remains challenging due to the need to consider surrounding elements. Existing approaches often involve manual setup or lack the naturalism of motion capture. Furthermore, in crowd animation, body animation is frequently treated as a separate process from trajectory planning, leading to inconsistencies between body pose and root motion. To address these limitations, we present <jats:italic toggle=\"yes\">Environment-aware Motion Matching</jats:italic> , a novel real-time system for full-body character animation that dynamically adapts to obstacles and other agents, emphasizing the bidirectional relationship between pose and trajectory. In a preprocessing step, we extract shape, pose, and trajectory features from a motion capture database. At runtime, we perform an efficient search that matches user input and current pose while penalizing collisions with a dynamic environment. Our method allows characters to naturally adjust their pose and trajectory to navigate crowded scenes.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"155 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shaojie Bai, Seunghyeon Seo, Yida Wang, Chenghui Li, Owen Wang, Te-Li Wang, Tianyang Ma, Jason Saragih, Shih-En Wei, Nojun Kwak, Hyung Jun(John) Kim
Enabling photorealistic avatar animations in virtual and augmented reality (VR/AR) has been challenging because of the difficulty of obtaining ground truth state of faces. It is physically impossible to obtain synchronized images from head-mounted cameras (HMC) sensing input, which has partial observations in infrared (IR), and an array of outside-in dome cameras, which have full observations that match avatars' appearance. Prior works relying on analysis-by-synthesis methods could generate accurate ground truth, but suffer from imperfect disentanglement between expression and style in their personalized training. The reliance of extensive paired captures (HMC and dome) for the same subject makes it operationally expensive to collect large-scale datasets, which cannot be reused for different HMC viewpoints and lighting. In this work, we propose a novel generative approach, Generative HMC (GenHMC), that leverages large unpaired HMC captures , which are much easier to collect, to directly generate high-quality synthetic HMC images given any conditioning avatar state from dome captures. We show that our method is able to properly disentangle the input conditioning signal that specifies facial expression and viewpoint, from facial appearance, leading to more accurate ground truth. Furthermore, our method can generalize to unseen identities, removing the reliance on the paired captures. We demonstrate these breakthroughs by both evaluating synthetic HMC images and universal face encoders trained from these new HMC-avatar correspondences, which achieve better data efficiency and state-of-the-art accuracy.
{"title":"Generative Head-Mounted Camera Captures for Photorealistic Avatars","authors":"Shaojie Bai, Seunghyeon Seo, Yida Wang, Chenghui Li, Owen Wang, Te-Li Wang, Tianyang Ma, Jason Saragih, Shih-En Wei, Nojun Kwak, Hyung Jun(John) Kim","doi":"10.1145/3763300","DOIUrl":"https://doi.org/10.1145/3763300","url":null,"abstract":"Enabling photorealistic avatar animations in virtual and augmented reality (VR/AR) has been challenging because of the difficulty of obtaining ground truth state of faces. It is <jats:italic toggle=\"yes\">physically impossible</jats:italic> to obtain synchronized images from head-mounted cameras (HMC) sensing input, which has partial observations in infrared (IR), and an array of outside-in dome cameras, which have full observations that match avatars' appearance. Prior works relying on analysis-by-synthesis methods could generate accurate ground truth, but suffer from imperfect disentanglement between expression and style in their personalized training. The reliance of extensive paired captures (HMC and dome) for the <jats:italic toggle=\"yes\">same</jats:italic> subject makes it operationally expensive to collect large-scale datasets, which cannot be reused for different HMC viewpoints and lighting. In this work, we propose a novel generative approach, Generative HMC (GenHMC), that leverages <jats:italic toggle=\"yes\">large unpaired HMC captures</jats:italic> , which are much easier to collect, to directly generate high-quality <jats:italic toggle=\"yes\">synthetic</jats:italic> HMC images given any conditioning avatar state from dome captures. We show that our method is able to properly disentangle the input conditioning signal that specifies facial expression and viewpoint, from facial appearance, leading to more accurate ground truth. Furthermore, our method can generalize to unseen identities, removing the reliance on the paired captures. We demonstrate these breakthroughs by both evaluating synthetic HMC images and universal face encoders trained from these new HMC-avatar correspondences, which achieve better data efficiency and state-of-the-art accuracy.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"28 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present a novel combustion simulation framework to model fire phenomena across solids, liquids, and gases. Our approach extends traditional fluid solvers by incorporating multi-species thermodynamics and reactive transport for fuel, oxygen, nitrogen, carbon dioxide, water vapor, and residuals. Combustion reactions are governed by stoichiometry-dependent heat release, allowing an accurate simulation of premixed and diffusive flames with varying intensity and composition. We support a wide range of scenarios including jet fires, water suppression (sprays and sprinklers), fuel evaporation, and starvation conditions. Our framework enables interactive heat sources, fire detectors, and realistic rendering of flames (e.g., laminar-to-turbulent transitions and blue-to-orange color shifts). Our key contributions include the tight coupling of species dynamics with thermodynamic feedback, evaporation modeling, and a hybrid SPH-grid representation for the efficient simulation of extinguishing fires. We validate our method through numerous experiments that demonstrate its versatility in both indoor and outdoor fire scenarios.
{"title":"Fire-X: Extinguishing Fire with Stoichiometric Heat Release","authors":"Helge Wrede, Anton Wagner, Sarker Miraz Mahfuz, Wojtek Palubicki, Dominik Michels, Sören Pirk","doi":"10.1145/3763338","DOIUrl":"https://doi.org/10.1145/3763338","url":null,"abstract":"We present a novel combustion simulation framework to model fire phenomena across solids, liquids, and gases. Our approach extends traditional fluid solvers by incorporating multi-species thermodynamics and reactive transport for fuel, oxygen, nitrogen, carbon dioxide, water vapor, and residuals. Combustion reactions are governed by stoichiometry-dependent heat release, allowing an accurate simulation of premixed and diffusive flames with varying intensity and composition. We support a wide range of scenarios including jet fires, water suppression (sprays and sprinklers), fuel evaporation, and starvation conditions. Our framework enables interactive heat sources, fire detectors, and realistic rendering of flames (e.g., laminar-to-turbulent transitions and blue-to-orange color shifts). Our key contributions include the tight coupling of species dynamics with thermodynamic feedback, evaporation modeling, and a hybrid SPH-grid representation for the efficient simulation of extinguishing fires. We validate our method through numerous experiments that demonstrate its versatility in both indoor and outdoor fire scenarios.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"1 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper addresses the problem of decomposed 4D scene reconstruction from multi-view videos. Recent methods achieve this by lifting video segmentation results to a 4D representation through differentiable rendering techniques. Therefore, they heavily rely on the quality of video segmentation maps, which are often unstable, leading to unreliable reconstruction results. To overcome this challenge, our key idea is to represent the decomposed 4D scene with the Freetime FeatureGS and design a streaming feature learning strategy to accurately recover it from per-image segmentation maps, eliminating the need for video segmentation. Freetime FeatureGS models the dynamic scene as a set of Gaussian primitives with learnable features and linear motion ability, allowing them to move to neighboring regions over time. We apply a contrastive loss to Freetime FeatureGS, forcing primitive features to be close or far apart based on whether their projections belong to the same instance in the 2D segmentation map. As our Gaussian primitives can move across time, it naturally extends the feature learning to the temporal dimension, achieving 4D segmentation. Furthermore, we sample observations for training in a temporally ordered manner, enabling the streaming propagation of features over time and effectively avoiding local minima during the optimization process. Experimental results on several datasets show that the reconstruction quality of our method outperforms recent methods by a large margin.
{"title":"Split4D: Decomposed 4D Scene Reconstruction Without Video Segmentation","authors":"Yongzhen Hu, Yihui Yang, Haotong Lin, Yifan Wang, Junting Dong, Yifu Deng, Xinyu Zhu, Fan Jia, Hujun Bao, Xiaowei Zhou, Sida Peng","doi":"10.1145/3763343","DOIUrl":"https://doi.org/10.1145/3763343","url":null,"abstract":"This paper addresses the problem of decomposed 4D scene reconstruction from multi-view videos. Recent methods achieve this by lifting video segmentation results to a 4D representation through differentiable rendering techniques. Therefore, they heavily rely on the quality of video segmentation maps, which are often unstable, leading to unreliable reconstruction results. To overcome this challenge, our key idea is to represent the decomposed 4D scene with the Freetime FeatureGS and design a streaming feature learning strategy to accurately recover it from per-image segmentation maps, eliminating the need for video segmentation. Freetime FeatureGS models the dynamic scene as a set of Gaussian primitives with learnable features and linear motion ability, allowing them to move to neighboring regions over time. We apply a contrastive loss to Freetime FeatureGS, forcing primitive features to be close or far apart based on whether their projections belong to the same instance in the 2D segmentation map. As our Gaussian primitives can move across time, it naturally extends the feature learning to the temporal dimension, achieving 4D segmentation. Furthermore, we sample observations for training in a temporally ordered manner, enabling the streaming propagation of features over time and effectively avoiding local minima during the optimization process. Experimental results on several datasets show that the reconstruction quality of our method outperforms recent methods by a large margin.","PeriodicalId":50913,"journal":{"name":"ACM Transactions on Graphics","volume":"4 1","pages":""},"PeriodicalIF":6.2,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145673774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}