Pub Date : 2026-01-14DOI: 10.1109/TVCG.2026.3653265
Antonia Saske, Laura Koesten, Torsten Moller, Judith Staudner, Sylvia Kritzinger
How audiences read, interpret, and critique data visualizations is mainly assessed through performance tests featuring tasks like value retrieval. Yet, other factors shown to shape visualization understanding, such as numeracy, graph familiarity, and aesthetic perception, remain underrepresented in existing instruments. To address this, we design and test a Multidimensional Assessment Method for Visualization Understanding (MdamV). This method integrates task-based measures with self-perceived ability ratings and open-ended critique, applied directly to the visualizations being read. Grounded in learning sciences frameworks that view understanding as a multifaceted process, MdamV spans six dimensions: Comprehending, Decoding, Aestheticizing, Critiquing, Reading, and Contextualizing. Validation was supported by a survey (N=438) representative of Austria's population (ages18-74, male/female split), using a line chart and a bar chart on climate data. Findings show, for example, that about a quarter of respondents indicate deficits in comprehending simple data units, roughly one in five people felt unfamiliar with each chart type, and self-assessed numeracy was significantly related to data reading performance (p=0.0004). Overall, the evaluation of MdamV demonstrates the value of assessing visualization understanding beyond performance, framing it as a situated process tied to particular visualizations.
{"title":"A Multidimensional Assessment Method for Visualization Understanding (MdamV).","authors":"Antonia Saske, Laura Koesten, Torsten Moller, Judith Staudner, Sylvia Kritzinger","doi":"10.1109/TVCG.2026.3653265","DOIUrl":"https://doi.org/10.1109/TVCG.2026.3653265","url":null,"abstract":"<p><p>How audiences read, interpret, and critique data visualizations is mainly assessed through performance tests featuring tasks like value retrieval. Yet, other factors shown to shape visualization understanding, such as numeracy, graph familiarity, and aesthetic perception, remain underrepresented in existing instruments. To address this, we design and test a Multidimensional Assessment Method for Visualization Understanding (MdamV). This method integrates task-based measures with self-perceived ability ratings and open-ended critique, applied directly to the visualizations being read. Grounded in learning sciences frameworks that view understanding as a multifaceted process, MdamV spans six dimensions: Comprehending, Decoding, Aestheticizing, Critiquing, Reading, and Contextualizing. Validation was supported by a survey (N=438) representative of Austria's population (ages18-74, male/female split), using a line chart and a bar chart on climate data. Findings show, for example, that about a quarter of respondents indicate deficits in comprehending simple data units, roughly one in five people felt unfamiliar with each chart type, and self-assessed numeracy was significantly related to data reading performance (p=0.0004). Overall, the evaluation of MdamV demonstrates the value of assessing visualization understanding beyond performance, framing it as a situated process tied to particular visualizations.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":6.5,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145986131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-14DOI: 10.1109/TVCG.2026.3653317
Damien Rohmer, Karim Salem, Niranjan Kalyanasundaram, Victor Zordan
We present an extension to traditional rig skinning, like Linear Blend Skinning (LBS), to produce secondary motions that exhibit the appearance of a physical phenomena without need for simulation.At the core of the technique, we call dynamic skinning, is a set of deformers which offset position of individual vertices as a function of position derivatives and time.Examples of such deformers create effects such as oscillation in response to movement and the appearance of wave propagation, among others. Because the technique computes offsets directly and does not solve physics equations, it is extremely fast to compute. It also boasts a highdegree of customizability which supports a desirable artist workflow and fine level of control. Finally, we showcase the technique in a number of scenarios and make comparisons with the state of the art.
{"title":"Dynamic Skinning: Kinematics-Driven Cartoon Effects for Articulated Characters.","authors":"Damien Rohmer, Karim Salem, Niranjan Kalyanasundaram, Victor Zordan","doi":"10.1109/TVCG.2026.3653317","DOIUrl":"https://doi.org/10.1109/TVCG.2026.3653317","url":null,"abstract":"<p><p>We present an extension to traditional rig skinning, like Linear Blend Skinning (LBS), to produce secondary motions that exhibit the appearance of a physical phenomena without need for simulation.At the core of the technique, we call dynamic skinning, is a set of deformers which offset position of individual vertices as a function of position derivatives and time.Examples of such deformers create effects such as oscillation in response to movement and the appearance of wave propagation, among others. Because the technique computes offsets directly and does not solve physics equations, it is extremely fast to compute. It also boasts a highdegree of customizability which supports a desirable artist workflow and fine level of control. Finally, we showcase the technique in a number of scenarios and make comparisons with the state of the art.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":6.5,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145986155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/TVCG.2026.3651640
Chuhang Ma, Shuai Tan, Ye Pan, Jiaolong Yang, Xin Tong
Most current audio-driven facial animation research primarily focuses on generating videos with neutral emotions. While some studies have addressed the generation of facial videos driven by emotional audio, efficiently generating high-quality talking head videos that integrate both emotional expressions and style features remains a significant challenge. In this paper, we propose ESGaussianFace, an innovative framework for emotional and stylized audio-driven facial animation. Our approach leverages 3D Gaussian Splatting to reconstruct 3D scenes and render videos, ensuring efficient generation of 3D consistent results. We propose an emotion-audio-guided spatial attention method that effectively integrates emotion features with audio content features. Through emotion-guided attention, the model is able to reconstruct facial details across different emotional states more accurately. To achieve emotional and stylized deformations of the 3D Gaussian points through emotion and style features, we introduce two 3D Gaussian deformation predictors. Futhermore, we propose a multi-stage training strategy, enabling the step-by-step learning of the character's lip movements, emotional variations, and style features. Our generated results exhibit high efficiency, high quality, and 3D consistency. Extensive experimental results demonstrate that our method outperforms existing state-of-the-art techniques in terms of lip movement accuracy, expression variation, and style feature expressiveness.
{"title":"ESGaussianFace: Emotional and Stylized Audio-Driven Facial Animation Via 3D Gaussian Splatting.","authors":"Chuhang Ma, Shuai Tan, Ye Pan, Jiaolong Yang, Xin Tong","doi":"10.1109/TVCG.2026.3651640","DOIUrl":"https://doi.org/10.1109/TVCG.2026.3651640","url":null,"abstract":"<p><p>Most current audio-driven facial animation research primarily focuses on generating videos with neutral emotions. While some studies have addressed the generation of facial videos driven by emotional audio, efficiently generating high-quality talking head videos that integrate both emotional expressions and style features remains a significant challenge. In this paper, we propose ESGaussianFace, an innovative framework for emotional and stylized audio-driven facial animation. Our approach leverages 3D Gaussian Splatting to reconstruct 3D scenes and render videos, ensuring efficient generation of 3D consistent results. We propose an emotion-audio-guided spatial attention method that effectively integrates emotion features with audio content features. Through emotion-guided attention, the model is able to reconstruct facial details across different emotional states more accurately. To achieve emotional and stylized deformations of the 3D Gaussian points through emotion and style features, we introduce two 3D Gaussian deformation predictors. Futhermore, we propose a multi-stage training strategy, enabling the step-by-step learning of the character's lip movements, emotional variations, and style features. Our generated results exhibit high efficiency, high quality, and 3D consistency. Extensive experimental results demonstrate that our method outperforms existing state-of-the-art techniques in terms of lip movement accuracy, expression variation, and style feature expressiveness.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":6.5,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145961101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
3D Gaussian Splatting (3DGS) has recently attracted wide attentions in various areas such as 3D navigation, Virtual Reality (VR) and 3D simulation, due to its photorealistic and efficient rendering performance. High-quality reconstrution of 3DGS relies on sufficient splats and a reasonable distribution of these splats to fit real geometric surface and texture details, which turns out to be a challenging problem. We present GeoTexDensifier, a novel geometry-texture-aware densification strategy to reconstruct high-quality Gaussian splats which better comply with the geometric structure and texture richness of the scene. Specifically, our GeoTexDensifier framework carries out an auxiliary texture-aware densification method to produce a denser distribution of splats in fully textured areas, while keeping sparsity in low-texture regions to maintain the quality of Gaussian point cloud. Meanwhile, a geometry-aware splitting strategy takes depth and normal priors to guide the splitting sampling and filter out the noisy splats whose initial positions are far from the actual geometric surfaces they aim to fit, under a Validation of Depth Ratio Change checking. With the help of relative monocular depth prior, such geometry-aware validation can effectively reduce the influence of scattered Gaussians to the final rendering quality, especially in regions with weak textures or without sufficient training views. The texture-aware densification and geometry-aware splitting strategies are fully combined to obtain a set of high-quality Gaussian splats. We experiment our GeoTexDensifier framework on various datasets and compare our Novel View Synthesis results to other state-of-the-art 3DGS approaches, with detailed quantitative and qualitative evaluations to demonstrate the effectiveness of our method in producing more photorealistic 3DGS models.
{"title":"GeoTexDensifier: Geometry-Texture-Aware Densification for High-Quality Photorealistic 3D Gaussian Splatting.","authors":"Hanqing Jiang, Xiaojun Xiang, Han Sun, Hongjie Li, Liyang Zhou, Xiaoyu Zhang, Guofeng Zhang","doi":"10.1109/TVCG.2025.3644697","DOIUrl":"https://doi.org/10.1109/TVCG.2025.3644697","url":null,"abstract":"<p><p>3D Gaussian Splatting (3DGS) has recently attracted wide attentions in various areas such as 3D navigation, Virtual Reality (VR) and 3D simulation, due to its photorealistic and efficient rendering performance. High-quality reconstrution of 3DGS relies on sufficient splats and a reasonable distribution of these splats to fit real geometric surface and texture details, which turns out to be a challenging problem. We present GeoTexDensifier, a novel geometry-texture-aware densification strategy to reconstruct high-quality Gaussian splats which better comply with the geometric structure and texture richness of the scene. Specifically, our GeoTexDensifier framework carries out an auxiliary texture-aware densification method to produce a denser distribution of splats in fully textured areas, while keeping sparsity in low-texture regions to maintain the quality of Gaussian point cloud. Meanwhile, a geometry-aware splitting strategy takes depth and normal priors to guide the splitting sampling and filter out the noisy splats whose initial positions are far from the actual geometric surfaces they aim to fit, under a Validation of Depth Ratio Change checking. With the help of relative monocular depth prior, such geometry-aware validation can effectively reduce the influence of scattered Gaussians to the final rendering quality, especially in regions with weak textures or without sufficient training views. The texture-aware densification and geometry-aware splitting strategies are fully combined to obtain a set of high-quality Gaussian splats. We experiment our GeoTexDensifier framework on various datasets and compare our Novel View Synthesis results to other state-of-the-art 3DGS approaches, with detailed quantitative and qualitative evaluations to demonstrate the effectiveness of our method in producing more photorealistic 3DGS models.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":6.5,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145961146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/TVCG.2026.3651382
Jia-Qi Zhang, Jia-Jun Wang, Fang-Lue Zhang, Miao Wang
The growing demand for diverse and realistic character animations in video games and films has driven the development of natural language-controlled motion generation systems. While recent advances in text-driven 3D human motion synthesis have made significant progress, generating realistic multi-person interactions remains a major challenge. Existing methods, such as denoising diffusion models and autoregressive frameworks, have explored interaction dynamics using attention mechanisms and causal modeling. However, they consistently overlook a critical physical constraint: the explicit spatial distance between interacting body parts, which is essential for producing semantically accurate and physically plausible interactions. To address this limitation, we propose InterDist, a novel masked generative Transformer model operating in a discrete state space. Our key idea is to decompose two-person motion into three components: two independent, interaction-agnostic single-person motion sequences and a separate interaction distance sequence. This formulation enables direct learning of both individual motion and dynamic spatial relationships from text prompts. We implement this via a VQ-VAE that jointly encodes independent motions and relative distances into discrete codebooks, followed by a bidirectional masked generative Transformer that models their joint distribution conditioned on text. To better align motion and language, we also introduce a cross-modal interaction module to enhance text-motion association. Our approach ensures the generated motions exhibit both semantic alignment with textual descriptions and preserving plausible inter-character distances, setting a new benchmark for text-driven multi-person interaction generation.
{"title":"Generating Distance-Aware Human-to-Human Interaction Motions From Text Guidance.","authors":"Jia-Qi Zhang, Jia-Jun Wang, Fang-Lue Zhang, Miao Wang","doi":"10.1109/TVCG.2026.3651382","DOIUrl":"https://doi.org/10.1109/TVCG.2026.3651382","url":null,"abstract":"<p><p>The growing demand for diverse and realistic character animations in video games and films has driven the development of natural language-controlled motion generation systems. While recent advances in text-driven 3D human motion synthesis have made significant progress, generating realistic multi-person interactions remains a major challenge. Existing methods, such as denoising diffusion models and autoregressive frameworks, have explored interaction dynamics using attention mechanisms and causal modeling. However, they consistently overlook a critical physical constraint: the explicit spatial distance between interacting body parts, which is essential for producing semantically accurate and physically plausible interactions. To address this limitation, we propose InterDist, a novel masked generative Transformer model operating in a discrete state space. Our key idea is to decompose two-person motion into three components: two independent, interaction-agnostic single-person motion sequences and a separate interaction distance sequence. This formulation enables direct learning of both individual motion and dynamic spatial relationships from text prompts. We implement this via a VQ-VAE that jointly encodes independent motions and relative distances into discrete codebooks, followed by a bidirectional masked generative Transformer that models their joint distribution conditioned on text. To better align motion and language, we also introduce a cross-modal interaction module to enhance text-motion association. Our approach ensures the generated motions exhibit both semantic alignment with textual descriptions and preserving plausible inter-character distances, setting a new benchmark for text-driven multi-person interaction generation.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":6.5,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145961104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/TVCG.2025.3634845
Julio Rey Ramirez, Peter Rautek, Tobias Gunther, Markus Hadwiger
The detection and analysis of features in fluid flow are important tasks in fluid mechanics and flow visualization. One recent class of methods to approach this problem is to first compute objective optimal reference frames, relative to which the input vector field becomes as steady as possible. However, existing methods either optimize locally over a fixed neighborhood, which might not match the extent of interesting features well, or perform global optimization, which is costly. We propose a novel objective method for the computation of optimal reference frames that automatically adapts to the flow field locally, without having to choose neighborhoods a priori. We enable adaptivity by formulating this problem as a moving least squares approximation, through which we determine a continuous field of reference frames. To incorporate fluid features into the computation of the reference frame field, we introduce the use of a scalar guidance field into the moving least squares approximation. The guidance field determines a curved manifold on which a regularly sampled input vector field becomes a set of irregularly spaced samples, which then forms the input to the moving least squares approximation. Although the guidance field can be any scalar field, by using a field that corresponds to flow features the resulting reference frame field will adapt accordingly. We show that using an FTLE field as the guidance field results in a reference frame field that adapts better to local features in the flow than prior work. However, our moving least squares framework is formulated in a very general way, and therefore other types of guidance fields could be used in the future to adapt to local fluid features.
{"title":"Locally Adapted Reference Frame Fields using Moving Least Squares.","authors":"Julio Rey Ramirez, Peter Rautek, Tobias Gunther, Markus Hadwiger","doi":"10.1109/TVCG.2025.3634845","DOIUrl":"https://doi.org/10.1109/TVCG.2025.3634845","url":null,"abstract":"<p><p>The detection and analysis of features in fluid flow are important tasks in fluid mechanics and flow visualization. One recent class of methods to approach this problem is to first compute objective optimal reference frames, relative to which the input vector field becomes as steady as possible. However, existing methods either optimize locally over a fixed neighborhood, which might not match the extent of interesting features well, or perform global optimization, which is costly. We propose a novel objective method for the computation of optimal reference frames that automatically adapts to the flow field locally, without having to choose neighborhoods a priori. We enable adaptivity by formulating this problem as a moving least squares approximation, through which we determine a continuous field of reference frames. To incorporate fluid features into the computation of the reference frame field, we introduce the use of a scalar guidance field into the moving least squares approximation. The guidance field determines a curved manifold on which a regularly sampled input vector field becomes a set of irregularly spaced samples, which then forms the input to the moving least squares approximation. Although the guidance field can be any scalar field, by using a field that corresponds to flow features the resulting reference frame field will adapt accordingly. We show that using an FTLE field as the guidance field results in a reference frame field that adapts better to local features in the flow than prior work. However, our moving least squares framework is formulated in a very general way, and therefore other types of guidance fields could be used in the future to adapt to local fluid features.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":6.5,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145961131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Creating detailed 3D characters from a single image remains challenging due to the difficulty in separating semantic components during generation. Existing methods often produce entangled meshes with poor topology, hindering downstream applications like rigging and animation. We introduce SeparateGen, a novel framework that generates high-quality 3D characters by explicitly reconstructing them as distinct semantic components (e.g., body, clothing, hair, shoes) from a single, arbitrary-pose image. SeparateGen first leverages a multi-view diffusion model to generate consistent multi-view images in a canonical Apose. Then, a novel component-aware reconstruction model, SC-LRM, conditioned on these multi-view images, adaptively decomposes and reconstructs each component with high fidelity. To train and evaluate SeparateGen, we contribute SC-Anime, the first large-scale dataset of 7,580 anime-style 3D characters with detailed component-level annotations. Extensive experiments demonstrate that SeparateGen significantly outperforms stateof- the-art methods in both reconstruction quality and multiview consistency. Furthermore, our component-based approach effectively resolves mesh entanglement issues, enabling seamless rigging and asset reuse. SeparateGen thus represents a step towards generating high-quality, application-ready 3D characters from a single image. The SC-Anime dataset and our code will be publicly released.
{"title":"SeparateGen: Semantic Component-based 3D Character Generation from Single Images.","authors":"Dong-Yang Li, Yi-Long Liu, Zi-Xian Liu, Yan-Pei Cao, Meng-Hao Guo, Shi-Min Hu","doi":"10.1109/TVCG.2026.3652452","DOIUrl":"https://doi.org/10.1109/TVCG.2026.3652452","url":null,"abstract":"<p><p>Creating detailed 3D characters from a single image remains challenging due to the difficulty in separating semantic components during generation. Existing methods often produce entangled meshes with poor topology, hindering downstream applications like rigging and animation. We introduce SeparateGen, a novel framework that generates high-quality 3D characters by explicitly reconstructing them as distinct semantic components (e.g., body, clothing, hair, shoes) from a single, arbitrary-pose image. SeparateGen first leverages a multi-view diffusion model to generate consistent multi-view images in a canonical Apose. Then, a novel component-aware reconstruction model, SC-LRM, conditioned on these multi-view images, adaptively decomposes and reconstructs each component with high fidelity. To train and evaluate SeparateGen, we contribute SC-Anime, the first large-scale dataset of 7,580 anime-style 3D characters with detailed component-level annotations. Extensive experiments demonstrate that SeparateGen significantly outperforms stateof- the-art methods in both reconstruction quality and multiview consistency. Furthermore, our component-based approach effectively resolves mesh entanglement issues, enabling seamless rigging and asset reuse. SeparateGen thus represents a step towards generating high-quality, application-ready 3D characters from a single image. The SC-Anime dataset and our code will be publicly released.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":6.5,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145961143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/TVCG.2026.3650881
Annan Zhou, Li Wang, Jian Li, Jing Huang, Li Li, Jian Yao
3D Gaussian Splatting (3DGS) has shown great promise in a variety of applications due to its exceptional real-time rendering quality and explicit representation, leading to numerous improvements across various fields. However, existing methods lack consideration of main objects and important structural information in their overall optimization strategies. This results in blurring of main objects in adaptive rendering and the loss of high-frequency details on targets that are insufficiently captured. In this work, we introduce a semantic-guided 3DGS method with adaptive rendering, which optimizes important structures through the guidance of boundary Gaussians, while leveraging semantic features to enhance the rendering of main objects in adaptive rendering. Experiments show that the proposed semantic-guided method can enhance important structures and high-frequency information in corner regions without significantly increasing the total number of Gaussians. This method also improves the separability between objects. At the same time, a semantic-guided Level-of-Detail (LoD) rendering approach enables the rapid display of main targets and the rendering of a complete scene. The semantic-guided methodology we have presented exhibits compatibility with a range of existing techniques. The code, more experimental results, and online demo will be available at https://zhouannan.github.io/SGGS/.
{"title":"SGGS: Semantic-Guided 3D Gaussian Splatting With Adaptive Rendering.","authors":"Annan Zhou, Li Wang, Jian Li, Jing Huang, Li Li, Jian Yao","doi":"10.1109/TVCG.2026.3650881","DOIUrl":"https://doi.org/10.1109/TVCG.2026.3650881","url":null,"abstract":"<p><p>3D Gaussian Splatting (3DGS) has shown great promise in a variety of applications due to its exceptional real-time rendering quality and explicit representation, leading to numerous improvements across various fields. However, existing methods lack consideration of main objects and important structural information in their overall optimization strategies. This results in blurring of main objects in adaptive rendering and the loss of high-frequency details on targets that are insufficiently captured. In this work, we introduce a semantic-guided 3DGS method with adaptive rendering, which optimizes important structures through the guidance of boundary Gaussians, while leveraging semantic features to enhance the rendering of main objects in adaptive rendering. Experiments show that the proposed semantic-guided method can enhance important structures and high-frequency information in corner regions without significantly increasing the total number of Gaussians. This method also improves the separability between objects. At the same time, a semantic-guided Level-of-Detail (LoD) rendering approach enables the rapid display of main targets and the rendering of a complete scene. The semantic-guided methodology we have presented exhibits compatibility with a range of existing techniques. The code, more experimental results, and online demo will be available at https://zhouannan.github.io/SGGS/.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":6.5,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145961152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/TVCG.2026.3652905
Rostyslav Hnatyshyn, Danny Perez, Gerik Scheuermann, Ross Maciejewski, Baldwin Nsonga
Contemporary materials science research is heavily conducted in silico, involving massive simulations of the atomicscale evolution of materials. Cataloging basic patterns in the atomic displacements is key to understanding and predicting the evolution of physical properties. However, the combinatorial complexity of the space of possible transitions coupled with the overwhelming amount of data being produced by highthroughput simulations make such an analysis extremely challenging and time-consuming for domain experts. The development of visual analytics systems that facilitate the exploration of simulation data is an active field of research. While these systems excel in identifying temporal regions of interest, they treat each timestep of a simulation as an independent event without considering the behavior of the atomic displacements between timesteps. We address this gap by introducing LAMDA, a visual analytics system that allows domain experts to quickly and systematically explore state-to-state transitions. In LAMDA, transitions are hierarchically categorized, providing a basis for cataloging displacement behavior, as well as enabling the analysis of simulations at different resolutions, ranging from very broad qualitative classes of transitions to very narrow definitions of unit processes. LAMDA supports navigating the hierarchy of transitions, enabling scientists to visualize the commonalities between different transitions in each class in terms of invariant features characterizing local atomic environments, and LAMDA simplifies the analysis by capturing user inputs through annotations. We evaluate our system through a case study and report on findings from our domain experts.
{"title":"LAMDA: Aiding Visual Exploration of Atomic Displacements in Molecular Dynamics Simulations.","authors":"Rostyslav Hnatyshyn, Danny Perez, Gerik Scheuermann, Ross Maciejewski, Baldwin Nsonga","doi":"10.1109/TVCG.2026.3652905","DOIUrl":"https://doi.org/10.1109/TVCG.2026.3652905","url":null,"abstract":"<p><p>Contemporary materials science research is heavily conducted in silico, involving massive simulations of the atomicscale evolution of materials. Cataloging basic patterns in the atomic displacements is key to understanding and predicting the evolution of physical properties. However, the combinatorial complexity of the space of possible transitions coupled with the overwhelming amount of data being produced by highthroughput simulations make such an analysis extremely challenging and time-consuming for domain experts. The development of visual analytics systems that facilitate the exploration of simulation data is an active field of research. While these systems excel in identifying temporal regions of interest, they treat each timestep of a simulation as an independent event without considering the behavior of the atomic displacements between timesteps. We address this gap by introducing LAMDA, a visual analytics system that allows domain experts to quickly and systematically explore state-to-state transitions. In LAMDA, transitions are hierarchically categorized, providing a basis for cataloging displacement behavior, as well as enabling the analysis of simulations at different resolutions, ranging from very broad qualitative classes of transitions to very narrow definitions of unit processes. LAMDA supports navigating the hierarchy of transitions, enabling scientists to visualize the commonalities between different transitions in each class in terms of invariant features characterizing local atomic environments, and LAMDA simplifies the analysis by capturing user inputs through annotations. We evaluate our system through a case study and report on findings from our domain experts.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":6.5,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145961122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Crowd simulation plays a crucial role in various domains, including entertainment, urban planning, and safety assessment. Data-driven methods offer significant advantages in simulating natural and diverse crowd behaviors, enabling highly realistic simulations. However, existing methods often face challenges due to incomplete trajectory data and limited generalization to unfamiliar scenarios. To address these limitations, we propose a novel crowd simulation framework based on the collaboration of a small model and a large model. Inspired by the dual-process decision-making mechanism in cognitive psychology, this framework enables efficient handling of familiar scenarios while leveraging the reasoning capabilities of large models in complex or unfamiliar environments. The small model, responsible for generating fast and reactive behaviors, is trained on real-world incomplete trajectory data to learn movement patterns. The large model, which performs simulation correction to refine failed behaviors, leverages past successful and failed experiences to enhance behavior generation in complex scenarios. Experimental results demonstrate that our framework significantly improves simulation accuracy in the presence of missing trajectory segments and enhances cross-scene generalization.
{"title":"Collaborative Small and Large Models for Crowd Simulation with Incomplete Trajectory Data.","authors":"Zheng Wang, Chang Li, Hua Wang, Dong Chen, Shuo He, Yingcai Wu, Mingliang Xu","doi":"10.1109/TVCG.2025.3649986","DOIUrl":"https://doi.org/10.1109/TVCG.2025.3649986","url":null,"abstract":"<p><p>Crowd simulation plays a crucial role in various domains, including entertainment, urban planning, and safety assessment. Data-driven methods offer significant advantages in simulating natural and diverse crowd behaviors, enabling highly realistic simulations. However, existing methods often face challenges due to incomplete trajectory data and limited generalization to unfamiliar scenarios. To address these limitations, we propose a novel crowd simulation framework based on the collaboration of a small model and a large model. Inspired by the dual-process decision-making mechanism in cognitive psychology, this framework enables efficient handling of familiar scenarios while leveraging the reasoning capabilities of large models in complex or unfamiliar environments. The small model, responsible for generating fast and reactive behaviors, is trained on real-world incomplete trajectory data to learn movement patterns. The large model, which performs simulation correction to refine failed behaviors, leverages past successful and failed experiences to enhance behavior generation in complex scenarios. Experimental results demonstrate that our framework significantly improves simulation accuracy in the presence of missing trajectory segments and enhances cross-scene generalization.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":6.5,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145879815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}