Graph Neural Networks (GNNs) have achieved significant success across various applications. However, their complex structures and inner workings can be challenging for non-AI experts to understand. To address this issue, this study presents GNN101, an educational visualization tool for interactive learning of GNNs. GNN101 introduces a set of animated visualizations that seamlessly integrates mathematical formulas with visualizations via multiple levels of abstraction, including a model overview, layer operations, and detailed calculations. Users can easily switch between two complementary views: a node-link view that offers an intuitive understanding of the graph data, and a matrix view that provides a space-efficient and comprehensive overview of all features and their transformations across layers. GNN101 was designed and developed based on close collaboration with four GNN experts and deployment in three GNN-related courses. We demonstrated the usability and effectiveness of GNN101 via use cases and user studies with both GNN teaching assistants and students. To ensure broad educational access, GNN101 is developed through modern web technologies and available directly in web browsers without requiring any installations.
{"title":"GNN101: Visual Learning of Graph Neural Networks in Your Web Browser.","authors":"Yilin Lu, Chongwei Chen, Yuxin Chen, Kexin Huang, Marinka Zitnik, Qianwen Wang","doi":"10.1109/TVCG.2025.3634087","DOIUrl":"10.1109/TVCG.2025.3634087","url":null,"abstract":"<p><p>Graph Neural Networks (GNNs) have achieved significant success across various applications. However, their complex structures and inner workings can be challenging for non-AI experts to understand. To address this issue, this study presents GNN101, an educational visualization tool for interactive learning of GNNs. GNN101 introduces a set of animated visualizations that seamlessly integrates mathematical formulas with visualizations via multiple levels of abstraction, including a model overview, layer operations, and detailed calculations. Users can easily switch between two complementary views: a node-link view that offers an intuitive understanding of the graph data, and a matrix view that provides a space-efficient and comprehensive overview of all features and their transformations across layers. GNN101 was designed and developed based on close collaboration with four GNN experts and deployment in three GNN-related courses. We demonstrated the usability and effectiveness of GNN101 via use cases and user studies with both GNN teaching assistants and students. To ensure broad educational access, GNN101 is developed through modern web technologies and available directly in web browsers without requiring any installations.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":"1793-1805"},"PeriodicalIF":6.5,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145574537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01DOI: 10.1109/TVCG.2025.3641982
Pinar Satilmis, Kurt Debattista, Thomas Bashford-Rogers
Accurate illumination is crucial for many imaging and vision applications, and skies are the dominant source of lighting in many scenes. Most existing work for representing sky illumination has focused on clear skies or more recently generative approaches for synthesizing clouds. However, these are very limited in that they assume distant illumination and do not capture the 3D properties of clouds. This paper presents a novel and principled approach to extract 3D whole-sky volumetric representations of clouds which can be used for imaging applications. Our approach extracts clouds from a single fisheye capture of the sky via an iterative optimization process. We achieve this by exploiting the physical properties of light scattering in clouds and use these to drive a domain-specific light transport simulation algorithm to render the images required for optimization. Results for this method provide high accuracy when re-rendering with our reconstructed clouds compared to real captures, and also enable novel uses of environment maps such as inclusion of captured clouds in renderings, cloud shadows, and more accurate aerial perspective and lighting.
{"title":"Image Based Whole Sky Cloud Volume Generation.","authors":"Pinar Satilmis, Kurt Debattista, Thomas Bashford-Rogers","doi":"10.1109/TVCG.2025.3641982","DOIUrl":"10.1109/TVCG.2025.3641982","url":null,"abstract":"<p><p>Accurate illumination is crucial for many imaging and vision applications, and skies are the dominant source of lighting in many scenes. Most existing work for representing sky illumination has focused on clear skies or more recently generative approaches for synthesizing clouds. However, these are very limited in that they assume distant illumination and do not capture the 3D properties of clouds. This paper presents a novel and principled approach to extract 3D whole-sky volumetric representations of clouds which can be used for imaging applications. Our approach extracts clouds from a single fisheye capture of the sky via an iterative optimization process. We achieve this by exploiting the physical properties of light scattering in clouds and use these to drive a domain-specific light transport simulation algorithm to render the images required for optimization. Results for this method provide high accuracy when re-rendering with our reconstructed clouds compared to real captures, and also enable novel uses of environment maps such as inclusion of captured clouds in renderings, cloud shadows, and more accurate aerial perspective and lighting.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":"1743-1753"},"PeriodicalIF":6.5,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145727800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Urban visual querying leverages visual representations and interactions to depict the domain of interest and express related requests for exploring complex datasets, which is usually an iterative process. One main challenge of this process is the vast search space in terms of identifying querying conditions, observing querying results, and making the subsequent queries. This paper proposes a novel acceleration scheme that intelligently recommends a small set of querying results subject to previous queries. Central to our approach is a reinforcement learning based approach that trains a recommendation agent by simulating user behavior and characterizing the search space. We propose a mixed-initiative urban visual query scheme to enhance the exploration process additionally. We evaluate our approach by performing qualitative and quantitative experiments on a real-world scenario. The experimental results demonstrate the capability of reducing user workload, achieving optimized querying, and improving analysis efficiency.
{"title":"Learning-Based Recommendations for Efficient Urban Visual Query.","authors":"Ziliang Wu, Wei Chen, Xiangyang Wu, Zihan Zhou, Yingchaojie Feng, Junhua Lu, Zhiguang Zhou, Mingliang Xu","doi":"10.1109/TVCG.2025.3625071","DOIUrl":"10.1109/TVCG.2025.3625071","url":null,"abstract":"<p><p>Urban visual querying leverages visual representations and interactions to depict the domain of interest and express related requests for exploring complex datasets, which is usually an iterative process. One main challenge of this process is the vast search space in terms of identifying querying conditions, observing querying results, and making the subsequent queries. This paper proposes a novel acceleration scheme that intelligently recommends a small set of querying results subject to previous queries. Central to our approach is a reinforcement learning based approach that trains a recommendation agent by simulating user behavior and characterizing the search space. We propose a mixed-initiative urban visual query scheme to enhance the exploration process additionally. We evaluate our approach by performing qualitative and quantitative experiments on a real-world scenario. The experimental results demonstrate the capability of reducing user workload, achieving optimized querying, and improving analysis efficiency.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":"1963-1977"},"PeriodicalIF":6.5,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145357316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01DOI: 10.1109/TVCG.2025.3642219
Liqun Liu, Leonid Bogachev, Mahdi Rezaei, Nishant Ravikumar, Arjun Khara, Mohsen Azarmi, Roy A Ruddle
Scatterplots are widely used across various domains to identify anomalies in datasets, particularly in multi-class settings, such as detecting misclassified or mislabeled data. However, scatterplot effectiveness often declines with large datasets due to limited display resolution. This paper introduces a novel Visual Quality Measure (VQM) - OM4AnI (Overlap Measure for Anomaly Identification) - which quantifies the degree of overlap for identifying anomalies, helping users estimate how effectively anomalies can be observed in multi-class scatterplots. OM4AnI begins by computing anomaly index based on each data point's position relative to its class cluster. The scatterplot is then discretized into a matrix representation by binning the display space into cell-level (pixel-level) grids and computing the coverage for each pixel. It takes into account the anomaly index of data points covering these pixels and visual features (marker shapes, marker sizes, and rendering orders). Building on this foundation, we sum all the coverage information in each cell (pixel) of matrix representation to obtain the final quality score with respect to anomaly identification. We conducted an evaluation to analyze the efficiency, effectiveness, sensitivity of OM4AnI in comparison with six representative baseline methods that are based on different computation granularity levels: data level, marker level, and pixel level. The results show that OM4AnI outperforms baseline methods by exhibiting more monotonic trends against the ground truth and greater sensitivity to rendering order, unlike the baseline methods. It confirms that OM4AnI can inform users about how effectively their scatterplots support anomaly identification. Overall, OM4AnI shows strong potential as an evaluation metric and for optimizing scatterplots through automatic adjustment of visual parameters.
散点图被广泛应用于各个领域,以识别数据集中的异常,特别是在多类设置中,例如检测错误分类或错误标记的数据。然而,由于有限的显示分辨率,散点图的有效性往往在大数据集上下降。本文介绍了一种新的视觉质量度量(VQM)——OM4AnI (Overlap Measure for Anomaly Identification),它量化了识别异常的重叠程度,帮助用户估计在多类散点图中如何有效地观察到异常。OM4AnI首先根据每个数据点相对于其类簇的位置计算异常指数。然后,通过将显示空间划分为单元级(像素级)网格并计算每个像素的覆盖率,将散点图离散为矩阵表示。它考虑了覆盖这些像素和视觉特征(标记形状、标记大小和呈现顺序)的数据点的异常指数。在此基础上,对矩阵表示的每个单元(像素)的覆盖信息进行求和,得到最终的异常识别质量分数。我们对OM4AnI的效率、有效性和灵敏度进行了评估,并与基于不同计算粒度级别(数据级、标记级和像素级)的六种代表性基线方法进行了比较。结果表明,与基线方法不同,OM4AnI表现出更多的单调趋势,对呈现顺序更敏感,从而优于基线方法。它证实了OM4AnI可以告知用户他们的散点图如何有效地支持异常识别。总体而言,OM4AnI显示出强大的潜力,可以作为评估指标,并通过自动调整视觉参数来优化散点图。
{"title":"OM4AnI: A Novel Overlap Measure for Anomaly Identification in Multi-Class Scatterplots.","authors":"Liqun Liu, Leonid Bogachev, Mahdi Rezaei, Nishant Ravikumar, Arjun Khara, Mohsen Azarmi, Roy A Ruddle","doi":"10.1109/TVCG.2025.3642219","DOIUrl":"10.1109/TVCG.2025.3642219","url":null,"abstract":"<p><p>Scatterplots are widely used across various domains to identify anomalies in datasets, particularly in multi-class settings, such as detecting misclassified or mislabeled data. However, scatterplot effectiveness often declines with large datasets due to limited display resolution. This paper introduces a novel Visual Quality Measure (VQM) - OM4AnI (Overlap Measure for Anomaly Identification) - which quantifies the degree of overlap for identifying anomalies, helping users estimate how effectively anomalies can be observed in multi-class scatterplots. OM4AnI begins by computing anomaly index based on each data point's position relative to its class cluster. The scatterplot is then discretized into a matrix representation by binning the display space into cell-level (pixel-level) grids and computing the coverage for each pixel. It takes into account the anomaly index of data points covering these pixels and visual features (marker shapes, marker sizes, and rendering orders). Building on this foundation, we sum all the coverage information in each cell (pixel) of matrix representation to obtain the final quality score with respect to anomaly identification. We conducted an evaluation to analyze the efficiency, effectiveness, sensitivity of OM4AnI in comparison with six representative baseline methods that are based on different computation granularity levels: data level, marker level, and pixel level. The results show that OM4AnI outperforms baseline methods by exhibiting more monotonic trends against the ground truth and greater sensitivity to rendering order, unlike the baseline methods. It confirms that OM4AnI can inform users about how effectively their scatterplots support anomaly identification. Overall, OM4AnI shows strong potential as an evaluation metric and for optimizing scatterplots through automatic adjustment of visual parameters.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":"1850-1863"},"PeriodicalIF":6.5,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145727824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01DOI: 10.1109/TVCG.2025.3622114
Aeri Cho, Hyeon Jeon, Kiroong Choe, Seokhyeon Park, Jinwook Seo
Nonlinear dimensionality reduction (NDR) techniques are widely used to visualize high-dimensional data. However, they often lack explainability, making it challenging for analysts to relate patterns in projections to original high-dimensional features. Existing interactive methods typically separate user interactions from the feature space, treating them primarily as post-hoc explanations rather than integrating them into the exploration process. This separation limits insight generation by restricting users' understanding of how features dynamically influence projections. To address this limitation, we propose a bidirectional interaction method that directly bridges the feature space and the projections. By allowing users to adjust feature weights, our approach enables intuitive exploration of how different features shape the embedding. We also define visual semantics to quantify projection changes, enabling structured pattern discovery through automated query-based interaction. To ensure responsiveness despite the computational complexity of NDR, we employ a neural network to approximate the projection process, enhancing scalability while maintaining accuracy. We evaluated our approach through quantitative analysis, assessing accuracy and scalability. A user study with a comprehensive visual interface and case studies demonstrated its effectiveness in supporting hypothesis generation and exploratory tasks with real-world data. The results confirmed that our approach supports diverse analytical scenarios and enhances users' ability to explore and interpret high-dimensional data through interactive exploration grounded in the feature space.
{"title":"Toward More Explainable Nonlinear Dimensionality Reduction: A Feature-Driven Interaction Approach.","authors":"Aeri Cho, Hyeon Jeon, Kiroong Choe, Seokhyeon Park, Jinwook Seo","doi":"10.1109/TVCG.2025.3622114","DOIUrl":"10.1109/TVCG.2025.3622114","url":null,"abstract":"<p><p>Nonlinear dimensionality reduction (NDR) techniques are widely used to visualize high-dimensional data. However, they often lack explainability, making it challenging for analysts to relate patterns in projections to original high-dimensional features. Existing interactive methods typically separate user interactions from the feature space, treating them primarily as post-hoc explanations rather than integrating them into the exploration process. This separation limits insight generation by restricting users' understanding of how features dynamically influence projections. To address this limitation, we propose a bidirectional interaction method that directly bridges the feature space and the projections. By allowing users to adjust feature weights, our approach enables intuitive exploration of how different features shape the embedding. We also define visual semantics to quantify projection changes, enabling structured pattern discovery through automated query-based interaction. To ensure responsiveness despite the computational complexity of NDR, we employ a neural network to approximate the projection process, enhancing scalability while maintaining accuracy. We evaluated our approach through quantitative analysis, assessing accuracy and scalability. A user study with a comprehensive visual interface and case studies demonstrated its effectiveness in supporting hypothesis generation and exploratory tasks with real-world data. The results confirmed that our approach supports diverse analytical scenarios and enhances users' ability to explore and interpret high-dimensional data through interactive exploration grounded in the feature space.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":"1835-1849"},"PeriodicalIF":6.5,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145310353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01DOI: 10.1109/TVCG.2025.3630826
Lisa Marie Prinz, Tintu Mathew, Benjamin Weyers
Interface metaphors are thought to enable an intuitive and effective interaction with a user interface by allowing users to draw on existing knowledge and reducing the need for instructions. This makes metaphors a promising candidate to drive the development of locomotion interfaces for virtual reality (VR). Since creating metaphoric interfaces can be difficult, it is important to analyze how typical locomotion metaphors can support an intuitive interaction. We performed a qualitative online study to observe the effect of typical metaphors (Walking, Steering, Flying, and Teleportation) and the influence of perceived affordances and user background. Our analysis shows that users adapt the interface expectations induced by metaphors to fit the perceived affordances instead of changing them. Several interests, the age, education, and gender influenced the expectations regarding the VR locomotion interface. Our findings contribute to a better understanding of users' mental models of VR locomotion metaphors, which seem necessary for designing more intuitive locomotion.
{"title":"Toward More Intuitive VR Locomotion Techniques: How Locomotion Metaphors Shape Users' Mental Models.","authors":"Lisa Marie Prinz, Tintu Mathew, Benjamin Weyers","doi":"10.1109/TVCG.2025.3630826","DOIUrl":"10.1109/TVCG.2025.3630826","url":null,"abstract":"<p><p>Interface metaphors are thought to enable an intuitive and effective interaction with a user interface by allowing users to draw on existing knowledge and reducing the need for instructions. This makes metaphors a promising candidate to drive the development of locomotion interfaces for virtual reality (VR). Since creating metaphoric interfaces can be difficult, it is important to analyze how typical locomotion metaphors can support an intuitive interaction. We performed a qualitative online study to observe the effect of typical metaphors (Walking, Steering, Flying, and Teleportation) and the influence of perceived affordances and user background. Our analysis shows that users adapt the interface expectations induced by metaphors to fit the perceived affordances instead of changing them. Several interests, the age, education, and gender influenced the expectations regarding the VR locomotion interface. Our findings contribute to a better understanding of users' mental models of VR locomotion metaphors, which seem necessary for designing more intuitive locomotion.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":"1605-1621"},"PeriodicalIF":6.5,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145508585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01DOI: 10.1109/TVCG.2025.3637845
Haoran Sun, Shuang Wu, Hujun Bao, Jin Huang
Surface mesh offsetting is a fundamental operation in various applications (e.g., shape modeling). Implicit methods that contour a volumetric distance field are robust at handling intersection defects, but it is challenging to apply shape control (e.g., preserving sharp features in the input shape) and to avoid undesired topology changes. Explicit methods, which move vertices towards the offset surface (with possible adaptivity), can address the above issues, but it is hard to avoid intersection issues. To combine the advantages of both, we propose a variational framework that takes mesh vertex locations as variables while simultaneously involving a smooth winding-number field associated with the mesh. Under various shape regularizations (e.g., sharp feature preservation) formulated on the mesh, the objective function mainly requires that the input mesh lie on the offset contour of the field induced by the resulting mesh. Such a combination inherits the ability to apply flexible shape regularizations from explicit methods and significantly alleviates intersection issues because of the field. Moreover, the optimization problem is numerically friendly by virtue of the differentiability of the field w.r.t. the mesh vertices. Results show that we can offset a mesh while preserving sharp features of the original surface, restricting selected parts to quadric surfaces and penalizing intersections.
{"title":"Variational Mesh Offsetting by Smoothed Winding Number.","authors":"Haoran Sun, Shuang Wu, Hujun Bao, Jin Huang","doi":"10.1109/TVCG.2025.3637845","DOIUrl":"10.1109/TVCG.2025.3637845","url":null,"abstract":"<p><p>Surface mesh offsetting is a fundamental operation in various applications (e.g., shape modeling). Implicit methods that contour a volumetric distance field are robust at handling intersection defects, but it is challenging to apply shape control (e.g., preserving sharp features in the input shape) and to avoid undesired topology changes. Explicit methods, which move vertices towards the offset surface (with possible adaptivity), can address the above issues, but it is hard to avoid intersection issues. To combine the advantages of both, we propose a variational framework that takes mesh vertex locations as variables while simultaneously involving a smooth winding-number field associated with the mesh. Under various shape regularizations (e.g., sharp feature preservation) formulated on the mesh, the objective function mainly requires that the input mesh lie on the offset contour of the field induced by the resulting mesh. Such a combination inherits the ability to apply flexible shape regularizations from explicit methods and significantly alleviates intersection issues because of the field. Moreover, the optimization problem is numerically friendly by virtue of the differentiability of the field w.r.t. the mesh vertices. Results show that we can offset a mesh while preserving sharp features of the original surface, restricting selected parts to quadric surfaces and penalizing intersections.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":"1668-1681"},"PeriodicalIF":6.5,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145644100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01DOI: 10.1109/TVCG.2025.3644930
Yifan Mao, Xinpeng Huang, Yilei Chen, Deyang Liu, Ping An, Sanghoon Lee
Light field (LF) imaging is inherently constrained by the trade-off between spatial resolution and angular sampling density. To overcome this obstacle, spatial-angular super-resolution (SR) methods have been developed to achieve concurrent enhancement in both dimensions. Traditional spatial-angular SR methods treat spatial and angular SR as separate tasks, resulting in parameter redundancy and error accumulation. While recent end-to-end approaches attempt joint processing, their uniform treatment of these distinct problems overlooks critical domain-specific requirements. To address these challenges, we propose a domain-specialized framework that deploys stage-tailored strategies to satisfy domain-specific demands. Specifically, in the angular SR stage, we introduce a cross-view consistency modulation module that enhances inter-view coherence through long-range dependency modeling of angular features. In the spatial SR stage, we propose a detail-aware state space model to reconstruct fine-grained detail. Finally, we develop a cross-domain integration module that explores spatial-angular correlations by fusing multi-representational features from both domains to foster synergistic optimization. Experimental results on public LF datasets demonstrate substantial improvements over state-of-the-art methods in both qualitative and quantitative comparisons, with approximately 50% fewer model parameters compared to competing methods.
{"title":"Learning a Domain-Specialized Network for Light Field Spatial-Angular Super-Resolution.","authors":"Yifan Mao, Xinpeng Huang, Yilei Chen, Deyang Liu, Ping An, Sanghoon Lee","doi":"10.1109/TVCG.2025.3644930","DOIUrl":"10.1109/TVCG.2025.3644930","url":null,"abstract":"<p><p>Light field (LF) imaging is inherently constrained by the trade-off between spatial resolution and angular sampling density. To overcome this obstacle, spatial-angular super-resolution (SR) methods have been developed to achieve concurrent enhancement in both dimensions. Traditional spatial-angular SR methods treat spatial and angular SR as separate tasks, resulting in parameter redundancy and error accumulation. While recent end-to-end approaches attempt joint processing, their uniform treatment of these distinct problems overlooks critical domain-specific requirements. To address these challenges, we propose a domain-specialized framework that deploys stage-tailored strategies to satisfy domain-specific demands. Specifically, in the angular SR stage, we introduce a cross-view consistency modulation module that enhances inter-view coherence through long-range dependency modeling of angular features. In the spatial SR stage, we propose a detail-aware state space model to reconstruct fine-grained detail. Finally, we develop a cross-domain integration module that explores spatial-angular correlations by fusing multi-representational features from both domains to foster synergistic optimization. Experimental results on public LF datasets demonstrate substantial improvements over state-of-the-art methods in both qualitative and quantitative comparisons, with approximately 50% fewer model parameters compared to competing methods.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":"2127-2140"},"PeriodicalIF":6.5,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145776536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Reconstructing Dynamic 3D Gaussian Splatting (3DGS) from low-framerate RGB videos is challenging. This is because large inter-frame motions will increase the uncertainty of the solution space. For example, one pixel in the first frame might have more choices to reach the corresponding pixel in the second frame. Event cameras can asynchronously capture rapid visual changes and are robust to motion blur, but they do not provide color information. Intuitively, the event stream can provide deterministic constraints for the inter-frame large motion by the event trajectories. Hence, combining low-temporal resolution images with high-framerate event streams can address this challenge. However, it is challenging to jointly optimize Dynamic 3DGS using both RGB and event modalities due to the significant discrepancy between these two data modalities. This paper introduces a novel framework that jointly optimizes dynamic 3DGS from the two modalities. The key idea is to adopt event motion priors to guide the optimization of the deformation fields. First, we extract the motion priors encoded in event streams by using the proposed LoCM unsupervised fine-tuning framework to adapt an event flow estimator to a certain unseen scene. Then, we present the geometry-aware data association method to build the event-Gaussian motion correspondence, which is the primary foundation of the pipeline, accompanied by two useful strategies, namely motion decomposition and inter-frame pseudo-label. Extensive experiments show that our method outperforms existing image and event-based approaches across synthetic and real scenes and prove that our method can effectively optimize dynamic 3DGS with the help of event data.
{"title":"DEGS: Deformable Event-Based 3D Gaussian Splatting From RGB and Event Stream.","authors":"Junhao He, Jiaxu Wang, Jia Li, Mingyuan Sun, Qiang Zhang, Jiahang Cao, Ziyi Zhang, Yi Gu, Jingkai Sun, Renjing Xu","doi":"10.1109/TVCG.2025.3618768","DOIUrl":"10.1109/TVCG.2025.3618768","url":null,"abstract":"<p><p>Reconstructing Dynamic 3D Gaussian Splatting (3DGS) from low-framerate RGB videos is challenging. This is because large inter-frame motions will increase the uncertainty of the solution space. For example, one pixel in the first frame might have more choices to reach the corresponding pixel in the second frame. Event cameras can asynchronously capture rapid visual changes and are robust to motion blur, but they do not provide color information. Intuitively, the event stream can provide deterministic constraints for the inter-frame large motion by the event trajectories. Hence, combining low-temporal resolution images with high-framerate event streams can address this challenge. However, it is challenging to jointly optimize Dynamic 3DGS using both RGB and event modalities due to the significant discrepancy between these two data modalities. This paper introduces a novel framework that jointly optimizes dynamic 3DGS from the two modalities. The key idea is to adopt event motion priors to guide the optimization of the deformation fields. First, we extract the motion priors encoded in event streams by using the proposed LoCM unsupervised fine-tuning framework to adapt an event flow estimator to a certain unseen scene. Then, we present the geometry-aware data association method to build the event-Gaussian motion correspondence, which is the primary foundation of the pipeline, accompanied by two useful strategies, namely motion decomposition and inter-frame pseudo-label. Extensive experiments show that our method outperforms existing image and event-based approaches across synthetic and real scenes and prove that our method can effectively optimize dynamic 3DGS with the help of event data.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":"1698-1712"},"PeriodicalIF":6.5,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145276920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The challenge of transforming partial point clouds into complete meshes still persists, with current methods facing issues like data accessibility constraint, shape preservation failure and poor robustness on real-scan data. Drawing inspiration from the structural information of objects to enhance the completion, we introduce an innovative weakly-supervised shape completion method leveraging structural decomposition without the necessity of SDFs during training. By representing objects as abstract structural frameworks and part details, our method initiates by forecasting the structure of the input partial point clouds, and individually restore each component through part decomposition completion and generation. Extracted part details are represented in images, which are porous and incomplete. Hence, we utilize a completion network to complete such details. For multiple results generation, a diffusion-based generation network is employed to generate a variety of details for the missing areas. The predicted structure and details are subsequently converted back into meshes, yielding the complete results. Since the details are depicted in images, our approach eliminates the need for SDFs during the training phase, achieving weakly-supervision. We conduct extensive comparisons on both artificial and real-scan datasets, demonstrating an average improvement of over 38.1% compared to the prior method, and achieving SOTA performance.
{"title":"Weakly-Supervised Shape Multi-Completion of Point Clouds by Structural Decomposition.","authors":"Changfeng Ma, Pengxiao Guo, Shuangyu Yang, Yuanqi Li, Jie Guo, Chongjun Wang, Yanwen Guo","doi":"10.1109/TVCG.2025.3636413","DOIUrl":"10.1109/TVCG.2025.3636413","url":null,"abstract":"<p><p>The challenge of transforming partial point clouds into complete meshes still persists, with current methods facing issues like data accessibility constraint, shape preservation failure and poor robustness on real-scan data. Drawing inspiration from the structural information of objects to enhance the completion, we introduce an innovative weakly-supervised shape completion method leveraging structural decomposition without the necessity of SDFs during training. By representing objects as abstract structural frameworks and part details, our method initiates by forecasting the structure of the input partial point clouds, and individually restore each component through part decomposition completion and generation. Extracted part details are represented in images, which are porous and incomplete. Hence, we utilize a completion network to complete such details. For multiple results generation, a diffusion-based generation network is employed to generate a variety of details for the missing areas. The predicted structure and details are subsequently converted back into meshes, yielding the complete results. Since the details are depicted in images, our approach eliminates the need for SDFs during the training phase, achieving weakly-supervision. We conduct extensive comparisons on both artificial and real-scan datasets, demonstrating an average improvement of over 38.1% compared to the prior method, and achieving SOTA performance.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":"2114-2126"},"PeriodicalIF":6.5,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145598409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}