In this paper, we propose a novel rendering framework based on neural radiance fields (NeRF) named HH-NeRF that can generate high-resolution audio-driven talking portrait videos with high fidelity and fast rendering. Specifically, our framework includes a detail-aware NeRF module and an efficient conditional super-resolution module. Firstly, a detail-aware NeRF is proposed to efficiently generate a high-fidelity low-resolution talking head, by using the encoded volume density estimation and audio-eye-aware color calculation. This module can capture natural eye blinks and high-frequency details, and maintain a similar rendering time as previous fast methods. Secondly, we present an efficient conditional super-resolution module on the dynamic scene to directly generate the high-resolution portrait with our low-resolution head. Incorporated with the prior information, such as depth map and audio features, our new proposed efficient conditional super resolution module can adopt a lightweight network to efficiently generate realistic and distinct high-resolution videos. Extensive experiments demonstrate that our method can generate more distinct and fidelity talking portraits on high resolution (900 × 900) videos compared to state-of-the-art methods. Our code is available at https://github.com/muyuWang/HHNeRF.
{"title":"High-Fidelity and High-Efficiency Talking Portrait Synthesis With Detail-Aware Neural Radiance Fields.","authors":"Muyu Wang, Sanyuan Zhao, Xingping Dong, Jianbing Shen","doi":"10.1109/TVCG.2024.3488960","DOIUrl":"10.1109/TVCG.2024.3488960","url":null,"abstract":"<p><p>In this paper, we propose a novel rendering framework based on neural radiance fields (NeRF) named HH-NeRF that can generate high-resolution audio-driven talking portrait videos with high fidelity and fast rendering. Specifically, our framework includes a detail-aware NeRF module and an efficient conditional super-resolution module. Firstly, a detail-aware NeRF is proposed to efficiently generate a high-fidelity low-resolution talking head, by using the encoded volume density estimation and audio-eye-aware color calculation. This module can capture natural eye blinks and high-frequency details, and maintain a similar rendering time as previous fast methods. Secondly, we present an efficient conditional super-resolution module on the dynamic scene to directly generate the high-resolution portrait with our low-resolution head. Incorporated with the prior information, such as depth map and audio features, our new proposed efficient conditional super resolution module can adopt a lightweight network to efficiently generate realistic and distinct high-resolution videos. Extensive experiments demonstrate that our method can generate more distinct and fidelity talking portraits on high resolution (900 × 900) videos compared to state-of-the-art methods. Our code is available at https://github.com/muyuWang/HHNeRF.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142559855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nowadays, 3D scenes are not merely static arrangements of objects. With the development of transformable modules, furniture objects can be translated, rotated, and even reshaped to achieve scenes with different functions (e.g., from a bedroom to a living room). Transformable domestic space, therefore, studies how a layout can change its function by reshaping and rearranging transformable modules, resulting in various transformable layouts. In practice, a rearrangement is dynamically conducted by reshaping/translating/rotating furniture objects with proper schedules, which can consume more time for designers than static scene design. Due to changes in objects' functions, potential transformable layouts may also be extensive, making it hard to explore desired layouts. We present a system for exploring transformable layouts. Given a single input scene consisting of transformable modules, our system first attempts to derive more layouts by reshaping and rearranging the modules. The derived scenes are organized into a graph-like hierarchy according to their functions, where edges represent functional evolutions (e.g., a living room can be reshaped to a bedroom), and nodes represent layouts that are dynamically transformable through translating/rotating/reshaping modules. The resulting hierarchy lets scene designers interactively explore possible scene variants and preview the animated rearrangement process. Experiments show that our system is efficient for generating transformable layouts, sensible for organizing functional hierarchies, and inspiring for providing ideas during interactions.
{"title":"SceneExplorer: An Interactive System for Expanding, Scheduling, and Organizing Transformable Layouts.","authors":"Shao-Kui Zhang, Jia-Hong Liu, Junkai Huang, Zi-Wei Chi, Hou Tam, Yong-Liang Yang, Song-Hai Zhang","doi":"10.1109/TVCG.2024.3488744","DOIUrl":"10.1109/TVCG.2024.3488744","url":null,"abstract":"<p><p>Nowadays, 3D scenes are not merely static arrangements of objects. With the development of transformable modules, furniture objects can be translated, rotated, and even reshaped to achieve scenes with different functions (e.g., from a bedroom to a living room). Transformable domestic space, therefore, studies how a layout can change its function by reshaping and rearranging transformable modules, resulting in various transformable layouts. In practice, a rearrangement is dynamically conducted by reshaping/translating/rotating furniture objects with proper schedules, which can consume more time for designers than static scene design. Due to changes in objects' functions, potential transformable layouts may also be extensive, making it hard to explore desired layouts. We present a system for exploring transformable layouts. Given a single input scene consisting of transformable modules, our system first attempts to derive more layouts by reshaping and rearranging the modules. The derived scenes are organized into a graph-like hierarchy according to their functions, where edges represent functional evolutions (e.g., a living room can be reshaped to a bedroom), and nodes represent layouts that are dynamically transformable through translating/rotating/reshaping modules. The resulting hierarchy lets scene designers interactively explore possible scene variants and preview the animated rearrangement process. Experiments show that our system is efficient for generating transformable layouts, sensible for organizing functional hierarchies, and inspiring for providing ideas during interactions.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142549777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-29DOI: 10.1109/TVCG.2024.3487974
Xiyuan Wang, Laixin Xie, He Wang, Xingxing Xing, Wei Wan, Ziming Wu, Xiaojuan Ma, Quan Li
The burgeoning online video game industry has sparked intense competition among providers to both expand their user base and retain existing players, particularly within social interaction genres. To anticipate player churn, there is an increasing reliance on machine learning (ML) models that focus on social interaction dynamics. However, the prevalent opacity of most ML algorithms poses a significant hurdle to their acceptance among domain experts, who often view them as "black boxes". Despite the availability of eXplainable Artificial Intelligence (XAI) techniques capable of elucidating model decisions, their adoption in the gaming industry remains limited. This is primarily because non-technical domain experts, such as product managers and game designers, encounter substantial challenges in deciphering the "explicit" and "implicit" features embedded within computational models. This study proposes a reliable, interpretable, and actionable solution for predicting player churn by restructuring model inputs into explicit and implicit features. It explores how establishing a connection between explicit and implicit features can assist experts in understanding the underlying implicit features. Moreover, it emphasizes the necessity for XAI techniques that not only offer implementable interventions but also pinpoint the most crucial features for those interventions. Two case studies, including expert feedback and a within-subject user study, demonstrate the efficacy of our approach.
蓬勃发展的在线视频游戏行业引发了供应商之间的激烈竞争,他们既要扩大用户群,又要留住现有玩家,尤其是社交互动类型的游戏。为了预测玩家流失率,人们越来越依赖于关注社交互动动态的机器学习(ML)模型。然而,大多数 ML 算法普遍不透明,这严重阻碍了该领域专家对它们的接受,他们通常将这些算法视为 "黑盒子"。尽管可解释人工智能(XAI)技术能够阐明模型决策,但其在游戏行业的应用仍然有限。这主要是因为非技术领域专家(如产品经理和游戏设计师)在解读蕴含在计算模型中的 "显性 "和 "隐性 "特征时遇到了巨大挑战。本研究通过将模型输入重组为显性和隐性特征,为预测玩家流失率提出了一种可靠、可解释和可操作的解决方案。它探讨了在显性特征和隐性特征之间建立联系如何有助于专家理解潜在的隐性特征。此外,它还强调了 XAI 技术的必要性,这些技术不仅能提供可实施的干预措施,还能为这些干预措施指出最关键的特征。包括专家反馈和主体内用户研究在内的两个案例研究证明了我们方法的有效性。
{"title":"Deciphering Explicit and Implicit Features for Reliable, Interpretable, and Actionable User Churn Prediction in Online Video Games.","authors":"Xiyuan Wang, Laixin Xie, He Wang, Xingxing Xing, Wei Wan, Ziming Wu, Xiaojuan Ma, Quan Li","doi":"10.1109/TVCG.2024.3487974","DOIUrl":"https://doi.org/10.1109/TVCG.2024.3487974","url":null,"abstract":"<p><p>The burgeoning online video game industry has sparked intense competition among providers to both expand their user base and retain existing players, particularly within social interaction genres. To anticipate player churn, there is an increasing reliance on machine learning (ML) models that focus on social interaction dynamics. However, the prevalent opacity of most ML algorithms poses a significant hurdle to their acceptance among domain experts, who often view them as \"black boxes\". Despite the availability of eXplainable Artificial Intelligence (XAI) techniques capable of elucidating model decisions, their adoption in the gaming industry remains limited. This is primarily because non-technical domain experts, such as product managers and game designers, encounter substantial challenges in deciphering the \"explicit\" and \"implicit\" features embedded within computational models. This study proposes a reliable, interpretable, and actionable solution for predicting player churn by restructuring model inputs into explicit and implicit features. It explores how establishing a connection between explicit and implicit features can assist experts in understanding the underlying implicit features. Moreover, it emphasizes the necessity for XAI techniques that not only offer implementable interventions but also pinpoint the most crucial features for those interventions. Two case studies, including expert feedback and a within-subject user study, demonstrate the efficacy of our approach.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142549776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Incorporating automatic style extraction and transfer from existing well-designed graph visualizations can significantly alleviate the designer's workload. There are many types of graph visualizations. In this paper, our work focuses on node-link diagrams. We present a novel approach to streamline the design process of graph visualizations by automatically extracting visual styles from well-designed examples and applying them to other graphs. Our formative study identifies the key styles that designers consider when crafting visualizations, categorizing them into global and local styles. Leveraging deep learning techniques such as saliency detection models and multi-label classification models, we develop end-to-end pipelines for extracting both global and local styles. Global styles focus on aspects such as color scheme and layout, while local styles are concerned with the finer details of node and edge representations. Through a user study and evaluation experiment, we demonstrate the efficacy and time-saving benefits of our method, highlighting its potential to enhance the graph visualization design process.
{"title":"GVVST: Image-Driven Style Extraction From Graph Visualizations for Visual Style Transfer.","authors":"Sicheng Song, Yipeng Zhang, Yanna Lin, Huamin Qu, Changbo Wang, Chenhui Li","doi":"10.1109/TVCG.2024.3485701","DOIUrl":"10.1109/TVCG.2024.3485701","url":null,"abstract":"<p><p>Incorporating automatic style extraction and transfer from existing well-designed graph visualizations can significantly alleviate the designer's workload. There are many types of graph visualizations. In this paper, our work focuses on node-link diagrams. We present a novel approach to streamline the design process of graph visualizations by automatically extracting visual styles from well-designed examples and applying them to other graphs. Our formative study identifies the key styles that designers consider when crafting visualizations, categorizing them into global and local styles. Leveraging deep learning techniques such as saliency detection models and multi-label classification models, we develop end-to-end pipelines for extracting both global and local styles. Global styles focus on aspects such as color scheme and layout, while local styles are concerned with the finer details of node and edge representations. Through a user study and evaluation experiment, we demonstrate the efficacy and time-saving benefits of our method, highlighting its potential to enhance the graph visualization design process.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142515306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-22DOI: 10.1109/TVCG.2024.3484654
Zhuo Su, Lang Zhou, Yudi Tan, Boliang Guan, Fan Zhou
Accurate segmentation of 3D point clouds in indoor scenes remains a challenging task, often hindered by the labor-intensive nature of data annotation. While weakly supervised learning approaches have shown promise in leveraging partial annotations, they frequently struggle with imbalanced performance between foreground and background elements due to the complex structures and proximity of objects in indoor environments. To address this issue, we propose a novel foreground-aware label enhancement method utilizing visual boundary priors. Our approach projects 3D point clouds onto 2D planes and applies 2D image segmentation to generate pseudo-labels for foreground objects. These labels are subsequently back-projected into 3D space and used to train an initial segmentation model. We further refine this process by incorporating prior knowledge from projected images to filter the predicted labels, followed by model retraining. We introduce this technique as the Foreground Boundary Prior (FBP), a versatile, plug-and-play module designed to enhance various weakly supervised point cloud segmentation methods. We demonstrate the efficacy of our approach on the widely-used 2D-3D-Semantic dataset, employing both random-sample and bounding-box based weak labeling strategies. Our experimental results show significant improvements in segmentation performance across different architectural backbones, highlighting the method's effectiveness and portability.
{"title":"Visual Boundary-Guided Pseudo-Labeling for Weakly Supervised 3D Point Cloud Segmentation in Indoor Environments.","authors":"Zhuo Su, Lang Zhou, Yudi Tan, Boliang Guan, Fan Zhou","doi":"10.1109/TVCG.2024.3484654","DOIUrl":"https://doi.org/10.1109/TVCG.2024.3484654","url":null,"abstract":"<p><p>Accurate segmentation of 3D point clouds in indoor scenes remains a challenging task, often hindered by the labor-intensive nature of data annotation. While weakly supervised learning approaches have shown promise in leveraging partial annotations, they frequently struggle with imbalanced performance between foreground and background elements due to the complex structures and proximity of objects in indoor environments. To address this issue, we propose a novel foreground-aware label enhancement method utilizing visual boundary priors. Our approach projects 3D point clouds onto 2D planes and applies 2D image segmentation to generate pseudo-labels for foreground objects. These labels are subsequently back-projected into 3D space and used to train an initial segmentation model. We further refine this process by incorporating prior knowledge from projected images to filter the predicted labels, followed by model retraining. We introduce this technique as the Foreground Boundary Prior (FBP), a versatile, plug-and-play module designed to enhance various weakly supervised point cloud segmentation methods. We demonstrate the efficacy of our approach on the widely-used 2D-3D-Semantic dataset, employing both random-sample and bounding-box based weak labeling strategies. Our experimental results show significant improvements in segmentation performance across different architectural backbones, highlighting the method's effectiveness and portability.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142515308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-21DOI: 10.1109/TVCG.2024.3484471
Sangbong Yoo, Seokyeon Kim, Yun Jang
The transfer function (TF) design is crucial for enhancing the visualization quality and understanding of volume data in volume rendering. Recent research has proposed various multidimensional TFs to utilize diverse attributes extracted from volume data for controlling individual voxel rendering. Although multidimensional TFs enhance the ability to segregate data, manipulating various attributes for the rendering is cumbersome. In contrast, low-dimensional TFs are more beneficial as they are easier to manage, but separating volume data during rendering is problematic. This paper proposes a novel approach, a two-level transfer function, for rendering volume data by reducing TF dimensions. The proposed technique involves extracting multidimensional TF attributes from volume data and applying t-Stochastic Neighbor Embedding (t-SNE) to the TF attributes for dimensionality reduction. The two-level transfer function combines the classical 2D TF and t-SNE TF in the conventional direct volume rendering pipeline. The proposed approach is evaluated by comparing segments in t-SNE TF and rendering images using various volume datasets. The results of this study demonstrate that the proposed approach can effectively allow us to manipulate multidimensional attributes easily while maintaining high visualization quality in volume rendering.
{"title":"Two-Level Transfer Functions Using t-SNE for Data Segmentation in Direct Volume Rendering.","authors":"Sangbong Yoo, Seokyeon Kim, Yun Jang","doi":"10.1109/TVCG.2024.3484471","DOIUrl":"https://doi.org/10.1109/TVCG.2024.3484471","url":null,"abstract":"<p><p>The transfer function (TF) design is crucial for enhancing the visualization quality and understanding of volume data in volume rendering. Recent research has proposed various multidimensional TFs to utilize diverse attributes extracted from volume data for controlling individual voxel rendering. Although multidimensional TFs enhance the ability to segregate data, manipulating various attributes for the rendering is cumbersome. In contrast, low-dimensional TFs are more beneficial as they are easier to manage, but separating volume data during rendering is problematic. This paper proposes a novel approach, a two-level transfer function, for rendering volume data by reducing TF dimensions. The proposed technique involves extracting multidimensional TF attributes from volume data and applying t-Stochastic Neighbor Embedding (t-SNE) to the TF attributes for dimensionality reduction. The two-level transfer function combines the classical 2D TF and t-SNE TF in the conventional direct volume rendering pipeline. The proposed approach is evaluated by comparing segments in t-SNE TF and rendering images using various volume datasets. The results of this study demonstrate that the proposed approach can effectively allow us to manipulate multidimensional attributes easily while maintaining high visualization quality in volume rendering.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142515307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-18DOI: 10.1109/TVCG.2024.3477413
Frank Billy Djupkep Dizeu, Michel Picard, Marc-Antoine Drouin, Jonathan Boisvert
Measuring the 3D shape of semi-transparent surfaces with projector-camera 3D scanners is a difficult task because these surfaces weakly reflect light in a diffuse manner, and transmit a large part of the incident light. The task is even harder in the presence of participating background surfaces. The two methods proposed in this paper use sinusoidal patterns, each with a frequency chosen in the frequency range allowed by the projection optics of the projector-camera system. They differ in the way in which the camera-projector correspondence map is established, as well as in the number of patterns and the processing time required. The first method utilizes the discrete Fourier transform, performed on the intensity signal measured at a camera pixel, to inventory projector columns illuminating directly and indirectly the scene point imaged by that pixel. The second method goes beyond discrete Fourier transform and achieves the same goal by fitting a proposed analytical model to the measured intensity signal. Once the one (camera pixel) to many (projector columns) correspondence is established, a surface continuity constraint is applied to extract the one to one correspondence map linked to the semi-transparent surface. This map is used to determine the 3D point cloud of the surface by triangulation. Experimental results demonstrate the performance (accuracy, reliability) achieved by the proposed methods.
{"title":"Multi-Frequency Nonlinear Methods for 3D Shape Measurement of Semi-Transparent Surfaces Using Projector-Camera Systems.","authors":"Frank Billy Djupkep Dizeu, Michel Picard, Marc-Antoine Drouin, Jonathan Boisvert","doi":"10.1109/TVCG.2024.3477413","DOIUrl":"https://doi.org/10.1109/TVCG.2024.3477413","url":null,"abstract":"<p><p>Measuring the 3D shape of semi-transparent surfaces with projector-camera 3D scanners is a difficult task because these surfaces weakly reflect light in a diffuse manner, and transmit a large part of the incident light. The task is even harder in the presence of participating background surfaces. The two methods proposed in this paper use sinusoidal patterns, each with a frequency chosen in the frequency range allowed by the projection optics of the projector-camera system. They differ in the way in which the camera-projector correspondence map is established, as well as in the number of patterns and the processing time required. The first method utilizes the discrete Fourier transform, performed on the intensity signal measured at a camera pixel, to inventory projector columns illuminating directly and indirectly the scene point imaged by that pixel. The second method goes beyond discrete Fourier transform and achieves the same goal by fitting a proposed analytical model to the measured intensity signal. Once the one (camera pixel) to many (projector columns) correspondence is established, a surface continuity constraint is applied to extract the one to one correspondence map linked to the semi-transparent surface. This map is used to determine the 3D point cloud of the surface by triangulation. Experimental results demonstrate the performance (accuracy, reliability) achieved by the proposed methods.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142484147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-18DOI: 10.1109/TVCG.2024.3478852
Xipeng Chen, Guangrun Wang, Xiaogang Xu, Philip Torr, Liang Lin
We present a novel data-driven Parametric Linear Blend Skinning (PLBS) model meticulously crafted for generalized 3D garment dressing and animation. Previous data-driven methods are impeded by certain challenges including overreliance on human body modeling and limited adaptability across different garment shapes. Our method resolves these challenges via two goals: 1) Develop a model based on garment modeling rather than human body modeling. 2) Separately construct low-dimensional sub-spaces for modeling in-plane deformation (such as variation in garment shape and size) and out-of-plane deformation (such as deformation due to varied body size and motion). Therefore, we formulate garment deformation as a PLBS model controlled by canonical 3D garment mesh, vertex-based skinning weights and associated local patch transformation. Unlike traditional LBS models specialized for individual objects, PLBS model is capable of uniformly expressing varied garments and bodies, the in-plane deformation is encoded on the canonical 3D garment and the out-of-plane deformation is controlled by the local patch transformation. Besides, we propose novel 3D garment registration and skinning weight decomposition strategies to obtain adequate data to build PLBS model under different garment categories. Furthermore, we employ dynamic fine-tuning to complement high-frequency signals missing from LBS for unseen testing data. Experiments illustrate that our method is capable of modeling dynamics for loose-fitting garments, outperforming previous data-driven modeling methods using different sub-space modeling strategies. We showcase that our method can factorize and be generalized for varied body sizes, garment shapes, garment sizes and human motions under different garment categories.
{"title":"Parametric Linear Blend Skinning Model for Multiple-Shape 3D Garments.","authors":"Xipeng Chen, Guangrun Wang, Xiaogang Xu, Philip Torr, Liang Lin","doi":"10.1109/TVCG.2024.3478852","DOIUrl":"10.1109/TVCG.2024.3478852","url":null,"abstract":"<p><p>We present a novel data-driven Parametric Linear Blend Skinning (PLBS) model meticulously crafted for generalized 3D garment dressing and animation. Previous data-driven methods are impeded by certain challenges including overreliance on human body modeling and limited adaptability across different garment shapes. Our method resolves these challenges via two goals: 1) Develop a model based on garment modeling rather than human body modeling. 2) Separately construct low-dimensional sub-spaces for modeling in-plane deformation (such as variation in garment shape and size) and out-of-plane deformation (such as deformation due to varied body size and motion). Therefore, we formulate garment deformation as a PLBS model controlled by canonical 3D garment mesh, vertex-based skinning weights and associated local patch transformation. Unlike traditional LBS models specialized for individual objects, PLBS model is capable of uniformly expressing varied garments and bodies, the in-plane deformation is encoded on the canonical 3D garment and the out-of-plane deformation is controlled by the local patch transformation. Besides, we propose novel 3D garment registration and skinning weight decomposition strategies to obtain adequate data to build PLBS model under different garment categories. Furthermore, we employ dynamic fine-tuning to complement high-frequency signals missing from LBS for unseen testing data. Experiments illustrate that our method is capable of modeling dynamics for loose-fitting garments, outperforming previous data-driven modeling methods using different sub-space modeling strategies. We showcase that our method can factorize and be generalized for varied body sizes, garment shapes, garment sizes and human motions under different garment categories.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142484157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-17DOI: 10.1109/TVCG.2024.3483070
Taylor A Doty, Jonathan W Kelly, Stephen B Gilbert, Michael C Dorneich
Cybersickness, or sickness induced by virtual reality (VR), negatively impacts the enjoyment and adoption of the technology. One method that has been used to reduce sickness is repeated exposure to VR, herein Cybersickness Abatement from Repeated Exposure (CARE). However, high sickness levels during repeated exposure may discourage some users from returning. Field of view (FOV) restriction reduces cybersickness by minimizing visual motion in the periphery, but also negatively affects the user's visual experience. This study explored whether CARE that occurs with FOV restriction generalizes to a full FOV experience. Participants played a VR game for up to 20 minutes. Those in the Repeated Exposure Condition played the same VR game on four separate days, experiencing FOV restriction during the first three days and no FOV restriction on the fourth day. Results indicated significant CARE with FOV restriction (Days 1-3). Further, cybersickness on Day 4, without FOV restriction, was significantly lower than that of participants in the Single Exposure Condition, who experienced the game without FOV restriction only on one day. The current findings show that significant CARE can occur while experiencing minimal cybersickness. Results are considered in the context of multiple theoretical explanations for CARE, including sensory rearrangement, adaptation, habituation, and postural control.
{"title":"Cybersickness Abatement from Repeated Exposure to VR with Reduced Discomfort.","authors":"Taylor A Doty, Jonathan W Kelly, Stephen B Gilbert, Michael C Dorneich","doi":"10.1109/TVCG.2024.3483070","DOIUrl":"10.1109/TVCG.2024.3483070","url":null,"abstract":"<p><p>Cybersickness, or sickness induced by virtual reality (VR), negatively impacts the enjoyment and adoption of the technology. One method that has been used to reduce sickness is repeated exposure to VR, herein Cybersickness Abatement from Repeated Exposure (CARE). However, high sickness levels during repeated exposure may discourage some users from returning. Field of view (FOV) restriction reduces cybersickness by minimizing visual motion in the periphery, but also negatively affects the user's visual experience. This study explored whether CARE that occurs with FOV restriction generalizes to a full FOV experience. Participants played a VR game for up to 20 minutes. Those in the Repeated Exposure Condition played the same VR game on four separate days, experiencing FOV restriction during the first three days and no FOV restriction on the fourth day. Results indicated significant CARE with FOV restriction (Days 1-3). Further, cybersickness on Day 4, without FOV restriction, was significantly lower than that of participants in the Single Exposure Condition, who experienced the game without FOV restriction only on one day. The current findings show that significant CARE can occur while experiencing minimal cybersickness. Results are considered in the context of multiple theoretical explanations for CARE, including sensory rearrangement, adaptation, habituation, and postural control.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142484143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-16DOI: 10.1109/TVCG.2024.3481354
Abdullah-Al-Raihan Nayeem, Dongyun Han, William J Tolone, Isaac Cho
Contour maps are an essential tool for exploring spatial features of the terrain, such as distance, directions, and surface gradient among the contour areas. User interactions in contour-based visualizations create approaches to visual analysis that are noticeably different from the perspective of human cognition. As such, various interactive approaches have been introduced to improve system usability and enhance human cognition for complex and large-scale spatial data exploration. However, what user interaction means for contour maps, its purpose, when to leverage, and design primitives have yet to be investigated in the context of analysis tasks. Therefore, further research is needed to better understand and quantify the potentials and benefits offered by user interactions in contour-based geospatial visualizations designed to support analytical tasks. In this paper, we present a contour-based interactive geospatial visualization designed for analytical tasks. We conducted a crowd-sourced user study (N=62) to examine the impact of interactive features on analysis using contour-based geospatial visualizations. Our results show that the interactive features aid in their data analysis and understanding in terms of spatial data extent, map layout, task complexity, and user expertise. Finally, we discuss our findings in-depth, which will serve as guidelines for future design and implementation of interactive features in support of case-specific analytical tasks on contour-based geospatial views.
{"title":"Evaluating Effectiveness of Interactivity in Contour-Based Geospatial Visualizations.","authors":"Abdullah-Al-Raihan Nayeem, Dongyun Han, William J Tolone, Isaac Cho","doi":"10.1109/TVCG.2024.3481354","DOIUrl":"https://doi.org/10.1109/TVCG.2024.3481354","url":null,"abstract":"<p><p>Contour maps are an essential tool for exploring spatial features of the terrain, such as distance, directions, and surface gradient among the contour areas. User interactions in contour-based visualizations create approaches to visual analysis that are noticeably different from the perspective of human cognition. As such, various interactive approaches have been introduced to improve system usability and enhance human cognition for complex and large-scale spatial data exploration. However, what user interaction means for contour maps, its purpose, when to leverage, and design primitives have yet to be investigated in the context of analysis tasks. Therefore, further research is needed to better understand and quantify the potentials and benefits offered by user interactions in contour-based geospatial visualizations designed to support analytical tasks. In this paper, we present a contour-based interactive geospatial visualization designed for analytical tasks. We conducted a crowd-sourced user study (N=62) to examine the impact of interactive features on analysis using contour-based geospatial visualizations. Our results show that the interactive features aid in their data analysis and understanding in terms of spatial data extent, map layout, task complexity, and user expertise. Finally, we discuss our findings in-depth, which will serve as guidelines for future design and implementation of interactive features in support of case-specific analytical tasks on contour-based geospatial views.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142484145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}