We present an authoring tool, called CAST+ (Canis Studio Plus), that enables the interactive creation of chart animations through the direct manipulation of keyframes. It introduces the visual specification of chart animations consisting of keyframes that can be played sequentially or simultaneously, and animation parameters (e.g., duration, delay). Building on Canis [1], a declarative chart animation grammar that leverages data-enriched SVG charts, CAST+ supports auto-completion for constructing both keyframes and keyframe sequences. It also enables users to refine the animation specification (e.g., aligning keyframes across tracks to play them together, adjusting delay) with direct manipulation. We report a user study conducted to assess the visual specification and system usability with its initial version. We enhanced the system's expressiveness and usability: CAST+ now supports the animation of multiple types of visual marks in the same keyframe group with new auto-completion algorithms based on generalized selection. This enables the creation of more expressive animations, while reducing the number of interactions needed to create comparable animations. We present a gallery of examples and four usage scenarios to demonstrate the expressiveness of CAST+. Finally, we discuss the limitations, comparison, and potentials of CAST+ as well as directions for future research.
{"title":"Authoring Data-Driven Chart Animations.","authors":"Yuancheng Shen, Yue Zhao, Yunhai Wang, Tong Ge, Haoyan Shi, Bongshin Lee","doi":"10.1109/TVCG.2024.3491504","DOIUrl":"https://doi.org/10.1109/TVCG.2024.3491504","url":null,"abstract":"<p><p>We present an authoring tool, called CAST+ (Canis Studio Plus), that enables the interactive creation of chart animations through the direct manipulation of keyframes. It introduces the visual specification of chart animations consisting of keyframes that can be played sequentially or simultaneously, and animation parameters (e.g., duration, delay). Building on Canis [1], a declarative chart animation grammar that leverages data-enriched SVG charts, CAST+ supports auto-completion for constructing both keyframes and keyframe sequences. It also enables users to refine the animation specification (e.g., aligning keyframes across tracks to play them together, adjusting delay) with direct manipulation. We report a user study conducted to assess the visual specification and system usability with its initial version. We enhanced the system's expressiveness and usability: CAST+ now supports the animation of multiple types of visual marks in the same keyframe group with new auto-completion algorithms based on generalized selection. This enables the creation of more expressive animations, while reducing the number of interactions needed to create comparable animations. We present a gallery of examples and four usage scenarios to demonstrate the expressiveness of CAST+. Finally, we discuss the limitations, comparison, and potentials of CAST+ as well as directions for future research.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142585415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-04DOI: 10.1109/TVCG.2024.3486613
Charles Berret, Tamara Munzner
We offer a new model of the sensemaking process for data analysis and visualization. Whereas past sensemaking models have been grounded in positivist assumptions about the nature of knowledge, we reframe data sensemaking in critical, humanistic terms by approaching it through an interpretivist lens. Our three-phase process model uses the analogy of an iceberg, where data is the visible tip of underlying schemas. In the Add phase, the analyst acquires data, incorporates explicit schemas from the data, and absorbs the tacit schemas of both data and people. In the Check phase, the analyst interprets the data with respect to the current schemas and evaluates whether the schemas match the data. In the Refine phase, the analyst considers the role of power, articulates what was tacit into explicitly stated schemas, updates data, and formulates findings. Our model has four important distinguishing features: Tacit and Explicit Schemas, Schemas First and Always, Data as a Schematic Artifact, and Schematic Multiplicity. We compare the roles of schemas in past sensemaking models and draw conceptual distinctions based on a historical review of schemas in different academic traditions. We validate the descriptive and prescriptive power of our model through four analysis scenarios: noticing uncollected data, learning to wrangle data, downplaying inconvenient data, and measuring with sensors. We conclude by discussing the value of interpretivism, the virtue of epistemic humility, and the pluralism this sensemaking model can foster.
{"title":"Iceberg Sensemaking: A Process Model for Critical Data Analysis.","authors":"Charles Berret, Tamara Munzner","doi":"10.1109/TVCG.2024.3486613","DOIUrl":"https://doi.org/10.1109/TVCG.2024.3486613","url":null,"abstract":"<p><p>We offer a new model of the sensemaking process for data analysis and visualization. Whereas past sensemaking models have been grounded in positivist assumptions about the nature of knowledge, we reframe data sensemaking in critical, humanistic terms by approaching it through an interpretivist lens. Our three-phase process model uses the analogy of an iceberg, where data is the visible tip of underlying schemas. In the Add phase, the analyst acquires data, incorporates explicit schemas from the data, and absorbs the tacit schemas of both data and people. In the Check phase, the analyst interprets the data with respect to the current schemas and evaluates whether the schemas match the data. In the Refine phase, the analyst considers the role of power, articulates what was tacit into explicitly stated schemas, updates data, and formulates findings. Our model has four important distinguishing features: Tacit and Explicit Schemas, Schemas First and Always, Data as a Schematic Artifact, and Schematic Multiplicity. We compare the roles of schemas in past sensemaking models and draw conceptual distinctions based on a historical review of schemas in different academic traditions. We validate the descriptive and prescriptive power of our model through four analysis scenarios: noticing uncollected data, learning to wrangle data, downplaying inconvenient data, and measuring with sensors. We conclude by discussing the value of interpretivism, the virtue of epistemic humility, and the pluralism this sensemaking model can foster.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142577356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-04DOI: 10.1109/TVCG.2024.3490840
Yuqi Han, Tao Yu, Xiaohang Yu, Di Xu, Binge Zheng, Zonghong Dai, Changpeng Yang, Yuwang Wang, Qionghai Dai
The neural radiance field (NeRF) achieved remarkable success in modeling 3D scenes and synthesizing high-fidelity novel views. However, existing NeRF-based methods focus more on making full use of high-resolution images to generate high-resolution novel views, but less considering the generation of high-resolution details given only low-resolution images. In analogy to the extensive usage of image super-resolution, NeRF super-resolution is an effective way to generate low-resolution-guided high-resolution 3D scenes and holds great potential applications. Up to now, such an important topic is still under-explored. In this paper, we propose a NeRF super-resolution method, named Super-NeRF, to generate high-resolution NeRF from only low-resolution inputs. Given multi-view low-resolution images, Super-NeRF constructs a multi-view consistency-controlling super-resolution module to generate various view-consistent high-resolution details for NeRF. Specifically, an optimizable latent code is introduced for each input view to control the generated reasonable high-resolution 2D images satisfying view consistency. The latent codes of each low-resolution image are optimized synergistically with the target Super-NeRF representation to utilize the view consistency constraint inherent in NeRF construction. We verify the effectiveness of Super-NeRF on synthetic, real-world, and even AI-generated NeRFs. Super-NeRF achieves state-of-the-art NeRF super-resolution performance on high-resolution detail generation and cross-view consistency.
{"title":"Super-NeRF: View-consistent Detail Generation for NeRF Super-resolution.","authors":"Yuqi Han, Tao Yu, Xiaohang Yu, Di Xu, Binge Zheng, Zonghong Dai, Changpeng Yang, Yuwang Wang, Qionghai Dai","doi":"10.1109/TVCG.2024.3490840","DOIUrl":"https://doi.org/10.1109/TVCG.2024.3490840","url":null,"abstract":"<p><p>The neural radiance field (NeRF) achieved remarkable success in modeling 3D scenes and synthesizing high-fidelity novel views. However, existing NeRF-based methods focus more on making full use of high-resolution images to generate high-resolution novel views, but less considering the generation of high-resolution details given only low-resolution images. In analogy to the extensive usage of image super-resolution, NeRF super-resolution is an effective way to generate low-resolution-guided high-resolution 3D scenes and holds great potential applications. Up to now, such an important topic is still under-explored. In this paper, we propose a NeRF super-resolution method, named Super-NeRF, to generate high-resolution NeRF from only low-resolution inputs. Given multi-view low-resolution images, Super-NeRF constructs a multi-view consistency-controlling super-resolution module to generate various view-consistent high-resolution details for NeRF. Specifically, an optimizable latent code is introduced for each input view to control the generated reasonable high-resolution 2D images satisfying view consistency. The latent codes of each low-resolution image are optimized synergistically with the target Super-NeRF representation to utilize the view consistency constraint inherent in NeRF construction. We verify the effectiveness of Super-NeRF on synthetic, real-world, and even AI-generated NeRFs. Super-NeRF achieves state-of-the-art NeRF super-resolution performance on high-resolution detail generation and cross-view consistency.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142574886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-31DOI: 10.1109/TVCG.2024.3489676
Chanyoung Jung, Soobin Yim, Giwoong Park, Simon Oh, Yun Jang
The transportation network is an important element in an urban system that supports daily activities, enabling people to travel from one place to another. One of the key challenges is the network complexity, which is composed of many node pairs distributed over the area. This spatial characteristic results in the high dimensional network problem in understanding the 'cause' of problems such as traffic congestion. Recent studies have proposed visual analytics systems aimed at understanding these underlying causes. Despite these efforts, the analysis of such causes is limited to identified patterns. However, given the intricate distribution of roads and their mutual influence, new patterns continuously emerge across all roads within urban transportation. At this stage, a well-defined visual analytics system can be a good solution for transportation practitioners. In this paper, we propose a system, CATOM (Causal Topology Map), for the cause-effect analysis of traffic patterns based on Granger causality for extracting causal topology maps. CATOM discovers causal relationships between roads through the Granger causality test and quantifies these relationships through the causal density. During the design process, the system was developed to fully utilize spatial information with visualization techniques to overcome the previous problems in the literature. We also evaluate the usability of our approach by conducting a SUS(System Usability Scale) test and traffic cause analysis with the real-world data from two study sites in collaboration with domain experts.
{"title":"CATOM : Causal Topology Map for Spatiotemporal Traffic Analysis with Granger Causality in Urban Areas.","authors":"Chanyoung Jung, Soobin Yim, Giwoong Park, Simon Oh, Yun Jang","doi":"10.1109/TVCG.2024.3489676","DOIUrl":"10.1109/TVCG.2024.3489676","url":null,"abstract":"<p><p>The transportation network is an important element in an urban system that supports daily activities, enabling people to travel from one place to another. One of the key challenges is the network complexity, which is composed of many node pairs distributed over the area. This spatial characteristic results in the high dimensional network problem in understanding the 'cause' of problems such as traffic congestion. Recent studies have proposed visual analytics systems aimed at understanding these underlying causes. Despite these efforts, the analysis of such causes is limited to identified patterns. However, given the intricate distribution of roads and their mutual influence, new patterns continuously emerge across all roads within urban transportation. At this stage, a well-defined visual analytics system can be a good solution for transportation practitioners. In this paper, we propose a system, CATOM (Causal Topology Map), for the cause-effect analysis of traffic patterns based on Granger causality for extracting causal topology maps. CATOM discovers causal relationships between roads through the Granger causality test and quantifies these relationships through the causal density. During the design process, the system was developed to fully utilize spatial information with visualization techniques to overcome the previous problems in the literature. We also evaluate the usability of our approach by conducting a SUS(System Usability Scale) test and traffic cause analysis with the real-world data from two study sites in collaboration with domain experts.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142559854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we propose a novel rendering framework based on neural radiance fields (NeRF) named HH-NeRF that can generate high-resolution audio-driven talking portrait videos with high fidelity and fast rendering. Specifically, our framework includes a detail-aware NeRF module and an efficient conditional super-resolution module. Firstly, a detail-aware NeRF is proposed to efficiently generate a high-fidelity low-resolution talking head, by using the encoded volume density estimation and audio-eye-aware color calculation. This module can capture natural eye blinks and high-frequency details, and maintain a similar rendering time as previous fast methods. Secondly, we present an efficient conditional super-resolution module on the dynamic scene to directly generate the high-resolution portrait with our low-resolution head. Incorporated with the prior information, such as depth map and audio features, our new proposed efficient conditional super resolution module can adopt a lightweight network to efficiently generate realistic and distinct high-resolution videos. Extensive experiments demonstrate that our method can generate more distinct and fidelity talking portraits on high resolution (900 × 900) videos compared to state-of-the-art methods. Our code is available at https://github.com/muyuWang/HHNeRF.
{"title":"High-Fidelity and High-Efficiency Talking Portrait Synthesis With Detail-Aware Neural Radiance Fields.","authors":"Muyu Wang, Sanyuan Zhao, Xingping Dong, Jianbing Shen","doi":"10.1109/TVCG.2024.3488960","DOIUrl":"10.1109/TVCG.2024.3488960","url":null,"abstract":"<p><p>In this paper, we propose a novel rendering framework based on neural radiance fields (NeRF) named HH-NeRF that can generate high-resolution audio-driven talking portrait videos with high fidelity and fast rendering. Specifically, our framework includes a detail-aware NeRF module and an efficient conditional super-resolution module. Firstly, a detail-aware NeRF is proposed to efficiently generate a high-fidelity low-resolution talking head, by using the encoded volume density estimation and audio-eye-aware color calculation. This module can capture natural eye blinks and high-frequency details, and maintain a similar rendering time as previous fast methods. Secondly, we present an efficient conditional super-resolution module on the dynamic scene to directly generate the high-resolution portrait with our low-resolution head. Incorporated with the prior information, such as depth map and audio features, our new proposed efficient conditional super resolution module can adopt a lightweight network to efficiently generate realistic and distinct high-resolution videos. Extensive experiments demonstrate that our method can generate more distinct and fidelity talking portraits on high resolution (900 × 900) videos compared to state-of-the-art methods. Our code is available at https://github.com/muyuWang/HHNeRF.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142559855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nowadays, 3D scenes are not merely static arrangements of objects. With the development of transformable modules, furniture objects can be translated, rotated, and even reshaped to achieve scenes with different functions (e.g., from a bedroom to a living room). Transformable domestic space, therefore, studies how a layout can change its function by reshaping and rearranging transformable modules, resulting in various transformable layouts. In practice, a rearrangement is dynamically conducted by reshaping/translating/rotating furniture objects with proper schedules, which can consume more time for designers than static scene design. Due to changes in objects' functions, potential transformable layouts may also be extensive, making it hard to explore desired layouts. We present a system for exploring transformable layouts. Given a single input scene consisting of transformable modules, our system first attempts to derive more layouts by reshaping and rearranging the modules. The derived scenes are organized into a graph-like hierarchy according to their functions, where edges represent functional evolutions (e.g., a living room can be reshaped to a bedroom), and nodes represent layouts that are dynamically transformable through translating/rotating/reshaping modules. The resulting hierarchy lets scene designers interactively explore possible scene variants and preview the animated rearrangement process. Experiments show that our system is efficient for generating transformable layouts, sensible for organizing functional hierarchies, and inspiring for providing ideas during interactions.
{"title":"SceneExplorer: An Interactive System for Expanding, Scheduling, and Organizing Transformable Layouts.","authors":"Shao-Kui Zhang, Jia-Hong Liu, Junkai Huang, Zi-Wei Chi, Hou Tam, Yong-Liang Yang, Song-Hai Zhang","doi":"10.1109/TVCG.2024.3488744","DOIUrl":"10.1109/TVCG.2024.3488744","url":null,"abstract":"<p><p>Nowadays, 3D scenes are not merely static arrangements of objects. With the development of transformable modules, furniture objects can be translated, rotated, and even reshaped to achieve scenes with different functions (e.g., from a bedroom to a living room). Transformable domestic space, therefore, studies how a layout can change its function by reshaping and rearranging transformable modules, resulting in various transformable layouts. In practice, a rearrangement is dynamically conducted by reshaping/translating/rotating furniture objects with proper schedules, which can consume more time for designers than static scene design. Due to changes in objects' functions, potential transformable layouts may also be extensive, making it hard to explore desired layouts. We present a system for exploring transformable layouts. Given a single input scene consisting of transformable modules, our system first attempts to derive more layouts by reshaping and rearranging the modules. The derived scenes are organized into a graph-like hierarchy according to their functions, where edges represent functional evolutions (e.g., a living room can be reshaped to a bedroom), and nodes represent layouts that are dynamically transformable through translating/rotating/reshaping modules. The resulting hierarchy lets scene designers interactively explore possible scene variants and preview the animated rearrangement process. Experiments show that our system is efficient for generating transformable layouts, sensible for organizing functional hierarchies, and inspiring for providing ideas during interactions.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142549777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-29DOI: 10.1109/TVCG.2024.3487974
Xiyuan Wang, Laixin Xie, He Wang, Xingxing Xing, Wei Wan, Ziming Wu, Xiaojuan Ma, Quan Li
The burgeoning online video game industry has sparked intense competition among providers to both expand their user base and retain existing players, particularly within social interaction genres. To anticipate player churn, there is an increasing reliance on machine learning (ML) models that focus on social interaction dynamics. However, the prevalent opacity of most ML algorithms poses a significant hurdle to their acceptance among domain experts, who often view them as "black boxes". Despite the availability of eXplainable Artificial Intelligence (XAI) techniques capable of elucidating model decisions, their adoption in the gaming industry remains limited. This is primarily because non-technical domain experts, such as product managers and game designers, encounter substantial challenges in deciphering the "explicit" and "implicit" features embedded within computational models. This study proposes a reliable, interpretable, and actionable solution for predicting player churn by restructuring model inputs into explicit and implicit features. It explores how establishing a connection between explicit and implicit features can assist experts in understanding the underlying implicit features. Moreover, it emphasizes the necessity for XAI techniques that not only offer implementable interventions but also pinpoint the most crucial features for those interventions. Two case studies, including expert feedback and a within-subject user study, demonstrate the efficacy of our approach.
蓬勃发展的在线视频游戏行业引发了供应商之间的激烈竞争,他们既要扩大用户群,又要留住现有玩家,尤其是社交互动类型的游戏。为了预测玩家流失率,人们越来越依赖于关注社交互动动态的机器学习(ML)模型。然而,大多数 ML 算法普遍不透明,这严重阻碍了该领域专家对它们的接受,他们通常将这些算法视为 "黑盒子"。尽管可解释人工智能(XAI)技术能够阐明模型决策,但其在游戏行业的应用仍然有限。这主要是因为非技术领域专家(如产品经理和游戏设计师)在解读蕴含在计算模型中的 "显性 "和 "隐性 "特征时遇到了巨大挑战。本研究通过将模型输入重组为显性和隐性特征,为预测玩家流失率提出了一种可靠、可解释和可操作的解决方案。它探讨了在显性特征和隐性特征之间建立联系如何有助于专家理解潜在的隐性特征。此外,它还强调了 XAI 技术的必要性,这些技术不仅能提供可实施的干预措施,还能为这些干预措施指出最关键的特征。包括专家反馈和主体内用户研究在内的两个案例研究证明了我们方法的有效性。
{"title":"Deciphering Explicit and Implicit Features for Reliable, Interpretable, and Actionable User Churn Prediction in Online Video Games.","authors":"Xiyuan Wang, Laixin Xie, He Wang, Xingxing Xing, Wei Wan, Ziming Wu, Xiaojuan Ma, Quan Li","doi":"10.1109/TVCG.2024.3487974","DOIUrl":"https://doi.org/10.1109/TVCG.2024.3487974","url":null,"abstract":"<p><p>The burgeoning online video game industry has sparked intense competition among providers to both expand their user base and retain existing players, particularly within social interaction genres. To anticipate player churn, there is an increasing reliance on machine learning (ML) models that focus on social interaction dynamics. However, the prevalent opacity of most ML algorithms poses a significant hurdle to their acceptance among domain experts, who often view them as \"black boxes\". Despite the availability of eXplainable Artificial Intelligence (XAI) techniques capable of elucidating model decisions, their adoption in the gaming industry remains limited. This is primarily because non-technical domain experts, such as product managers and game designers, encounter substantial challenges in deciphering the \"explicit\" and \"implicit\" features embedded within computational models. This study proposes a reliable, interpretable, and actionable solution for predicting player churn by restructuring model inputs into explicit and implicit features. It explores how establishing a connection between explicit and implicit features can assist experts in understanding the underlying implicit features. Moreover, it emphasizes the necessity for XAI techniques that not only offer implementable interventions but also pinpoint the most crucial features for those interventions. Two case studies, including expert feedback and a within-subject user study, demonstrate the efficacy of our approach.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142549776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Incorporating automatic style extraction and transfer from existing well-designed graph visualizations can significantly alleviate the designer's workload. There are many types of graph visualizations. In this paper, our work focuses on node-link diagrams. We present a novel approach to streamline the design process of graph visualizations by automatically extracting visual styles from well-designed examples and applying them to other graphs. Our formative study identifies the key styles that designers consider when crafting visualizations, categorizing them into global and local styles. Leveraging deep learning techniques such as saliency detection models and multi-label classification models, we develop end-to-end pipelines for extracting both global and local styles. Global styles focus on aspects such as color scheme and layout, while local styles are concerned with the finer details of node and edge representations. Through a user study and evaluation experiment, we demonstrate the efficacy and time-saving benefits of our method, highlighting its potential to enhance the graph visualization design process.
{"title":"GVVST: Image-Driven Style Extraction From Graph Visualizations for Visual Style Transfer.","authors":"Sicheng Song, Yipeng Zhang, Yanna Lin, Huamin Qu, Changbo Wang, Chenhui Li","doi":"10.1109/TVCG.2024.3485701","DOIUrl":"10.1109/TVCG.2024.3485701","url":null,"abstract":"<p><p>Incorporating automatic style extraction and transfer from existing well-designed graph visualizations can significantly alleviate the designer's workload. There are many types of graph visualizations. In this paper, our work focuses on node-link diagrams. We present a novel approach to streamline the design process of graph visualizations by automatically extracting visual styles from well-designed examples and applying them to other graphs. Our formative study identifies the key styles that designers consider when crafting visualizations, categorizing them into global and local styles. Leveraging deep learning techniques such as saliency detection models and multi-label classification models, we develop end-to-end pipelines for extracting both global and local styles. Global styles focus on aspects such as color scheme and layout, while local styles are concerned with the finer details of node and edge representations. Through a user study and evaluation experiment, we demonstrate the efficacy and time-saving benefits of our method, highlighting its potential to enhance the graph visualization design process.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142515306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-22DOI: 10.1109/TVCG.2024.3484654
Zhuo Su, Lang Zhou, Yudi Tan, Boliang Guan, Fan Zhou
Accurate segmentation of 3D point clouds in indoor scenes remains a challenging task, often hindered by the labor-intensive nature of data annotation. While weakly supervised learning approaches have shown promise in leveraging partial annotations, they frequently struggle with imbalanced performance between foreground and background elements due to the complex structures and proximity of objects in indoor environments. To address this issue, we propose a novel foreground-aware label enhancement method utilizing visual boundary priors. Our approach projects 3D point clouds onto 2D planes and applies 2D image segmentation to generate pseudo-labels for foreground objects. These labels are subsequently back-projected into 3D space and used to train an initial segmentation model. We further refine this process by incorporating prior knowledge from projected images to filter the predicted labels, followed by model retraining. We introduce this technique as the Foreground Boundary Prior (FBP), a versatile, plug-and-play module designed to enhance various weakly supervised point cloud segmentation methods. We demonstrate the efficacy of our approach on the widely-used 2D-3D-Semantic dataset, employing both random-sample and bounding-box based weak labeling strategies. Our experimental results show significant improvements in segmentation performance across different architectural backbones, highlighting the method's effectiveness and portability.
{"title":"Visual Boundary-Guided Pseudo-Labeling for Weakly Supervised 3D Point Cloud Segmentation in Indoor Environments.","authors":"Zhuo Su, Lang Zhou, Yudi Tan, Boliang Guan, Fan Zhou","doi":"10.1109/TVCG.2024.3484654","DOIUrl":"https://doi.org/10.1109/TVCG.2024.3484654","url":null,"abstract":"<p><p>Accurate segmentation of 3D point clouds in indoor scenes remains a challenging task, often hindered by the labor-intensive nature of data annotation. While weakly supervised learning approaches have shown promise in leveraging partial annotations, they frequently struggle with imbalanced performance between foreground and background elements due to the complex structures and proximity of objects in indoor environments. To address this issue, we propose a novel foreground-aware label enhancement method utilizing visual boundary priors. Our approach projects 3D point clouds onto 2D planes and applies 2D image segmentation to generate pseudo-labels for foreground objects. These labels are subsequently back-projected into 3D space and used to train an initial segmentation model. We further refine this process by incorporating prior knowledge from projected images to filter the predicted labels, followed by model retraining. We introduce this technique as the Foreground Boundary Prior (FBP), a versatile, plug-and-play module designed to enhance various weakly supervised point cloud segmentation methods. We demonstrate the efficacy of our approach on the widely-used 2D-3D-Semantic dataset, employing both random-sample and bounding-box based weak labeling strategies. Our experimental results show significant improvements in segmentation performance across different architectural backbones, highlighting the method's effectiveness and portability.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142515308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-21DOI: 10.1109/TVCG.2024.3484471
Sangbong Yoo, Seokyeon Kim, Yun Jang
The transfer function (TF) design is crucial for enhancing the visualization quality and understanding of volume data in volume rendering. Recent research has proposed various multidimensional TFs to utilize diverse attributes extracted from volume data for controlling individual voxel rendering. Although multidimensional TFs enhance the ability to segregate data, manipulating various attributes for the rendering is cumbersome. In contrast, low-dimensional TFs are more beneficial as they are easier to manage, but separating volume data during rendering is problematic. This paper proposes a novel approach, a two-level transfer function, for rendering volume data by reducing TF dimensions. The proposed technique involves extracting multidimensional TF attributes from volume data and applying t-Stochastic Neighbor Embedding (t-SNE) to the TF attributes for dimensionality reduction. The two-level transfer function combines the classical 2D TF and t-SNE TF in the conventional direct volume rendering pipeline. The proposed approach is evaluated by comparing segments in t-SNE TF and rendering images using various volume datasets. The results of this study demonstrate that the proposed approach can effectively allow us to manipulate multidimensional attributes easily while maintaining high visualization quality in volume rendering.
{"title":"Two-Level Transfer Functions Using t-SNE for Data Segmentation in Direct Volume Rendering.","authors":"Sangbong Yoo, Seokyeon Kim, Yun Jang","doi":"10.1109/TVCG.2024.3484471","DOIUrl":"https://doi.org/10.1109/TVCG.2024.3484471","url":null,"abstract":"<p><p>The transfer function (TF) design is crucial for enhancing the visualization quality and understanding of volume data in volume rendering. Recent research has proposed various multidimensional TFs to utilize diverse attributes extracted from volume data for controlling individual voxel rendering. Although multidimensional TFs enhance the ability to segregate data, manipulating various attributes for the rendering is cumbersome. In contrast, low-dimensional TFs are more beneficial as they are easier to manage, but separating volume data during rendering is problematic. This paper proposes a novel approach, a two-level transfer function, for rendering volume data by reducing TF dimensions. The proposed technique involves extracting multidimensional TF attributes from volume data and applying t-Stochastic Neighbor Embedding (t-SNE) to the TF attributes for dimensionality reduction. The two-level transfer function combines the classical 2D TF and t-SNE TF in the conventional direct volume rendering pipeline. The proposed approach is evaluated by comparing segments in t-SNE TF and rendering images using various volume datasets. The results of this study demonstrate that the proposed approach can effectively allow us to manipulate multidimensional attributes easily while maintaining high visualization quality in volume rendering.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142515307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}