Pub Date : 2024-09-17DOI: 10.1109/TVCG.2024.3456327
Hana Pokojná;Tobias Isenberg;Stefan Bruckner;Barbora Kozlíková;Laura Garrison
We apply an approach from cognitive linguistics by mapping Conceptual Metaphor Theory (CMT) to the visualization domain to address patterns of visual conceptual metaphors that are often used in science infographics. Metaphors play an essential part in visual communication and are frequently employed to explain complex concepts. However, their use is often based on intuition, rather than following a formal process. At present, we lack tools and language for understanding and describing metaphor use in visualization to the extent where taxonomy and grammar could guide the creation of visual components, e.g., infographics. Our classification of the visual conceptual mappings within scientific representations is based on the breakdown of visual components in existing scientific infographics. We demonstrate the development of this mapping through a detailed analysis of data collected from four domains (biomedicine, climate, space, and anthropology) that represent a diverse range of visual conceptual metaphors used in the visual communication of science. This work allows us to identify patterns of visual conceptual metaphor use within the domains, resolve ambiguities about why specific conceptual metaphors are used, and develop a better overall understanding of visual metaphor use in scientific infographics. Our analysis shows that ontological and orientational conceptual metaphors are the most widely applied to translate complex scientific concepts. To support our findings we developed a visual exploratory tool based on the collected database that places the individual infographics on a spatio-temporal scale and illustrates the breakdown of visual conceptual metaphors.
{"title":"The Language of Infographics: Toward Understanding Conceptual Metaphor Use in Scientific Storytelling","authors":"Hana Pokojná;Tobias Isenberg;Stefan Bruckner;Barbora Kozlíková;Laura Garrison","doi":"10.1109/TVCG.2024.3456327","DOIUrl":"10.1109/TVCG.2024.3456327","url":null,"abstract":"We apply an approach from cognitive linguistics by mapping Conceptual Metaphor Theory (CMT) to the visualization domain to address patterns of visual conceptual metaphors that are often used in science infographics. Metaphors play an essential part in visual communication and are frequently employed to explain complex concepts. However, their use is often based on intuition, rather than following a formal process. At present, we lack tools and language for understanding and describing metaphor use in visualization to the extent where taxonomy and grammar could guide the creation of visual components, e.g., infographics. Our classification of the visual conceptual mappings within scientific representations is based on the breakdown of visual components in existing scientific infographics. We demonstrate the development of this mapping through a detailed analysis of data collected from four domains (biomedicine, climate, space, and anthropology) that represent a diverse range of visual conceptual metaphors used in the visual communication of science. This work allows us to identify patterns of visual conceptual metaphor use within the domains, resolve ambiguities about why specific conceptual metaphors are used, and develop a better overall understanding of visual metaphor use in scientific infographics. Our analysis shows that ontological and orientational conceptual metaphors are the most widely applied to translate complex scientific concepts. To support our findings we developed a visual exploratory tool based on the collected database that places the individual infographics on a spatio-temporal scale and illustrates the breakdown of visual conceptual metaphors.","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"31 1","pages":"371-381"},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142262078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-17DOI: 10.1109/TVCG.2024.3456308
Daniel Atzberger;Tim Cech;Willy Scheibel;Jürgen Döllner;Michael Behrisch;Tobias Schreck
The semantic similarity between documents of a text corpus can be visualized using map-like metaphors based on two-dimensional scatterplot layouts. These layouts result from a dimensionality reduction on the document-term matrix or a representation within a latent embedding, including topic models. Thereby, the resulting layout depends on the input data and hyperparameters of the dimensionality reduction and is therefore affected by changes in them. Furthermore, the resulting layout is affected by changes in the input data and hyperparameters of the dimensionality reduction. However, such changes to the layout require additional cognitive efforts from the user. In this work, we present a sensitivity study that analyzes the stability of these layouts concerning (1) changes in the text corpora, (2) changes in the hyperparameter, and (3) randomness in the initialization. Our approach has two stages: data measurement and data analysis. First, we derived layouts for the combination of three text corpora and six text embeddings and a grid-search-inspired hyperparameter selection of the dimensionality reductions. Afterward, we quantified the similarity of the layouts through ten metrics, concerning local and global structures and class separation. Second, we analyzed the resulting 42 817 tabular data points in a descriptive statistical analysis. From this, we derived guidelines for informed decisions on the layout algorithm and highlight specific hyperparameter settings. We provide our implementation as a Git repository at hpicgs/Topic-Models-and-Dimensionality-Reduction-Sensitivity-Study and results as Zenodo archive at DOI:10.5281/zenodo.12772898.
{"title":"A Large-Scale Sensitivity Analysis on Latent Embeddings and Dimensionality Reductions for Text Spatializations","authors":"Daniel Atzberger;Tim Cech;Willy Scheibel;Jürgen Döllner;Michael Behrisch;Tobias Schreck","doi":"10.1109/TVCG.2024.3456308","DOIUrl":"10.1109/TVCG.2024.3456308","url":null,"abstract":"The semantic similarity between documents of a text corpus can be visualized using map-like metaphors based on two-dimensional scatterplot layouts. These layouts result from a dimensionality reduction on the document-term matrix or a representation within a latent embedding, including topic models. Thereby, the resulting layout depends on the input data and hyperparameters of the dimensionality reduction and is therefore affected by changes in them. Furthermore, the resulting layout is affected by changes in the input data and hyperparameters of the dimensionality reduction. However, such changes to the layout require additional cognitive efforts from the user. In this work, we present a sensitivity study that analyzes the stability of these layouts concerning (1) changes in the text corpora, (2) changes in the hyperparameter, and (3) randomness in the initialization. Our approach has two stages: data measurement and data analysis. First, we derived layouts for the combination of three text corpora and six text embeddings and a grid-search-inspired hyperparameter selection of the dimensionality reductions. Afterward, we quantified the similarity of the layouts through ten metrics, concerning local and global structures and class separation. Second, we analyzed the resulting 42 817 tabular data points in a descriptive statistical analysis. From this, we derived guidelines for informed decisions on the layout algorithm and highlight specific hyperparameter settings. We provide our implementation as a Git repository at hpicgs/Topic-Models-and-Dimensionality-Reduction-Sensitivity-Study and results as Zenodo archive at DOI:10.5281/zenodo.12772898.","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"31 1","pages":"305-315"},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142262080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-17DOI: 10.1109/TVCG.2024.3456326
Mengyu Chen;Yijun Liu;Emily Wall
The Dunning-Kruger Effect (DKE) is a metacognitive phenomenon where low-skilled individuals tend to overestimate their competence while high-skilled individuals tend to underestimate their competence. This effect has been observed in a number of domains including humor, grammar, and logic. In this paper, we explore if and how DKE manifests in visual reasoning and judgment tasks. Across two online user studies involving (1) a sliding puzzle game and (2) a scatterplot-based categorization task, we demonstrate that individuals are susceptible to DKE in visual reasoning and judgment tasks: those who performed best underestimated their performance, while bottom performers overestimated their performance. In addition, we contribute novel analyses that correlate susceptibility of DKE with personality traits and user interactions. Our findings pave the way for novel modes of bias detection via interaction patterns and establish promising directions towards interventions tailored to an individual's personality traits. All materials and analyses are in supplemental materials: https://github.com/CAV-Lab/DKE_supplemental.git.
{"title":"Unmasking Dunning-Kruger Effect in Visual Reasoning & Judgment","authors":"Mengyu Chen;Yijun Liu;Emily Wall","doi":"10.1109/TVCG.2024.3456326","DOIUrl":"10.1109/TVCG.2024.3456326","url":null,"abstract":"The Dunning-Kruger Effect (DKE) is a metacognitive phenomenon where low-skilled individuals tend to overestimate their competence while high-skilled individuals tend to underestimate their competence. This effect has been observed in a number of domains including humor, grammar, and logic. In this paper, we explore if and how DKE manifests in visual reasoning and judgment tasks. Across two online user studies involving (1) a sliding puzzle game and (2) a scatterplot-based categorization task, we demonstrate that individuals are susceptible to DKE in visual reasoning and judgment tasks: those who performed best underestimated their performance, while bottom performers overestimated their performance. In addition, we contribute novel analyses that correlate susceptibility of DKE with personality traits and user interactions. Our findings pave the way for novel modes of bias detection via interaction patterns and establish promising directions towards interventions tailored to an individual's personality traits. All materials and analyses are in supplemental materials: https://github.com/CAV-Lab/DKE_supplemental.git.","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"31 1","pages":"743-753"},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142262084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-17DOI: 10.1109/TVCG.2024.3456298
Astrid van den Brandt;Sehi L'Yi;Huyen N. Nguyen;Anna Vilanova;Nils Gehlenborg
Genomics experts rely on visualization to extract and share insights from complex and large-scale datasets. Beyond off-the-shelf tools for data exploration, there is an increasing need for platforms that aid experts in authoring customized visualizations for both exploration and communication of insights. A variety of interactive techniques have been proposed for authoring data visualizations, such as template editing, shelf configuration, natural language input, and code editors. However, it remains unclear how genomics experts create visualizations and which techniques best support their visualization tasks and needs. To address this gap, we conducted two user studies with genomics researchers: (1) semi-structured interviews (n=20) to identify the tasks, user contexts, and current visualization authoring techniques and (2) an exploratory study (n=13) using visual probes to elicit users' intents and desired techniques when creating visualizations. Our contributions include (1) a characterization of how visualization authoring is currently utilized in genomics visualization, identifying limitations and benefits in light of common criteria for authoring tools, and (2) generalizable design implications for genomics visualization authoring tools based on our findings on task- and user-specific usefulness of authoring techniques. All supplemental materials are available at https://osf.io/bdj4v/
{"title":"Understanding Visualization Authoring Techniques for Genomics Data in the Context of Personas and Tasks","authors":"Astrid van den Brandt;Sehi L'Yi;Huyen N. Nguyen;Anna Vilanova;Nils Gehlenborg","doi":"10.1109/TVCG.2024.3456298","DOIUrl":"10.1109/TVCG.2024.3456298","url":null,"abstract":"Genomics experts rely on visualization to extract and share insights from complex and large-scale datasets. Beyond off-the-shelf tools for data exploration, there is an increasing need for platforms that aid experts in authoring customized visualizations for both exploration and communication of insights. A variety of interactive techniques have been proposed for authoring data visualizations, such as template editing, shelf configuration, natural language input, and code editors. However, it remains unclear how genomics experts create visualizations and which techniques best support their visualization tasks and needs. To address this gap, we conducted two user studies with genomics researchers: (1) semi-structured interviews (n=20) to identify the tasks, user contexts, and current visualization authoring techniques and (2) an exploratory study (n=13) using visual probes to elicit users' intents and desired techniques when creating visualizations. Our contributions include (1) a characterization of how visualization authoring is currently utilized in genomics visualization, identifying limitations and benefits in light of common criteria for authoring tools, and (2) generalizable design implications for genomics visualization authoring tools based on our findings on task- and user-specific usefulness of authoring techniques. All supplemental materials are available at https://osf.io/bdj4v/","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"31 1","pages":"1180-1190"},"PeriodicalIF":0.0,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10681582","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142262079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Text entry with word-gesture keyboards (WGK) is emerging as a popular method and becoming a key interaction for Extended Reality (XR). However, the diversity of interaction modes, keyboard sizes, and visual feedback in these environments introduces divergent word-gesture trajectory data patterns, thus leading to complexity in decoding trajectories into text. Template-matching decoding methods, such as SHARK2 [32], are commonly used for these WGK systems because they are easy to implement and configure. However, these methods are susceptible to decoding inaccuracies for noisy trajectories. While conventional neural-network-based decoders (neural decoders) trained on word-gesture trajectory data have been proposed to improve accuracy, they have their own limitations: they require extensive data for training and deep-learning expertise for implementation. To address these challenges, we propose a novel solution that combines ease of implementation with high decoding accuracy: a generalizable neural decoder enabled by pre-training on large-scale coarsely discretized word-gesture trajectories. This approach produces a ready-to-use WGK decoder that is generalizable across mid-air and on-surface WGK systems in augmented reality (AR) and virtual reality (VR), which is evident by a robust average Top-4 accuracy of 90.4% on four diverse datasets. It significantly outperforms SHARK2 with a 37.2% enhancement and surpasses the conventional neural decoder by 7.4%. Moreover, the Pre-trained Neural Decoder's size is only 4 MB after quantization, without sacrificing accuracy, and it can operate in real-time, executing in just 97 milliseconds on Quest 3.
{"title":"Gesture2Text: A Generalizable Decoder for Word-Gesture Keyboards in XR Through Trajectory Coarse Discretization and Pre-Training","authors":"Junxiao Shen;Khadija Khaldi;Enmin Zhou;Hemant Bhaskar Surale;Amy Karlson","doi":"10.1109/TVCG.2024.3456198","DOIUrl":"10.1109/TVCG.2024.3456198","url":null,"abstract":"Text entry with word-gesture keyboards (WGK) is emerging as a popular method and becoming a key interaction for Extended Reality (XR). However, the diversity of interaction modes, keyboard sizes, and visual feedback in these environments introduces divergent word-gesture trajectory data patterns, thus leading to complexity in decoding trajectories into text. Template-matching decoding methods, such as SHARK2 [32], are commonly used for these WGK systems because they are easy to implement and configure. However, these methods are susceptible to decoding inaccuracies for noisy trajectories. While conventional neural-network-based decoders (neural decoders) trained on word-gesture trajectory data have been proposed to improve accuracy, they have their own limitations: they require extensive data for training and deep-learning expertise for implementation. To address these challenges, we propose a novel solution that combines ease of implementation with high decoding accuracy: a generalizable neural decoder enabled by pre-training on large-scale coarsely discretized word-gesture trajectories. This approach produces a ready-to-use WGK decoder that is generalizable across mid-air and on-surface WGK systems in augmented reality (AR) and virtual reality (VR), which is evident by a robust average Top-4 accuracy of 90.4% on four diverse datasets. It significantly outperforms SHARK2 with a 37.2% enhancement and surpasses the conventional neural decoder by 7.4%. Moreover, the Pre-trained Neural Decoder's size is only 4 MB after quantization, without sacrificing accuracy, and it can operate in real-time, executing in just 97 milliseconds on Quest 3.","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"30 11","pages":"7118-7128"},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142304741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-16DOI: 10.1109/TVCG.2024.3456379
Hanning Shao;Xiaoru Yuan
Classical bibliography, by researching preserved catalogs from both official archives and personal collections of accumulated books, examines the books throughout history, thereby revealing cultural development across historical periods. In this work, we collaborate with domain experts to accomplish the task of data annotation concerning Chinese ancient catalogs. We introduce the CataAnno system that facilitates users in completing annotations more efficiently through cross-linked views, recommendation methods and convenient annotation interactions. The recommendation method can learn the background knowledge and annotation patterns that experts subconsciously integrate into the data during prior annotation processes. CataAnno searches for the most relevant examples previously annotated and recommends to the user. Meanwhile, the cross-linked views assist users in comprehending the correlations between entries and offer explanations for these recommendations. Evaluation and expert feedback confirm that the CataAnno system, by offering high-quality recommendations and visualizing the relationships between entries, can mitigate the necessity for specialized knowledge during the annotation process. This results in enhanced accuracy and consistency in annotations, thereby enhancing the overall efficiency.
{"title":"CataAnno: An Ancient Catalog Annotator for Annotation Cleaning by Recommendation","authors":"Hanning Shao;Xiaoru Yuan","doi":"10.1109/TVCG.2024.3456379","DOIUrl":"10.1109/TVCG.2024.3456379","url":null,"abstract":"Classical bibliography, by researching preserved catalogs from both official archives and personal collections of accumulated books, examines the books throughout history, thereby revealing cultural development across historical periods. In this work, we collaborate with domain experts to accomplish the task of data annotation concerning Chinese ancient catalogs. We introduce the CataAnno system that facilitates users in completing annotations more efficiently through cross-linked views, recommendation methods and convenient annotation interactions. The recommendation method can learn the background knowledge and annotation patterns that experts subconsciously integrate into the data during prior annotation processes. CataAnno searches for the most relevant examples previously annotated and recommends to the user. Meanwhile, the cross-linked views assist users in comprehending the correlations between entries and offer explanations for these recommendations. Evaluation and expert feedback confirm that the CataAnno system, by offering high-quality recommendations and visualizing the relationships between entries, can mitigate the necessity for specialized knowledge during the annotation process. This results in enhanced accuracy and consistency in annotations, thereby enhancing the overall efficiency.","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"31 1","pages":"404-414"},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142262085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shape is commonly used to distinguish between categories in multi-class scatterplots. However, existing guidelines for choosing effective shape palettes rely largely on intuition and do not consider how these needs may change as the number of categories increases. Unlike color, shapes can not be represented by a numerical space, making it difficult to propose general guidelines or design heuristics for using shape effectively. This paper presents a series of four experiments evaluating the efficiency of 39 shapes across three tasks: relative mean judgment tasks, expert preference, and correlation estimation. Our results show that conventional means for reasoning about shapes, such as filled versus unfilled, are insufficient to inform effective palette design. Further, even expert palettes vary significantly in their use of shape and corresponding effectiveness. To support effective shape palette design, we developed a model based on pairwise relations between shapes in our experiments and the number of shapes required for a given design. We embed this model in a palette design tool to give designers agency over shape selection while incorporating empirical elements of perceptual performance captured in our study. Our model advances understanding of shape perception in visualization contexts and provides practical design guidelines that can help improve categorical data encodings.
{"title":"Shape It Up: An Empirically Grounded Approach for Designing Shape Palettes","authors":"Chin Tseng;Arran Zeyu Wang;Ghulam Jilani Quadri;Danielle Albers Szafir","doi":"10.1109/TVCG.2024.3456385","DOIUrl":"10.1109/TVCG.2024.3456385","url":null,"abstract":"Shape is commonly used to distinguish between categories in multi-class scatterplots. However, existing guidelines for choosing effective shape palettes rely largely on intuition and do not consider how these needs may change as the number of categories increases. Unlike color, shapes can not be represented by a numerical space, making it difficult to propose general guidelines or design heuristics for using shape effectively. This paper presents a series of four experiments evaluating the efficiency of 39 shapes across three tasks: relative mean judgment tasks, expert preference, and correlation estimation. Our results show that conventional means for reasoning about shapes, such as filled versus unfilled, are insufficient to inform effective palette design. Further, even expert palettes vary significantly in their use of shape and corresponding effectiveness. To support effective shape palette design, we developed a model based on pairwise relations between shapes in our experiments and the number of shapes required for a given design. We embed this model in a palette design tool to give designers agency over shape selection while incorporating empirical elements of perceptual performance captured in our study. Our model advances understanding of shape perception in visualization contexts and provides practical design guidelines that can help improve categorical data encodings.","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"31 1","pages":"349-359"},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142304782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-16DOI: 10.1109/TVCG.2024.3456350
Huichen Will Wang;Mitchell Gordon;Leilani Battle;Jeffrey Heer
Trained on vast corpora, Large Language Models (LLMs) have the potential to encode visualization design knowledge and best practices. However, if they fail to do so, they might provide unreliable visualization recommendations. What visualization design preferences, then, have LLMs learned? We contribute DracoGPT, a method for extracting, modeling, and assessing visualization design preferences from LLMs. To assess varied tasks, we develop two pipelines—DracoGPT-Rank and DracoGPT-Recommend—to model LLMs prompted to either rank or recommend visual encoding specifications. We use Draco as a shared knowledge base in which to represent LLM design preferences and compare them to best practices from empirical research. We demonstrate that DracoGPT can accurately model the preferences expressed by LLMs, enabling analysis in terms of Draco design constraints. Across a suite of backing LLMs, we find that DracoGPT-Rank and DracoGPT-Recommend moderately agree with each other, but both substantially diverge from guidelines drawn from human subjects experiments. Future work can build on our approach to expand Draco's knowledge base to model a richer set of preferences and to provide a robust and cost-effective stand-in for LLMs.
{"title":"DracoGPT: Extracting Visualization Design Preferences from Large Language Models","authors":"Huichen Will Wang;Mitchell Gordon;Leilani Battle;Jeffrey Heer","doi":"10.1109/TVCG.2024.3456350","DOIUrl":"10.1109/TVCG.2024.3456350","url":null,"abstract":"Trained on vast corpora, Large Language Models (LLMs) have the potential to encode visualization design knowledge and best practices. However, if they fail to do so, they might provide unreliable visualization recommendations. What visualization design preferences, then, have LLMs learned? We contribute DracoGPT, a method for extracting, modeling, and assessing visualization design preferences from LLMs. To assess varied tasks, we develop two pipelines—DracoGPT-Rank and DracoGPT-Recommend—to model LLMs prompted to either rank or recommend visual encoding specifications. We use Draco as a shared knowledge base in which to represent LLM design preferences and compare them to best practices from empirical research. We demonstrate that DracoGPT can accurately model the preferences expressed by LLMs, enabling analysis in terms of Draco design constraints. Across a suite of backing LLMs, we find that DracoGPT-Rank and DracoGPT-Recommend moderately agree with each other, but both substantially diverge from guidelines drawn from human subjects experiments. Future work can build on our approach to expand Draco's knowledge base to model a richer set of preferences and to provide a robust and cost-effective stand-in for LLMs.","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"31 1","pages":"710-720"},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142262087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose HaptoFloater, a low-latency mid-air visuo-haptic augmented reality (VHAR) system that utilizes imperceptible color vibrations. When adding tactile stimuli to the visual information of a mid-air image, the user should not perceive the latency between the tactile and visual information. However, conventional tactile presentation methods for mid-air images, based on camera-detected fingertip positioning, introduce latency due to image processing and communication. To mitigate this latency, we use a color vibration technique; humans cannot perceive the vibration when the display alternates between two different color stimuli at a frequency of 25 Hz or higher. In our system, we embed this imperceptible color vibration into the mid-air image formed by a micromirror array plate, and a photodiode on the fingertip device directly detects this color vibration to provide tactile stimulation. Thus, our system allows for the tactile perception of multiple patterns on a mid-air image in 59.5 ms. In addition, we evaluate the visual-haptic delay tolerance on a mid-air display using our VHAR system and a tactile actuator with a single pattern and faster response time. The results of our user study indicate a visual-haptic delay tolerance of 110.6 ms, which is considerably larger than the latency associated with systems using multiple tactile patterns.
{"title":"HaptoFloater: Visuo-Haptic Augmented Reality by Embedding Imperceptible Color Vibration Signals for Tactile Display Control in a Mid-Air Image","authors":"Rina Nagano;Takahiro Kinoshita;Shingo Hattori;Yuichi Hiroi;Yuta Itoh;Takefumi Hiraki","doi":"10.1109/TVCG.2024.3456175","DOIUrl":"10.1109/TVCG.2024.3456175","url":null,"abstract":"We propose HaptoFloater, a low-latency mid-air visuo-haptic augmented reality (VHAR) system that utilizes imperceptible color vibrations. When adding tactile stimuli to the visual information of a mid-air image, the user should not perceive the latency between the tactile and visual information. However, conventional tactile presentation methods for mid-air images, based on camera-detected fingertip positioning, introduce latency due to image processing and communication. To mitigate this latency, we use a color vibration technique; humans cannot perceive the vibration when the display alternates between two different color stimuli at a frequency of 25 Hz or higher. In our system, we embed this imperceptible color vibration into the mid-air image formed by a micromirror array plate, and a photodiode on the fingertip device directly detects this color vibration to provide tactile stimulation. Thus, our system allows for the tactile perception of multiple patterns on a mid-air image in 59.5 ms. In addition, we evaluate the visual-haptic delay tolerance on a mid-air display using our VHAR system and a tactile actuator with a single pattern and faster response time. The results of our user study indicate a visual-haptic delay tolerance of 110.6 ms, which is considerably larger than the latency associated with systems using multiple tactile patterns.","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"30 11","pages":"7463-7472"},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142262081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We developed and validated an instrument to measure the perceived readability in data visualization: PREVis. Researchers and practitioners can easily use this instrument as part of their evaluations to compare the perceived readability of different visual data representations. Our instrument can complement results from controlled experiments on user task performance or provide additional data during in-depth qualitative work such as design iterations when developing a new technique. Although readability is recognized as an essential quality of data visualizations, so far there has not been a unified definition of the construct in the context of visual representations. As a result, researchers often lack guidance for determining how to ask people to rate their perceived readability of a visualization. To address this issue, we engaged in a rigorous process to develop the first validated instrument targeted at the subjective readability of visual data representations. Our final instrument consists of 11 items across 4 dimensions: understandability, layout clarity, readability of data values, and readability of data patterns. We provide the questionnaire as a document with implementation guidelines on osf.io/9cg8j. Beyond this instrument, we contribute a discussion of how researchers have previously assessed visualization readability, and an analysis of the factors underlying perceived readability in visual data representations.
{"title":"PREVis: Perceived Readability Evaluation for Visualizations","authors":"Anne-Flore Cabouat;Tingying He;Petra Isenberg;Tobias Isenberg","doi":"10.1109/TVCG.2024.3456318","DOIUrl":"10.1109/TVCG.2024.3456318","url":null,"abstract":"We developed and validated an instrument to measure the perceived readability in data visualization: PREVis. Researchers and practitioners can easily use this instrument as part of their evaluations to compare the perceived readability of different visual data representations. Our instrument can complement results from controlled experiments on user task performance or provide additional data during in-depth qualitative work such as design iterations when developing a new technique. Although readability is recognized as an essential quality of data visualizations, so far there has not been a unified definition of the construct in the context of visual representations. As a result, researchers often lack guidance for determining how to ask people to rate their perceived readability of a visualization. To address this issue, we engaged in a rigorous process to develop the first validated instrument targeted at the subjective readability of visual data representations. Our final instrument consists of 11 items across 4 dimensions: understandability, layout clarity, readability of data values, and readability of data patterns. We provide the questionnaire as a document with implementation guidelines on osf.io/9cg8j. Beyond this instrument, we contribute a discussion of how researchers have previously assessed visualization readability, and an analysis of the factors underlying perceived readability in visual data representations.","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"31 1","pages":"1083-1093"},"PeriodicalIF":0.0,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142262126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}