Pub Date : 2024-09-18DOI: 10.1177/14738716241277559
Qi Zhang, Nikhil Maram
Visualization is integral to uncovering hidden information in data and providing users with intuitive feedback for decision-making. Data visualization is crucial for transforming complex data into actionable insights across various domains. In recent years, coronavirus disease vaccines have become increasingly available to much of the population. However, the CDC (Centers for Disease Control and Prevention) often fails to consider multidimensional coronavirus pandemic data from a side-by-side perspective, limiting the ability of medical professionals and individuals to compare and interact with comprehensive data visualizations. Effectively displaying coronavirus and vaccination data collected from multiple sources is essential for interpreting pandemic transmission patterns and vaccine efficiency. This paper presents a new platform for innovative data visualizations that offers users intuitive feedback and a complete data story. We designed algorithms to seamlessly combine multiple parameters, synchronize attributes, and dynamically visualize data over time on a single webpage. Instead of integrating all attributes into a single plot, which can be overwhelming due to space limitations and make it difficult to extract crucial information from overcrowded display components, we developed algorithms to classify, enhance, and group all parameters based on their relationships and similarities. Furthermore, a side-by-side visualization method was created to dynamically link all parameters in multiple images for data exploration, trend comparison, hidden information detection, and correspondence analysis. Our platform provides real-time performance, enabling healthcare professionals to make informed decisions, communicate findings effectively, and uncover patterns that might not be apparent in raw data. The proposed multidimensional data visualization algorithms have broad applications in general data exploration and revealing hidden information.
{"title":"Multidimensional data visualization and synchronization for revealing hidden pandemic information","authors":"Qi Zhang, Nikhil Maram","doi":"10.1177/14738716241277559","DOIUrl":"https://doi.org/10.1177/14738716241277559","url":null,"abstract":"Visualization is integral to uncovering hidden information in data and providing users with intuitive feedback for decision-making. Data visualization is crucial for transforming complex data into actionable insights across various domains. In recent years, coronavirus disease vaccines have become increasingly available to much of the population. However, the CDC (Centers for Disease Control and Prevention) often fails to consider multidimensional coronavirus pandemic data from a side-by-side perspective, limiting the ability of medical professionals and individuals to compare and interact with comprehensive data visualizations. Effectively displaying coronavirus and vaccination data collected from multiple sources is essential for interpreting pandemic transmission patterns and vaccine efficiency. This paper presents a new platform for innovative data visualizations that offers users intuitive feedback and a complete data story. We designed algorithms to seamlessly combine multiple parameters, synchronize attributes, and dynamically visualize data over time on a single webpage. Instead of integrating all attributes into a single plot, which can be overwhelming due to space limitations and make it difficult to extract crucial information from overcrowded display components, we developed algorithms to classify, enhance, and group all parameters based on their relationships and similarities. Furthermore, a side-by-side visualization method was created to dynamically link all parameters in multiple images for data exploration, trend comparison, hidden information detection, and correspondence analysis. Our platform provides real-time performance, enabling healthcare professionals to make informed decisions, communicate findings effectively, and uncover patterns that might not be apparent in raw data. The proposed multidimensional data visualization algorithms have broad applications in general data exploration and revealing hidden information.","PeriodicalId":50360,"journal":{"name":"Information Visualization","volume":"3 1","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-14DOI: 10.1177/14738716241270288
Adrian Derstroff, Simon Leistikow, Ali Nahardani, Katja Gruen, Marcus Franz, Verena Hoerr, Lars Linsen
Understanding how a classification result is generated and what role individual features play in the classification is crucial in many applications and, in particular, in medical contexts such as the translation of diagnosis biomarkers into clinical practice. The goal is to find (ideally simple) relationships between the features in multi-dimensional data and the classification for an explanation of the underlying phenomenon. Mathematical formulas allow for the expression of these relationships and can serve as classifiers. However, there are infinitely many mathematical formulas for the given features and they bear an inherent trade-off between complexity and accuracy. We present an interactive visual approach that supports domain experts to mitigate the trade-off issue. Core to our approach is a novel feature selection method, from which formulas are composed using symbolic regression and where state-of-the-art classifiers serve as a reference. To evaluate our approach and compare the achieved classification performance to the performance achieved by other state-of-the-art feature selection techniques, we test our methods with well-known machine learning data sets. Our evaluation shows that our feature selection method performs better than randomly selecting features for data sets with many features or when a low number of generations in the symbolic regression is required. Moreover, it consistently matches or outperforms state-of-the-art methods. Moreover, we apply our approach in a case study to a hemodynamic cohort data set, where we report our findings and domain expert feedback. Our approach was able to find formulas containing features that are in agreement with literature. Also, we could find formulas that performed better in the micro-averaged F1 score when compared to established histological indices.
了解分类结果是如何产生的,以及各个特征在分类过程中发挥了什么作用,这在许多应用中都至关重要,尤其是在医疗领域,例如将诊断生物标记物转化为临床实践。我们的目标是找到多维数据中的特征与分类之间的(理想情况下是简单的)关系,以解释潜在的现象。数学公式可以表达这些关系,并可作为分类器。然而,给定特征的数学公式无穷无尽,它们在复杂性和准确性之间存在固有的权衡。我们提出了一种支持领域专家的交互式可视化方法,以缓解权衡问题。我们的方法的核心是一种新颖的特征选择方法,使用符号回归法组成公式,并以最先进的分类器作为参考。为了评估我们的方法,并将其分类性能与其他最先进的特征选择技术进行比较,我们用著名的机器学习数据集测试了我们的方法。评估结果表明,对于特征较多的数据集或需要较少代数的符号回归时,我们的特征选择方法比随机选择特征的方法性能更好。而且,它的性能始终与最先进的方法相匹配或更胜一筹。此外,我们还在血液动力学队列数据集的案例研究中应用了我们的方法,并报告了我们的发现和领域专家的反馈意见。我们的方法能够找到包含与文献一致的特征的公式。此外,与已有的组织学指数相比,我们还能找到在微观平均 F1 分数上表现更好的公式。
{"title":"Interactive visual formula composition of multidimensional data classifiers","authors":"Adrian Derstroff, Simon Leistikow, Ali Nahardani, Katja Gruen, Marcus Franz, Verena Hoerr, Lars Linsen","doi":"10.1177/14738716241270288","DOIUrl":"https://doi.org/10.1177/14738716241270288","url":null,"abstract":"Understanding how a classification result is generated and what role individual features play in the classification is crucial in many applications and, in particular, in medical contexts such as the translation of diagnosis biomarkers into clinical practice. The goal is to find (ideally simple) relationships between the features in multi-dimensional data and the classification for an explanation of the underlying phenomenon. Mathematical formulas allow for the expression of these relationships and can serve as classifiers. However, there are infinitely many mathematical formulas for the given features and they bear an inherent trade-off between complexity and accuracy. We present an interactive visual approach that supports domain experts to mitigate the trade-off issue. Core to our approach is a novel feature selection method, from which formulas are composed using symbolic regression and where state-of-the-art classifiers serve as a reference. To evaluate our approach and compare the achieved classification performance to the performance achieved by other state-of-the-art feature selection techniques, we test our methods with well-known machine learning data sets. Our evaluation shows that our feature selection method performs better than randomly selecting features for data sets with many features or when a low number of generations in the symbolic regression is required. Moreover, it consistently matches or outperforms state-of-the-art methods. Moreover, we apply our approach in a case study to a hemodynamic cohort data set, where we report our findings and domain expert feedback. Our approach was able to find formulas containing features that are in agreement with literature. Also, we could find formulas that performed better in the micro-averaged F1 score when compared to established histological indices.","PeriodicalId":50360,"journal":{"name":"Information Visualization","volume":"1 1","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142266094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Annotations are an essential part of data analysis and communication in visualizations, which focus a readers attention on critical visual elements (e.g. an arrow that emphasizes a downward trend in a bar chart). Annotations enhance comprehension, mental organization, memorability, user engagement, and interaction and are crucial for data externalization and exploration, collaborative data analysis, and narrative storytelling in visualizations. However, we have identified a general lack of understanding of how people annotate visualizations to support effective communication. In this study, we evaluate how visualization students annotate grouped bar charts when answering high-level questions about the data. The resulting annotations were qualitatively coded to generate a taxonomy of how they leverage different visual elements to communicate critical information. We found that the annotations used significantly varied by the task they were supporting and that whereas several annotation types supported many tasks, others were usable only in special cases. We also found that some tasks were so challenging that ensembles of annotations were necessary to support the tasks sufficiently. The resulting taxonomy of approaches provides a foundation for understanding the usage of annotations in broader contexts to help visualizations achieve their desired message.
{"title":"Exploring annotation taxonomy in grouped bar charts: A qualitative classroom study","authors":"Md Dilshadur Rahman, Ghulam Jilani Quadri, Danielle Albers Szafir, Paul Rosen","doi":"10.1177/14738716241270247","DOIUrl":"https://doi.org/10.1177/14738716241270247","url":null,"abstract":"Annotations are an essential part of data analysis and communication in visualizations, which focus a readers attention on critical visual elements (e.g. an arrow that emphasizes a downward trend in a bar chart). Annotations enhance comprehension, mental organization, memorability, user engagement, and interaction and are crucial for data externalization and exploration, collaborative data analysis, and narrative storytelling in visualizations. However, we have identified a general lack of understanding of how people annotate visualizations to support effective communication. In this study, we evaluate how visualization students annotate grouped bar charts when answering high-level questions about the data. The resulting annotations were qualitatively coded to generate a taxonomy of how they leverage different visual elements to communicate critical information. We found that the annotations used significantly varied by the task they were supporting and that whereas several annotation types supported many tasks, others were usable only in special cases. We also found that some tasks were so challenging that ensembles of annotations were necessary to support the tasks sufficiently. The resulting taxonomy of approaches provides a foundation for understanding the usage of annotations in broader contexts to help visualizations achieve their desired message.","PeriodicalId":50360,"journal":{"name":"Information Visualization","volume":"11 1","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-26DOI: 10.1177/14738716241270341
Nicola Cerioli, Rupesh Vyas, Mary Pat Reeve, Masood Masoodian
Despite their widespread use, network visualisations can be rather challenging to design and use. This is due to the fact that such visualisations are generally used to represent highly complex underlying data sets. As such, the resulting charts often include a very large number of visual elements and many non-linear relations between them that must be displayed. More effective design-oriented approaches are therefore needed to better support designers in creating network visualisations for complex data sets that are more understandable and usable for their users. The use of visual metaphors seems to offer such an approach to designing visualisations of complex data. In this article, we propose the use of wayfinding map metaphor in network diagrams to support both the designers and users of this type of data visualisation. We also provide a mapping of the three common map wayfinding tasks – orientation, exploration, and navigation – to three categories of network diagram user interactions. To demonstrate the potential of our proposed approach, we provide an example case study using a prototype network diagram visualisation tool – Colocalisation Network Explorer – which we have developed to support the exploration of relationships between various diseases and the portion of the human genome involved in their onset.
{"title":"Designing complex network visualisations using the wayfinding map metaphor","authors":"Nicola Cerioli, Rupesh Vyas, Mary Pat Reeve, Masood Masoodian","doi":"10.1177/14738716241270341","DOIUrl":"https://doi.org/10.1177/14738716241270341","url":null,"abstract":"Despite their widespread use, network visualisations can be rather challenging to design and use. This is due to the fact that such visualisations are generally used to represent highly complex underlying data sets. As such, the resulting charts often include a very large number of visual elements and many non-linear relations between them that must be displayed. More effective design-oriented approaches are therefore needed to better support designers in creating network visualisations for complex data sets that are more understandable and usable for their users. The use of visual metaphors seems to offer such an approach to designing visualisations of complex data. In this article, we propose the use of wayfinding map metaphor in network diagrams to support both the designers and users of this type of data visualisation. We also provide a mapping of the three common map wayfinding tasks – orientation, exploration, and navigation – to three categories of network diagram user interactions. To demonstrate the potential of our proposed approach, we provide an example case study using a prototype network diagram visualisation tool – Colocalisation Network Explorer – which we have developed to support the exploration of relationships between various diseases and the portion of the human genome involved in their onset.","PeriodicalId":50360,"journal":{"name":"Information Visualization","volume":"2 1","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142206290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-06DOI: 10.1177/14738716241265120
Arran Zeyu Wang, David Borland, David Gotz
Exploratory data analysis of high-dimensional datasets is a crucial task for which visual analytics can be especially useful. However, the ad hoc nature of exploratory analysis can also lead users to draw incorrect causal inferences. Previous studies have demonstrated this risk and shown that integrating counterfactual concepts within visual analytics systems can improve users’ understanding of visualized data. However, effectively leveraging counterfactual concepts can be challenging, with only bespoke implementations found in prior work. Moreover, it can require expertise in both counterfactual subset analysis and visualization to implement the functionalities practically. This paper aims to help address these challenges in two ways. First, we propose an operator-based conceptual model for the use of counterfactuals that is informed by prior work in visualization research. Second, we contribute the Co-op library, an open and extensible reference implementation of this model that can support the integration of counterfactual-based subset computation with visualization systems. To evaluate the effectiveness and generalizability of Co-op, the library was used to construct two different visual analytics systems each supporting a distinct user workflow. In addition, expert interviews were conducted with professional visual analytics researchers and engineers to gain more insights regarding how Co-op could be leveraged. Finally, informed in part by these evaluation results, we distil a set of key design implications for effectively leveraging counterfactuals in future visualization systems.
{"title":"A framework to improve causal inferences from visualizations using counterfactual operators","authors":"Arran Zeyu Wang, David Borland, David Gotz","doi":"10.1177/14738716241265120","DOIUrl":"https://doi.org/10.1177/14738716241265120","url":null,"abstract":"Exploratory data analysis of high-dimensional datasets is a crucial task for which visual analytics can be especially useful. However, the ad hoc nature of exploratory analysis can also lead users to draw incorrect causal inferences. Previous studies have demonstrated this risk and shown that integrating counterfactual concepts within visual analytics systems can improve users’ understanding of visualized data. However, effectively leveraging counterfactual concepts can be challenging, with only bespoke implementations found in prior work. Moreover, it can require expertise in both counterfactual subset analysis and visualization to implement the functionalities practically. This paper aims to help address these challenges in two ways. First, we propose an operator-based conceptual model for the use of counterfactuals that is informed by prior work in visualization research. Second, we contribute the Co-op library, an open and extensible reference implementation of this model that can support the integration of counterfactual-based subset computation with visualization systems. To evaluate the effectiveness and generalizability of Co-op, the library was used to construct two different visual analytics systems each supporting a distinct user workflow. In addition, expert interviews were conducted with professional visual analytics researchers and engineers to gain more insights regarding how Co-op could be leveraged. Finally, informed in part by these evaluation results, we distil a set of key design implications for effectively leveraging counterfactuals in future visualization systems.","PeriodicalId":50360,"journal":{"name":"Information Visualization","volume":"60 1","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141943367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-03DOI: 10.1177/14738716241265110
Qi Huang, Mao Lin Huang, Yi-Na Li
Within organizations, managers’ specific responsibilities and domain expertise shape their interests in the output of social network analysis. Our proposed visualization approach is tailored to meet the operation-directed needs and preferences for visual analysis of specific tasks. This method prioritizes an overall geographical map with focal-contextual dynamics within the network. To enable a comprehensive and in-depth understanding of pinpointed focal areas, we customize an analytical framework for analyzing inter-community networks. We extract focal sub-networks from specific nodes to create graph visualization for detailed analysis, represent rich types of domain-specific graphic properties, and provide direct zoom+filtering interactions to allow easy pattern recognition and knowledge discovery. We applied our approach to visualizing the data from interactions among 300 city-based truck communities on the largest occupational platform for truckers in China. We also conduct a case study to demonstrate that our approach is effective in supporting managers’ network analysis and knowledge discovery.
{"title":"Two-layer visual analytics of truckers’ risk-coping social network","authors":"Qi Huang, Mao Lin Huang, Yi-Na Li","doi":"10.1177/14738716241265110","DOIUrl":"https://doi.org/10.1177/14738716241265110","url":null,"abstract":"Within organizations, managers’ specific responsibilities and domain expertise shape their interests in the output of social network analysis. Our proposed visualization approach is tailored to meet the operation-directed needs and preferences for visual analysis of specific tasks. This method prioritizes an overall geographical map with focal-contextual dynamics within the network. To enable a comprehensive and in-depth understanding of pinpointed focal areas, we customize an analytical framework for analyzing inter-community networks. We extract focal sub-networks from specific nodes to create graph visualization for detailed analysis, represent rich types of domain-specific graphic properties, and provide direct zoom+filtering interactions to allow easy pattern recognition and knowledge discovery. We applied our approach to visualizing the data from interactions among 300 city-based truck communities on the largest occupational platform for truckers in China. We also conduct a case study to demonstrate that our approach is effective in supporting managers’ network analysis and knowledge discovery.","PeriodicalId":50360,"journal":{"name":"Information Visualization","volume":"373 1","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141943439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-25DOI: 10.1177/14738716241259432
Andrew Hill
Many data visualization experts recommend the use of bar charts over pie charts because they consider comparing the area or angle of segments to be less accurate than comparing bars on a bar chart. However, many studies show that when the pie chart is used to estimate proportions (arguably its main function) it is as accurate as the bar chart. A major issue is that most previous studies have only looked at one method of extracting information from pie charts, for example either by comparing the segment to the circle (the part-whole relationship) or one segment to another (relative magnitude estimation). Therefore, in this study I test multiple metrics to provide a more holistic assessment of the pie and donut chart against the bar chart. I also measured cognitive load through pupillometry. In summary, bar charts were more precise than pie and donut charts for ranking elements, but all charts were equally accurate for extracting the part-whole relationship. There was little difference in cognitive load between chart types, although bar charts were consistently faster to use on average. Overall, the bar chart was more flexible, but where there were statistically significant differences between charts, the effect sizes were often small, and unlikely to prevent effective extraction of quantitative information. That is, as long as they were used appropriately, all chart types were arguably acceptable for displaying simple, categorical data.
{"title":"Are pie charts evil? An assessment of the value of pie and donut charts compared to bar charts","authors":"Andrew Hill","doi":"10.1177/14738716241259432","DOIUrl":"https://doi.org/10.1177/14738716241259432","url":null,"abstract":"Many data visualization experts recommend the use of bar charts over pie charts because they consider comparing the area or angle of segments to be less accurate than comparing bars on a bar chart. However, many studies show that when the pie chart is used to estimate proportions (arguably its main function) it is as accurate as the bar chart. A major issue is that most previous studies have only looked at one method of extracting information from pie charts, for example either by comparing the segment to the circle (the part-whole relationship) or one segment to another (relative magnitude estimation). Therefore, in this study I test multiple metrics to provide a more holistic assessment of the pie and donut chart against the bar chart. I also measured cognitive load through pupillometry. In summary, bar charts were more precise than pie and donut charts for ranking elements, but all charts were equally accurate for extracting the part-whole relationship. There was little difference in cognitive load between chart types, although bar charts were consistently faster to use on average. Overall, the bar chart was more flexible, but where there were statistically significant differences between charts, the effect sizes were often small, and unlikely to prevent effective extraction of quantitative information. That is, as long as they were used appropriately, all chart types were arguably acceptable for displaying simple, categorical data.","PeriodicalId":50360,"journal":{"name":"Information Visualization","volume":"15 1","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-25DOI: 10.1177/14738716241255598
Michael Burch, Marco Lemmenmeier, Robert Ospelt, Zilka Bajraktarevic
There are various styles to visually represent relational data in the form of node-link diagrams. In particular, for planar graphs we can find orthogonal node-link diagrams consisting of links bending only at ninety degrees a successful and prominent variant. One of the benefits of such drawings is the tracking of longer paths through a network with the eyes due to their limited number of link orientations, changes, and variations, but on the negative side the links can have arbitrary bending shapes. In this article we developed a novel way to visualize such orthogonal planar drawings by making use of mazes that look more natural to the human eye due to the street-like visual metaphor that many people are familiar with. Tracking paths is one of the major tasks in such graph visualizations, similar to orthogonal node-link diagrams, however, we argue that mazes are a more natural way to find paths. To get insights in the visual scanning behavior when reading graph mazes we conducted a comparative eye tracking study with 26 male versus female participants of different experience levels while also alternating between orthogonal node-link drawings and graph mazes as well as different graph size levels. The major result of this comparative study is that the participants can track paths in both representation styles, including a geodesic path tendency in their visual search behavior, but typically have a longer fixation duration at branching nodes and locations in the mazes that lead to opposite directions to the geodesic path tendency, maybe the viewers had to start a reorientation phase in their visual scanning behavior. We also found out that the size, that is the number of graph vertices has an impact on the visual scanning behavior for both orthogonal node-link diagrams as well as street-like maze representations, but for the mazes we found this impact to be less strong (in terms of the eye movement data metrics fixation durations and saccade lengths) compared to the node-link diagrams. To conclude the article, we discuss limitations and scalability issues of our approach. Moreover, we give an outlook and future work for possible extensions.
{"title":"Visually encoding orthogonal planar graph drawings as graph mazes: An eye tracking study","authors":"Michael Burch, Marco Lemmenmeier, Robert Ospelt, Zilka Bajraktarevic","doi":"10.1177/14738716241255598","DOIUrl":"https://doi.org/10.1177/14738716241255598","url":null,"abstract":"There are various styles to visually represent relational data in the form of node-link diagrams. In particular, for planar graphs we can find orthogonal node-link diagrams consisting of links bending only at ninety degrees a successful and prominent variant. One of the benefits of such drawings is the tracking of longer paths through a network with the eyes due to their limited number of link orientations, changes, and variations, but on the negative side the links can have arbitrary bending shapes. In this article we developed a novel way to visualize such orthogonal planar drawings by making use of mazes that look more natural to the human eye due to the street-like visual metaphor that many people are familiar with. Tracking paths is one of the major tasks in such graph visualizations, similar to orthogonal node-link diagrams, however, we argue that mazes are a more natural way to find paths. To get insights in the visual scanning behavior when reading graph mazes we conducted a comparative eye tracking study with 26 male versus female participants of different experience levels while also alternating between orthogonal node-link drawings and graph mazes as well as different graph size levels. The major result of this comparative study is that the participants can track paths in both representation styles, including a geodesic path tendency in their visual search behavior, but typically have a longer fixation duration at branching nodes and locations in the mazes that lead to opposite directions to the geodesic path tendency, maybe the viewers had to start a reorientation phase in their visual scanning behavior. We also found out that the size, that is the number of graph vertices has an impact on the visual scanning behavior for both orthogonal node-link diagrams as well as street-like maze representations, but for the mazes we found this impact to be less strong (in terms of the eye movement data metrics fixation durations and saccade lengths) compared to the node-link diagrams. To conclude the article, we discuss limitations and scalability issues of our approach. Moreover, we give an outlook and future work for possible extensions.","PeriodicalId":50360,"journal":{"name":"Information Visualization","volume":"46 1","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-25DOI: 10.1177/14738716241260745
Catarina Maçãs, João R Campos, Nuno Lourenço, Penousal Machado
Decision Trees (DTs) stand out as a prevalent choice among supervised Machine Learning algorithms. These algorithms form binary structures, effectively dividing data into smaller segments based on distinct rules. Consequently, DTs serve as a learning mechanism to identify optimal rules for the separation and classification of all elements within a dataset. Due to their resemblance to rule-based decisions, DTs are easy to interpret. Additionally, their minimal need for data pre-processing and versatility in handling various data types make DTs highly practical and user-friendly across diverse domains. Nevertheless, when confronted with extensive datasets or ensembles involving multiple trees, such as Random Forests, its analysis can become challenging. To facilitate the examination and validation of these models, we have developed a visual tool that incorporates a range of visualisations providing both an overview and detailed insights into a set of DTs. Our tool is designed to offer diverse perspectives on the same data, enabling a deeper understanding of the decision-making process. This article outlines our design approach, introduces various visualisation models, and details the iterative validation process. We validate our methodology through a telecommunications use case, specifically employing the visual tool to decipher how a DT-based model determines the optimal communication channel (i.e. phone call, email, SMS) for a telecommunication operator to use when contacting a client.
{"title":"Visualisation of Random Forest classification","authors":"Catarina Maçãs, João R Campos, Nuno Lourenço, Penousal Machado","doi":"10.1177/14738716241260745","DOIUrl":"https://doi.org/10.1177/14738716241260745","url":null,"abstract":"Decision Trees (DTs) stand out as a prevalent choice among supervised Machine Learning algorithms. These algorithms form binary structures, effectively dividing data into smaller segments based on distinct rules. Consequently, DTs serve as a learning mechanism to identify optimal rules for the separation and classification of all elements within a dataset. Due to their resemblance to rule-based decisions, DTs are easy to interpret. Additionally, their minimal need for data pre-processing and versatility in handling various data types make DTs highly practical and user-friendly across diverse domains. Nevertheless, when confronted with extensive datasets or ensembles involving multiple trees, such as Random Forests, its analysis can become challenging. To facilitate the examination and validation of these models, we have developed a visual tool that incorporates a range of visualisations providing both an overview and detailed insights into a set of DTs. Our tool is designed to offer diverse perspectives on the same data, enabling a deeper understanding of the decision-making process. This article outlines our design approach, introduces various visualisation models, and details the iterative validation process. We validate our methodology through a telecommunications use case, specifically employing the visual tool to decipher how a DT-based model determines the optimal communication channel (i.e. phone call, email, SMS) for a telecommunication operator to use when contacting a client.","PeriodicalId":50360,"journal":{"name":"Information Visualization","volume":"4 1","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141506238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}