Decision trees are simple and powerful decision support tools, and their graphical nature can be very useful for visual analysis tasks. However, decision trees tend to be large and hard to display when they are built from complex real world data. This paper proposes an original solution to optimize the visual representation of decision trees obtained from data. The solution combines clustering and feature construction, and introduces a new clustering algorithm that takes into account the visual properties and the accuracy of decision trees. A prototype has been implemented, and the benefits of the proposed method are shown using the results of several experiments performed on the UCI datasets.
{"title":"Using Clustering to Improve Decision Trees Visualization","authors":"O. Parisot, Y. Didry, T. Tamisier, B. Otjacques","doi":"10.1109/IV.2013.24","DOIUrl":"https://doi.org/10.1109/IV.2013.24","url":null,"abstract":"Decision trees are simple and powerful decision support tools, and their graphical nature can be very useful for visual analysis tasks. However, decision trees tend to be large and hard to display when they are built from complex real world data. This paper proposes an original solution to optimize the visual representation of decision trees obtained from data. The solution combines clustering and feature construction, and introduces a new clustering algorithm that takes into account the visual properties and the accuracy of decision trees. A prototype has been implemented, and the benefits of the proposed method are shown using the results of several experiments performed on the UCI datasets.","PeriodicalId":354135,"journal":{"name":"2013 17th International Conference on Information Visualisation","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123545330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We investigate the potential of modern nonlinear dimensionality reduction techniques for an interactive cluster detection in bioinformatics applications. We demonstrate that recent non-parametric techniques such as t-distributed stochastic neighbor embedding (t-SNE) allow a cluster identification which is superior to direct clustering of the original data or cluster detection based on classical parametric dimensionality reduction approaches. Non-parametric approaches, however, display quadratic complexity which makes them unsuitable in interactive devices. As speedup, we propose kernel-t-SNE as a fast parametric counterpart based on t-SNE.
{"title":"Nonlinear Dimensionality Reduction for Cluster Identification in Metagenomic Samples","authors":"A. Gisbrecht, B. Hammer, B. Mokbel, A. Sczyrba","doi":"10.1109/IV.2013.22","DOIUrl":"https://doi.org/10.1109/IV.2013.22","url":null,"abstract":"We investigate the potential of modern nonlinear dimensionality reduction techniques for an interactive cluster detection in bioinformatics applications. We demonstrate that recent non-parametric techniques such as t-distributed stochastic neighbor embedding (t-SNE) allow a cluster identification which is superior to direct clustering of the original data or cluster detection based on classical parametric dimensionality reduction approaches. Non-parametric approaches, however, display quadratic complexity which makes them unsuitable in interactive devices. As speedup, we propose kernel-t-SNE as a fast parametric counterpart based on t-SNE.","PeriodicalId":354135,"journal":{"name":"2013 17th International Conference on Information Visualisation","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121427003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Traditionally, drawing of edges is performed together with drawing of nodes. However, there are situations where positions of the nodes are fixed, e.g., when the positions are defined by the user or a separate algorithm. An example of this situation is a database schema editor, where user positions the nodes (i.e., visual representations of definitions of individual database tables) according to their meaning, for example grouping them according to sub domains of the problem. In this case, we only need to draw the edges but we must do that in such a way that the lines that represent these edges do not cross the rectangles that represent the nodes -- we need to perform some kind of edge routing. This paper describes an algorithm that performs edge routing in such a way that the lengths of the polylines it produces are minimal. We also describe several ways of improving the performance of the basic algorithm so that it can be used even for interactive graph visualization and manipulation, which is necessary in our scenario. Then, we show several post-processing steps that are used to turn the results of the algorithm into a usable visualization.
{"title":"Shortest Path Approach to Edge Routing","authors":"J. Dokulil, J. Katreniaková, D. Bednárek","doi":"10.1109/IV.2013.97","DOIUrl":"https://doi.org/10.1109/IV.2013.97","url":null,"abstract":"Traditionally, drawing of edges is performed together with drawing of nodes. However, there are situations where positions of the nodes are fixed, e.g., when the positions are defined by the user or a separate algorithm. An example of this situation is a database schema editor, where user positions the nodes (i.e., visual representations of definitions of individual database tables) according to their meaning, for example grouping them according to sub domains of the problem. In this case, we only need to draw the edges but we must do that in such a way that the lines that represent these edges do not cross the rectangles that represent the nodes -- we need to perform some kind of edge routing. This paper describes an algorithm that performs edge routing in such a way that the lengths of the polylines it produces are minimal. We also describe several ways of improving the performance of the basic algorithm so that it can be used even for interactive graph visualization and manipulation, which is necessary in our scenario. Then, we show several post-processing steps that are used to turn the results of the algorithm into a usable visualization.","PeriodicalId":354135,"journal":{"name":"2013 17th International Conference on Information Visualisation","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126457673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Clustering is a well established data exploration and analysis method. It allows interactive discovery and interpretation of groups of entities that have similar properties and characteristics. However, deriving meaningful insights from clusters often presents challenges in large sets of structurally complex data. Large scale commercial enterprises hold an increasing volume of complex, highly-dimensional data. In order to effectively analyze this data and create meaningful clusters from it, pre-processing the data prior to clustering is essential. Once clusters are created, interpretation and representation of clusters is equally essential to capture insights that can aid corporate decision making. In this paper, we present a generic approach to data preparation and cluster interpretation implemented on a large scale enterprise database.
{"title":"Visual Clustering for Large Scale Commercial Enterprises","authors":"Masoud Charkhabi, Tarundeep Dhot","doi":"10.1109/IV.2013.94","DOIUrl":"https://doi.org/10.1109/IV.2013.94","url":null,"abstract":"Clustering is a well established data exploration and analysis method. It allows interactive discovery and interpretation of groups of entities that have similar properties and characteristics. However, deriving meaningful insights from clusters often presents challenges in large sets of structurally complex data. Large scale commercial enterprises hold an increasing volume of complex, highly-dimensional data. In order to effectively analyze this data and create meaningful clusters from it, pre-processing the data prior to clustering is essential. Once clusters are created, interpretation and representation of clusters is equally essential to capture insights that can aid corporate decision making. In this paper, we present a generic approach to data preparation and cluster interpretation implemented on a large scale enterprise database.","PeriodicalId":354135,"journal":{"name":"2013 17th International Conference on Information Visualisation","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126038999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zohra Ben Said, F. Guillet, Paul Richard, Fabien Picarougne, Julien Blanchard
In order to extract interesting knowledge from large amounts of rules produced by the data mining algorithms, visual representations of association rules are increasingly used. These representations can help users to find and to validate interesting knowledge. All techniques proposed for visualisation of rules have been developed to represent an association rule as a whole without paying attention to the relations among the items that make up the antecedent and the consequent and the contribution of each one to the rule. In this paper, we propose a new visualisation representation for association rules that allows the visualisation of the items which make up the antecedent and the consequent, the contribution of each one to the rule, and the correlations between each pair of the antecedent and each pair of consequent.
{"title":"Visualisation of Association Rules Based on a Molecular Representation","authors":"Zohra Ben Said, F. Guillet, Paul Richard, Fabien Picarougne, Julien Blanchard","doi":"10.1109/IV.2013.98","DOIUrl":"https://doi.org/10.1109/IV.2013.98","url":null,"abstract":"In order to extract interesting knowledge from large amounts of rules produced by the data mining algorithms, visual representations of association rules are increasingly used. These representations can help users to find and to validate interesting knowledge. All techniques proposed for visualisation of rules have been developed to represent an association rule as a whole without paying attention to the relations among the items that make up the antecedent and the consequent and the contribution of each one to the rule. In this paper, we propose a new visualisation representation for association rules that allows the visualisation of the items which make up the antecedent and the consequent, the contribution of each one to the rule, and the correlations between each pair of the antecedent and each pair of consequent.","PeriodicalId":354135,"journal":{"name":"2013 17th International Conference on Information Visualisation","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124102093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Being able to produce a wide variety of layouts for a same graphs may prove useful when users have no preferred visual encoding for their data. The first contribution of this paper is a enhanced force-directed layout capable of producing different layouts of a same graph. We turn a well known force-directed algorithm (GEM) into a highly parametrizable layout and control it from a genetic algorithm framework. The genetic algorithm allows to efficiently explore the parameter space of this highly parametrisable layout. The search process relies on the capability of the system to evaluate the similarity between two drawings. The second contribution of this paper is a similarity metric used as a fitness function for the genetic algorithm. Its main features are its computational cost and its insensitivity to planar homotheties.
{"title":"One Graph, Multiple Drawings","authors":"M. Nadal, G. Melanon","doi":"10.1109/IV.2013.55","DOIUrl":"https://doi.org/10.1109/IV.2013.55","url":null,"abstract":"Being able to produce a wide variety of layouts for a same graphs may prove useful when users have no preferred visual encoding for their data. The first contribution of this paper is a enhanced force-directed layout capable of producing different layouts of a same graph. We turn a well known force-directed algorithm (GEM) into a highly parametrizable layout and control it from a genetic algorithm framework. The genetic algorithm allows to efficiently explore the parameter space of this highly parametrisable layout. The search process relies on the capability of the system to evaluate the similarity between two drawings. The second contribution of this paper is a similarity metric used as a fitness function for the genetic algorithm. Its main features are its computational cost and its insensitivity to planar homotheties.","PeriodicalId":354135,"journal":{"name":"2013 17th International Conference on Information Visualisation","volume":"222 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124387673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Masoodian, Birgit Lugrin, René Bühling, Pavel Ermolin, E. André
In recent years a growing number of information visualization systems have been developed to assist users with monitoring their energy consumption, with the hope of reducing energy use through more effective user-awareness. Most of these visualizations can be categorized into either some form of a time-series or pie chart, each with their own limitations. These visualization systems also often ignore incorporating contextual (e.g. weather, environmental) information which could assist users with better interpretation of their energy use information. In this paper we introduce the time-pie visualization technique, which combines the concepts of timeseries and pie charts, and allows the addition of contextual information to energy consumption data.
{"title":"Time-Pie visualization: Providing Contextual Information for Energy Consumption Data","authors":"M. Masoodian, Birgit Lugrin, René Bühling, Pavel Ermolin, E. André","doi":"10.1109/IV.2013.12","DOIUrl":"https://doi.org/10.1109/IV.2013.12","url":null,"abstract":"In recent years a growing number of information visualization systems have been developed to assist users with monitoring their energy consumption, with the hope of reducing energy use through more effective user-awareness. Most of these visualizations can be categorized into either some form of a time-series or pie chart, each with their own limitations. These visualization systems also often ignore incorporating contextual (e.g. weather, environmental) information which could assist users with better interpretation of their energy use information. In this paper we introduce the time-pie visualization technique, which combines the concepts of timeseries and pie charts, and allows the addition of contextual information to energy consumption data.","PeriodicalId":354135,"journal":{"name":"2013 17th International Conference on Information Visualisation","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128956627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The large and ever-increasing amounts of multivariate, multi-source, time-varying and geospatial digital information represent a major challenge for the analyst. The need to analyze and make decisions based on these information streams, often in time-critical situations, demands efficient, integrated and interactive visualization tools that aid the user to explore, present, collaborate and communicate visually large information spaces. This approach has been encapsulated in the idea of Geovisual Analytics, an interdisciplinary field that facilitates analytical reasoning through highly interactive visual interfaces and creative visualization of complex and dynamic data integrated with storytelling. Collaborative mapping is exemplified in this paper through telling stories about public statistics development over time that could shape, for example, economic growth and well-being. Discoveries are made that leave lasting impressions by stimulating the readers' curiosity making them want to learn more and convey a deeper meaning. In addition, the user can interactively participate in this web-based process which is important to the education and dissemination of public statistics. The storytelling mechanism assists the author to improve a reader's visual knowledge through reflections such as how life is lived by using a variety of demographics, such as healthcare, environment, and educational and economic indicators. Integrated snapshots can be captured at any time during the explorative data analysis process and thus become an important component of a storytelling reasoning process. The public can access Geovisual Analytics applications and explore statistical data relations on their own guided by the stories prepared by the experts. With the associated science of perception and cognition in relation to the use of multivariate spatial-temporal statistical data, this article contributes to the growing interest in visual storytelling engaging the public with new experiences.
{"title":"Geovisual Analytics and Storytelling Using HTML5","authors":"P. Lundblad, M. Jern","doi":"10.1109/IV.2013.35","DOIUrl":"https://doi.org/10.1109/IV.2013.35","url":null,"abstract":"The large and ever-increasing amounts of multivariate, multi-source, time-varying and geospatial digital information represent a major challenge for the analyst. The need to analyze and make decisions based on these information streams, often in time-critical situations, demands efficient, integrated and interactive visualization tools that aid the user to explore, present, collaborate and communicate visually large information spaces. This approach has been encapsulated in the idea of Geovisual Analytics, an interdisciplinary field that facilitates analytical reasoning through highly interactive visual interfaces and creative visualization of complex and dynamic data integrated with storytelling. Collaborative mapping is exemplified in this paper through telling stories about public statistics development over time that could shape, for example, economic growth and well-being. Discoveries are made that leave lasting impressions by stimulating the readers' curiosity making them want to learn more and convey a deeper meaning. In addition, the user can interactively participate in this web-based process which is important to the education and dissemination of public statistics. The storytelling mechanism assists the author to improve a reader's visual knowledge through reflections such as how life is lived by using a variety of demographics, such as healthcare, environment, and educational and economic indicators. Integrated snapshots can be captured at any time during the explorative data analysis process and thus become an important component of a storytelling reasoning process. The public can access Geovisual Analytics applications and explore statistical data relations on their own guided by the stories prepared by the experts. With the associated science of perception and cognition in relation to the use of multivariate spatial-temporal statistical data, this article contributes to the growing interest in visual storytelling engaging the public with new experiences.","PeriodicalId":354135,"journal":{"name":"2013 17th International Conference on Information Visualisation","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128419464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Owing to a limited display resolution, it may be difficult to obtain an overview of high-dimensional data in the display area used for visualization. In this paper, we aimed to obtain an overview of high-dimensional data in a limited screen area. We developed Colored Mosaic Matrix as a method to obtain a data overview. Colored Mosaic Matrix is a visualization method for high-dimensional categorical data that uses a color representation of the features. By representing quantitative data in category units, the proposed method enables the visualization of data containing a large number of records. As a result of an experimental investigation of its readability, we found our method to be useful in obtaining a data overview.
{"title":"Colored Mosaic Matrix: Visualization Technique for High-Dimensional Data","authors":"Hiroaki Kobayashi, Kazuo Misue, J. Tanaka","doi":"10.1109/IV.2013.50","DOIUrl":"https://doi.org/10.1109/IV.2013.50","url":null,"abstract":"Owing to a limited display resolution, it may be difficult to obtain an overview of high-dimensional data in the display area used for visualization. In this paper, we aimed to obtain an overview of high-dimensional data in a limited screen area. We developed Colored Mosaic Matrix as a method to obtain a data overview. Colored Mosaic Matrix is a visualization method for high-dimensional categorical data that uses a color representation of the features. By representing quantitative data in category units, the proposed method enables the visualization of data containing a large number of records. As a result of an experimental investigation of its readability, we found our method to be useful in obtaining a data overview.","PeriodicalId":354135,"journal":{"name":"2013 17th International Conference on Information Visualisation","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128600720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
N. Adamo-Villani, Thomas Haley-Hermiz, Robb Cutler
In this paper we describe the design, development, formative evaluation and initial findings of one level of a serious game whose objective is to teach Information Assurance concepts to undergraduate students in introductory programming courses. The game level focuses on the concept of 'operator precedence'. The player travels through a multilevel 3- dimensional maze and at each junction in the maze he/she is required to solve a mathematical problem that involves the application of operator precedence rules. A correct answer allows the player to move closer to the maze exit, an incorrect solution moves the player farther from the end of the maze. Initial findings from a formative study with a group of 14 undergraduate students show that the game level is usable, engaging and useful for learning/reviewing the intended programming concept.
{"title":"Using a Serious Game Approach to Teach 'Operator Precedence' to Introductory Programming Students","authors":"N. Adamo-Villani, Thomas Haley-Hermiz, Robb Cutler","doi":"10.1109/IV.2013.70","DOIUrl":"https://doi.org/10.1109/IV.2013.70","url":null,"abstract":"In this paper we describe the design, development, formative evaluation and initial findings of one level of a serious game whose objective is to teach Information Assurance concepts to undergraduate students in introductory programming courses. The game level focuses on the concept of 'operator precedence'. The player travels through a multilevel 3- dimensional maze and at each junction in the maze he/she is required to solve a mathematical problem that involves the application of operator precedence rules. A correct answer allows the player to move closer to the maze exit, an incorrect solution moves the player farther from the end of the maze. Initial findings from a formative study with a group of 14 undergraduate students show that the game level is usable, engaging and useful for learning/reviewing the intended programming concept.","PeriodicalId":354135,"journal":{"name":"2013 17th International Conference on Information Visualisation","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131771803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}