Pub Date : 2021-04-01DOI: 10.1109/PacificVis52677.2021.00014
Dominik Vietinghoff, Christian Heine, M. Böttinger, G. Scheuermann
To assess the reliability of weather forecasts and climate simulations, common practice is to generate large ensembles of numerical simulations. Analyzing such data is challenging and requires pattern and feature detection. For single time-dependent scalar fields, empirical orthogonal functions (EOFs) are a proven means to identify the main variation. In this paper, we present an extension of that concept to time-dependent ensemble data. We applied our methods to two ensemble data sets from climate research in order to investigate the North Atlantic Oscillation (NAO) and East Atlantic (EA) pattern.
{"title":"An Extension of Empirical Orthogonal Functions for the Analysis of Time-Dependent 2D Scalar Field Ensembles","authors":"Dominik Vietinghoff, Christian Heine, M. Böttinger, G. Scheuermann","doi":"10.1109/PacificVis52677.2021.00014","DOIUrl":"https://doi.org/10.1109/PacificVis52677.2021.00014","url":null,"abstract":"To assess the reliability of weather forecasts and climate simulations, common practice is to generate large ensembles of numerical simulations. Analyzing such data is challenging and requires pattern and feature detection. For single time-dependent scalar fields, empirical orthogonal functions (EOFs) are a proven means to identify the main variation. In this paper, we present an extension of that concept to time-dependent ensemble data. We applied our methods to two ensemble data sets from climate research in order to investigate the North Atlantic Oscillation (NAO) and East Atlantic (EA) pattern.","PeriodicalId":199565,"journal":{"name":"2021 IEEE 14th Pacific Visualization Symposium (PacificVis)","volume":"194 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114970062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-04-01DOI: 10.1109/PacificVis52677.2021.00032
Junpeng Wang, Wei Zhang, Liang Wang, Hao Yang
Tree boosting models are widely adopted predictive models and have demonstrated superior performance than other conventional and even deep learning models, especially since the recent release of their parallel and distributed implementations, e.g., XGBoost, LightGMB, and CatBoost. Tree boosting uses a group of sequentially generated weak learners (i.e., decision trees), each learns from the mistakes of its predecessor, to push the model’s decision boundary towards the true boundary. As the number of trees keeps increasing over training, it is important to reveal how the newly-added trees change the predictions of individual data instances, and how the impacts of different data features evolve. To accomplish these goals, in this paper, we introduce a new design of the temporal confusion matrix, providing users with an effective interface to track data instances’ predictions across the tree boosting process. Also, we present an improved visualization to better illustrate and compare the impacts of individual data features (based on their SHAP values) across training iterations. Integrating these components with a tree structure visualization component, we propose a visual analytics system for tree boosting models. Through case studies with domain experts using real-world datasets, we validated the system’s effectiveness.
{"title":"Investigating the Evolution of Tree Boosting Models with Visual Analytics","authors":"Junpeng Wang, Wei Zhang, Liang Wang, Hao Yang","doi":"10.1109/PacificVis52677.2021.00032","DOIUrl":"https://doi.org/10.1109/PacificVis52677.2021.00032","url":null,"abstract":"Tree boosting models are widely adopted predictive models and have demonstrated superior performance than other conventional and even deep learning models, especially since the recent release of their parallel and distributed implementations, e.g., XGBoost, LightGMB, and CatBoost. Tree boosting uses a group of sequentially generated weak learners (i.e., decision trees), each learns from the mistakes of its predecessor, to push the model’s decision boundary towards the true boundary. As the number of trees keeps increasing over training, it is important to reveal how the newly-added trees change the predictions of individual data instances, and how the impacts of different data features evolve. To accomplish these goals, in this paper, we introduce a new design of the temporal confusion matrix, providing users with an effective interface to track data instances’ predictions across the tree boosting process. Also, we present an improved visualization to better illustrate and compare the impacts of individual data features (based on their SHAP values) across training iterations. Integrating these components with a tree structure visualization component, we propose a visual analytics system for tree boosting models. Through case studies with domain experts using real-world datasets, we validated the system’s effectiveness.","PeriodicalId":199565,"journal":{"name":"2021 IEEE 14th Pacific Visualization Symposium (PacificVis)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125810512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-04-01DOI: 10.1109/PacificVis52677.2021.00033
Xiaoyu Zhang, Takanori Fujiwara, Senthil K. Chandrasegaran, Michael P. Brundage, Thurston Sexton, A. Dima, K. Ma
Analysis of large, high-dimensional, and heterogeneous datasets is challenging as no one technique is suitable for visualizing and clustering such data in order to make sense of the underlying information. For instance, heterogeneous logs detailing machine repair and maintenance in an organization often need to be analyzed to diagnose errors and identify abnormal patterns, formalize root-cause analyses, and plan preventive maintenance. Such real-world datasets are also beset by issues such as inconsistent and/or missing entries. To conduct an effective diagnosis, it is important to extract and understand patterns from the data with support from analytic algorithms (e.g., finding that certain kinds of machine complaints occur more in the summer) while involving the human-in-the-loop. To address these challenges, we adopt existing techniques for dimensionality reduction (DR) and clustering of numerical, categorical, and text data dimensions, and introduce a visual analytics approach that uses multiple coordinated views to connect DR + clustering results across each kind of the data dimension stated. To help analysts label the clusters, each clustering view is supplemented with techniques and visualizations that contrast a cluster of interest with the rest of the dataset. Our approach assists analysts to make sense of machine maintenance logs and their errors. Then the gained insights help them carry out preventive maintenance. We illustrate and evaluate our approach through use cases and expert studies respectively, and discuss generalization of the approach to other heterogeneous data.
{"title":"A Visual Analytics Approach for the Diagnosis of Heterogeneous and Multidimensional Machine Maintenance Data","authors":"Xiaoyu Zhang, Takanori Fujiwara, Senthil K. Chandrasegaran, Michael P. Brundage, Thurston Sexton, A. Dima, K. Ma","doi":"10.1109/PacificVis52677.2021.00033","DOIUrl":"https://doi.org/10.1109/PacificVis52677.2021.00033","url":null,"abstract":"Analysis of large, high-dimensional, and heterogeneous datasets is challenging as no one technique is suitable for visualizing and clustering such data in order to make sense of the underlying information. For instance, heterogeneous logs detailing machine repair and maintenance in an organization often need to be analyzed to diagnose errors and identify abnormal patterns, formalize root-cause analyses, and plan preventive maintenance. Such real-world datasets are also beset by issues such as inconsistent and/or missing entries. To conduct an effective diagnosis, it is important to extract and understand patterns from the data with support from analytic algorithms (e.g., finding that certain kinds of machine complaints occur more in the summer) while involving the human-in-the-loop. To address these challenges, we adopt existing techniques for dimensionality reduction (DR) and clustering of numerical, categorical, and text data dimensions, and introduce a visual analytics approach that uses multiple coordinated views to connect DR + clustering results across each kind of the data dimension stated. To help analysts label the clusters, each clustering view is supplemented with techniques and visualizations that contrast a cluster of interest with the rest of the dataset. Our approach assists analysts to make sense of machine maintenance logs and their errors. Then the gained insights help them carry out preventive maintenance. We illustrate and evaluate our approach through use cases and expert studies respectively, and discuss generalization of the approach to other heterogeneous data.","PeriodicalId":199565,"journal":{"name":"2021 IEEE 14th Pacific Visualization Symposium (PacificVis)","volume":"482 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116892995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-04-01DOI: 10.1109/PacificVis52677.2021.00015
B. D. Nguyen, R. Hewett, Tommy Dang
The ability to capture common characteristics among complex multi-variate time series variables can profoundly impact big data analytics in uncovering valuable insights into the relationships among them and enabling a dimensionality reduction technique. Two widely used data displays include time series and scatter plots. While the former focuses on the dynamics over time, the latter deals with static relationships among variables. Motivated by these distinctive perspectives, our research aims to maximally utilize the information captured by both at the same time. This paper presents NetScatter, a visual analytic approach to characterizing changes of pairwise relationships in a high-dimensional time series. Unlike most traditional techniques that employ a single perspective of the visual display, our approach combines static perspectives of two variables in multi-variate time series into a single representation by comparing all data instances over two different time steps. The paper also introduces a list of visual features of the representation to capture how overall data evolve. We have implemented a web-based prototype that supports a full range of operations, such as ranking, filtering, and details on demand. The paper illustrates the proposed approach on data of various sizes in different domains to demonstrate its benefits.
{"title":"NetScatter: Visual analytics of multivariate time series with a hybrid of dynamic and static variable relationships","authors":"B. D. Nguyen, R. Hewett, Tommy Dang","doi":"10.1109/PacificVis52677.2021.00015","DOIUrl":"https://doi.org/10.1109/PacificVis52677.2021.00015","url":null,"abstract":"The ability to capture common characteristics among complex multi-variate time series variables can profoundly impact big data analytics in uncovering valuable insights into the relationships among them and enabling a dimensionality reduction technique. Two widely used data displays include time series and scatter plots. While the former focuses on the dynamics over time, the latter deals with static relationships among variables. Motivated by these distinctive perspectives, our research aims to maximally utilize the information captured by both at the same time. This paper presents NetScatter, a visual analytic approach to characterizing changes of pairwise relationships in a high-dimensional time series. Unlike most traditional techniques that employ a single perspective of the visual display, our approach combines static perspectives of two variables in multi-variate time series into a single representation by comparing all data instances over two different time steps. The paper also introduces a list of visual features of the representation to capture how overall data evolve. We have implemented a web-based prototype that supports a full range of operations, such as ranking, filtering, and details on demand. The paper illustrates the proposed approach on data of various sizes in different domains to demonstrate its benefits.","PeriodicalId":199565,"journal":{"name":"2021 IEEE 14th Pacific Visualization Symposium (PacificVis)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124799988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-04-01DOI: 10.1109/PacificVis52677.2021.00029
Seok-Hee Hong, P. Eades, Marnijati Torkel
This paper presents a new visual representation of graphs, inspired by the dot painting style of Central Australia. This painting style is established as a powerful medium for communicating information with abstraction, and has a long history of supporting storytelling.We propose a general framework GDot to visually represent in-formation as dot paintings. We describe computational techniques as well as the rendering effects to produce painterly representations of graphs and networks. We present visualization examples with various networks from diverse domains, from pure mathematics to social systems. Further, we briefly describe the extension of our dot painting visualization style to multi-dimensional data, dynamic data and geo-referenced data.
{"title":"GDot: Drawing Graphs with Dots and Circles","authors":"Seok-Hee Hong, P. Eades, Marnijati Torkel","doi":"10.1109/PacificVis52677.2021.00029","DOIUrl":"https://doi.org/10.1109/PacificVis52677.2021.00029","url":null,"abstract":"This paper presents a new visual representation of graphs, inspired by the dot painting style of Central Australia. This painting style is established as a powerful medium for communicating information with abstraction, and has a long history of supporting storytelling.We propose a general framework GDot to visually represent in-formation as dot paintings. We describe computational techniques as well as the rendering effects to produce painterly representations of graphs and networks. We present visualization examples with various networks from diverse domains, from pure mathematics to social systems. Further, we briefly describe the extension of our dot painting visualization style to multi-dimensional data, dynamic data and geo-referenced data.","PeriodicalId":199565,"journal":{"name":"2021 IEEE 14th Pacific Visualization Symposium (PacificVis)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124208126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Exploring campus lifestyle is conducive to innovating education management, optimizing campus resources allocation, and providing personalized services, but little attention had been paid to the exploration campus lifestyle. A novel interactive system based on behavioral data of campus cards is presented in this paper to provide new ideas and technical support for campus management. Interactive visualization techniques are utilized to help users analyze campus lifestyle via intelligible diagrams. The system contains three functional modules: providing a decision-making reference to educators on students’ poverty subsidies, predicting students’ academic performance by quantitative analysis, and scheduling cafeteria repast based on the scheduling model during the outbreak of COVID-19. Finally, three exploratory case studies are presented to demonstrate the effectiveness of the system.
{"title":"Visual Analytics Methods for Interactively Exploring the Campus Lifestyle","authors":"Liang Liu, Song Wang, Ting Cai, Hanglin Li, Weixin Zhao, Yadong Wu","doi":"10.1109/PacificVis52677.2021.00031","DOIUrl":"https://doi.org/10.1109/PacificVis52677.2021.00031","url":null,"abstract":"Exploring campus lifestyle is conducive to innovating education management, optimizing campus resources allocation, and providing personalized services, but little attention had been paid to the exploration campus lifestyle. A novel interactive system based on behavioral data of campus cards is presented in this paper to provide new ideas and technical support for campus management. Interactive visualization techniques are utilized to help users analyze campus lifestyle via intelligible diagrams. The system contains three functional modules: providing a decision-making reference to educators on students’ poverty subsidies, predicting students’ academic performance by quantitative analysis, and scheduling cafeteria repast based on the scheduling model during the outbreak of COVID-19. Finally, three exploratory case studies are presented to demonstrate the effectiveness of the system.","PeriodicalId":199565,"journal":{"name":"2021 IEEE 14th Pacific Visualization Symposium (PacificVis)","volume":"149 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127509475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-04-01DOI: 10.1109/PacificVis52677.2021.00011
Junhua Lu, Wei Chen, Hui Ye, Jie Wang, Honghui Mei, Yuhui Gu, Yingcai Wu, X. Zhang, K. Ma
Data-driven scrollytelling has become a prevalent way of visual communication because of its comprehensive delivery of perspectives derived from the data. However, creating an expressive scrollytelling story requires both data and design literacy and is time-consuming. As a result, scrollytelling has been mainly used only by professional journalists to disseminate opinions. In this paper, we present an automatic method to generate expressive scrollytelling visualization, which can present easy-to-understand data facts through a carefully arranged sequence of views. The method first enumerates data facts of a given dataset, and scores and organizes them. The facts are further assembled, sequenced into a story, with reader input taken into consideration. Finally, visual graphs, transitions, and text descriptions are generated to synthesize the scrollytelling visualization. In this way, non-professionals can easily explore and share interesting perspectives from selected data attributes and fact types. We demonstrate the effectiveness and usability of our method through both use cases and an in-lab user study.
{"title":"Automatic Generation of Unit Visualization-based Scrollytelling for Impromptu Data Facts Delivery","authors":"Junhua Lu, Wei Chen, Hui Ye, Jie Wang, Honghui Mei, Yuhui Gu, Yingcai Wu, X. Zhang, K. Ma","doi":"10.1109/PacificVis52677.2021.00011","DOIUrl":"https://doi.org/10.1109/PacificVis52677.2021.00011","url":null,"abstract":"Data-driven scrollytelling has become a prevalent way of visual communication because of its comprehensive delivery of perspectives derived from the data. However, creating an expressive scrollytelling story requires both data and design literacy and is time-consuming. As a result, scrollytelling has been mainly used only by professional journalists to disseminate opinions. In this paper, we present an automatic method to generate expressive scrollytelling visualization, which can present easy-to-understand data facts through a carefully arranged sequence of views. The method first enumerates data facts of a given dataset, and scores and organizes them. The facts are further assembled, sequenced into a story, with reader input taken into consideration. Finally, visual graphs, transitions, and text descriptions are generated to synthesize the scrollytelling visualization. In this way, non-professionals can easily explore and share interesting perspectives from selected data attributes and fact types. We demonstrate the effectiveness and usability of our method through both use cases and an in-lab user study.","PeriodicalId":199565,"journal":{"name":"2021 IEEE 14th Pacific Visualization Symposium (PacificVis)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128312315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-04-01DOI: 10.1109/PacificVis52677.2021.00030
A. Meidiana, James Wood, Seok-Hee Hong
We present algorithms reducing the runtime of the stress minimization iteration of stress-based layouts to sublinear in the number of vertices and edges. Specifically, we use vertex sampling to further reduce the number of vertex pairs considered in stress minimization iterations. Moreover, we use spectral sparsification to reduce the number of edges considered in stress minimization computations to sublinear in the number of edges, esp. for dense graphs.Specifically, we present new pivot selection methods using importance-based sampling. Then, we present two variations of sublinear-time stress minimization method on two popular stress-based layouts, Stress Majorization and Stochastic Gradient Descent.Experimental results demonstrate that our sublinear-time algorithms run, on average, about 35% faster than the state-of-art linear-time algorithms, while obtaining similar quality drawings based on stress and shape-based metrics.
{"title":"Sublinear-time Algorithms for Stress Minimization in Graph Drawing","authors":"A. Meidiana, James Wood, Seok-Hee Hong","doi":"10.1109/PacificVis52677.2021.00030","DOIUrl":"https://doi.org/10.1109/PacificVis52677.2021.00030","url":null,"abstract":"We present algorithms reducing the runtime of the stress minimization iteration of stress-based layouts to sublinear in the number of vertices and edges. Specifically, we use vertex sampling to further reduce the number of vertex pairs considered in stress minimization iterations. Moreover, we use spectral sparsification to reduce the number of edges considered in stress minimization computations to sublinear in the number of edges, esp. for dense graphs.Specifically, we present new pivot selection methods using importance-based sampling. Then, we present two variations of sublinear-time stress minimization method on two popular stress-based layouts, Stress Majorization and Stochastic Gradient Descent.Experimental results demonstrate that our sublinear-time algorithms run, on average, about 35% faster than the state-of-art linear-time algorithms, while obtaining similar quality drawings based on stress and shape-based metrics.","PeriodicalId":199565,"journal":{"name":"2021 IEEE 14th Pacific Visualization Symposium (PacificVis)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129781615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-04-01DOI: 10.1109/PacificVis52677.2021.00019
Jian Zhao, Maoyuan Sun, Patrick Chiu, Francine Chen, Bee Liew
This paper proposes a novel approach for analyzing search results of a document collection, which can help support know-what and know-who information seeking questions. Search results are grouped by topics, and each topic is represented by a two-mode network composed of related documents and authors (i.e., biclusters). We visualize these biclusters in a 2D layout to support interactive visual exploration of the analyzed search results, which highlights a novel way of organizing entities of biclusters. We evaluated our approach using a large academic publication corpus, by testing the distribution of the relevant documents and of lead and prolific authors. The results indicate the effectiveness of our approach compared to traditional 1D ranked lists. Moreover, a user study with 12 participants was conducted to compare our proposed visualization, a simplified variation without topics, and a text-based interface. We report on participants’ task performance, their preference of the three interfaces, and the different strategies used in information seeking.
{"title":"Know-What and Know-Who: Document Searching and Exploration using Topic-Based Two-Mode Networks","authors":"Jian Zhao, Maoyuan Sun, Patrick Chiu, Francine Chen, Bee Liew","doi":"10.1109/PacificVis52677.2021.00019","DOIUrl":"https://doi.org/10.1109/PacificVis52677.2021.00019","url":null,"abstract":"This paper proposes a novel approach for analyzing search results of a document collection, which can help support know-what and know-who information seeking questions. Search results are grouped by topics, and each topic is represented by a two-mode network composed of related documents and authors (i.e., biclusters). We visualize these biclusters in a 2D layout to support interactive visual exploration of the analyzed search results, which highlights a novel way of organizing entities of biclusters. We evaluated our approach using a large academic publication corpus, by testing the distribution of the relevant documents and of lead and prolific authors. The results indicate the effectiveness of our approach compared to traditional 1D ranked lists. Moreover, a user study with 12 participants was conducted to compare our proposed visualization, a simplified variation without topics, and a text-based interface. We report on participants’ task performance, their preference of the three interfaces, and the different strategies used in information seeking.","PeriodicalId":199565,"journal":{"name":"2021 IEEE 14th Pacific Visualization Symposium (PacificVis)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115611010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-04-01DOI: 10.1109/PacificVis52677.2021.00035
Tiandong Xiao, Yosuke Onoue
Social networking services (SNSs) have become the main avenue, where people speak their thoughts. Accordingly, we can explore people’s thoughts by analyzing topics in SNS. When do topics change? Do they ever come back? What do people mainly talk about? In this study, we design and propose a novel visual analytics system to answer these interesting questions. We abstract the topic per unit time as a point in a two-dimensional space through document embedding and dimensionality reduction techniques and provide supplemented charts that represent words appearing at a certain time and the time-series change of word occurrence over the entire period. We employ a novel text visualization technique, called semantic preserving word bubbles, to visualize words at a certain time. In addition, we demonstrate the effectiveness of our system using Twitter data about early COVID-19 trends in Japan. We propose our system to help users to explore and understand transitions in posted contents on SNS.
{"title":"Visualization of Topic Transitions in SNSs Using Document Embedding and Dimensionality Reduction","authors":"Tiandong Xiao, Yosuke Onoue","doi":"10.1109/PacificVis52677.2021.00035","DOIUrl":"https://doi.org/10.1109/PacificVis52677.2021.00035","url":null,"abstract":"Social networking services (SNSs) have become the main avenue, where people speak their thoughts. Accordingly, we can explore people’s thoughts by analyzing topics in SNS. When do topics change? Do they ever come back? What do people mainly talk about? In this study, we design and propose a novel visual analytics system to answer these interesting questions. We abstract the topic per unit time as a point in a two-dimensional space through document embedding and dimensionality reduction techniques and provide supplemented charts that represent words appearing at a certain time and the time-series change of word occurrence over the entire period. We employ a novel text visualization technique, called semantic preserving word bubbles, to visualize words at a certain time. In addition, we demonstrate the effectiveness of our system using Twitter data about early COVID-19 trends in Japan. We propose our system to help users to explore and understand transitions in posted contents on SNS.","PeriodicalId":199565,"journal":{"name":"2021 IEEE 14th Pacific Visualization Symposium (PacificVis)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127004604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}