Finding appropriate adslots to display ads is an important step to achieve high conversion rates in online display advertising. Previous work on ad recommendation and conversion prediction often focuses on matching between adslots, users and ads simultaneously for each impression at micro level. Such methods require rich attributes of users, ads and adslots, which might not always be available, especially with ad-adslot pairs that have never been displayed. In this research, we propose a macro approach for mining new adslots for each ad by recommending appropriate adslots to the ad. The proposed method does not require any user information and can be pre-calculated offline, even when there are not any impressions of the ad on the target adslots. It applies matrix factorization techniques to the ad-adslot performance history matrix to calculate the predicted performance of the target adslots. Experiments show that the proposed method achieves a small root mean-square error (RMSE) when testing with offline data and it yields high conversion rates in online tests with real-world ad campaigns.
{"title":"Adslot Mining for Online Display Ads","authors":"Kazuki Taniguchi, Yuki Harada, Nguyen Tuan Duc","doi":"10.1109/ICDMW.2015.82","DOIUrl":"https://doi.org/10.1109/ICDMW.2015.82","url":null,"abstract":"Finding appropriate adslots to display ads is an important step to achieve high conversion rates in online display advertising. Previous work on ad recommendation and conversion prediction often focuses on matching between adslots, users and ads simultaneously for each impression at micro level. Such methods require rich attributes of users, ads and adslots, which might not always be available, especially with ad-adslot pairs that have never been displayed. In this research, we propose a macro approach for mining new adslots for each ad by recommending appropriate adslots to the ad. The proposed method does not require any user information and can be pre-calculated offline, even when there are not any impressions of the ad on the target adslots. It applies matrix factorization techniques to the ad-adslot performance history matrix to calculate the predicted performance of the target adslots. Experiments show that the proposed method achieves a small root mean-square error (RMSE) when testing with offline data and it yields high conversion rates in online tests with real-world ad campaigns.","PeriodicalId":192888,"journal":{"name":"2015 IEEE International Conference on Data Mining Workshop (ICDMW)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124418680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Long-lead prediction of heavy precipitation events has a significant impact since it can provide an early warning of disasters, like a flood. However, the performance of existed prediction models has been constrained by the high dimensional space and non-linear relationship among variables. In this study, we study the prediction problem from the prospective of machine learning. In our machine-learning framework for forecasting heavy precipitation events, we use global hydro-meteorological variables with spatial and temporal influences as features, and the target weather events that last several days have been formulated as weather clusters. Our study has three phases: 1) identify weather clusters in different sizes, 2) handle the imbalance problem within the data, 3) select the most-relevant features through the large feature space. We plan to evaluate our methods with several real world data sets for predicting the heavy precipitation events.
{"title":"Prediction of Long-Lead Heavy Precipitation Events Aided by Machine Learning","authors":"Yahui Di","doi":"10.1109/ICDMW.2015.218","DOIUrl":"https://doi.org/10.1109/ICDMW.2015.218","url":null,"abstract":"Long-lead prediction of heavy precipitation events has a significant impact since it can provide an early warning of disasters, like a flood. However, the performance of existed prediction models has been constrained by the high dimensional space and non-linear relationship among variables. In this study, we study the prediction problem from the prospective of machine learning. In our machine-learning framework for forecasting heavy precipitation events, we use global hydro-meteorological variables with spatial and temporal influences as features, and the target weather events that last several days have been formulated as weather clusters. Our study has three phases: 1) identify weather clusters in different sizes, 2) handle the imbalance problem within the data, 3) select the most-relevant features through the large feature space. We plan to evaluate our methods with several real world data sets for predicting the heavy precipitation events.","PeriodicalId":192888,"journal":{"name":"2015 IEEE International Conference on Data Mining Workshop (ICDMW)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126163435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhiquan Qi, Ying-jie Tian, Lingfeng Niu, Fan Meng, Limeng Cui, Yong Shi
How to balance the speed and the quality is always a challenging issue in pedestrian detection. In this paper, we introduce the Learning model Using Privileged Information (LUPI), which can accelerate the convergence rate of learning and effectively improve the quality without sacrificing the speed. In more detail, we give the clear definition of the privileged information, which is only available at the training stage but is never available for the testing set, for the pedestrian detection problem and show how much the privileged information helps the detector to improve the quality. All experimental results show the robustness and effectiveness of the proposed method, at the same time show that the privileged information offers a significant improvement.
{"title":"Pedestrian Detection Using Privileged Information","authors":"Zhiquan Qi, Ying-jie Tian, Lingfeng Niu, Fan Meng, Limeng Cui, Yong Shi","doi":"10.1109/ICDMW.2015.70","DOIUrl":"https://doi.org/10.1109/ICDMW.2015.70","url":null,"abstract":"How to balance the speed and the quality is always a challenging issue in pedestrian detection. In this paper, we introduce the Learning model Using Privileged Information (LUPI), which can accelerate the convergence rate of learning and effectively improve the quality without sacrificing the speed. In more detail, we give the clear definition of the privileged information, which is only available at the training stage but is never available for the testing set, for the pedestrian detection problem and show how much the privileged information helps the detector to improve the quality. All experimental results show the robustness and effectiveness of the proposed method, at the same time show that the privileged information offers a significant improvement.","PeriodicalId":192888,"journal":{"name":"2015 IEEE International Conference on Data Mining Workshop (ICDMW)","volume":"123 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123585629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Publicly available social data has been adoptedwidely to explore language of crowds and leverage themin real world problem predictions. In microblogs, usersextensively share information about their moods, topics ofinterests, and social events which provide ideal data resourcefor many applications. We also study footprints of socialproblems in Twitter data. Hidden topics identified fromTwitter content are utilized to predict crime trend. Since ourproblem has a sequential order, extracting meaningful patternsinvolves temporal analysis. Prediction model requiresto address information evolution, in which data are morerelated when they are close in time rather than further apart. The study has been presented into two steps: firstly, a temporaltopic detection model is introduced to infer predictivehidden topics. The model builds a dynamic vocabulary todetect emerged topics. Topics are compared over time to havediversity and novelty in each time consideration. Secondly, apredictive model is proposed which utilizes identified temporaltopics to predict crime trend in prospective timeframe. The model does not suffer from lack of available learningexamples. Learning examples are annotated with knowledgeinferred from the trend. The experiments have revealed, temporal topic detection outperforms static topic modelingwhen dealing with sequential data. Topics are more diversewhen are inferred in different time slices. In general, theresults indicate temporal topics have a strong correlationwith crime index changes. Predictability is high in somespecific crime types and could be variant depending on theincidents. The study provides insight into the correlation oflanguage and real world problems and impacts of social datain providing predictive indicators.
{"title":"Temporal Topic Inference for Trend Prediction","authors":"S. Aghababaei, M. Makrehchi","doi":"10.1109/ICDMW.2015.214","DOIUrl":"https://doi.org/10.1109/ICDMW.2015.214","url":null,"abstract":"Publicly available social data has been adoptedwidely to explore language of crowds and leverage themin real world problem predictions. In microblogs, usersextensively share information about their moods, topics ofinterests, and social events which provide ideal data resourcefor many applications. We also study footprints of socialproblems in Twitter data. Hidden topics identified fromTwitter content are utilized to predict crime trend. Since ourproblem has a sequential order, extracting meaningful patternsinvolves temporal analysis. Prediction model requiresto address information evolution, in which data are morerelated when they are close in time rather than further apart. The study has been presented into two steps: firstly, a temporaltopic detection model is introduced to infer predictivehidden topics. The model builds a dynamic vocabulary todetect emerged topics. Topics are compared over time to havediversity and novelty in each time consideration. Secondly, apredictive model is proposed which utilizes identified temporaltopics to predict crime trend in prospective timeframe. The model does not suffer from lack of available learningexamples. Learning examples are annotated with knowledgeinferred from the trend. The experiments have revealed, temporal topic detection outperforms static topic modelingwhen dealing with sequential data. Topics are more diversewhen are inferred in different time slices. In general, theresults indicate temporal topics have a strong correlationwith crime index changes. Predictability is high in somespecific crime types and could be variant depending on theincidents. The study provides insight into the correlation oflanguage and real world problems and impacts of social datain providing predictive indicators.","PeriodicalId":192888,"journal":{"name":"2015 IEEE International Conference on Data Mining Workshop (ICDMW)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125279759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recently, a novel nonparallel support vector machine (NPSVM) is proposed by Tian et al, which has several attracting advantages over its predecessors. A sequential minimal optimization algorithm(SMO) has already been provided to solve the dual form of NPSVM. Different from the existing work, we present a new strategy to solve the primal form of NPSVM in this paper. Our algorithm is designed in the framework of the alternating direction method of multipliers (ADMM), which is well suited to distributed convex optimization. Although the closed-form solution of each step can be written out directly, in order to be able to handle problems with a very large number of features or training examples, we propose to solve the underlying linear equation systems proximally by the conjugate gradient method. Experiments are carried out on several data sets. Numerical results indeed demonstrate the effectiveness of our method.
{"title":"Alternating Direction Method of Multipliers for Nonparallel Support Vector Machines","authors":"Xin Shen, Lingfeng Niu, Ying-jie Tian, Yong Shi","doi":"10.1109/ICDMW.2015.77","DOIUrl":"https://doi.org/10.1109/ICDMW.2015.77","url":null,"abstract":"Recently, a novel nonparallel support vector machine (NPSVM) is proposed by Tian et al, which has several attracting advantages over its predecessors. A sequential minimal optimization algorithm(SMO) has already been provided to solve the dual form of NPSVM. Different from the existing work, we present a new strategy to solve the primal form of NPSVM in this paper. Our algorithm is designed in the framework of the alternating direction method of multipliers (ADMM), which is well suited to distributed convex optimization. Although the closed-form solution of each step can be written out directly, in order to be able to handle problems with a very large number of features or training examples, we propose to solve the underlying linear equation systems proximally by the conjugate gradient method. Experiments are carried out on several data sets. Numerical results indeed demonstrate the effectiveness of our method.","PeriodicalId":192888,"journal":{"name":"2015 IEEE International Conference on Data Mining Workshop (ICDMW)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116580454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Entity resolution (ER) is a common data cleaning and data-integration task that aims to determine which records in one or more data sets refer to the same real-world entities. In most cases no training data exists and the ER process involves considerable trial and error, with an often time-consuming manual evaluation required to determine whether the obtained results are good enough. We propose a method that makes use of transitive closure within triples of records to provide an early indication of inconsistency in an ER result in an unsupervised fashion. We test our approach on three real-world data sets with different similarity calculations and blocking approaches and show that our approach can detect problems with ER resultsearly on without a manual evaluation.
{"title":"Unsupervised Measuring of Entity Resolution Consistency","authors":"Jeffrey Fisher, Qing Wang","doi":"10.1109/ICDMW.2015.162","DOIUrl":"https://doi.org/10.1109/ICDMW.2015.162","url":null,"abstract":"Entity resolution (ER) is a common data cleaning and data-integration task that aims to determine which records in one or more data sets refer to the same real-world entities. In most cases no training data exists and the ER process involves considerable trial and error, with an often time-consuming manual evaluation required to determine whether the obtained results are good enough. We propose a method that makes use of transitive closure within triples of records to provide an early indication of inconsistency in an ER result in an unsupervised fashion. We test our approach on three real-world data sets with different similarity calculations and blocking approaches and show that our approach can detect problems with ER resultsearly on without a manual evaluation.","PeriodicalId":192888,"journal":{"name":"2015 IEEE International Conference on Data Mining Workshop (ICDMW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122130177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Feng Wang, Yongquan Zhang, Hang Xiao, Li Kuang, Yi-Chang Lai
In this paper, we focus on the problem of how to design a methodology which can improve the prediction accuracy as well as speed up prediction process for stock market prediction. As market news and stock prices are commonly believed as two important market data sources, we present the design of our stock price prediction model based on those two data sources concurrently. Firstly, in order to get the most significant features of the market news documents, we propose a new feature selection algorithm (NRDC), as well as a new feature weighting algorithm (N-TF-IDF) to help improve the prediction accuracy. Then we employ a fast learning model named Extreme Learning Machine(ELM) and use the kernel-based ELM (K-ELM) to improve the prediction speed. Comprehensive experimental comparisons between our hybrid proposal K-ELM with NRDC and N-TF-IDF(N-N-K-ELM) and the state-of-the-art learning algorithms, including Support Vector Machine (SVM) and Back-Propagation Neural Network (BP-NN), have been undertaken on the intra-day tick-by-tick data of the H-share market and contemporaneous news archives. Experimental results show that our N-N-K-ELM model can achieve better performance on the consideration of both prediction accuracy and prediction speed in most cases.
{"title":"Enhancing Stock Price Prediction with a Hybrid Approach Based Extreme Learning Machine","authors":"Feng Wang, Yongquan Zhang, Hang Xiao, Li Kuang, Yi-Chang Lai","doi":"10.1109/ICDMW.2015.74","DOIUrl":"https://doi.org/10.1109/ICDMW.2015.74","url":null,"abstract":"In this paper, we focus on the problem of how to design a methodology which can improve the prediction accuracy as well as speed up prediction process for stock market prediction. As market news and stock prices are commonly believed as two important market data sources, we present the design of our stock price prediction model based on those two data sources concurrently. Firstly, in order to get the most significant features of the market news documents, we propose a new feature selection algorithm (NRDC), as well as a new feature weighting algorithm (N-TF-IDF) to help improve the prediction accuracy. Then we employ a fast learning model named Extreme Learning Machine(ELM) and use the kernel-based ELM (K-ELM) to improve the prediction speed. Comprehensive experimental comparisons between our hybrid proposal K-ELM with NRDC and N-TF-IDF(N-N-K-ELM) and the state-of-the-art learning algorithms, including Support Vector Machine (SVM) and Back-Propagation Neural Network (BP-NN), have been undertaken on the intra-day tick-by-tick data of the H-share market and contemporaneous news archives. Experimental results show that our N-N-K-ELM model can achieve better performance on the consideration of both prediction accuracy and prediction speed in most cases.","PeriodicalId":192888,"journal":{"name":"2015 IEEE International Conference on Data Mining Workshop (ICDMW)","volume":"220 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117001108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We proposed a deep Convolutional Neural Network (CNN) approach and a Multi-View Stacking Ensemble (MVSE) method in Ali Mobile Recommendation Algorithm competition Season 1 and Season 2, respectively. Specifically, we treat the recommendation task as a classical binary classification problem. We thereby designed a large amount of indicative features based on the logic of mobile business, and grouped them into ten clusters according to their properties. In Season 1, a two-dimensional (2D) feature map which covered both time axis and feature cluster axis was created from the original features. This design made it possible for CNN to do predictions based on the information of both short-time actions and long-time behavior habit of mobile users. Combined with some traditional ensemble methods, the CNN achieved good results which ranked No. 2 in Season 1. In Season 2, we proposed a Multi-View Stacking Ensemble (MVSE) method, by using the stacking technique to efficiently combine different views of features. A classifier was trained on each of the ten feature clusters at first. The predictions of the ten classifiers were then used as additional features. Based on the augmented features, an ensemble classifier was trained to generate the final prediction. We continuously updated our model by padding the new stacking features, and finally achieved the performance of F-1 score 8.78% which ranked No. 1 in Season 2, among over 7,000 teams in total.
{"title":"Deep Convolutional Neural Network and Multi-view Stacking Ensemble in Ali Mobile Recommendation Algorithm Competition: The Solution to the Winning of Ali Mobile Recommendation Algorithm","authors":"Xiang Li, Suchi Qian, Furong Peng, Jian Yang, Xiaolin Hu, Rui Xia","doi":"10.1109/ICDMW.2015.26","DOIUrl":"https://doi.org/10.1109/ICDMW.2015.26","url":null,"abstract":"We proposed a deep Convolutional Neural Network (CNN) approach and a Multi-View Stacking Ensemble (MVSE) method in Ali Mobile Recommendation Algorithm competition Season 1 and Season 2, respectively. Specifically, we treat the recommendation task as a classical binary classification problem. We thereby designed a large amount of indicative features based on the logic of mobile business, and grouped them into ten clusters according to their properties. In Season 1, a two-dimensional (2D) feature map which covered both time axis and feature cluster axis was created from the original features. This design made it possible for CNN to do predictions based on the information of both short-time actions and long-time behavior habit of mobile users. Combined with some traditional ensemble methods, the CNN achieved good results which ranked No. 2 in Season 1. In Season 2, we proposed a Multi-View Stacking Ensemble (MVSE) method, by using the stacking technique to efficiently combine different views of features. A classifier was trained on each of the ten feature clusters at first. The predictions of the ten classifiers were then used as additional features. Based on the augmented features, an ensemble classifier was trained to generate the final prediction. We continuously updated our model by padding the new stacking features, and finally achieved the performance of F-1 score 8.78% which ranked No. 1 in Season 2, among over 7,000 teams in total.","PeriodicalId":192888,"journal":{"name":"2015 IEEE International Conference on Data Mining Workshop (ICDMW)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124780326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the computational power available today, machine learning is becoming a very active field finding its applications in our everyday life. One of its biggest challenge is the classification task involving data representation (the preprocessing part in a machine learning algorithm). In fact, classification of linearly separable data can be easily done. The aim of the preprocessing part is to obtain well represented data by mapping raw data into a "feature space" where simple classifiers can be used efficiently. For example, almost everything around audio/bioacoustic uses MFCC features until now. We present here a toolbox giving the basic tools for audio representation using the C++ programming language by providing an implementation of the Scattering Network which brings a new and powerful solution for these tasks. We focused our implementation to massive dataset and servers applications. The toolkit of reference in scattering analysis is SCATNET from Mallat et al. http://www.di.ens.fr/data/software/scatnet/. This tool is an attempt to have some of the scatnet features moretractable for Big Data challenges. Furthermore, the use of this toolbox is not limited to machine learning preprocessing. It can also be used for more advanced biological analysis such as animal communication behaviours analysis or any biological study related to signal analysis. This implementation gives out of the box executables that can be used by simple commands without a graphical interface and is thus suited for server applications. As we will review in the next part, we will need to perform data manipulation on huge dataset. It becomes important to have fast and efficient implementations in order to deal with this new "Big Data" era.
{"title":"Scattering Decomposition for Massive Signal Classification: From Theory to Fast Algorithm and Implementation with Validation on International Bioacoustic Benchmark","authors":"Randall Balestriero, H. Glotin","doi":"10.1109/ICDMW.2015.127","DOIUrl":"https://doi.org/10.1109/ICDMW.2015.127","url":null,"abstract":"With the computational power available today, machine learning is becoming a very active field finding its applications in our everyday life. One of its biggest challenge is the classification task involving data representation (the preprocessing part in a machine learning algorithm). In fact, classification of linearly separable data can be easily done. The aim of the preprocessing part is to obtain well represented data by mapping raw data into a \"feature space\" where simple classifiers can be used efficiently. For example, almost everything around audio/bioacoustic uses MFCC features until now. We present here a toolbox giving the basic tools for audio representation using the C++ programming language by providing an implementation of the Scattering Network which brings a new and powerful solution for these tasks. We focused our implementation to massive dataset and servers applications. The toolkit of reference in scattering analysis is SCATNET from Mallat et al. http://www.di.ens.fr/data/software/scatnet/. This tool is an attempt to have some of the scatnet features moretractable for Big Data challenges. Furthermore, the use of this toolbox is not limited to machine learning preprocessing. It can also be used for more advanced biological analysis such as animal communication behaviours analysis or any biological study related to signal analysis. This implementation gives out of the box executables that can be used by simple commands without a graphical interface and is thus suited for server applications. As we will review in the next part, we will need to perform data manipulation on huge dataset. It becomes important to have fast and efficient implementations in order to deal with this new \"Big Data\" era.","PeriodicalId":192888,"journal":{"name":"2015 IEEE International Conference on Data Mining Workshop (ICDMW)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123830555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Many networks nowadays contain both positive and negative relationships, such as ratings and conflicts, which are often mixed in the layouts of network visualization represented by the layouts of node-link diagram and node indices of matrix representation. In this work, we present a visual analysis framework for visualizing signed networks through emphasizing different effects of signed edges on network topologies. The theoretical foundation of the visual analysis framework comes from the spectral analysis of data patterns in the high-dimensional spectral space. Based on the spectral analysis results, we present a block-organized visualization approach in the hybrid form of matrix, node-link, and arc diagrams with the focus on revealing topological structures of signed networks. We demonstrate with a detailed case study that block-organized visualization and spectral space exploration can be combined to analyze topologies of signed networks effectively.
{"title":"Block-Organized Topology Visualization for Visual Exploration of Signed Networks","authors":"Xianlin Hu, Leting Wu, Aidong Lu, Xintao Wu","doi":"10.1109/ICDMW.2015.117","DOIUrl":"https://doi.org/10.1109/ICDMW.2015.117","url":null,"abstract":"Many networks nowadays contain both positive and negative relationships, such as ratings and conflicts, which are often mixed in the layouts of network visualization represented by the layouts of node-link diagram and node indices of matrix representation. In this work, we present a visual analysis framework for visualizing signed networks through emphasizing different effects of signed edges on network topologies. The theoretical foundation of the visual analysis framework comes from the spectral analysis of data patterns in the high-dimensional spectral space. Based on the spectral analysis results, we present a block-organized visualization approach in the hybrid form of matrix, node-link, and arc diagrams with the focus on revealing topological structures of signed networks. We demonstrate with a detailed case study that block-organized visualization and spectral space exploration can be combined to analyze topologies of signed networks effectively.","PeriodicalId":192888,"journal":{"name":"2015 IEEE International Conference on Data Mining Workshop (ICDMW)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121465675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}