Marwan Hassani, Pascal Spaus, A. Cuzzocrea, T. Seidl
{"title":"I-HASTREAM:基于密度的大数据流分层聚类及其在大图分析工具中的应用","authors":"Marwan Hassani, Pascal Spaus, A. Cuzzocrea, T. Seidl","doi":"10.1109/CCGrid.2016.102","DOIUrl":null,"url":null,"abstract":"Big Data Streams are very popular at now, as stirred-up by a plethora of modern applications such as sensor networks, scientific computing tools, Web intelligence, social network analysis and mining tools, and so forth. Here, the main research issue consists in how to effectively and efficiently extract useful knowledge from (streaming) big data, in order to support innovative big data analytics platforms. To this end, clustering analysis is a well-known tool for extracting knowledge from big data streams, as also confirmed by recent trends in active literature. A special applicative case is represented by so-called graph-shaped data (big) streams, which are produced by graph sources providing both structure-and content-oriented knowledge. On top of such sources, big graph analytics is a leading scientific area to be considered. At the convergence of these emerging topics, in this paper we provide the following contributions: (i) I-HASTREAM, a novel density-based hierarchical clustering algorithm for evolving big data streams that founds on it predecessor, namely HASTREAM, (ii) the architecture of a big graph analytics engine that embeds I-HASTREAM in its core layer.","PeriodicalId":103641,"journal":{"name":"2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"I-HASTREAM: Density-Based Hierarchical Clustering of Big Data Streams and Its Application to Big Graph Analytics Tools\",\"authors\":\"Marwan Hassani, Pascal Spaus, A. Cuzzocrea, T. Seidl\",\"doi\":\"10.1109/CCGrid.2016.102\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Big Data Streams are very popular at now, as stirred-up by a plethora of modern applications such as sensor networks, scientific computing tools, Web intelligence, social network analysis and mining tools, and so forth. Here, the main research issue consists in how to effectively and efficiently extract useful knowledge from (streaming) big data, in order to support innovative big data analytics platforms. To this end, clustering analysis is a well-known tool for extracting knowledge from big data streams, as also confirmed by recent trends in active literature. A special applicative case is represented by so-called graph-shaped data (big) streams, which are produced by graph sources providing both structure-and content-oriented knowledge. On top of such sources, big graph analytics is a leading scientific area to be considered. At the convergence of these emerging topics, in this paper we provide the following contributions: (i) I-HASTREAM, a novel density-based hierarchical clustering algorithm for evolving big data streams that founds on it predecessor, namely HASTREAM, (ii) the architecture of a big graph analytics engine that embeds I-HASTREAM in its core layer.\",\"PeriodicalId\":103641,\"journal\":{\"name\":\"2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)\",\"volume\":\"64 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-05-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCGrid.2016.102\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCGrid.2016.102","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14
摘要
随着传感器网络、科学计算工具、Web智能、社交网络分析和挖掘工具等大量现代应用的兴起,大数据流现在非常流行。在这里,主要的研究问题在于如何有效和高效地从(流)大数据中提取有用的知识,以支持创新的大数据分析平台。为此,聚类分析是一种众所周知的从大数据流中提取知识的工具,最近的活跃文献趋势也证实了这一点。一个特殊的应用案例是所谓的图形数据(大)流,它是由提供面向结构和面向内容知识的图源产生的。在这些资源之上,大图形分析是一个需要考虑的领先科学领域。在这些新兴主题的融合中,本文提供了以下贡献:(i) i -HASTREAM,一种新的基于密度的分层聚类算法,用于发展大数据流,该算法建立在其前身HASTREAM之上,即(ii)将i -HASTREAM嵌入其核心层的大图形分析引擎架构。
I-HASTREAM: Density-Based Hierarchical Clustering of Big Data Streams and Its Application to Big Graph Analytics Tools
Big Data Streams are very popular at now, as stirred-up by a plethora of modern applications such as sensor networks, scientific computing tools, Web intelligence, social network analysis and mining tools, and so forth. Here, the main research issue consists in how to effectively and efficiently extract useful knowledge from (streaming) big data, in order to support innovative big data analytics platforms. To this end, clustering analysis is a well-known tool for extracting knowledge from big data streams, as also confirmed by recent trends in active literature. A special applicative case is represented by so-called graph-shaped data (big) streams, which are produced by graph sources providing both structure-and content-oriented knowledge. On top of such sources, big graph analytics is a leading scientific area to be considered. At the convergence of these emerging topics, in this paper we provide the following contributions: (i) I-HASTREAM, a novel density-based hierarchical clustering algorithm for evolving big data streams that founds on it predecessor, namely HASTREAM, (ii) the architecture of a big graph analytics engine that embeds I-HASTREAM in its core layer.