首页 > 最新文献

Proceedings of the 1st ACM SIGMOD Workshop on Network Data Analytics最新文献

英文 中文
Analyzing extended property graphs with Apache Flink 分析扩展属性图与Apache Flink
Pub Date : 2016-07-01 DOI: 10.1145/2980523.2980527
Martin Junghanns, André Petermann, Niklas Teichmann, Kevin Gómez, E. Rahm
Graphs are an intuitive way to model complex relationships between real-world data objects. Thus, graph analytics plays an important role in research and industry. As graphs often reflect heterogeneous domain data, their representation requires an expressive data model including the abstraction of graph collections, for example, to analyze communities inside a social network. Further on, answering complex analytical questions about such graphs entails combining multiple analytical operations. To satisfy these requirements, we propose the Extended Property Graph Model, which is semantically rich, schema-free and supports multiple distinct graphs. Based on this representation, it provides declarative and combinable operators to analyze both single graphs and graph collections. Our current implementation is based on the distributed dataflow framework Apache Flink. We present the results of a first experimental study showing the scalability of our implementation on social network data with up to 11 billion edges.
图是对现实世界数据对象之间的复杂关系进行建模的一种直观方式。因此,图形分析在研究和工业中扮演着重要的角色。由于图形经常反映异构领域数据,因此它们的表示需要一个具有表现力的数据模型,包括图形集合的抽象,例如,用于分析社会网络中的社区。进一步说,回答关于这种图的复杂分析问题需要结合多种分析操作。为了满足这些需求,我们提出了语义丰富、无模式、支持多个不同图的扩展属性图模型。基于这种表示,它提供了声明性和可组合的操作符来分析单个图和图集合。我们目前的实现是基于分布式数据流框架Apache Flink。我们展示了第一个实验研究的结果,显示了我们在多达110亿个边的社交网络数据上实现的可扩展性。
{"title":"Analyzing extended property graphs with Apache Flink","authors":"Martin Junghanns, André Petermann, Niklas Teichmann, Kevin Gómez, E. Rahm","doi":"10.1145/2980523.2980527","DOIUrl":"https://doi.org/10.1145/2980523.2980527","url":null,"abstract":"Graphs are an intuitive way to model complex relationships between real-world data objects. Thus, graph analytics plays an important role in research and industry. As graphs often reflect heterogeneous domain data, their representation requires an expressive data model including the abstraction of graph collections, for example, to analyze communities inside a social network. Further on, answering complex analytical questions about such graphs entails combining multiple analytical operations. To satisfy these requirements, we propose the Extended Property Graph Model, which is semantically rich, schema-free and supports multiple distinct graphs. Based on this representation, it provides declarative and combinable operators to analyze both single graphs and graph collections. Our current implementation is based on the distributed dataflow framework Apache Flink. We present the results of a first experimental study showing the scalability of our implementation on social network data with up to 11 billion edges.","PeriodicalId":246127,"journal":{"name":"Proceedings of the 1st ACM SIGMOD Workshop on Network Data Analytics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130296291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 47
Integer programming approach for directed minimum spanning tree problem on temporal graphs 时序图上有向最小生成树问题的整数规划方法
Pub Date : 2016-07-01 DOI: 10.1145/2980523.2980528
Takuto Ikuta, Takuya Akiba
Considerable effort has been devoted to establishing concepts and designing algorithms that are useful for graph data management. While most work so far has focused on static graphs, there are many networks with time information, i.e., temporal graphs, such as social network messages, phone calls, public transportation, and neural networks. Even the most fundamental problems for static graphs become non-trivial for temporal graphs. In this paper, we explore the minimum-weight spanning tree problem on temporal graphs, which was recently proposed by Huang et al. [SIGMOD 2015]. Even though this problem is proven to be NP-hard, we design practically efficient exact algorithms using integer programming. Experimental results confirm that the proposed algorithms can produce better solutions than a previously proposed approximation algorithm.
在建立图形数据管理的概念和设计算法方面已经付出了相当大的努力。虽然到目前为止,大多数工作都集中在静态图上,但有许多网络具有时间信息,即时间图,如社交网络消息、电话、公共交通和神经网络。即使是静态图中最基本的问题,在时态图中也变得非常重要。在本文中,我们探讨了时间图上的最小权值生成树问题,该问题最近由Huang等人[SIGMOD 2015]提出。尽管这个问题被证明是np困难的,但我们使用整数规划设计了实际有效的精确算法。实验结果证实,所提出的算法比先前提出的近似算法能产生更好的解。
{"title":"Integer programming approach for directed minimum spanning tree problem on temporal graphs","authors":"Takuto Ikuta, Takuya Akiba","doi":"10.1145/2980523.2980528","DOIUrl":"https://doi.org/10.1145/2980523.2980528","url":null,"abstract":"Considerable effort has been devoted to establishing concepts and designing algorithms that are useful for graph data management. While most work so far has focused on static graphs, there are many networks with time information, i.e., temporal graphs, such as social network messages, phone calls, public transportation, and neural networks. Even the most fundamental problems for static graphs become non-trivial for temporal graphs. In this paper, we explore the minimum-weight spanning tree problem on temporal graphs, which was recently proposed by Huang et al. [SIGMOD 2015]. Even though this problem is proven to be NP-hard, we design practically efficient exact algorithms using integer programming. Experimental results confirm that the proposed algorithms can produce better solutions than a previously proposed approximation algorithm.","PeriodicalId":246127,"journal":{"name":"Proceedings of the 1st ACM SIGMOD Workshop on Network Data Analytics","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132618949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Beyond nodes and edges: multiresolution algorithms for network data 超越节点和边缘:网络数据的多分辨率算法
Pub Date : 2016-07-01 DOI: 10.1145/2980523.2980525
J. Leskovec
Networks are a fundamental tool for understanding and modeling complex systems in physics, biology, neuroscience, engineering, and social science. Many networks are known to exhibit rich, lower-order connectivity patterns that can be captured at the level of individual nodes and edges. However, higher-order organization of complex networks -- at the level of small network subgraphs -- remains largely unknown. Here, we develop a generalized framework for clustering networks on the basis of higher-order connectivity patterns. This framework provides mathematical guarantees on the optimality of obtained clusters and scales to networks with billions of edges. The framework reveals higher-order organization in a number of networks, including information propagation units in neuronal networks and hub structure in transportation networks. Results show that networks exhibit rich higher-order organizational structures that are exposed by clustering based on higher-order connectivity patterns. Prediction tasks over nodes and edges in networks require careful effort in engineering features used by learning algorithms. Recent research in the broader field of representation learning has led to significant progress in automating prediction by learning the features themselves. However, present feature learning approaches are not expressive enough to capture the diversity of connectivity patterns observed in networks. Here we propose node2vec, an algorithmic framework for learning continuous feature representations for nodes in networks. In node2vec, we learn a mapping of nodes to a low-dimensional space of features that maximizes the likelihood of preserving network neighborhoods of nodes. We define a flexible notion of a node's network neighborhood and design a biased random walk procedure, which efficiently explores diverse neighborhoods. Our algorithm generalizes prior work which is based on rigid notions of network neighborhoods, and we argue that the added flexibility in exploring neighborhoods is the key to learning richer representations. We demonstrate the efficacy of node2vec over existing state-of-the-art techniques on multi-label classification and link prediction in several real-world networks from diverse domains. Taken together, our work represents a new way for efficiently learning state-of-the-art task-independent representations in complex networks.
网络是物理学、生物学、神经科学、工程学和社会科学中理解和建模复杂系统的基本工具。众所周知,许多网络都表现出丰富的低阶连接模式,这些模式可以在单个节点和边的级别上捕获。然而,复杂网络的高阶组织——在小网络子图的水平上——在很大程度上仍然未知。在这里,我们开发了一个基于高阶连接模式的聚类网络的通用框架。该框架为获得的集群的最优性提供了数学保证,并扩展到具有数十亿条边的网络。该框架揭示了许多网络中的高阶组织,包括神经网络中的信息传播单元和交通网络中的枢纽结构。结果表明,基于高阶连接模式的聚类揭示了网络具有丰富的高阶组织结构。网络中节点和边缘上的预测任务需要在学习算法所使用的工程特征上付出谨慎的努力。最近在更广泛的表征学习领域的研究已经在通过学习特征本身来实现自动预测方面取得了重大进展。然而,目前的特征学习方法不足以表达网络中观察到的连接模式的多样性。在这里,我们提出了node2vec,一个用于学习网络中节点连续特征表示的算法框架。在node2vec中,我们学习节点到低维特征空间的映射,以最大限度地保留节点的网络邻域的可能性。我们定义了一个灵活的节点网络邻域概念,并设计了一个有偏差的随机漫步过程,该过程可以有效地探索不同的邻域。我们的算法推广了先前基于网络邻域的严格概念的工作,我们认为在探索邻域时增加的灵活性是学习更丰富表征的关键。我们在来自不同领域的几个现实世界网络中展示了node2vec在多标签分类和链路预测方面优于现有最先进技术的有效性。综上所述,我们的工作代表了一种在复杂网络中有效学习最先进的任务独立表示的新方法。
{"title":"Beyond nodes and edges: multiresolution algorithms for network data","authors":"J. Leskovec","doi":"10.1145/2980523.2980525","DOIUrl":"https://doi.org/10.1145/2980523.2980525","url":null,"abstract":"Networks are a fundamental tool for understanding and modeling complex systems in physics, biology, neuroscience, engineering, and social science. Many networks are known to exhibit rich, lower-order connectivity patterns that can be captured at the level of individual nodes and edges. However, higher-order organization of complex networks -- at the level of small network subgraphs -- remains largely unknown. Here, we develop a generalized framework for clustering networks on the basis of higher-order connectivity patterns. This framework provides mathematical guarantees on the optimality of obtained clusters and scales to networks with billions of edges. The framework reveals higher-order organization in a number of networks, including information propagation units in neuronal networks and hub structure in transportation networks. Results show that networks exhibit rich higher-order organizational structures that are exposed by clustering based on higher-order connectivity patterns. Prediction tasks over nodes and edges in networks require careful effort in engineering features used by learning algorithms. Recent research in the broader field of representation learning has led to significant progress in automating prediction by learning the features themselves. However, present feature learning approaches are not expressive enough to capture the diversity of connectivity patterns observed in networks. Here we propose node2vec, an algorithmic framework for learning continuous feature representations for nodes in networks. In node2vec, we learn a mapping of nodes to a low-dimensional space of features that maximizes the likelihood of preserving network neighborhoods of nodes. We define a flexible notion of a node's network neighborhood and design a biased random walk procedure, which efficiently explores diverse neighborhoods. Our algorithm generalizes prior work which is based on rigid notions of network neighborhoods, and we argue that the added flexibility in exploring neighborhoods is the key to learning richer representations. We demonstrate the efficacy of node2vec over existing state-of-the-art techniques on multi-label classification and link prediction in several real-world networks from diverse domains. Taken together, our work represents a new way for efficiently learning state-of-the-art task-independent representations in complex networks.","PeriodicalId":246127,"journal":{"name":"Proceedings of the 1st ACM SIGMOD Workshop on Network Data Analytics","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116690418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Nepal: a path query language for communication networks 尼泊尔:通信网络的路径查询语言
Pub Date : 2016-07-01 DOI: 10.1145/2980523.2980530
T. Johnson, Y. Kanza, L. Lakshmanan, Vladislav Shkapenyuk
Communication networks are typically large, dynamic and extremely complicated. To deploy, maintain, and trouble-shoot such networks, it is essential to understand how network elements---such as servers, switches, virtual machines, and virtual network functions---are connected to one another, and be able to discover communication paths between them. It is also essential to understand how connections change over time, and be able to pose time-travel queries to retrieve information about past network states. This problem is becoming more acute with the advent of software defined networks, where network functions are virtualized and managed in a cloud infrastructure. We represent a communication network inventory as a graph where the nodes are network entities and edges represent relationships between them, e.g. hosted-on, communicates-with, etc. Querying such a graph, e.g. for troubleshooting, using existing graph query languages is too cumbersome for network analysts. Thus, in this paper we present Nepal---a network path query language, which is designed to effectively retrieve desired paths from a network graph. The main novelty of Nepal is to consider paths as first-class citizens of the language, which achieves closure under composition while maintaining simplicity. We demonstrate the capabilities of Nepal by examples and discuss query evaluation. We illustrate how path queries can simplify the extraction of information from a dynamic inventory of a multi-layer network and can be used for troubleshooting.
通信网络通常是庞大的、动态的和极其复杂的。要部署、维护和排除此类网络的故障,必须了解网络元素(如服务器、交换机、虚拟机和虚拟网络功能)如何相互连接,并能够发现它们之间的通信路径。理解连接是如何随时间变化的,并且能够提出时间旅行查询来检索有关过去网络状态的信息,这也是至关重要的。随着软件定义网络的出现,这个问题变得更加尖锐,在软件定义网络中,网络功能在云基础设施中被虚拟化和管理。我们将通信网络清单表示为一个图,其中节点是网络实体,边表示它们之间的关系,例如托管,通信与等。对于网络分析人员来说,使用现有的图查询语言查询这样的图(例如用于故障排除)过于繁琐。因此,在本文中,我们提出了尼泊尔——一种网络路径查询语言,旨在有效地从网络图中检索所需的路径。尼泊尔的主要新颖之处在于将路径视为语言的一等公民,在保持简单性的同时实现了构图的封闭性。我们通过实例演示了尼泊尔的功能,并讨论了查询评估。我们说明了路径查询如何简化从多层网络的动态库存中提取信息的过程,并可用于故障排除。
{"title":"Nepal: a path query language for communication networks","authors":"T. Johnson, Y. Kanza, L. Lakshmanan, Vladislav Shkapenyuk","doi":"10.1145/2980523.2980530","DOIUrl":"https://doi.org/10.1145/2980523.2980530","url":null,"abstract":"Communication networks are typically large, dynamic and extremely complicated. To deploy, maintain, and trouble-shoot such networks, it is essential to understand how network elements---such as servers, switches, virtual machines, and virtual network functions---are connected to one another, and be able to discover communication paths between them. It is also essential to understand how connections change over time, and be able to pose time-travel queries to retrieve information about past network states. This problem is becoming more acute with the advent of software defined networks, where network functions are virtualized and managed in a cloud infrastructure. We represent a communication network inventory as a graph where the nodes are network entities and edges represent relationships between them, e.g. hosted-on, communicates-with, etc. Querying such a graph, e.g. for troubleshooting, using existing graph query languages is too cumbersome for network analysts. Thus, in this paper we present Nepal---a network path query language, which is designed to effectively retrieve desired paths from a network graph. The main novelty of Nepal is to consider paths as first-class citizens of the language, which achieves closure under composition while maintaining simplicity. We demonstrate the capabilities of Nepal by examples and discuss query evaluation. We illustrate how path queries can simplify the extraction of information from a dynamic inventory of a multi-layer network and can be used for troubleshooting.","PeriodicalId":246127,"journal":{"name":"Proceedings of the 1st ACM SIGMOD Workshop on Network Data Analytics","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132250279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
NScaleSpark: subgraph-centric graph analytics on Apache Spark NScaleSpark: Apache Spark上以子图为中心的图分析
Pub Date : 2016-07-01 DOI: 10.1145/2980523.2980529
A. Quamar, A. Deshpande
In this paper, we describe NScaleSpark, a framework for executing large-scale distributed graph analysis tasks on the Apache Spark platform. NScaleSpark is motivated by the increasing interest in executing rich and complex analysis tasks over large graph datasets. There is much recent work on vertex-centric graph programming frameworks for executing such analysis tasks -- these systems espouse a "think-like-a-vertex" (TLV) paradigm, with some example systems being Pregel, Apache Giraph, GPS, Grace, and GraphX (built on top of Apache Spark). However, the TLV paradigm is not suitable for many complex graph analysis tasks that typically require processing of information aggregated over neighborhoods or subgraphs in the underlying graph. Instead, NScaleSpark is based on a "think-like-a-subgraph" paradigm (also recently called "think-like-an-embedding" [23]). Here, the users specify computations to be executed against a large number of multi-hop neighborhoods or subgraphs of the data graph. NScaleSpark builds upon our prior work on the NScale system [18], which was built on top of the Hadoop MapReduce system. We describe how we reimplemented NScale on the Apache Spark platform, the key challenges therein, and the design decisions we made. NScaleSpark uses a series of RDD transformations to extract and hold the relevant subgraphs in distributed memory with minimal footprint using a cost-based optimizer. Our in-memory graph data structure enables efficient graph computations over large-scale graphs. Our experimental results over several real world data sets and applications show orders-of-magnitude improvement in performance and total cost over GraphX and other vertex-centric approaches.
本文描述了NScaleSpark,一个在Apache Spark平台上执行大规模分布式图分析任务的框架。NScaleSpark的动机是对在大型图数据集上执行丰富而复杂的分析任务越来越感兴趣。最近有很多关于以顶点为中心的图形编程框架的工作,用于执行此类分析任务——这些系统支持“像顶点一样思考”(TLV)范式,其中一些示例系统是Pregel, Apache Giraph, GPS, Grace和GraphX(构建在Apache Spark之上)。然而,TLV范式不适合许多复杂的图分析任务,这些任务通常需要处理在底层图中的邻域或子图上聚合的信息。相反,NScaleSpark基于“像子图一样思考”范式(最近也被称为“像嵌入一样思考”[23])。在这里,用户指定要对数据图的大量多跳邻域或子图执行计算。NScaleSpark建立在我们之前在NScale系统[18]上的工作基础上,该系统建立在Hadoop MapReduce系统之上。我们描述了如何在Apache Spark平台上重新实现NScale,其中的关键挑战,以及我们所做的设计决策。NScaleSpark使用一系列的RDD转换来提取相关的子图,并使用基于成本的优化器将其保存在分布式内存中。我们的内存图数据结构使大规模图的高效图计算成为可能。我们在几个真实世界数据集和应用程序上的实验结果显示,与GraphX和其他以顶点为中心的方法相比,性能和总成本有了数量级的提高。
{"title":"NScaleSpark: subgraph-centric graph analytics on Apache Spark","authors":"A. Quamar, A. Deshpande","doi":"10.1145/2980523.2980529","DOIUrl":"https://doi.org/10.1145/2980523.2980529","url":null,"abstract":"In this paper, we describe NScaleSpark, a framework for executing large-scale distributed graph analysis tasks on the Apache Spark platform. NScaleSpark is motivated by the increasing interest in executing rich and complex analysis tasks over large graph datasets. There is much recent work on vertex-centric graph programming frameworks for executing such analysis tasks -- these systems espouse a \"think-like-a-vertex\" (TLV) paradigm, with some example systems being Pregel, Apache Giraph, GPS, Grace, and GraphX (built on top of Apache Spark). However, the TLV paradigm is not suitable for many complex graph analysis tasks that typically require processing of information aggregated over neighborhoods or subgraphs in the underlying graph. Instead, NScaleSpark is based on a \"think-like-a-subgraph\" paradigm (also recently called \"think-like-an-embedding\" [23]). Here, the users specify computations to be executed against a large number of multi-hop neighborhoods or subgraphs of the data graph. NScaleSpark builds upon our prior work on the NScale system [18], which was built on top of the Hadoop MapReduce system. We describe how we reimplemented NScale on the Apache Spark platform, the key challenges therein, and the design decisions we made. NScaleSpark uses a series of RDD transformations to extract and hold the relevant subgraphs in distributed memory with minimal footprint using a cost-based optimizer. Our in-memory graph data structure enables efficient graph computations over large-scale graphs. Our experimental results over several real world data sets and applications show orders-of-magnitude improvement in performance and total cost over GraphX and other vertex-centric approaches.","PeriodicalId":246127,"journal":{"name":"Proceedings of the 1st ACM SIGMOD Workshop on Network Data Analytics","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124633900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Proceedings of the 1st ACM SIGMOD Workshop on Network Data Analytics 第一届ACM SIGMOD网络数据分析研讨会论文集
Akhil Arora, Shourya Roy, S. Mehta
Networks are prevalent in today's electronic world in a wide variety of domains ranging from Engineering to Social Sciences, Life Sciences to Physical Sciences, and so on. Researchers and practitioners have studied networks in multiple ways like defining network metrics, providing theoretical results and examining problems like pattern mining, link prediction etc. The NDA workshop is a forum for exchanging ideas and methods for mining, querying and learning with real-world networks, developing new common understandings of the problems at hand, sharing of data sets where applicable, and leveraging existing knowledge from different disciplines. The purpose of this workshop is to bring together researchers from academia, industry, and government, to create a forum for discussing recent advances in (large-scale) graph analysis, as well as propose and discuss novel methods and techniques towards addressing domain specific challenges.
在当今的电子世界中,网络在从工程到社会科学,从生命科学到物理科学等各个领域都很普遍。研究人员和实践者以多种方式研究网络,如定义网络指标、提供理论结果和检查模式挖掘、链接预测等问题。NDA研讨会是一个交流思想和方法的论坛,用于与现实世界的网络进行挖掘、查询和学习,发展对手头问题的新共识,在适用的情况下共享数据集,并利用来自不同学科的现有知识。本次研讨会的目的是将来自学术界、工业界和政府的研究人员聚集在一起,创建一个论坛,讨论(大规模)图分析的最新进展,以及提出和讨论解决领域特定挑战的新方法和技术。
{"title":"Proceedings of the 1st ACM SIGMOD Workshop on Network Data Analytics","authors":"Akhil Arora, Shourya Roy, S. Mehta","doi":"10.1145/2980523","DOIUrl":"https://doi.org/10.1145/2980523","url":null,"abstract":"Networks are prevalent in today's electronic world in a wide variety of domains ranging from Engineering to Social Sciences, Life Sciences to Physical Sciences, and so on. Researchers and practitioners have studied networks in multiple ways like defining network metrics, providing theoretical results and examining problems like pattern mining, link prediction etc. The NDA workshop is a forum for exchanging ideas and methods for mining, querying and learning with real-world networks, developing new common understandings of the problems at hand, sharing of data sets where applicable, and leveraging existing knowledge from different disciplines. The purpose of this workshop is to bring together researchers from academia, industry, and government, to create a forum for discussing recent advances in (large-scale) graph analysis, as well as propose and discuss novel methods and techniques towards addressing domain specific challenges.","PeriodicalId":246127,"journal":{"name":"Proceedings of the 1st ACM SIGMOD Workshop on Network Data Analytics","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115585812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Viral marketing 2.0 病毒式营销2.0
Pub Date : 2016-07-01 DOI: 10.1145/2980523.2980526
L. Lakshmanan
Over the last decade, there has been considerable excitement and research on the study and exploitation of the spread of information and influence over networks. Tremendous advances have been made on the prototypical problem of selecting a small number of seed users to activate over a social network such that the number of activated nodes in an expected sense is maximized, under several standard information diffusion models. Scalable heuristics, but more notably scalable approximation algorithms, have been developed in the recent years. Unfortunately, the state of the art has several shortcomings. Firstly, most of the research has focused on a simplistic setting where one marketing campaign is active at a time. While there has been some work on modeling and optimizing for competing diffusions, the key role played by the network owner in a campaign has been overlooked. Secondly, the relationship and contract needed between the network owner and the advertisers is not captured. Thirdly, in real life, relationships between multiple campaigns may be more complex than just pure competition. Finally, most of the studies assume that the seeds must be chosen all at once before the campaign starts with no opportunity to observe the performance of seeds chosen earlier and course-correct as needed. We make a call to arms for opening up the framework of viral marketing to allow for more expressive business models and seed selection strategies, and present some recent research from our group that addresses the modeling and computational challenges.
在过去的十年里,人们对信息传播和网络影响的研究和利用产生了相当大的兴趣和研究。在几种标准的信息扩散模型下,在选择少量种子用户激活社交网络的原型问题上取得了巨大的进展,从而使预期意义上的激活节点数量最大化。可伸缩启发式算法,尤其是可伸缩近似算法,是近年来发展起来的。不幸的是,目前的技术有几个缺点。首先,大多数研究都集中在一个简单的设定上,即一次只进行一项营销活动。虽然有一些关于竞争扩散的建模和优化工作,但网络所有者在活动中发挥的关键作用被忽视了。其次,网络所有者和广告商之间所需的关系和合同没有被捕获。第三,在现实生活中,多个活动之间的关系可能比单纯的竞争更为复杂。最后,大多数研究都假定种子必须在运动开始前一次全部选择,没有机会观察早期选择的种子的表现,并根据需要纠正路线。我们呼吁开放病毒式营销的框架,以允许更具表现力的商业模式和种子选择策略,并介绍了我们小组最近的一些研究,这些研究解决了建模和计算方面的挑战。
{"title":"Viral marketing 2.0","authors":"L. Lakshmanan","doi":"10.1145/2980523.2980526","DOIUrl":"https://doi.org/10.1145/2980523.2980526","url":null,"abstract":"Over the last decade, there has been considerable excitement and research on the study and exploitation of the spread of information and influence over networks. Tremendous advances have been made on the prototypical problem of selecting a small number of seed users to activate over a social network such that the number of activated nodes in an expected sense is maximized, under several standard information diffusion models. Scalable heuristics, but more notably scalable approximation algorithms, have been developed in the recent years. Unfortunately, the state of the art has several shortcomings. Firstly, most of the research has focused on a simplistic setting where one marketing campaign is active at a time. While there has been some work on modeling and optimizing for competing diffusions, the key role played by the network owner in a campaign has been overlooked. Secondly, the relationship and contract needed between the network owner and the advertisers is not captured. Thirdly, in real life, relationships between multiple campaigns may be more complex than just pure competition. Finally, most of the studies assume that the seeds must be chosen all at once before the campaign starts with no opportunity to observe the performance of seeds chosen earlier and course-correct as needed. We make a call to arms for opening up the framework of viral marketing to allow for more expressive business models and seed selection strategies, and present some recent research from our group that addresses the modeling and computational challenges.","PeriodicalId":246127,"journal":{"name":"Proceedings of the 1st ACM SIGMOD Workshop on Network Data Analytics","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117142160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of the 1st ACM SIGMOD Workshop on Network Data Analytics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1