Proceedings of the 5th ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)最新文献
Renzo Angles, A. Hogan, O. Lassila, Carlos Rojas, D. Schwabe, Pedro A. Szekely, D. Vrgoc
In this short position paper, we argue that there is a need for a unifying data model that can support popular graph formats such as RDF, RDF* and property graphs, while at the same time being powerful enough to naturally store information from complex knowledge graphs, such as Wikidata, without the need for a complex reification scheme. Our proposal, called the multilayer graph model, presents a simple and flexible data model for graphs that can naturally support all of the above, and more. We also observe that the idea of multilayer graphs has appeared in existing graph systems from different vendors and research groups, illustrating its versatility.
{"title":"Multilayer graphs: a unified data model for graph databases","authors":"Renzo Angles, A. Hogan, O. Lassila, Carlos Rojas, D. Schwabe, Pedro A. Szekely, D. Vrgoc","doi":"10.1145/3534540.3534696","DOIUrl":"https://doi.org/10.1145/3534540.3534696","url":null,"abstract":"In this short position paper, we argue that there is a need for a unifying data model that can support popular graph formats such as RDF, RDF* and property graphs, while at the same time being powerful enough to naturally store information from complex knowledge graphs, such as Wikidata, without the need for a complex reification scheme. Our proposal, called the multilayer graph model, presents a simple and flexible data model for graphs that can naturally support all of the above, and more. We also observe that the idea of multilayer graphs has appeared in existing graph systems from different vendors and research groups, illustrating its versatility.","PeriodicalId":309669,"journal":{"name":"Proceedings of the 5th ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)","volume":"62 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126221706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We establish a translation between a formalism for dynamic programming over hypergraphs and the computation of semiring-based provenance for Datalog programs. The benefit of this translation is a new method for computing the provenance of Datalog programs for specific classes of semirings, which we apply to provenance-aware querying of graph databases. Theoretical results and practical optimizations lead to an efficient implementation using Soufflé, a state-of-the-art Datalog interpreter. Experimental results on real-world data suggest this approach to be efficient in practical contexts, competing with dedicated solutions for graphs.
{"title":"Efficient provenance-aware querying of graph databases with datalog","authors":"Yann Ramusat, S. Maniu, P. Senellart","doi":"10.1145/3534540.3534689","DOIUrl":"https://doi.org/10.1145/3534540.3534689","url":null,"abstract":"We establish a translation between a formalism for dynamic programming over hypergraphs and the computation of semiring-based provenance for Datalog programs. The benefit of this translation is a new method for computing the provenance of Datalog programs for specific classes of semirings, which we apply to provenance-aware querying of graph databases. Theoretical results and practical optimizations lead to an efficient implementation using Soufflé, a state-of-the-art Datalog interpreter. Experimental results on real-world data suggest this approach to be efficient in practical contexts, competing with dedicated solutions for graphs.","PeriodicalId":309669,"journal":{"name":"Proceedings of the 5th ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130153953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The interest in the ability of processing data that has an underlying graph structure has grown in the recent past. This has led to the development of many distributed graph processing systems. However, due to rapidly growing amount of data, e.g., web graphs and social graphs, even such distributed graph processing frameworks end up requiring several minutes or even several hours to execute popular graph algorithms. This leads to the question: do we always need to know the exact answer for a large graph? The aforementioned modern distributed graph processing frameworks execute graph algorithms by exchanging messages between vertices. This paper introduces a novel message-dropping approach for approximation in these frameworks. As dropping messages would result in degradation of quality of result, our objective is to drop messages that have least adverse impact on quality. More specifically, we propose an application-aware approach that dynamically drops messages at runtime. We evaluate the effects of our approach for the PageRank algorithm on several representative real-world web graphs and compare its performance to that of existing approximation techniques for modern frameworks..
{"title":"Flexible application-aware approximation for modern distributed graph processing frameworks","authors":"Michael Schramm, Sukanya Bhowmik, K. Rothermel","doi":"10.1145/3534540.3534693","DOIUrl":"https://doi.org/10.1145/3534540.3534693","url":null,"abstract":"The interest in the ability of processing data that has an underlying graph structure has grown in the recent past. This has led to the development of many distributed graph processing systems. However, due to rapidly growing amount of data, e.g., web graphs and social graphs, even such distributed graph processing frameworks end up requiring several minutes or even several hours to execute popular graph algorithms. This leads to the question: do we always need to know the exact answer for a large graph? The aforementioned modern distributed graph processing frameworks execute graph algorithms by exchanging messages between vertices. This paper introduces a novel message-dropping approach for approximation in these frameworks. As dropping messages would result in degradation of quality of result, our objective is to drop messages that have least adverse impact on quality. More specifically, we propose an application-aware approach that dynamically drops messages at runtime. We evaluate the effects of our approach for the PageRank algorithm on several representative real-world web graphs and compare its performance to that of existing approximation techniques for modern frameworks..","PeriodicalId":309669,"journal":{"name":"Proceedings of the 5th ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133758868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As AI technologies become mature in natural language processing, speech recognition and computer vision, "intelligent" user interfaces emerge to handle complex and diverse tasks that require human-like knowledge and reasoning capability. In Part 1, I will present our recent work on knowledge graph representation learning using Graph Neural Networks (GNNs): the first approach is called orthogonal transform embedding (OTE), which integrates graph context into the embedding distance scoring function and improves prediction accuracy on complex relations such as the difficult N-to-1, 1-to-N and N-to-N cases; the second approach is called multi-hop attention GNN (MAGNA), a principled way to incorporate multi-hop context information into every layer of attention computation. MAGNA uses a diffusion prior on attention values, to efficiently account for all paths between the pair of disconnected nodes. Experimental results on knowledge graph completion as well as node classification benchmarks show that MAGNA achieves state-of-the-art results. In Part 2, I will present how we take advantage of GNNs for language understanding and reasoning tasks. We show that combined with large pre-trained language models and knowledge graph embeddings, GNNs are proven effective in multi-hop reading comprehension across documents, improving time sensitivity for question answering over temporal knowledge graphs, and constructing robust syntactic information for aspect-level sentiment analysis.
{"title":"Knowledge graph representation learning and graph neural networks for language understanding","authors":"Jing Huang","doi":"10.1145/3534540.3534710","DOIUrl":"https://doi.org/10.1145/3534540.3534710","url":null,"abstract":"As AI technologies become mature in natural language processing, speech recognition and computer vision, \"intelligent\" user interfaces emerge to handle complex and diverse tasks that require human-like knowledge and reasoning capability. In Part 1, I will present our recent work on knowledge graph representation learning using Graph Neural Networks (GNNs): the first approach is called orthogonal transform embedding (OTE), which integrates graph context into the embedding distance scoring function and improves prediction accuracy on complex relations such as the difficult N-to-1, 1-to-N and N-to-N cases; the second approach is called multi-hop attention GNN (MAGNA), a principled way to incorporate multi-hop context information into every layer of attention computation. MAGNA uses a diffusion prior on attention values, to efficiently account for all paths between the pair of disconnected nodes. Experimental results on knowledge graph completion as well as node classification benchmarks show that MAGNA achieves state-of-the-art results. In Part 2, I will present how we take advantage of GNNs for language understanding and reasoning tasks. We show that combined with large pre-trained language models and knowledge graph embeddings, GNNs are proven effective in multi-hop reading comprehension across documents, improving time sensitivity for question answering over temporal knowledge graphs, and constructing robust syntactic information for aspect-level sentiment analysis.","PeriodicalId":309669,"journal":{"name":"Proceedings of the 5th ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116228164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper focuses on subgraph queries where constraints are present in the neighborhood of the explored subgraphs. We describe anti-vertex, a declarative construct that indicates absence of a vertex, i.e., the resulting subgraph should not have a vertex in its specified neighborhood that matches the anti-vertex. We formalize the semantics of anti-vertex to benefit from automatic reasoning and optimization, and to enable standardized implementation across query languages and runtimes. The semantics are defined for various matching semantics that are commonly employed in subgraph querying (isomorphism, homomorphism, and no-repeated-edge matching) and for the widely adopted property graph model. We illustrate several examples where anti-vertices can be employed to help familiarize with the anti-vertex concept. We further showcase how anti-vertex support can be added in existing graph query languages by developing prototype extensions of Cypher language. Finally, we study how anti-vertices interact with the symmetry breaking technique in subgraph matching frameworks so that their meaning remains consistent with the expected outcome of constrained neighborhoods to connected vertices.
{"title":"Anti-vertex for neighborhood constraints in subgraph queries","authors":"Kasra Jamshidi, Mugilan Mariappan, Keval Vora","doi":"10.1145/3534540.3534690","DOIUrl":"https://doi.org/10.1145/3534540.3534690","url":null,"abstract":"This paper focuses on subgraph queries where constraints are present in the neighborhood of the explored subgraphs. We describe anti-vertex, a declarative construct that indicates absence of a vertex, i.e., the resulting subgraph should not have a vertex in its specified neighborhood that matches the anti-vertex. We formalize the semantics of anti-vertex to benefit from automatic reasoning and optimization, and to enable standardized implementation across query languages and runtimes. The semantics are defined for various matching semantics that are commonly employed in subgraph querying (isomorphism, homomorphism, and no-repeated-edge matching) and for the widely adopted property graph model. We illustrate several examples where anti-vertices can be employed to help familiarize with the anti-vertex concept. We further showcase how anti-vertex support can be added in existing graph query languages by developing prototype extensions of Cypher language. Finally, we study how anti-vertices interact with the symmetry breaking technique in subgraph matching frameworks so that their meaning remains consistent with the expected outcome of constrained neighborhoods to connected vertices.","PeriodicalId":309669,"journal":{"name":"Proceedings of the 5th ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129528140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Oh dear, there's that word again - "semantics!" Isn't that what doomed that Semantic Web thing and led to knowledge graphs instead? In fact, many of the same problems, and particularly problems with interoperability, arise again for KGs, and thus we must explore the old problem in this new area. This is even more important when we start to explore the "personal knowledge graph (PKG)," that is, the ability to have private and public information combined in KG technology. In this talk, I discuss how knowledge graphs, PKGs, linked data and, yes, semantics are all critically linked and why the latter is still relevant to the growth and scaling of knowledge graphs into the future - and specifically to the ability to extract better data from them.
{"title":"Knowledge graph semantics","authors":"J. Hendler","doi":"10.1145/3534540.3534709","DOIUrl":"https://doi.org/10.1145/3534540.3534709","url":null,"abstract":"Oh dear, there's that word again - \"semantics!\" Isn't that what doomed that Semantic Web thing and led to knowledge graphs instead? In fact, many of the same problems, and particularly problems with interoperability, arise again for KGs, and thus we must explore the old problem in this new area. This is even more important when we start to explore the \"personal knowledge graph (PKG),\" that is, the ability to have private and public information combined in KG technology. In this talk, I discuss how knowledge graphs, PKGs, linked data and, yes, semantics are all critically linked and why the latter is still relevant to the growth and scaling of knowledge graphs into the future - and specifically to the ability to extract better data from them.","PeriodicalId":309669,"journal":{"name":"Proceedings of the 5th ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132724850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shahrzad Khayatbashi, Sebastián Ferrada, O. Hartig
Today's space of graph database solutions is characterized by two main technology stacks that have evolved separate from one another: on one hand, there are systems that focus on supporting the RDF family of standards; on the other hand, there is the Property Graph category of systems. As a basis for bringing these stacks together and, in particular, to facilitate data exchange between the different types of systems, different direct mappings between the underlying graph data models have been introduced in the literature. While fundamental properties are well-documented for most of these mappings, the same cannot be said about the practical implications of choosing one mapping over another. Our research aims to contribute towards closing this gap. In this paper we report on a preliminary study for which we have selected two direct mappings from (Labeled) Property Graphs to RDF, where one of them uses features of the RDF-star extension to RDF. We compare these mappings in terms of the query performance achieved by two popular commercial RDF stores, GraphDB and Stardog, in which the converted data is imported. While we find that, for both of these systems, none of the mappings is a clear winner in terms of guaranteeing better query performance, we also identify types of queries that are problematic for the systems when using one mapping but not the other.
{"title":"Converting property graphs to RDF: a preliminary study of the practical impact of different mappings","authors":"Shahrzad Khayatbashi, Sebastián Ferrada, O. Hartig","doi":"10.1145/3534540.3534695","DOIUrl":"https://doi.org/10.1145/3534540.3534695","url":null,"abstract":"Today's space of graph database solutions is characterized by two main technology stacks that have evolved separate from one another: on one hand, there are systems that focus on supporting the RDF family of standards; on the other hand, there is the Property Graph category of systems. As a basis for bringing these stacks together and, in particular, to facilitate data exchange between the different types of systems, different direct mappings between the underlying graph data models have been introduced in the literature. While fundamental properties are well-documented for most of these mappings, the same cannot be said about the practical implications of choosing one mapping over another. Our research aims to contribute towards closing this gap. In this paper we report on a preliminary study for which we have selected two direct mappings from (Labeled) Property Graphs to RDF, where one of them uses features of the RDF-star extension to RDF. We compare these mappings in terms of the query performance achieved by two popular commercial RDF stores, GraphDB and Stardog, in which the converted data is imported. While we find that, for both of these systems, none of the mappings is a clear winner in terms of guaranteeing better query performance, we also identify types of queries that are problematic for the systems when using one mapping but not the other.","PeriodicalId":309669,"journal":{"name":"Proceedings of the 5th ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)","volume":"123 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123285651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Finding k-cores in graphs is a valuable and effective strategy for extracting dense regions of otherwise sparse graphs. We focus on the important problem of maintaining cores on rapidly changing dynamic graphs, where batches of edge changes need to be processed quickly. Many prior dynamic algorithms focus on the problem of maintaining a core decomposition. This finds vertices that are dense in some subgraph, but the subgraph itself is not returned. We develop a new dynamic batch algorithm to maintain cores, with their connected subgraphs, that improves efficiency over processing edge-by-edge. We implement our algorithm and experimentally show that with it core queries can be returned on rapidly changing graphs quickly enough for interactive applications. For 1 million edge batches, on many graphs we run over 100x faster than processing edge-by-edge while remaining under re-computing from scratch.
{"title":"Batch dynamic algorithm to find k-core hierarchies","authors":"Kasimir Gabert, Ali Pinar, Ümit V. Çatalyürek","doi":"10.1145/3534540.3534694","DOIUrl":"https://doi.org/10.1145/3534540.3534694","url":null,"abstract":"Finding k-cores in graphs is a valuable and effective strategy for extracting dense regions of otherwise sparse graphs. We focus on the important problem of maintaining cores on rapidly changing dynamic graphs, where batches of edge changes need to be processed quickly. Many prior dynamic algorithms focus on the problem of maintaining a core decomposition. This finds vertices that are dense in some subgraph, but the subgraph itself is not returned. We develop a new dynamic batch algorithm to maintain cores, with their connected subgraphs, that improves efficiency over processing edge-by-edge. We implement our algorithm and experimentally show that with it core queries can be returned on rapidly changing graphs quickly enough for interactive applications. For 1 million edge batches, on many graphs we run over 100x faster than processing edge-by-edge while remaining under re-computing from scratch.","PeriodicalId":309669,"journal":{"name":"Proceedings of the 5th ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126539797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Proceedings of the 5th ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)","authors":"","doi":"10.1145/3534540","DOIUrl":"https://doi.org/10.1145/3534540","url":null,"abstract":"","PeriodicalId":309669,"journal":{"name":"Proceedings of the 5th ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)","volume":"58 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123389499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Proceedings of the 5th ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)