Guillermo de Bernardo, N. Brisaboa, Diego Caro, Michael A. Rodriguez
{"title":"Compact Data Structures for Temporal Graphs","authors":"Guillermo de Bernardo, N. Brisaboa, Diego Caro, Michael A. Rodriguez","doi":"10.1109/DCC.2013.59","DOIUrl":null,"url":null,"abstract":"Summary form only given. In this paper we propose three compact data structures to answer queries on temporal graphs. We define a temporal graph as a graph whose edges appear or disappear along time. Possible queries are related to adjacency along time, for example, to get the neighbors of a node at a given time point or interval. A naive representation consists of a time-ordered sequence of graphs, each of them valid at a particular time instant. The main issue of this representation is the unnecessary use of space if many nodes and their connections remain unchanged during a long period of time. The work in this paper proposes to store only what changes at each time instant. The ttk2-tree is conceptually a dynamic k2-tree in which each leaf and internal node contains a change list of time instants when its bit value has changed. All the change lists are stored consecutively in a dynamic sequence. During query processing, the change lists are used to expand only valid regions in the dynamic k2-tree. It supports updates of the current or past states of the graph. The ltg-index is a set of snapshots and logs of changes between consecutive snapshots. The structure keeps a log for each node, storing the edge and the time where a change has been produced. To retrieve direct neighbors of a node, the previous snapshot is queried, and then the log is traversed adding or removing edges to the result. The differential k2-tree stores snapshots of some time instants in k2-trees. For the other time instants, a k2-tree is also built, but these are differential (they store the edges that differ from the last snapshot). To perform a query it accesses the k2-tree of the given time and the previous full snapshot. The edges that appear in exactly one of these two k2-trees will be the final results. We test our proposals using synthetic and real datasets. Our results show that the ltg-index obtains the smallest space in general. We also measure times for direct and reverse neighbor queries in a time instant or a time interval. For all these queries, the times of our best proposal range from tens of μs to several ms, depending on the size of the dataset and the number of results returned. The ltg-index is the fastest for direct queries (almost as fast as accessing a snapshot), but it is 5-20 times slower in reverse queries. The differential k2-tree is very fast in time instant queries, but slower in time interval queries. The ttk2-tree obtains similar times for direct and reverse queries and different time intervals, being the fastest in some reverse interval queries. It has also the advantage of being dynamic.","PeriodicalId":388717,"journal":{"name":"2013 Data Compression Conference","volume":"159 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 Data Compression Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DCC.2013.59","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 20
Abstract
Summary form only given. In this paper we propose three compact data structures to answer queries on temporal graphs. We define a temporal graph as a graph whose edges appear or disappear along time. Possible queries are related to adjacency along time, for example, to get the neighbors of a node at a given time point or interval. A naive representation consists of a time-ordered sequence of graphs, each of them valid at a particular time instant. The main issue of this representation is the unnecessary use of space if many nodes and their connections remain unchanged during a long period of time. The work in this paper proposes to store only what changes at each time instant. The ttk2-tree is conceptually a dynamic k2-tree in which each leaf and internal node contains a change list of time instants when its bit value has changed. All the change lists are stored consecutively in a dynamic sequence. During query processing, the change lists are used to expand only valid regions in the dynamic k2-tree. It supports updates of the current or past states of the graph. The ltg-index is a set of snapshots and logs of changes between consecutive snapshots. The structure keeps a log for each node, storing the edge and the time where a change has been produced. To retrieve direct neighbors of a node, the previous snapshot is queried, and then the log is traversed adding or removing edges to the result. The differential k2-tree stores snapshots of some time instants in k2-trees. For the other time instants, a k2-tree is also built, but these are differential (they store the edges that differ from the last snapshot). To perform a query it accesses the k2-tree of the given time and the previous full snapshot. The edges that appear in exactly one of these two k2-trees will be the final results. We test our proposals using synthetic and real datasets. Our results show that the ltg-index obtains the smallest space in general. We also measure times for direct and reverse neighbor queries in a time instant or a time interval. For all these queries, the times of our best proposal range from tens of μs to several ms, depending on the size of the dataset and the number of results returned. The ltg-index is the fastest for direct queries (almost as fast as accessing a snapshot), but it is 5-20 times slower in reverse queries. The differential k2-tree is very fast in time instant queries, but slower in time interval queries. The ttk2-tree obtains similar times for direct and reverse queries and different time intervals, being the fastest in some reverse interval queries. It has also the advantage of being dynamic.