Shaden Smith, Xing Liu, Nesreen Ahmed, A. Tom, F. Petrini, G. Karypis
{"title":"共享内存并行系统的桁架分解","authors":"Shaden Smith, Xing Liu, Nesreen Ahmed, A. Tom, F. Petrini, G. Karypis","doi":"10.1109/HPEC.2017.8091049","DOIUrl":null,"url":null,"abstract":"The scale of data used in graph analytics grows at an unprecedented rate. More than ever, domain experts require efficient and parallel algorithms for tasks in graph analytics. One such task is the truss decomposition, which is a hierarchical decomposition of the edges of a graph and is closely related to the task of triangle enumeration. As evidenced by the recent GraphChallenge, existing algorithms and implementations for truss decomposition are insufficient for the scale of modern datasets. In this work, we propose a parallel algorithm for computing the truss decomposition of massive graphs on a shared-memory system. Our algorithm breaks a computation-efficient serial algorithm into several bulk-synchronous parallel steps which do not rely on atomics or other fine-grained synchronization. We evaluate our algorithm across a variety of synthetic and real-world datasets on a 56-core Intel Xeon system. Our serial implementation achieves over 1400 × speedup over the provided GraphChallenge serial benchmark implementation and is up to 28 × faster than the state-of-the-art shared-memory parallel algorithm.","PeriodicalId":364903,"journal":{"name":"2017 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"116 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"45","resultStr":"{\"title\":\"Truss decomposition on shared-memory parallel systems\",\"authors\":\"Shaden Smith, Xing Liu, Nesreen Ahmed, A. Tom, F. Petrini, G. Karypis\",\"doi\":\"10.1109/HPEC.2017.8091049\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The scale of data used in graph analytics grows at an unprecedented rate. More than ever, domain experts require efficient and parallel algorithms for tasks in graph analytics. One such task is the truss decomposition, which is a hierarchical decomposition of the edges of a graph and is closely related to the task of triangle enumeration. As evidenced by the recent GraphChallenge, existing algorithms and implementations for truss decomposition are insufficient for the scale of modern datasets. In this work, we propose a parallel algorithm for computing the truss decomposition of massive graphs on a shared-memory system. Our algorithm breaks a computation-efficient serial algorithm into several bulk-synchronous parallel steps which do not rely on atomics or other fine-grained synchronization. We evaluate our algorithm across a variety of synthetic and real-world datasets on a 56-core Intel Xeon system. Our serial implementation achieves over 1400 × speedup over the provided GraphChallenge serial benchmark implementation and is up to 28 × faster than the state-of-the-art shared-memory parallel algorithm.\",\"PeriodicalId\":364903,\"journal\":{\"name\":\"2017 IEEE High Performance Extreme Computing Conference (HPEC)\",\"volume\":\"116 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"45\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE High Performance Extreme Computing Conference (HPEC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPEC.2017.8091049\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE High Performance Extreme Computing Conference (HPEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPEC.2017.8091049","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Truss decomposition on shared-memory parallel systems
The scale of data used in graph analytics grows at an unprecedented rate. More than ever, domain experts require efficient and parallel algorithms for tasks in graph analytics. One such task is the truss decomposition, which is a hierarchical decomposition of the edges of a graph and is closely related to the task of triangle enumeration. As evidenced by the recent GraphChallenge, existing algorithms and implementations for truss decomposition are insufficient for the scale of modern datasets. In this work, we propose a parallel algorithm for computing the truss decomposition of massive graphs on a shared-memory system. Our algorithm breaks a computation-efficient serial algorithm into several bulk-synchronous parallel steps which do not rely on atomics or other fine-grained synchronization. We evaluate our algorithm across a variety of synthetic and real-world datasets on a 56-core Intel Xeon system. Our serial implementation achieves over 1400 × speedup over the provided GraphChallenge serial benchmark implementation and is up to 28 × faster than the state-of-the-art shared-memory parallel algorithm.