{"title":"疏浚","authors":"Andrew McCrabb, Eric Winsor, V. Bertacco","doi":"10.1145/3316781.3317804","DOIUrl":null,"url":null,"abstract":"Graph-based algorithms have gained significant interest in several application domains. Solutions addressing the computational efficiency of such algorithms have mostly relied on many-core architectures. Cleverly laying out input graphs in storage, by placing adjacent vertices in a same storage unit (memory bank or cache unit), enables fast access during graph traversal. Dynamic graphs, however, must be continuously repartitioned to leverage this benefit. Yet software repartitioning solutions rely on costly, cross-vault communication to query and optimize the graph layout between algorithm iterations. In this work, we propose DREDGE, a novel hardware solution to provide heuristic repartitioning optimizations in the background without extra communication. Our evaluation indicates that we achieve a $1.9 x$ speedup, on average, over several graph algorithms and datasets, executing on a 24x24-core architecture, when compared against a baseline solution that does not repartition the dynamic graph. We estimated that DREDGE incurs only 1.5% area and 2.1% power overheads over an ARM A5 processor core. CCS CONCEPTS • Hardware $\\rightarrow$ Hardware accelerators; Application specific processors; • Mathematics of computing $\\rightarrow$ Graph theory;","PeriodicalId":391209,"journal":{"name":"Proceedings of the 56th Annual Design Automation Conference 2019","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"DREDGE\",\"authors\":\"Andrew McCrabb, Eric Winsor, V. Bertacco\",\"doi\":\"10.1145/3316781.3317804\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Graph-based algorithms have gained significant interest in several application domains. Solutions addressing the computational efficiency of such algorithms have mostly relied on many-core architectures. Cleverly laying out input graphs in storage, by placing adjacent vertices in a same storage unit (memory bank or cache unit), enables fast access during graph traversal. Dynamic graphs, however, must be continuously repartitioned to leverage this benefit. Yet software repartitioning solutions rely on costly, cross-vault communication to query and optimize the graph layout between algorithm iterations. In this work, we propose DREDGE, a novel hardware solution to provide heuristic repartitioning optimizations in the background without extra communication. Our evaluation indicates that we achieve a $1.9 x$ speedup, on average, over several graph algorithms and datasets, executing on a 24x24-core architecture, when compared against a baseline solution that does not repartition the dynamic graph. We estimated that DREDGE incurs only 1.5% area and 2.1% power overheads over an ARM A5 processor core. CCS CONCEPTS • Hardware $\\\\rightarrow$ Hardware accelerators; Application specific processors; • Mathematics of computing $\\\\rightarrow$ Graph theory;\",\"PeriodicalId\":391209,\"journal\":{\"name\":\"Proceedings of the 56th Annual Design Automation Conference 2019\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-06-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 56th Annual Design Automation Conference 2019\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3316781.3317804\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 56th Annual Design Automation Conference 2019","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3316781.3317804","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Graph-based algorithms have gained significant interest in several application domains. Solutions addressing the computational efficiency of such algorithms have mostly relied on many-core architectures. Cleverly laying out input graphs in storage, by placing adjacent vertices in a same storage unit (memory bank or cache unit), enables fast access during graph traversal. Dynamic graphs, however, must be continuously repartitioned to leverage this benefit. Yet software repartitioning solutions rely on costly, cross-vault communication to query and optimize the graph layout between algorithm iterations. In this work, we propose DREDGE, a novel hardware solution to provide heuristic repartitioning optimizations in the background without extra communication. Our evaluation indicates that we achieve a $1.9 x$ speedup, on average, over several graph algorithms and datasets, executing on a 24x24-core architecture, when compared against a baseline solution that does not repartition the dynamic graph. We estimated that DREDGE incurs only 1.5% area and 2.1% power overheads over an ARM A5 processor core. CCS CONCEPTS • Hardware $\rightarrow$ Hardware accelerators; Application specific processors; • Mathematics of computing $\rightarrow$ Graph theory;