{"title":"Highly Fault-tolerant NoC Routing with Application-aware Congestion Management","authors":"Doowon Lee, Ritesh Parikh, V. Bertacco","doi":"10.1145/2786572.2786590","DOIUrl":null,"url":null,"abstract":"Silicon devices are becoming less and less reliable as technology moves to smaller feature sizes. As a result, digital systems are increasingly likely to experience permanent failures during their life-time. To overcome this problem, networks-on-chip (NoCs) should be designed to, not only fulfill performance requirements, but also be robust to many fault occurrences. This paper proposes a fault- and application-aware routing framework called FATE: it leverages the diversity of communication patterns in applications for highly faulty NoCs to reduce congestion during execution. To this end, FATE estimates routing demands in applications to balance traffic load among the available resources. We propose a set of novel route-enabling rules that greatly reduce the search for deadlock-free, maximally-connected routes for any faulty 2D mesh topology, by preventing early on the exploration of routing configuration options that lead eventually to unviable solutions. Our experimental results show a 33% improvement on average saturation throughput for synthetic traffic patterns, and a 59% improvement on average packet latency for SPLASH-2 benchmarks, over state-of-the-art fault-tolerant solutions. The FATE approach is also beneficial in the complete absence of faults: indeed, it outperforms prior fully-adaptive routing techniques by improving the saturation throughput by up to 33%.","PeriodicalId":228605,"journal":{"name":"Proceedings of the 9th International Symposium on Networks-on-Chip","volume":"102 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 9th International Symposium on Networks-on-Chip","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2786572.2786590","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15
Abstract
Silicon devices are becoming less and less reliable as technology moves to smaller feature sizes. As a result, digital systems are increasingly likely to experience permanent failures during their life-time. To overcome this problem, networks-on-chip (NoCs) should be designed to, not only fulfill performance requirements, but also be robust to many fault occurrences. This paper proposes a fault- and application-aware routing framework called FATE: it leverages the diversity of communication patterns in applications for highly faulty NoCs to reduce congestion during execution. To this end, FATE estimates routing demands in applications to balance traffic load among the available resources. We propose a set of novel route-enabling rules that greatly reduce the search for deadlock-free, maximally-connected routes for any faulty 2D mesh topology, by preventing early on the exploration of routing configuration options that lead eventually to unviable solutions. Our experimental results show a 33% improvement on average saturation throughput for synthetic traffic patterns, and a 59% improvement on average packet latency for SPLASH-2 benchmarks, over state-of-the-art fault-tolerant solutions. The FATE approach is also beneficial in the complete absence of faults: indeed, it outperforms prior fully-adaptive routing techniques by improving the saturation throughput by up to 33%.