Proactive deadlock prevention based on traffic classification sub-graphs for triplet-based NoC TriBA-cNoC

IF 2.6 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Microprocessors and Microsystems Pub Date : 2024-10-01 Epub Date: 2024-08-31 DOI:10.1016/j.micpro.2024.105091

Karim Soliman, Shi Feng, Ruan Shengqiang, Chunfeng Li

{"title":"Proactive deadlock prevention based on traffic classification sub-graphs for triplet-based NoC TriBA-cNoC","authors":"Karim Soliman, Shi Feng, Ruan Shengqiang, Chunfeng Li","doi":"10.1016/j.micpro.2024.105091","DOIUrl":null,"url":null,"abstract":"<div><p>Network topology and routing algorithms stand as pivotal decision points that profoundly impact the performance of Network-on-Chip (NoC) systems. As core counts rise, so does the inherent competition for shared resources, spotlighting the critical need for meticulously designed routing algorithms that circumvent deadlocks to ensure optimal network efficiency. This research capitalizes on the Triplet-Base Architecture (TriBA) and its Distributed Minimal Routing Algorithm (DM4T) to overcome the limitations of previous approaches. While DM4T exhibits performance advantages over previous routing algorithms, its deterministic nature and potential for circular dependencies during routing can lead to deadlocks and congestion. Therefore, this work addresses these vulnerabilities while leveraging the performance benefits of TriBA and DM4T. This work introduces a novel approach that merges a proactive deadlock prevention mechanism with Intermediate Adjacent Shortest Path Routing (IASPR). This combination guarantees both deadlock-free and livelock-free routing, ensuring reliable communication within the network. The key to this integration lies in a flow model-based data transfer categorization technique. This technique prevents the formation of circular dependencies. Additionally, it reduces redundant distance calculations during the routing process. By addressing these challenges, the proposed approach achieves improvements in both routing latency and throughput. To rigorously assess the performance of TriBA network topologies under varying configurations, extensive simulations were undertaken. The investigation encompassed both TriBA networks comprising 9 nodes and those with 27 nodes, employing DM4T, IASPR routing algorithms, and the proactive deadlock prevention method. The gem5 simulator, operating under the Garnet 3.0 network model using a standalone protocol for synthetic traffic patterns, was utilized for simulations at high injection rates, spanning diverse synthetic traffic patterns and PARSEC benchmark suite applications. Simulations rigorously quantified the effectiveness of the proposed approach, revealing reductions in average latency 40.17% and 34.05% compared to the lookup table and DM4T, respectively. Additionally, there were notable increases in average throughput of 7.48% and 5.66%.</p></div>","PeriodicalId":49815,"journal":{"name":"Microprocessors and Microsystems","volume":"110 ","pages":"Article 105091"},"PeriodicalIF":2.6000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Microprocessors and Microsystems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141933124000863","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/8/31 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

Network topology and routing algorithms stand as pivotal decision points that profoundly impact the performance of Network-on-Chip (NoC) systems. As core counts rise, so does the inherent competition for shared resources, spotlighting the critical need for meticulously designed routing algorithms that circumvent deadlocks to ensure optimal network efficiency. This research capitalizes on the Triplet-Base Architecture (TriBA) and its Distributed Minimal Routing Algorithm (DM4T) to overcome the limitations of previous approaches. While DM4T exhibits performance advantages over previous routing algorithms, its deterministic nature and potential for circular dependencies during routing can lead to deadlocks and congestion. Therefore, this work addresses these vulnerabilities while leveraging the performance benefits of TriBA and DM4T. This work introduces a novel approach that merges a proactive deadlock prevention mechanism with Intermediate Adjacent Shortest Path Routing (IASPR). This combination guarantees both deadlock-free and livelock-free routing, ensuring reliable communication within the network. The key to this integration lies in a flow model-based data transfer categorization technique. This technique prevents the formation of circular dependencies. Additionally, it reduces redundant distance calculations during the routing process. By addressing these challenges, the proposed approach achieves improvements in both routing latency and throughput. To rigorously assess the performance of TriBA network topologies under varying configurations, extensive simulations were undertaken. The investigation encompassed both TriBA networks comprising 9 nodes and those with 27 nodes, employing DM4T, IASPR routing algorithms, and the proactive deadlock prevention method. The gem5 simulator, operating under the Garnet 3.0 network model using a standalone protocol for synthetic traffic patterns, was utilized for simulations at high injection rates, spanning diverse synthetic traffic patterns and PARSEC benchmark suite applications. Simulations rigorously quantified the effectiveness of the proposed approach, revealing reductions in average latency 40.17% and 34.05% compared to the lookup table and DM4T, respectively. Additionally, there were notable increases in average throughput of 7.48% and 5.66%.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于流量分类子图的主动死锁预防，适用于基于三胞胎的 NoC TriBA-cNoC

网络拓扑和路由算法是深刻影响片上网络 (NoC) 系统性能的关键决策点。随着内核数量的增加，对共享资源的固有竞争也在加剧，因此迫切需要精心设计的路由算法来规避死锁，以确保最佳的网络效率。本研究利用三重基础架构（TriBA）及其分布式最小路由算法（DM4T）克服了以往方法的局限性。虽然 DM4T 与之前的路由算法相比具有性能优势，但其确定性和路由过程中的潜在循环依赖性可能会导致死锁和拥塞。因此，本研究在利用 TriBA 和 DM4T 性能优势的同时，解决了这些漏洞。这项工作引入了一种新方法，将主动死锁预防机制与中间相邻最短路径路由（IASPR）相结合。这种组合保证了无死锁和无活锁路由，确保了网络内的可靠通信。这种整合的关键在于基于流模型的数据传输分类技术。这种技术可防止形成循环依赖关系。此外，它还能减少路由过程中多余的距离计算。通过应对这些挑战，所提出的方法实现了路由延迟和吞吐量的改善。为了严格评估 TriBA 网络拓扑在不同配置下的性能，我们进行了大量模拟。调查涵盖了由 9 个节点组成的 TriBA 网络和由 27 个节点组成的 TriBA 网络，采用了 DM4T、IASPR 路由算法和主动死锁预防方法。gem5 模拟器在 Garnet 3.0 网络模型下运行，使用合成流量模式的独立协议，以高注入率进行模拟，涵盖各种合成流量模式和 PARSEC 基准套件应用。模拟严格量化了建议方法的有效性，结果显示，与查找表和 DM4T 相比，平均延迟分别降低了 40.17% 和 34.05%。此外，平均吞吐量也显著提高了 7.48% 和 5.66%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Microprocessors and Microsystems 工程技术-工程：电子与电气

CiteScore

6.90

自引率

3.80%

发文量

204

审稿时长

172 days

期刊介绍： Microprocessors and Microsystems: Embedded Hardware Design (MICPRO) is a journal covering all design and architectural aspects related to embedded systems hardware. This includes different embedded system hardware platforms ranging from custom hardware via reconfigurable systems and application specific processors to general purpose embedded processors. Special emphasis is put on novel complex embedded architectures, such as systems on chip (SoC), systems on a programmable/reconfigurable chip (SoPC) and multi-processor systems on a chip (MPSoC), as well as, their memory and communication methods and structures, such as network-on-chip (NoC). Design automation of such systems including methodologies, techniques, flows and tools for their design, as well as, novel designs of hardware components fall within the scope of this journal. Novel cyber-physical applications that use embedded systems are also central in this journal. While software is not in the main focus of this journal, methods of hardware/software co-design, as well as, application restructuring and mapping to embedded hardware platforms, that consider interplay between software and hardware components with emphasis on hardware, are also in the journal scope.

期刊最新文献

CGR-AI Engine: A scalable CGRA-based processing platform for Artificial Intelligence in space applications The TEXTAROSSA project: Cool all the Way Down to the Hardware Analyzing the impact of functional approximation on the resilience of Deep Neural Networks Efficient associative processing in FPGA LoLiPoP-IoT: Advancing the energy-efficient Internet of Things