首页 > 最新文献

Proceedings of the 2nd International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems最新文献

英文 中文
Bringing minimal routing back to HPC through silicon photonics: a study of "flexfly" architectures with the structural simulation toolkit (SST) 通过硅光子学将最小路由带回HPC:使用结构模拟工具包(SST)的“柔性”架构研究
Jeremiah J. Wilke
In system-level interconnects for high-performance computing (HPC), low-diameter hierarchical topologies like dragonfly are gaining in popularity. The topologies require adaptive routing schemes for high performance, but using non-minimal paths can stress the long-distance inter-group links that are the most expensive and scarce network resource.Âă We introduce "FlexflyâĂIJ, a network design incorporating optical switches that steers bandwidth onto minimal paths instead of diverting packets, alleviating contention. Performance results and the simulation methodology using the Structural Simulation Toolkit (SST) are introduced.
在高性能计算(HPC)的系统级互连中,像蜻蜓这样的低直径分层拓扑越来越受欢迎。这种拓扑需要自适应路由方案以获得高性能,但使用非最小路径会对长距离组间链路造成压力,而长距离组间链路是最昂贵和稀缺的网络资源。Âă我们介绍“FlexflyâĂIJ”,这是一种包含光交换机的网络设计,它将带宽引导到最小路径上,而不是分流数据包,从而减轻了争用。介绍了性能结果和使用结构模拟工具包(SST)的模拟方法。
{"title":"Bringing minimal routing back to HPC through silicon photonics: a study of \"flexfly\" architectures with the structural simulation toolkit (SST)","authors":"Jeremiah J. Wilke","doi":"10.1145/3073763.3073775","DOIUrl":"https://doi.org/10.1145/3073763.3073775","url":null,"abstract":"In system-level interconnects for high-performance computing (HPC), low-diameter hierarchical topologies like dragonfly are gaining in popularity. The topologies require adaptive routing schemes for high performance, but using non-minimal paths can stress the long-distance inter-group links that are the most expensive and scarce network resource.Âă We introduce \"FlexflyâĂIJ, a network design incorporating optical switches that steers bandwidth onto minimal paths instead of diverting packets, alleviating contention. Performance results and the simulation methodology using the Structural Simulation Toolkit (SST) are introduced.","PeriodicalId":20560,"journal":{"name":"Proceedings of the 2nd International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90616162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A survey of low power NoC design techniques 低功耗NoC设计技术综述
Emmanuel Ofori-Attah, Michael Opoku Agyeman
As we usher into the billion-transistor era, NoC which was once deemed as the solution is defecting due to high power consumption in its components. Several techniques have been proposed over the years to improve the performance of the NoCs, trading off power efficiency. However, low power design solution is one of the essential requirements of future NoC-based SoC applications. Power dissipation can be reduced by efficient routers, architecture saving techniques and communication links. This paper presents recent contributions and efficient saving techniques at the router, NoC architecture and Communication link level.
随着十亿晶体管时代的到来,曾经被认为是解决方案的NoC由于其组件的高功耗而开始走下坡路。多年来,已经提出了几种技术来提高noc的性能,同时权衡功率效率。然而,低功耗设计解决方案是未来基于noc的SoC应用的基本要求之一。通过高效的路由器、架构节省技术和通信链路可以降低功耗。本文介绍了路由器、NoC体系结构和通信链路级别的最新研究成果和有效的存储技术。
{"title":"A survey of low power NoC design techniques","authors":"Emmanuel Ofori-Attah, Michael Opoku Agyeman","doi":"10.1145/3073763.3073767","DOIUrl":"https://doi.org/10.1145/3073763.3073767","url":null,"abstract":"As we usher into the billion-transistor era, NoC which was once deemed as the solution is defecting due to high power consumption in its components. Several techniques have been proposed over the years to improve the performance of the NoCs, trading off power efficiency. However, low power design solution is one of the essential requirements of future NoC-based SoC applications. Power dissipation can be reduced by efficient routers, architecture saving techniques and communication links. This paper presents recent contributions and efficient saving techniques at the router, NoC architecture and Communication link level.","PeriodicalId":20560,"journal":{"name":"Proceedings of the 2nd International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86784152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Network-on-chip service guarantees on the kalray MPPA-256 bostan processor 在kalray MPPA-256 bostan处理器上实现片上网络服务保障
B. Dinechin, Amaury Graillat
The Kalray MPPA-256 Bostan manycore processor implements a clustered architecture, where clusters of cores share a local memory, and a DMA-capable network-on-chip (NoC) connects the clusters. The NoC implements wormhole switching without virtual channels, with source routing, and can be configured for maximum flow rate and burstiness at ingress. We describe and illustrate the techniques used to configure the MPPA NoC for guaranteed services. Our approach is based on three steps: global selection of routes between end-points and computation of flow rates, by solving the max-min fairness with unsplittable path problem; configuration of the flow burstiness parameters at ingress, by solving an acyclic set of linear inequalities; and end-to-end latency upper bound computation, based on the principles of separated flow analysis (SFA). In this paper, we develop the two last steps, taking advantage of the effects of NoC link shaping on the leaky-bucket arrival curves of flows.
Kalray MPPA-256 Bostan多核处理器实现了集群架构,其中核心集群共享本地内存,并且具有dma功能的片上网络(NoC)连接集群。NoC实现了没有虚拟通道的虫洞交换,具有源路由,并且可以配置为最大流量和入口突发。我们描述并举例说明了为保证服务配置MPPA NoC所使用的技术。我们的方法基于三个步骤:通过解决具有不可分割路径的最大最小公平性问题来解决端点之间的路径全局选择和流量计算;通过求解一组无循环的线性不等式,得到了入口流动的突发性参数;基于分离流分析(SFA)原理的端到端延迟上界计算。在本文中,我们利用NoC链路整形对流的漏桶到达曲线的影响,发展了最后两个步骤。
{"title":"Network-on-chip service guarantees on the kalray MPPA-256 bostan processor","authors":"B. Dinechin, Amaury Graillat","doi":"10.1145/3073763.3073770","DOIUrl":"https://doi.org/10.1145/3073763.3073770","url":null,"abstract":"The Kalray MPPA-256 Bostan manycore processor implements a clustered architecture, where clusters of cores share a local memory, and a DMA-capable network-on-chip (NoC) connects the clusters. The NoC implements wormhole switching without virtual channels, with source routing, and can be configured for maximum flow rate and burstiness at ingress. We describe and illustrate the techniques used to configure the MPPA NoC for guaranteed services. Our approach is based on three steps: global selection of routes between end-points and computation of flow rates, by solving the max-min fairness with unsplittable path problem; configuration of the flow burstiness parameters at ingress, by solving an acyclic set of linear inequalities; and end-to-end latency upper bound computation, based on the principles of separated flow analysis (SFA). In this paper, we develop the two last steps, taking advantage of the effects of NoC link shaping on the leaky-bucket arrival curves of flows.","PeriodicalId":20560,"journal":{"name":"Proceedings of the 2nd International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems","volume":"57 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89055892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Model-based framework for networks-on-chip design space exploration 基于模型的片上网络设计空间探索框架
YongTing Hu, Daniel Mueller-Gritschneder, Ulf Schlichtmann
With increasing density on circuits, more cores are integrated. Networks-on-chip (NoCs) is emerged as a solution for interconnect. Many router architectures, NoC topologies and routing algorithms are developed to improve NoC design. This brings a large design space to explore. The exploration requires various models and tools to evaluate NoCs. So this paper proposes a model-based framework that can integrate different evaluation together. Each NoC design is processed as one model using Eclipse Modelling Framework (EMF). Models can be used in code generation to generate different evaluation models, including ORION, SystemC and LISNoC Verilog description. An execution is further developed to compile, execute and synthesize models. The framework is experimented with both a real multi-media application and random traffic tests. Various aspects of evaluation are reported, including latency, throughoutput, buffer utilization, area, power and so on.
随着电路密度的增加,集成的核心越来越多。片上网络(noc)作为一种互连解决方案应运而生。为了改进NoC设计,开发了许多路由器架构、NoC拓扑和路由算法。这带来了很大的设计空间去探索。勘探需要各种模型和工具来评估noc。因此,本文提出了一个基于模型的框架,可以将不同的评价整合在一起。每个NoC设计都使用Eclipse modeling Framework (EMF)作为一个模型来处理。模型可以在代码生成中使用,生成不同的评估模型,包括ORION、SystemC和LISNoC Verilog描述。进一步开发一个执行来编译、执行和综合模型。该框架在一个真实的多媒体应用和随机流量测试中进行了实验。报告了评估的各个方面,包括延迟、吞吐量、缓冲区利用率、面积、功率等。
{"title":"Model-based framework for networks-on-chip design space exploration","authors":"YongTing Hu, Daniel Mueller-Gritschneder, Ulf Schlichtmann","doi":"10.1145/3073763.3073769","DOIUrl":"https://doi.org/10.1145/3073763.3073769","url":null,"abstract":"With increasing density on circuits, more cores are integrated. Networks-on-chip (NoCs) is emerged as a solution for interconnect. Many router architectures, NoC topologies and routing algorithms are developed to improve NoC design. This brings a large design space to explore. The exploration requires various models and tools to evaluate NoCs. So this paper proposes a model-based framework that can integrate different evaluation together. Each NoC design is processed as one model using Eclipse Modelling Framework (EMF). Models can be used in code generation to generate different evaluation models, including ORION, SystemC and LISNoC Verilog description. An execution is further developed to compile, execute and synthesize models. The framework is experimented with both a real multi-media application and random traffic tests. Various aspects of evaluation are reported, including latency, throughoutput, buffer utilization, area, power and so on.","PeriodicalId":20560,"journal":{"name":"Proceedings of the 2nd International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems","volume":"33 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90263117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Interconnects for next generation SoC designs 互连下一代SoC设计
S. Shah
As the number of functional IP blocks connected on a die increase, SoC development becomes constrained by the capabilities of the on-chip interconnect that connects these IP blocks together. And as the use of commercial IP increase to encompass 80% or more of a commercial SoCs functionality, innovation and differentiation between competing designs could only be expressed in how the IP is connected, as implemented by the on-chip interconnect. To keep up with the demands of the SoC, the interconnects have also become fairly complex and sophisticated. The desire for satisfying the needs of next generation SoCs, while optimizing the area, processing efficiency and power consumption, is driving innovation in switch designs, routing algorithms, transport mechanisms, Quality of Service and coherency schemes. The problem space is big and perhaps more complex in certain ways than that of data networks. The changing application requirements is also changing how we look at Service Level Agreements (SLAs) within the SoC. The SLAs for next generation Interconnects have to go beyond delay and bandwidth considerations to also include resiliency, fault tolerance, and security. In this talk, I will discuss the challenges in building next generation Interconnects, the innovation taking place to address these challenges and how the SoC interconnects are different from the interconnects in data networks.
随着连接在芯片上的功能IP块数量的增加,SoC的开发受到将这些IP块连接在一起的片上互连能力的限制。随着商业IP的使用增加到包含80%或更多的商业soc功能,竞争设计之间的创新和差异化只能通过IP如何连接来表达,通过片上互连来实现。为了跟上SoC的需求,互连也变得相当复杂和复杂。为了满足下一代soc的需求,同时优化面积、处理效率和功耗,正在推动交换机设计、路由算法、传输机制、服务质量和一致性方案的创新。问题空间很大,在某些方面可能比数据网络更复杂。不断变化的应用需求也改变了我们在SoC中看待服务水平协议(sla)的方式。下一代互连的sla必须超越延迟和带宽考虑,还包括弹性、容错和安全性。在这次演讲中,我将讨论构建下一代互连所面临的挑战,为应对这些挑战而进行的创新,以及SoC互连与数据网络中的互连有何不同。
{"title":"Interconnects for next generation SoC designs","authors":"S. Shah","doi":"10.1145/3073763.3073771","DOIUrl":"https://doi.org/10.1145/3073763.3073771","url":null,"abstract":"As the number of functional IP blocks connected on a die increase, SoC development becomes constrained by the capabilities of the on-chip interconnect that connects these IP blocks together. And as the use of commercial IP increase to encompass 80% or more of a commercial SoCs functionality, innovation and differentiation between competing designs could only be expressed in how the IP is connected, as implemented by the on-chip interconnect. To keep up with the demands of the SoC, the interconnects have also become fairly complex and sophisticated. The desire for satisfying the needs of next generation SoCs, while optimizing the area, processing efficiency and power consumption, is driving innovation in switch designs, routing algorithms, transport mechanisms, Quality of Service and coherency schemes. The problem space is big and perhaps more complex in certain ways than that of data networks. The changing application requirements is also changing how we look at Service Level Agreements (SLAs) within the SoC. The SLAs for next generation Interconnects have to go beyond delay and bandwidth considerations to also include resiliency, fault tolerance, and security. In this talk, I will discuss the challenges in building next generation Interconnects, the innovation taking place to address these challenges and how the SoC interconnects are different from the interconnects in data networks.","PeriodicalId":20560,"journal":{"name":"Proceedings of the 2nd International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems","volume":"30 5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77511850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Software-defined board- and chip-level optical interconnects for multi-socket communication and disaggregated computing 用于多套接字通信和分解计算的软件定义板级和芯片级光互连
N. Pleros, N. Terzenidis, T. Alexoudi, K. Vyrsokinos, G. Kanellos, D. Syrivelis
The vast amount of new data being generated is outpacing the development of infrastructures and continues to grow at much higher rates than MooreâĂŹs law, a problem that is commonly referred to as the âĂIJdata deluge problemâĂİ. This brings current computational machines in the struggle to exceed Exascale processing powers by 2020 and this is where the energy boundary is setting the second, bottom-side alarm: A reasonable power envelope for future Super-computers has been projected to be 20MW, while worldâĂŹs current No. 1 Supercomputer Sunway TaihuLight provides 93 Pflops and requires already 15.37 MW. This simply means that we have reached so far below 10% of the Exascale target but we consume already more than 75% of the tar-geted energy limit! The way to escape is currently following the paradigm of disaggregating and disintegrating resources, massively introducing at the same time optical technologies for interconnect purposes. Disaggregating computing from memory and storage modules can allow for flexible and modular settings where hardware requirements can be tailored to meet the certain energy and performance metrics targeted per application. At the same time, optical interconnect and photonic integration technologies are rapidly replacing electrical interconnects continuously penetrating at deeper hierarchy levels: Silicon photonics have enabled the penetration of optical technology to the computing environment, starting from rack-to-rack and gradually shifting towards board-level communications. In this article, we present our recent work towards implementing on-board single-mode optical interconnects that can support Software Defined Networking allowing for programmable and flexible computational settings that can quickly adapt to the application requirements. We present a programmable 4×4 Silicon Photonic switch that supports SDN through the use of Bloom filter (BF) labeled router ports. Our scheme significantly simplifies packet forwarding as it negates the need for large forwarding tables, supporting at the same time network size and topol-ogy changes through simple modifications in the assigned BF labels. We demonstrate 1×4 switch operation controlling the Si-Pho switch by a Stratix V FPGA board that is responsible for processing the packet ID and correlating its destination with the appropriate BF-labeled switch output port. Moving towards high-capacity board-level settings, we discuss the architecture and technology being currently promoted by the recently started H2020 project ICT-STREAMS, where single-mode optical PCBs hosting Si-based routing modules and mid-board transceiver optics expect to enable a massive any-to-any, buffer-less, collision-less and extremely low latency routing platform with 25.6Tb/s aggregate through-put. This architecture and technology are also extended to support resource disaggregation in data centers as currently being pursued in the H2020 project dREDBox, where the any-to-any collisionless routing s
产生的大量新数据的速度超过了基础设施的发展速度,并继续以比MooreâĂŹs法律高得多的速度增长,这个问题通常被称为âĂIJdata洪水problemâĂİ。这使得目前的计算机器在2020年的处理能力超过百亿亿次,这是能量边界设置的第二个,底部的警报:未来超级计算机的合理功率包线预计为20MW,而worldâĂŹs目前排名第一的超级计算机神威太湖之光提供93 Pflops,已经需要15.37 MW。这仅仅意味着,到目前为止,我们只达到了百亿亿次目标的10%以下,但我们消耗的能量已经超过了目标能量限制的75% !目前,逃避的方法是遵循分解和分解资源的范式,同时大规模引入用于互联目的的光学技术。从内存和存储模块中分离计算可以允许灵活和模块化的设置,可以定制硬件要求,以满足每个应用程序的特定能源和性能指标。与此同时,光互连和光子集成技术正在迅速取代电气互连,并不断向更深层次渗透:硅光子学使光学技术渗透到计算环境,从机架到机架逐渐转向板级通信。在本文中,我们介绍了我们最近在实现机载单模光互连方面的工作,该互连可以支持软件定义网络,允许可编程和灵活的计算设置,可以快速适应应用需求。我们提出了一个可编程的4×4硅光子交换机,它通过使用布隆滤波器(BF)标记的路由器端口来支持SDN。我们的方案大大简化了数据包转发,因为它不需要大型转发表,同时通过简单修改分配的BF标签来支持网络规模和拓扑的变化。我们演示了1×4开关操作,通过Stratix V FPGA板控制Si-Pho开关,该板负责处理数据包ID并将其目的地与适当的bf标记交换机输出端口相关联。转向高容量板级设置,我们讨论了最近启动的H2020项目ICT-STREAMS目前正在推广的架构和技术,其中单模光pcb承载基于si的路由模块和中板收发器光学器件,有望实现具有25.6Tb/s总吞吐量的大规模任意对任意、无缓冲、无碰撞和极低延迟的路由平台。这种架构和技术也被扩展到支持数据中心的资源分解,正如H2020项目dREDBox目前所追求的那样,其中提出了任意对任意的无冲突路由方案,用于连接分解的计算和内存块,以尽量减少远程内存访问延迟和能耗。
{"title":"Software-defined board- and chip-level optical interconnects for multi-socket communication and disaggregated computing","authors":"N. Pleros, N. Terzenidis, T. Alexoudi, K. Vyrsokinos, G. Kanellos, D. Syrivelis","doi":"10.1145/3073763.3073776","DOIUrl":"https://doi.org/10.1145/3073763.3073776","url":null,"abstract":"The vast amount of new data being generated is outpacing the development of infrastructures and continues to grow at much higher rates than MooreâĂŹs law, a problem that is commonly referred to as the âĂIJdata deluge problemâĂİ. This brings current computational machines in the struggle to exceed Exascale processing powers by 2020 and this is where the energy boundary is setting the second, bottom-side alarm: A reasonable power envelope for future Super-computers has been projected to be 20MW, while worldâĂŹs current No. 1 Supercomputer Sunway TaihuLight provides 93 Pflops and requires already 15.37 MW. This simply means that we have reached so far below 10% of the Exascale target but we consume already more than 75% of the tar-geted energy limit! The way to escape is currently following the paradigm of disaggregating and disintegrating resources, massively introducing at the same time optical technologies for interconnect purposes. Disaggregating computing from memory and storage modules can allow for flexible and modular settings where hardware requirements can be tailored to meet the certain energy and performance metrics targeted per application. At the same time, optical interconnect and photonic integration technologies are rapidly replacing electrical interconnects continuously penetrating at deeper hierarchy levels: Silicon photonics have enabled the penetration of optical technology to the computing environment, starting from rack-to-rack and gradually shifting towards board-level communications. In this article, we present our recent work towards implementing on-board single-mode optical interconnects that can support Software Defined Networking allowing for programmable and flexible computational settings that can quickly adapt to the application requirements. We present a programmable 4×4 Silicon Photonic switch that supports SDN through the use of Bloom filter (BF) labeled router ports. Our scheme significantly simplifies packet forwarding as it negates the need for large forwarding tables, supporting at the same time network size and topol-ogy changes through simple modifications in the assigned BF labels. We demonstrate 1×4 switch operation controlling the Si-Pho switch by a Stratix V FPGA board that is responsible for processing the packet ID and correlating its destination with the appropriate BF-labeled switch output port. Moving towards high-capacity board-level settings, we discuss the architecture and technology being currently promoted by the recently started H2020 project ICT-STREAMS, where single-mode optical PCBs hosting Si-based routing modules and mid-board transceiver optics expect to enable a massive any-to-any, buffer-less, collision-less and extremely low latency routing platform with 25.6Tb/s aggregate through-put. This architecture and technology are also extended to support resource disaggregation in data centers as currently being pursued in the H2020 project dREDBox, where the any-to-any collisionless routing s","PeriodicalId":20560,"journal":{"name":"Proceedings of the 2nd International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83309548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimal application mapping to 2D-mesh NoCs by using a tabu-based particle swarm methodology 基于禁忌的粒子群方法优化二维网格noc的应用映射
Muhammad Obaidullah, G. Khan
A hybrid optimization scheme is presented in this paper that combines Tabu-search, Force Directed Swapping and Discrete Particle Swarm Optimization for Network-on-Chip (NoC) mapping problem. The main goal of the optimization is to map an application core graph such that the overall communication latency and energy consumption of the NoC are minimal. Discrete Particle Swarm Optimization is used as the main optimization scheme where each particle move is influenced by a force derived from the network traffic matrix. We also employ a Tabu-list to discourage swarm particles to re-visit the explored search space. This is done through particle reflection which proposes an alternative route towards the intended move direction. The methodology is tested for some multimedia application core graphs as well as randomly generated large network of synthetic cores. It was found that on average, this hybrid algorithm required less number of iterations to reach an optimal solution as compared to other existing and past algorithms without losing the quality of NoC mapping.
针对片上网络(NoC)映射问题,提出了一种结合禁忌搜索、强制定向交换和离散粒子群优化的混合优化方案。优化的主要目标是映射应用程序核心图,从而使NoC的总体通信延迟和能耗最小。采用离散粒子群优化作为主要的优化方案,每个粒子的移动受到来自网络流量矩阵的力的影响。我们还使用禁忌列表来阻止群粒子重新访问已探索的搜索空间。这是通过粒子反射完成的,它提出了一条通往预期移动方向的替代路线。该方法在一些多媒体应用核心图和随机生成的大型合成核心网络中进行了验证。研究发现,平均而言,与其他现有和过去的算法相比,该混合算法在不损失NoC映射质量的情况下,需要更少的迭代次数才能达到最优解。
{"title":"Optimal application mapping to 2D-mesh NoCs by using a tabu-based particle swarm methodology","authors":"Muhammad Obaidullah, G. Khan","doi":"10.1145/3073763.3073766","DOIUrl":"https://doi.org/10.1145/3073763.3073766","url":null,"abstract":"A hybrid optimization scheme is presented in this paper that combines Tabu-search, Force Directed Swapping and Discrete Particle Swarm Optimization for Network-on-Chip (NoC) mapping problem. The main goal of the optimization is to map an application core graph such that the overall communication latency and energy consumption of the NoC are minimal. Discrete Particle Swarm Optimization is used as the main optimization scheme where each particle move is influenced by a force derived from the network traffic matrix. We also employ a Tabu-list to discourage swarm particles to re-visit the explored search space. This is done through particle reflection which proposes an alternative route towards the intended move direction. The methodology is tested for some multimedia application core graphs as well as randomly generated large network of synthetic cores. It was found that on average, this hybrid algorithm required less number of iterations to reach an optimal solution as compared to other existing and past algorithms without losing the quality of NoC mapping.","PeriodicalId":20560,"journal":{"name":"Proceedings of the 2nd International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82165447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Transparent lifetime built-in self-testing of networks-on-chip through the selective non-concurrent testing of their communication channels 通过对通信通道的选择性非并发测试,对片上网络进行透明的终身内置自我测试
Marco Balboni, D. Bertozzi
In some application domains (e.g., mission-critical systems), proactive detection of reliability threats or prompt fault containment are mandatory in order to avoid or limit the malfunctioning of electronic systems as an effect of the onset of permanent faults at runtime. As an essential milestone for the design of these systems, this paper presents a distributed and lightweight control framework for the built-in self-testing of networks-on-chip (NoCs) in the background while applications are running. The main idea of this concurrent online testing framework consists of modularizing the NoC into communication channels, of selectively taking such channels offline for non-concurrent testing, and of reconfiguring the NoC routing function to route packets around the temporary blockages to preserve network availability.
在某些应用领域(例如,关键任务系统)中,为了避免或限制电子系统在运行时发生永久性故障所造成的故障,必须主动检测可靠性威胁或及时遏制故障。作为这些系统设计的重要里程碑,本文提出了一个分布式轻量级控制框架,用于在应用程序运行时在后台对片上网络(noc)进行内置自测。这种并发在线测试框架的主要思想包括将NoC模块化为通信通道,选择性地将这些通道脱机以进行非并发测试,以及重新配置NoC路由功能以绕过临时阻塞路由数据包以保持网络可用性。
{"title":"Transparent lifetime built-in self-testing of networks-on-chip through the selective non-concurrent testing of their communication channels","authors":"Marco Balboni, D. Bertozzi","doi":"10.1145/3073763.3073765","DOIUrl":"https://doi.org/10.1145/3073763.3073765","url":null,"abstract":"In some application domains (e.g., mission-critical systems), proactive detection of reliability threats or prompt fault containment are mandatory in order to avoid or limit the malfunctioning of electronic systems as an effect of the onset of permanent faults at runtime. As an essential milestone for the design of these systems, this paper presents a distributed and lightweight control framework for the built-in self-testing of networks-on-chip (NoCs) in the background while applications are running. The main idea of this concurrent online testing framework consists of modularizing the NoC into communication channels, of selectively taking such channels offline for non-concurrent testing, and of reconfiguring the NoC routing function to route packets around the temporary blockages to preserve network availability.","PeriodicalId":20560,"journal":{"name":"Proceedings of the 2nd International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems","volume":"386 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77683618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Secure communications in wireless network-on-chips 无线片上网络中的安全通信
F. Pereñíguez-Garcia, José L. Abellán
Wireless on-chip communication is an emerging technology that is currently being adopted in order to reduce latency and energy consumption of network transactions in many-core systems. The reason is that the multi-hop nature of conventional electrical network-on-chip has lead to the point of diminishing returns, which even aggravates as the number of hops increases to meet the ever-increasing core count in many-core systems. A Wireless NoC (WNoC) can be realized to broadcast network messages in a more efficient manner, so current research is exploring hybrid NoC designs composed of an electrical NoC and a WNoC to reach the desired performance improvement. Nonetheless, so far, nobody has addressed the problem of having network attacks when using a WNoC. In this work, we propose a security mechanism for a 64-core system with a hybrid NoC implementing ECONO cache coherence. Our experimental evaluation using multi-threaded applications from state-of-the-art benchmark suites reveals that the most lightweight technology designed to secure broadcast messages through hash-based functions can lead to more than 30% performance degradation. In addition, based on our study, we also propose tolerable latencies that must be achieved in future designs to guarantee truly lightweight secure WNoCs.
无线片上通信是一种新兴的通信技术,目前被广泛应用于多核系统中,以减少网络事务的延迟和能耗。这是因为传统的片上电子网络的多跳特性已经导致了收益递减点,甚至随着跳数的增加而加剧,以满足多核系统中不断增加的核心数量。无线NoC (WNoC)可以更有效地广播网络消息,因此目前的研究正在探索由电气NoC和WNoC组成的混合NoC设计,以达到预期的性能改进。尽管如此,到目前为止,还没有人解决使用WNoC时遭受网络攻击的问题。在这项工作中,我们提出了一种64核系统的安全机制,该系统具有实现ECONO缓存一致性的混合NoC。我们使用来自最先进基准套件的多线程应用程序进行的实验评估表明,通过基于哈希的函数来保护广播消息的最轻量级技术可能导致30%以上的性能下降。此外,根据我们的研究,我们还提出了在未来设计中必须实现的可容忍延迟,以保证真正轻量级的安全wnoc。
{"title":"Secure communications in wireless network-on-chips","authors":"F. Pereñíguez-Garcia, José L. Abellán","doi":"10.1145/3073763.3073768","DOIUrl":"https://doi.org/10.1145/3073763.3073768","url":null,"abstract":"Wireless on-chip communication is an emerging technology that is currently being adopted in order to reduce latency and energy consumption of network transactions in many-core systems. The reason is that the multi-hop nature of conventional electrical network-on-chip has lead to the point of diminishing returns, which even aggravates as the number of hops increases to meet the ever-increasing core count in many-core systems. A Wireless NoC (WNoC) can be realized to broadcast network messages in a more efficient manner, so current research is exploring hybrid NoC designs composed of an electrical NoC and a WNoC to reach the desired performance improvement. Nonetheless, so far, nobody has addressed the problem of having network attacks when using a WNoC. In this work, we propose a security mechanism for a 64-core system with a hybrid NoC implementing ECONO cache coherence. Our experimental evaluation using multi-threaded applications from state-of-the-art benchmark suites reveals that the most lightweight technology designed to secure broadcast messages through hash-based functions can lead to more than 30% performance degradation. In addition, based on our study, we also propose tolerable latencies that must be achieved in future designs to guarantee truly lightweight secure WNoCs.","PeriodicalId":20560,"journal":{"name":"Proceedings of the 2nd International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems","volume":"5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79710805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
BXI: designing a network for eXascale BXI:设计一个百亿亿级的网络
Jean-Pierre Panziera
BXI, Bull eXascale Interconnect, is the new interconnection network developed by Bull, now an Atos company for High Performance Computing. First an overview of the BXI network is presented. It is designed and optimized for HPC workloads at very large scale. The BXI network is based on the Portals 4 protocol and permits a complete offload of communication primitives in hardware, thus enabling independent progress of computation and communication. We then describe the two BXI ASIC components, the network interface and the switch, and the BXI software environment. The fabric management integrates features for monitoring, performance analysis, quick traffic re-routing and jobs isolation for performance and security. We finally explain how the Bull eXascale platform integrates BXI to build a large scale parallel system and we present some results obtained on the first BXI systems.
BXI (Bull eXascale Interconnect)是由Bull公司开发的新型互连网络,该公司现在是Atos公司的高性能计算公司。首先介绍了BXI网络的概况。它是为大规模的HPC工作负载而设计和优化的。BXI网络基于portal 4协议,允许在硬件中完全卸载通信原语,从而实现计算和通信的独立进程。然后介绍了两个BXI ASIC组件,网络接口和交换机,以及BXI软件环境。fabric管理集成了监控、性能分析、快速流量重路由和作业隔离等功能,以提高性能和安全性。最后,我们解释了Bull eXascale平台如何集成BXI来构建大规模并行系统,并介绍了在第一批BXI系统上获得的一些结果。
{"title":"BXI: designing a network for eXascale","authors":"Jean-Pierre Panziera","doi":"10.1145/3073763.3073774","DOIUrl":"https://doi.org/10.1145/3073763.3073774","url":null,"abstract":"BXI, Bull eXascale Interconnect, is the new interconnection network developed by Bull, now an Atos company for High Performance Computing. First an overview of the BXI network is presented. It is designed and optimized for HPC workloads at very large scale. The BXI network is based on the Portals 4 protocol and permits a complete offload of communication primitives in hardware, thus enabling independent progress of computation and communication. We then describe the two BXI ASIC components, the network interface and the switch, and the BXI software environment. The fabric management integrates features for monitoring, performance analysis, quick traffic re-routing and jobs isolation for performance and security. We finally explain how the Bull eXascale platform integrates BXI to build a large scale parallel system and we present some results obtained on the first BXI systems.","PeriodicalId":20560,"journal":{"name":"Proceedings of the 2nd International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80080935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of the 2nd International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1