Proceedings of the 2nd International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems最新文献

英文中文

Enabling high performance rack-scale optical switching through global synchronisation 通过全局同步实现高性能机架级光交换

Proceedings of the 2nd International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems

Pub Date : 2017-01-25 DOI: 10.1145/3073763.3073773

Kari A. Clark, Phillip Watt

There is a growing need for high radix switches in data centres and high performance computing. Current computing systems are interconnected using large numbers of relatively low radix (32--48 port) switches that restrict scalability and performance, while increasing cost and management complexity. In parallel, there is a growing interest in dense rack scale computing in which a single rack can contain several thousand network nodes. To meet these demands, we recently demonstrated a flexible optical switch architecture using fast tuneable lasers and coherent receivers which scales to over 1000 ports. However, using traditional clock data recovery circuits in this or any optical packet switch results in large latency and throughput penalties due to resynchronisation on each new connection. In this talk, we will address the challenges of building a fully synchronous optical switch network, of rack-scale or greater, in which a reference clock is distributed to every node to reduce resynchronisation overhead. We will firstly present results from preliminary FPGA-based experiments demonstrating the viability of synchronising a rack scale network. We will then discuss the limitations on port count, range and bit rate which would limit the ability to build larger synchronous systems in this way.

数据中心和高性能计算对高基数交换机的需求日益增长。当前的计算系统使用大量相对较低基数(32- 48端口)的交换机进行互连，这限制了可伸缩性和性能，同时增加了成本和管理复杂性。与此同时，人们对密集机架规模计算越来越感兴趣，其中单个机架可以包含数千个网络节点。为了满足这些需求，我们最近展示了一种灵活的光开关架构，使用快速可调谐激光器和相干接收器，可扩展到1000多个端口。然而，在这种或任何光分组交换机中使用传统的时钟数据恢复电路会导致由于每个新连接的重新同步而导致的大延迟和吞吐量损失。在本次演讲中，我们将解决构建完全同步光交换网络的挑战，机架规模或更大，其中参考时钟分布到每个节点以减少重新同步开销。我们将首先展示基于fpga的初步实验结果，证明同步机架规模网络的可行性。然后，我们将讨论端口数、范围和比特率的限制，这些限制将限制以这种方式构建更大的同步系统的能力。

{"title":"Enabling high performance rack-scale optical switching through global synchronisation","authors":"Kari A. Clark, Phillip Watt","doi":"10.1145/3073763.3073773","DOIUrl":"https://doi.org/10.1145/3073763.3073773","url":null,"abstract":"There is a growing need for high radix switches in data centres and high performance computing. Current computing systems are interconnected using large numbers of relatively low radix (32--48 port) switches that restrict scalability and performance, while increasing cost and management complexity. In parallel, there is a growing interest in dense rack scale computing in which a single rack can contain several thousand network nodes. To meet these demands, we recently demonstrated a flexible optical switch architecture using fast tuneable lasers and coherent receivers which scales to over 1000 ports. However, using traditional clock data recovery circuits in this or any optical packet switch results in large latency and throughput penalties due to resynchronisation on each new connection. In this talk, we will address the challenges of building a fully synchronous optical switch network, of rack-scale or greater, in which a reference clock is distributed to every node to reduce resynchronisation overhead. We will firstly present results from preliminary FPGA-based experiments demonstrating the viability of synchronising a rack scale network. We will then discuss the limitations on port count, range and bit rate which would limit the ability to build larger synchronous systems in this way.","PeriodicalId":20560,"journal":{"name":"Proceedings of the 2nd International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems","volume":"101 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85813182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Microserver + micro-switch = micro-datacenter 微服务器+微交换机=微数据中心

Proceedings of the 2nd International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems

Pub Date : 2017-01-25 DOI: 10.1145/3073763.3073772

F. Abel, A. Doering

Many computational workloads from commercial and scientific fields have high demands in total throughput, and energy efficiency. For example the largest radio telescope, to be built in South Africa and Australia combines cost, performance and power targets that cannot be met by the technological development until its installation. In processor architecture a design tradeoff between cost and power efficiency against single-thread performance is observed. Hence, to achieve a high system power efficiency, large-scale parallelism has to be employed. In order to maintain wire length, and hence network delays, energy losses, and cost, the volume of compute nodes and network switches has to be reduced to a minimum âĂŞ hence the term microserver. Our DOME microserver compute card measures 130 by 7.5 by 65 mm3. The presented switch module is confined to the same area (140 by 55 mm2), yet is deeper (40mm) because of the 630-pin high-speed connector. For 64 ports of 10Gbit Ethernet (10Gbase-KR) our switch consumes about 150W maximal. In addition to the switch ASIC (Intel FM6000 series), the power converters, clock generation, configuration memory and management processor is integrated on a second PCB. The switch management (âĂIJControl PointâĂİ) is implemented in a separate compute node. In the talk options to integrate the management into the switch (same volume as now) will be discussed. Another topic covered is the cooling of the microserver, and of the switch in particular, using (warm) water in the infrastructure and heat pipes on the module.

许多商业和科学领域的计算工作负载对总吞吐量和能源效率有很高的要求。例如，将在南非和澳大利亚建造的最大的射电望远镜结合了成本、性能和功率目标，这些目标在安装之前无法通过技术发展来满足。在处理器体系结构中，可以观察到单线程性能在成本和功率效率之间的设计权衡。因此，为了获得较高的系统功率效率，必须采用大规模并行。为了保持线路长度，从而减少网络延迟、能量损失和成本，计算节点和网络交换机的体积必须减少到最小âĂŞ因此有了术语微服务器。我们的DOME微服务器计算卡尺寸为130 × 7.5 × 65 mm3。所展示的交换模块被限制在相同的面积(140 × 55mm2)，但由于630针高速连接器，更深(40mm)。对于64端口的10gb以太网(10Gbase-KR)，我们的交换机最大消耗约150W。除了开关ASIC (Intel FM6000系列)外，电源转换器、时钟生成、配置存储器和管理处理器都集成在第二块PCB上。交换机管理(âĂIJControl PointâĂİ)在单独的计算节点上实现。在讨论中，将讨论将管理集成到交换机中的选项(与现在相同的音量)。另一个涉及的主题是微服务器的冷却，特别是交换机的冷却，在基础设施和模块上的热管中使用(热)水。

{"title":"Microserver + micro-switch = micro-datacenter","authors":"F. Abel, A. Doering","doi":"10.1145/3073763.3073772","DOIUrl":"https://doi.org/10.1145/3073763.3073772","url":null,"abstract":"Many computational workloads from commercial and scientific fields have high demands in total throughput, and energy efficiency. For example the largest radio telescope, to be built in South Africa and Australia combines cost, performance and power targets that cannot be met by the technological development until its installation. In processor architecture a design tradeoff between cost and power efficiency against single-thread performance is observed. Hence, to achieve a high system power efficiency, large-scale parallelism has to be employed. In order to maintain wire length, and hence network delays, energy losses, and cost, the volume of compute nodes and network switches has to be reduced to a minimum âĂŞ hence the term microserver. Our DOME microserver compute card measures 130 by 7.5 by 65 mm3. The presented switch module is confined to the same area (140 by 55 mm2), yet is deeper (40mm) because of the 630-pin high-speed connector. For 64 ports of 10Gbit Ethernet (10Gbase-KR) our switch consumes about 150W maximal. In addition to the switch ASIC (Intel FM6000 series), the power converters, clock generation, configuration memory and management processor is integrated on a second PCB. The switch management (âĂIJControl PointâĂİ) is implemented in a separate compute node. In the talk options to integrate the management into the switch (same volume as now) will be discussed. Another topic covered is the cooling of the microserver, and of the switch in particular, using (warm) water in the infrastructure and heat pipes on the module.","PeriodicalId":20560,"journal":{"name":"Proceedings of the 2nd International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems","volume":"94 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79613038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Low-cost congestion management in networks-on-chip using edge and in-network traffic throttling 使用边缘和网络内流量节流的片上网络低成本拥塞管理

Proceedings of the 2nd International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems

Pub Date : 2017-01-25 DOI: 10.1145/3073763.3073764

Monobrata Debnath, Dimitris Konstantinou, C. Nicopoulos, G. Dimitrakopoulos, Wei-Ming Lin, Junghee Lee

Implementing cost effective congestion control within the Network-on-Chip (NoC) is a major design challenge. Whenever congestion awareness and/or mitigation is desired, architects typically rely on the use of adaptive routing algorithms, which aim to (intelligently) balance the traffic load throughout the NoC. Nevertheless, the hardware cost incurred by such solutions is quite considerable, since it entails the collection/propagation of traffic-related information and the provisioning of deadlock freedom guarantees. In this paper, we explore the potential of simultaneous edge and in-network traffic throttling, as a low-cost alternative to adaptive routing techniques. Without any reliance on adaptivity by the routing algorithm, combined throttling is demonstrated to yield better (in most cases) throughput improvements than state-of-the-art adaptive routing algorithms, but at a significantly lower cost.

在片上网络(NoC)中实现具有成本效益的拥塞控制是一个主要的设计挑战。每当需要拥塞感知和/或缓解时，架构师通常依赖于自适应路由算法的使用，其目的是(智能地)平衡整个NoC的流量负载。然而，这种解决方案所产生的硬件成本相当高，因为它需要收集/传播与流量相关的信息并提供死锁自由保证。在本文中，我们探讨了同步边缘和网络内流量节流的潜力，作为自适应路由技术的低成本替代方案。在不依赖路由算法的自适应性的情况下，联合节流被证明比最先进的自适应路由算法产生更好的(在大多数情况下)吞吐量改进，但成本要低得多。

引用次数: 2

Proceedings of the 2nd International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems 第二届新兴计算系统先进互连解决方案和技术国际研讨会论文集

Proceedings of the 2nd International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems

Pub Date : 2017-01-01 DOI: 10.1145/3073763

引用次数: 1

首页上一页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the 2nd International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀