R. Drost, C. Forrest, B. Guenin, R. Ho, A. Krishnamoorthy, D. Cohen, J. Cunningham, B. Tourancheau, A. Zingher, A. Chow, G. Lauterbach, I. Sutherland
Memory systems for conventional large-scale computers provide only limited bytes/s of data bandwidth when compared to their flop/s of instruction execution rate. The resulting bottleneck limits the bytes/flop that a processor may access from the full memory footprint of a machine and can hinder overall performance. This paper discusses physical and functional views of memory hierarchies and examines existing ratios of bandwidth to execution rate versus memory capacity (or bytes/flop versus capacity) found in a number of large-scale computers. The paper then explores a set of technologies, proximity communication, low-power on-chip networks, dense optical communication, and sea-of-any thing interconnect, that can flatten this bandwidth hierarchy to relieve the memory bottleneck in a large-scale computer that we call "Hero".
{"title":"Challenges in building a flat-bandwidth memory hierarchy for a large-scale computer with proximity communication","authors":"R. Drost, C. Forrest, B. Guenin, R. Ho, A. Krishnamoorthy, D. Cohen, J. Cunningham, B. Tourancheau, A. Zingher, A. Chow, G. Lauterbach, I. Sutherland","doi":"10.1109/CONECT.2005.12","DOIUrl":"https://doi.org/10.1109/CONECT.2005.12","url":null,"abstract":"Memory systems for conventional large-scale computers provide only limited bytes/s of data bandwidth when compared to their flop/s of instruction execution rate. The resulting bottleneck limits the bytes/flop that a processor may access from the full memory footprint of a machine and can hinder overall performance. This paper discusses physical and functional views of memory hierarchies and examines existing ratios of bandwidth to execution rate versus memory capacity (or bytes/flop versus capacity) found in a number of large-scale computers. The paper then explores a set of technologies, proximity communication, low-power on-chip networks, dense optical communication, and sea-of-any thing interconnect, that can flatten this bandwidth hierarchy to relieve the memory bottleneck in a large-scale computer that we call \"Hero\".","PeriodicalId":148282,"journal":{"name":"13th Symposium on High Performance Interconnects (HOTI'05)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131834307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lloyd Dickman, Greg Lindahl, Dave Olson, Jeffery H. Rubin, J. Broughton
Clusters are now a dominant model for high-capacity, scalable computing based on a commodity cost structure. This paper describes the first generation PathScale/spl trade/ InfiniPath/spl trade/ adapter - a single chip ASIC directly connecting HyperTransport/spl trade/ attached processors, such as the AMD Opteron/spl trade/, to the InfiniBand/spl trade/ network fabric. In addition to providing ultra-low communications latency, the PathScale InfiniPath adapter achieves high bandwidth from very small to large message sizes. Its performance also scales on multi-core processor nodes. Use of the InfiniBand switching fabric permits high bandwidth to be realized at a commodity fabric price point.
{"title":"Pathscale InfiniPath: a first look","authors":"Lloyd Dickman, Greg Lindahl, Dave Olson, Jeffery H. Rubin, J. Broughton","doi":"10.1109/CONECT.2005.29","DOIUrl":"https://doi.org/10.1109/CONECT.2005.29","url":null,"abstract":"Clusters are now a dominant model for high-capacity, scalable computing based on a commodity cost structure. This paper describes the first generation PathScale/spl trade/ InfiniPath/spl trade/ adapter - a single chip ASIC directly connecting HyperTransport/spl trade/ attached processors, such as the AMD Opteron/spl trade/, to the InfiniBand/spl trade/ network fabric. In addition to providing ultra-low communications latency, the PathScale InfiniPath adapter achieves high bandwidth from very small to large message sizes. Its performance also scales on multi-core processor nodes. Use of the InfiniBand switching fabric permits high bandwidth to be realized at a commodity fabric price point.","PeriodicalId":148282,"journal":{"name":"13th Symposium on High Performance Interconnects (HOTI'05)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130353058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The goal of this tutorial is to provide a comprehensive understanding of the state-of-the-art research and practice in Internet infrastructure security, to its audience. In addition to discussions on attacks and counter-measures, issues such as performance, scalability, deployability, and high speed implementations will also be discussed.
{"title":"Internet infrastructure security","authors":"Simon Fraser University, Scott Wakelin","doi":"10.1109/CONECT.2005.25","DOIUrl":"https://doi.org/10.1109/CONECT.2005.25","url":null,"abstract":"The goal of this tutorial is to provide a comprehensive understanding of the state-of-the-art research and practice in Internet infrastructure security, to its audience. In addition to discussions on attacks and counter-measures, issues such as performance, scalability, deployability, and high speed implementations will also be discussed.","PeriodicalId":148282,"journal":{"name":"13th Symposium on High Performance Interconnects (HOTI'05)","volume":"9 31","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120966106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This tutorial presents a comprehensive introduction to all aspects of high-speed networking, based on the book high-speed networking: a systematic approach to high-bandwidth low-latency communication. The target audience includes computer scientists and engineers who may have expertise in a narrow aspect of high-speed networking but want to gain a broader understanding of all aspects of high-speed networking and the impact that their designs have on overall network performance. This tutorial presents a systemic approach to high-speed networks, where the goal is to provide high bandwidth and low latency to distribute applications, and to deal with the high bandwidth-x-delay product that results from high-speed networking over long distances.
{"title":"High-speed networking: a systematic approach to high-bandwidth low-latency communications","authors":"J. Sterbenz","doi":"10.1109/CONECT.2005.21","DOIUrl":"https://doi.org/10.1109/CONECT.2005.21","url":null,"abstract":"This tutorial presents a comprehensive introduction to all aspects of high-speed networking, based on the book high-speed networking: a systematic approach to high-bandwidth low-latency communication. The target audience includes computer scientists and engineers who may have expertise in a narrow aspect of high-speed networking but want to gain a broader understanding of all aspects of high-speed networking and the impact that their designs have on overall network performance. This tutorial presents a systemic approach to high-speed networks, where the goal is to provide high bandwidth and low latency to distribute applications, and to deal with the high bandwidth-x-delay product that results from high-speed networking over long distances.","PeriodicalId":148282,"journal":{"name":"13th Symposium on High Performance Interconnects (HOTI'05)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127671808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hybrid Cache Architecture for High Speed Packet Processing","authors":"Zhen Liu, Kai Zheng, Bin Liu","doi":"10.1109/conect.2005.22","DOIUrl":"https://doi.org/10.1109/conect.2005.22","url":null,"abstract":"","PeriodicalId":148282,"journal":{"name":"13th Symposium on High Performance Interconnects (HOTI'05)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121634870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}