首页 > 最新文献

Proceedings. 12th Annual IEEE Symposium on High Performance Interconnects最新文献

英文 中文
Designing packet buffers with statistical guarantees 设计具有统计保证的数据包缓冲区
Pub Date : 2004-08-05 DOI: 10.1109/CONECT.2004.1375202
G. Shrimali, I. Keslassy, N. McKeown
Packet buffers are an essential part of routers. In high-end routers, these buffers need to store a large amount of data at very high speeds. To satisfy these requirements, we need a memory with the the speed of SRAM and the density of DRAM. A typical solution is to use hybrid packet buffers built from a combination of SRAM and DRAM, where the SRAM holds the heads and tails of per-flow packet FIFOs and the DRAM is used for bulk storage. The main challenge then is to minimize the size of the SRAM while providing reasonable performance guarantees. We analyze a commonly used hybrid architecture from a statistical perspective, and investigate how small the SRAM can get if the packet buffer designer is willing to tolerate a certain drop probability. We introduce an analytical model to represent the SRAM buffer occupancy, and derive drop probabilities as a function of SRAM size under a wide range of statistical traffic patterns. By our analysis, we show that, for low drop probability, the required SRAM size is proportional to the number of flows.
包缓冲区是路由器的重要组成部分。在高端路由器中,这些缓冲区需要以非常高的速度存储大量数据。为了满足这些要求,我们需要一种具有SRAM的速度和DRAM的密度的存储器。一个典型的解决方案是使用由SRAM和DRAM组合构建的混合数据包缓冲区,其中SRAM保存每个流数据包fifo的头和尾,而DRAM用于批量存储。那么主要的挑战是在提供合理的性能保证的同时最小化SRAM的大小。我们从统计的角度分析了一种常用的混合架构,并研究了如果数据包缓冲区设计者愿意容忍一定的丢失概率,SRAM可以变得多小。我们引入了一个分析模型来表示SRAM缓冲区占用率,并推导了在广泛的统计流量模式下SRAM大小的函数掉落概率。通过我们的分析,我们表明,对于低跌落概率,所需的SRAM大小与流的数量成正比。
{"title":"Designing packet buffers with statistical guarantees","authors":"G. Shrimali, I. Keslassy, N. McKeown","doi":"10.1109/CONECT.2004.1375202","DOIUrl":"https://doi.org/10.1109/CONECT.2004.1375202","url":null,"abstract":"Packet buffers are an essential part of routers. In high-end routers, these buffers need to store a large amount of data at very high speeds. To satisfy these requirements, we need a memory with the the speed of SRAM and the density of DRAM. A typical solution is to use hybrid packet buffers built from a combination of SRAM and DRAM, where the SRAM holds the heads and tails of per-flow packet FIFOs and the DRAM is used for bulk storage. The main challenge then is to minimize the size of the SRAM while providing reasonable performance guarantees. We analyze a commonly used hybrid architecture from a statistical perspective, and investigate how small the SRAM can get if the packet buffer designer is willing to tolerate a certain drop probability. We introduce an analytical model to represent the SRAM buffer occupancy, and derive drop probabilities as a function of SRAM size under a wide range of statistical traffic patterns. By our analysis, we show that, for low drop probability, the required SRAM size is proportional to the number of flows.","PeriodicalId":224195,"journal":{"name":"Proceedings. 12th Annual IEEE Symposium on High Performance Interconnects","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123028987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Non-random generator for IPv6 tables IPv6表的非随机生成器
Pub Date : 2004-08-05 DOI: 10.1109/CONECT.2004.1375198
Mei Wang, S. Deering, T. Hain, L. Dunn
The next generation Internet Protocol, IPv6, has attracted growing attention. The characteristics of future IPv6 routing tables play a key role in router architecture and network design. In order to design and analyze efficient and scalable IP lookup algorithms for IPv6, IPv6 routing tables are needed. Analysis of existing IPv4 tables shows that there is an underlying structure that differs greatly from random distributions. Since there are few users on IPv6 at present, current IPv6 table sizes are small and unlikely to reflect future IPv6 network growth. Thus, neither randomly generated tables nor current IPv6 tables are good benchmarks for analysis. More representative IPv6 lookup tables are needed for the development of IPv6 routers. From an analysis of current IPv4 tables, algorithms are proposed for generating IPv6 lookup tables. Tables generated by the suggested methods exhibit certain features characteristic of real lookup tables, reflecting not only new IPv6 address allocation schemes but also patterns common to IPv4 tables. These tables provide useful research tools by a better representation of future lookup tables as IPv6 becomes more widely deployed.
下一代互联网协议IPv6已经引起了越来越多的关注。未来IPv6路由表的特性在路由器架构和网络设计中起着关键作用。为了设计和分析高效且可扩展的IPv6 IP查找算法,需要使用IPv6路由表。对现有IPv4表的分析表明,存在一个与随机分布有很大不同的底层结构。由于目前使用IPv6的用户很少,目前IPv6表的大小很小,不太可能反映未来IPv6网络的增长。因此,随机生成的表和当前的IPv6表都不是很好的分析基准。IPv6路由器的开发需要更多具有代表性的IPv6查找表。通过对当前IPv4表的分析,提出了生成IPv6查找表的算法。由建议的方法生成的表显示了真实查找表的某些特征,不仅反映了新的IPv6地址分配方案,还反映了IPv4表的常见模式。随着IPv6得到更广泛的部署,这些表格通过更好地表示未来的查找表提供了有用的研究工具。
{"title":"Non-random generator for IPv6 tables","authors":"Mei Wang, S. Deering, T. Hain, L. Dunn","doi":"10.1109/CONECT.2004.1375198","DOIUrl":"https://doi.org/10.1109/CONECT.2004.1375198","url":null,"abstract":"The next generation Internet Protocol, IPv6, has attracted growing attention. The characteristics of future IPv6 routing tables play a key role in router architecture and network design. In order to design and analyze efficient and scalable IP lookup algorithms for IPv6, IPv6 routing tables are needed. Analysis of existing IPv4 tables shows that there is an underlying structure that differs greatly from random distributions. Since there are few users on IPv6 at present, current IPv6 table sizes are small and unlikely to reflect future IPv6 network growth. Thus, neither randomly generated tables nor current IPv6 tables are good benchmarks for analysis. More representative IPv6 lookup tables are needed for the development of IPv6 routers. From an analysis of current IPv4 tables, algorithms are proposed for generating IPv6 lookup tables. Tables generated by the suggested methods exhibit certain features characteristic of real lookup tables, reflecting not only new IPv6 address allocation schemes but also patterns common to IPv4 tables. These tables provide useful research tools by a better representation of future lookup tables as IPv6 becomes more widely deployed.","PeriodicalId":224195,"journal":{"name":"Proceedings. 12th Annual IEEE Symposium on High Performance Interconnects","volume":" 29","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120833363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 56
Efficient multi-match packet classification with TCAM 基于TCAM的高效多匹配分组分类
Pub Date : 2004-08-05 DOI: 10.1109/CONECT.2004.1375197
Fang Yu, R. Katz
Today's packet classification systems are designed to provide the highest priority matching result, e.g., the longest prefix match, even if a packet matches multiple classification rules. However, new network applications, such as intrusion detection systems, require information about all the matching results. We call this the multi-match classification problem. In several complex network applications, multi-match classification is immediately followed by other processing dependent on the classification results. Therefore, classification should be even faster than the line rate. Pure software solutions cannot be used due to their slow speeds. We present a solution based on ternary content addressable memory (TCAM), which produces multi-match classification results with only one TCAM lookup and one SRAM lookup per packet - about ten times fewer memory lookups than a pure software approach. In addition, we present a scheme to remove the negation format in rule sets, which can save up to 95% of TCAM space compared with the straight forward solution. We show that using our pre-processing scheme, header processing for the SNORT rule set can be done with one TCAM and one SRAM lookup using a 135 KB TCAM.
今天的包分类系统被设计为提供最高优先级的匹配结果,例如,最长的前缀匹配,即使一个包匹配多个分类规则。然而,新的网络应用,如入侵检测系统,需要所有匹配结果的信息。我们称之为多匹配分类问题。在一些复杂的网络应用中,多匹配分类之后紧接着是依赖于分类结果的其他处理。因此,分类速度应该比排线速度还要快。纯软件解决方案由于速度慢而不能使用。我们提出了一种基于三元内容可寻址存储器(TCAM)的解决方案,它产生多匹配分类结果,每个数据包只有一次TCAM查找和一次SRAM查找-大约比纯软件方法少十倍的内存查找。此外,我们还提出了一种去除规则集中否定格式的方案,与直接解决方案相比,该方案可节省高达95%的TCAM空间。我们展示了使用我们的预处理方案,SNORT规则集的标头处理可以通过一个TCAM和一个使用135 KB TCAM的SRAM查找来完成。
{"title":"Efficient multi-match packet classification with TCAM","authors":"Fang Yu, R. Katz","doi":"10.1109/CONECT.2004.1375197","DOIUrl":"https://doi.org/10.1109/CONECT.2004.1375197","url":null,"abstract":"Today's packet classification systems are designed to provide the highest priority matching result, e.g., the longest prefix match, even if a packet matches multiple classification rules. However, new network applications, such as intrusion detection systems, require information about all the matching results. We call this the multi-match classification problem. In several complex network applications, multi-match classification is immediately followed by other processing dependent on the classification results. Therefore, classification should be even faster than the line rate. Pure software solutions cannot be used due to their slow speeds. We present a solution based on ternary content addressable memory (TCAM), which produces multi-match classification results with only one TCAM lookup and one SRAM lookup per packet - about ten times fewer memory lookups than a pure software approach. In addition, we present a scheme to remove the negation format in rule sets, which can save up to 95% of TCAM space compared with the straight forward solution. We show that using our pre-processing scheme, header processing for the SNORT rule set can be done with one TCAM and one SRAM lookup using a 135 KB TCAM.","PeriodicalId":224195,"journal":{"name":"Proceedings. 12th Annual IEEE Symposium on High Performance Interconnects","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120991422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 69
Efficient multicast on a terabit router 在太比特路由器上实现高效组播
Pub Date : 2004-08-05 DOI: 10.1109/CONECT.2004.1375203
Punit Bhargava, Sriram C. Krishnan, R. Panigrahy
Multicast routing protocols and routers on the Internet enable multicast transmission by replicating packets close to the destinations, obviating the need for multiple unicast connections, thereby saving network bandwidth and improving throughput. Similarly, within a router, multicast between linecards is enabled by a multicast capable switch fabric. A multicast cell is sent once from the source linecard to the switch fabric; the switch fabric sends the cells to all the destination linecards obviating the need for, and the waste of, linecard to fabric bandwidth that would result from multiple unicast cell transmissions. For high capacity routers (several terabits), the fixed size destination field of the cell is inadequate to specify exactly the subset of the switch ports the multicast cell should be sent to the number of multicast connections to be supported. Therefore, for several connections, we have to supercast, i.e., send the cell to non-subscribing linecards and have them drop the cell. We study the problem of assigning destination labels for multicast cells so that the amount of supercast, i.e., wasted bandwidth, is minimized, and the throughput of the router is maximized. We formalize this combinatorial optimization problem and prove it NP-complete and hard to find approximate solutions. We have devised several heuristic algorithms that we have implemented and we report the experimental results. Faster heuristics can support a higher multicast connection establishment rate; slower heuristics can be invoked off-line to further optimize multicast label maps.
Internet上的组播路由协议和路由器通过在目的地附近复制数据包来实现组播传输,从而避免了对多个单播连接的需要,从而节省了网络带宽,提高了吞吐量。类似地,在路由器内,线路卡之间的组播是由具有组播功能的交换结构启用的。一个多播单元从源线卡发送一次到交换结构;交换结构将单元发送到所有目标线卡,从而避免了由于多个单播单元传输而导致的线卡到结构带宽的需要和浪费。对于大容量路由器(几兆位),固定大小的单元目的字段不足以精确指定组播单元应该发送到的交换端口的子集以及支持的组播连接的数量。因此,对于几个连接,我们必须进行超转换,即,将单元格发送给非订阅的linecard,并让它们删除单元格。我们研究了为组播单元分配目的地标签的问题,以使超播的数量(即浪费的带宽)最小化,并使路由器的吞吐量最大化。我们形式化了这个组合优化问题,并证明了它是np完全且难以找到近似解的。我们设计了几个启发式算法,我们已经实现并报告了实验结果。更快的启发式算法可以支持更高的组播连接建立率;可以离线调用较慢的启发式方法来进一步优化多播标签映射。
{"title":"Efficient multicast on a terabit router","authors":"Punit Bhargava, Sriram C. Krishnan, R. Panigrahy","doi":"10.1109/CONECT.2004.1375203","DOIUrl":"https://doi.org/10.1109/CONECT.2004.1375203","url":null,"abstract":"Multicast routing protocols and routers on the Internet enable multicast transmission by replicating packets close to the destinations, obviating the need for multiple unicast connections, thereby saving network bandwidth and improving throughput. Similarly, within a router, multicast between linecards is enabled by a multicast capable switch fabric. A multicast cell is sent once from the source linecard to the switch fabric; the switch fabric sends the cells to all the destination linecards obviating the need for, and the waste of, linecard to fabric bandwidth that would result from multiple unicast cell transmissions. For high capacity routers (several terabits), the fixed size destination field of the cell is inadequate to specify exactly the subset of the switch ports the multicast cell should be sent to the number of multicast connections to be supported. Therefore, for several connections, we have to supercast, i.e., send the cell to non-subscribing linecards and have them drop the cell. We study the problem of assigning destination labels for multicast cells so that the amount of supercast, i.e., wasted bandwidth, is minimized, and the throughput of the router is maximized. We formalize this combinatorial optimization problem and prove it NP-complete and hard to find approximate solutions. We have devised several heuristic algorithms that we have implemented and we report the experimental results. Faster heuristics can support a higher multicast connection establishment rate; slower heuristics can be invoked off-line to further optimize multicast label maps.","PeriodicalId":224195,"journal":{"name":"Proceedings. 12th Annual IEEE Symposium on High Performance Interconnects","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122078053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Performance evaluation of InfiniBand with PCI Express 采用 PCI Express 的 InfiniBand 性能评估
Pub Date : 2004-08-05 DOI: 10.1109/CONECT.2004.1375193
Jiuxing Liu, A. Mamidala, Abhinav Vishnu, D. Panda
We present an initial performance evaluation of InfiniBand HCAs (host channel adapters) from Mellanox with PCI Express interfaces. We compare the performance with HCAs using PCI-X interfaces. Our results show that InfiniBand HCAs with PCI Express can achieve significant performance benefits. Compared with HCAs using 64 bit/133 MHz PCI-X interfaces, they can achieve 20%-30% lower latency for small messages. The small message latency achieved with PCI Express is around 3.8 /spl mu/s, compared with the 5.0 /spl mu/s with PCI-X. For large messages, HCAs with PCI Express using a single port can deliver unidirectional bandwidth up to 968 MB/s and bidirectional bandwidth up to 1916 MB/s, which are, respectively, 1.24 and 2.02 times the peak bandwidths achieved by HCAs with PCI-X. When both the ports of the HCAs are activated, HCAs with PCI Express can deliver a peak unidirectional bandwidth of 1486 MB/s and aggregate bidirectional bandwidth up to 2729 MB/s, which are 1.93 and 2.88 times the peak bandwidths obtained using HCAs with PCI-X. PCI Express also improves performance at the MPI level. A latency of 4.6 /spl mu/s with PCI Express is achieved for small messages. And for large messages, unidirectional bandwidth of 1497 MB/s and bidirectional bandwidth of 2724 MB/s are observed.
我们介绍了对带有 PCI Express 接口的 Mellanox InfiniBand HCA(主机通道适配器)的初步性能评估。我们将其性能与使用 PCI-X 接口的 HCA 进行了比较。我们的结果表明,采用 PCI Express 的 InfiniBand HCA 可以实现显著的性能优势。与使用 64 位/133 MHz PCI-X 接口的 HCA 相比,它们可以将小信息的延迟时间降低 20%-30%。PCI Express 实现的小信息延迟约为 3.8 /spl mu/s,而 PCI-X 为 5.0 /spl mu/s。对于大型信息,使用单端口 PCI Express 的 HCA 可提供高达 968 MB/s 的单向带宽和高达 1916 MB/s 的双向带宽,分别是使用 PCI-X 的 HCA 峰值带宽的 1.24 倍和 2.02 倍。当 HCA 的两个端口都激活时,带 PCI Express 的 HCA 可提供 1486 MB/s 的峰值单向带宽和高达 2729 MB/s 的总双向带宽,分别是带 PCI-X 的 HCA 峰值带宽的 1.93 倍和 2.88 倍。PCI Express 还提高了 MPI 级别的性能。使用 PCI Express 时,小信息的延迟为 4.6 /spl mu/s。对于大型信息,单向带宽为 1497 MB/s,双向带宽为 2724 MB/s。
{"title":"Performance evaluation of InfiniBand with PCI Express","authors":"Jiuxing Liu, A. Mamidala, Abhinav Vishnu, D. Panda","doi":"10.1109/CONECT.2004.1375193","DOIUrl":"https://doi.org/10.1109/CONECT.2004.1375193","url":null,"abstract":"We present an initial performance evaluation of InfiniBand HCAs (host channel adapters) from Mellanox with PCI Express interfaces. We compare the performance with HCAs using PCI-X interfaces. Our results show that InfiniBand HCAs with PCI Express can achieve significant performance benefits. Compared with HCAs using 64 bit/133 MHz PCI-X interfaces, they can achieve 20%-30% lower latency for small messages. The small message latency achieved with PCI Express is around 3.8 /spl mu/s, compared with the 5.0 /spl mu/s with PCI-X. For large messages, HCAs with PCI Express using a single port can deliver unidirectional bandwidth up to 968 MB/s and bidirectional bandwidth up to 1916 MB/s, which are, respectively, 1.24 and 2.02 times the peak bandwidths achieved by HCAs with PCI-X. When both the ports of the HCAs are activated, HCAs with PCI Express can deliver a peak unidirectional bandwidth of 1486 MB/s and aggregate bidirectional bandwidth up to 2729 MB/s, which are 1.93 and 2.88 times the peak bandwidths obtained using HCAs with PCI-X. PCI Express also improves performance at the MPI level. A latency of 4.6 /spl mu/s with PCI Express is achieved for small messages. And for large messages, unidirectional bandwidth of 1497 MB/s and bidirectional bandwidth of 2724 MB/s are observed.","PeriodicalId":224195,"journal":{"name":"Proceedings. 12th Annual IEEE Symposium on High Performance Interconnects","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128371906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Resilient network infrastructures for global grid computing 面向全局网格计算的弹性网络基础设施
Pub Date : 2004-08-05 DOI: 10.1109/CONECT.2004.1375212
L. Valcarenghi
Summary form only given. Grid computing is defined as "coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations". The transport network infrastructure represents one of the main resources to be shared. Emerging high capacity intelligent grid transport network infrastructures, such as optical transport networks based on generalized multiprotocol label switching (GMPLS) and automatically switched optical networks/automatically switched transport networks (ASON/ASTN), are fostering the expansion of grid computing from local area networks (LAN) (i.e., cluster grid) to wide area networks (WAN) (i.e., global grid). Indeed they are able to guarantee the required quality of service (QoS) to heterogeneous grid applications that share the same grid network infrastructure. The tutorial addresses one particular aspect of the grid transport network QoS: resilience, i.e. the ability to overcome failures. In particular, it gives an overview of the current efforts for guaranteeing grid application resilience in spite of different types of failures, such as network infrastructure failures or computer crashes. Finally, it shows that, by tailoring the utilized recovery scheme to the type of failure that occurred, it is possible to optimize the failure recovery process.
只提供摘要形式。网格计算被定义为“在动态的、多机构的虚拟组织中协调资源共享和问题解决”。传输网络基础设施是要共享的主要资源之一。新兴的高容量智能网格传输网络基础设施,如基于广义多协议标签交换(GMPLS)的光传输网络和自动交换光网络/自动交换传输网络(ASON/ASTN),正在促进网格计算从局域网(LAN)(即集群网格)向广域网(WAN)(即全球网格)的扩展。实际上,它们能够保证共享相同网格网络基础设施的异构网格应用程序所需的服务质量(QoS)。本教程介绍了网格传输网络QoS的一个特定方面:弹性,即克服故障的能力。特别地,它概述了在不同类型的故障(如网络基础设施故障或计算机崩溃)下保证网格应用程序弹性的当前努力。最后,它表明,通过根据发生的故障类型定制所使用的恢复方案,可以优化故障恢复过程。
{"title":"Resilient network infrastructures for global grid computing","authors":"L. Valcarenghi","doi":"10.1109/CONECT.2004.1375212","DOIUrl":"https://doi.org/10.1109/CONECT.2004.1375212","url":null,"abstract":"Summary form only given. Grid computing is defined as \"coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations\". The transport network infrastructure represents one of the main resources to be shared. Emerging high capacity intelligent grid transport network infrastructures, such as optical transport networks based on generalized multiprotocol label switching (GMPLS) and automatically switched optical networks/automatically switched transport networks (ASON/ASTN), are fostering the expansion of grid computing from local area networks (LAN) (i.e., cluster grid) to wide area networks (WAN) (i.e., global grid). Indeed they are able to guarantee the required quality of service (QoS) to heterogeneous grid applications that share the same grid network infrastructure. The tutorial addresses one particular aspect of the grid transport network QoS: resilience, i.e. the ability to overcome failures. In particular, it gives an overview of the current efforts for guaranteeing grid application resilience in spite of different types of failures, such as network infrastructure failures or computer crashes. Finally, it shows that, by tailoring the utilized recovery scheme to the type of failure that occurred, it is possible to optimize the failure recovery process.","PeriodicalId":224195,"journal":{"name":"Proceedings. 12th Annual IEEE Symposium on High Performance Interconnects","volume":"283 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121368557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Design of a high-speed optical interconnect for scalable shared memory multiprocessors 可扩展共享存储器多处理器高速光互连设计
Avinash Karanth Kodi, A. Louri
The paper proposes a highly connected optical interconnect based architecture that maximizes the channel availability for future scalable parallel computers, such as distributed shared memory (DSM) multiprocessors and cluster networks. As the system size increases, various messages (requests, responses and acknowledgments) increase in the network resulting in contention. This results in increasing the remote memory access latency and significantly affects the performance of these parallel computers. As a solution, we propose an architecture called RAPID (reconfigurable and scalable all-photonic interconnect for distributed-shared memory), that provides low remote memory access latency by providing fast and efficient unicast, multicast and broadcast capabilities using a combination of aggressively designed WDM, TDM and SDM techniques. We evaluated RAPID based on network characteristics and by simulation using synthetic traffic workloads and compared it against other networks such as electrical ring, torus, mesh and hypercube networks. We found that RAPID outperforms all networks and satisfies most of the requirements of parallel computer design such as low latency, high bandwidth, high connectivity, and easy scalability.
本文提出了一种基于高连接光互连的体系结构,该体系结构最大限度地提高了未来可扩展并行计算机(如分布式共享内存(DSM)多处理器和集群网络)的信道可用性。随着系统大小的增加,网络中的各种消息(请求、响应和确认)也会增加,从而导致争用。这会增加远程内存访问延迟,并显著影响这些并行计算机的性能。作为解决方案,我们提出了一种称为RAPID(可重构和可扩展的分布式共享存储器全光子互连)的架构,该架构通过使用积极设计的WDM, TDM和SDM技术的组合提供快速高效的单播,多播和广播功能,从而提供低远程存储器访问延迟。我们基于网络特性对RAPID进行了评估,并使用合成流量负载进行了模拟,并将其与其他网络(如电环、环面、网状和超立方体网络)进行了比较。我们发现RAPID优于所有网络,并且满足并行计算机设计的大多数要求,如低延迟、高带宽、高连接性和易于扩展性。
{"title":"Design of a high-speed optical interconnect for scalable shared memory multiprocessors","authors":"Avinash Karanth Kodi, A. Louri","doi":"10.1109/MM.2005.7","DOIUrl":"https://doi.org/10.1109/MM.2005.7","url":null,"abstract":"The paper proposes a highly connected optical interconnect based architecture that maximizes the channel availability for future scalable parallel computers, such as distributed shared memory (DSM) multiprocessors and cluster networks. As the system size increases, various messages (requests, responses and acknowledgments) increase in the network resulting in contention. This results in increasing the remote memory access latency and significantly affects the performance of these parallel computers. As a solution, we propose an architecture called RAPID (reconfigurable and scalable all-photonic interconnect for distributed-shared memory), that provides low remote memory access latency by providing fast and efficient unicast, multicast and broadcast capabilities using a combination of aggressively designed WDM, TDM and SDM techniques. We evaluated RAPID based on network characteristics and by simulation using synthetic traffic workloads and compared it against other networks such as electrical ring, torus, mesh and hypercube networks. We found that RAPID outperforms all networks and satisfies most of the requirements of parallel computer design such as low latency, high bandwidth, high connectivity, and easy scalability.","PeriodicalId":224195,"journal":{"name":"Proceedings. 12th Annual IEEE Symposium on High Performance Interconnects","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116096010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 45
Studying network protocol offload with emulation: approach and preliminary results 网络协议卸载仿真研究方法及初步结果
Pub Date : 2004-08-05 DOI: 10.1109/CONECT.2004.1375208
R. Westrelin, Nicolas Fugier, Erik Nordmark, Kai Kunze, E. Lemoine
To take full advantage of high-speed networks while freeing CPU cycles for application processing, the industry is proposing new techniques relying on an extended role for network interface cards such as TCP offload engine and remote direct memory access. The paper presents an experimental study aimed at collecting the performance data needed to assess these techniques. This work is based on the emulation of an advanced network interface card plugged on the I/O bus. In the experimental setting, a processor of a partitioned SMP machine is dedicated to network processing. Achieving a faithful emulation of a network interface card is one of the main concerns and it is guiding the design of the offload engine software. This setting has the advantage of being flexible so that many different offload scenarios can be evaluated. Preliminary throughput results of an emulated TCP offload engine demonstrate a large benefit. The emulated TCP offload engine indeed yields 600% to 900% improvement while still relying on memory copies at the kernel boundary.
为了充分利用高速网络,同时释放CPU周期用于应用程序处理,业界正在提出依赖于网络接口卡扩展角色的新技术,如TCP卸载引擎和远程直接内存访问。本文提出了一项实验研究,旨在收集评估这些技术所需的性能数据。这项工作是基于对插在I/O总线上的高级网络接口卡的仿真。在实验设置中,分区SMP机器的一个处理器专用于网络处理。实现对网卡的逼真仿真是当前研究的重点之一,它指导着卸载引擎软件的设计。这种设置的优点是灵活,因此可以评估许多不同的卸载场景。模拟TCP卸载引擎的初步吞吐量结果显示了很大的好处。模拟的TCP卸载引擎确实产生了600%到900%的改进,同时仍然依赖于内核边界的内存副本。
{"title":"Studying network protocol offload with emulation: approach and preliminary results","authors":"R. Westrelin, Nicolas Fugier, Erik Nordmark, Kai Kunze, E. Lemoine","doi":"10.1109/CONECT.2004.1375208","DOIUrl":"https://doi.org/10.1109/CONECT.2004.1375208","url":null,"abstract":"To take full advantage of high-speed networks while freeing CPU cycles for application processing, the industry is proposing new techniques relying on an extended role for network interface cards such as TCP offload engine and remote direct memory access. The paper presents an experimental study aimed at collecting the performance data needed to assess these techniques. This work is based on the emulation of an advanced network interface card plugged on the I/O bus. In the experimental setting, a processor of a partitioned SMP machine is dedicated to network processing. Achieving a faithful emulation of a network interface card is one of the main concerns and it is guiding the design of the offload engine software. This setting has the advantage of being flexible so that many different offload scenarios can be evaluated. Preliminary throughput results of an emulated TCP offload engine demonstrate a large benefit. The emulated TCP offload engine indeed yields 600% to 900% improvement while still relying on memory copies at the kernel boundary.","PeriodicalId":224195,"journal":{"name":"Proceedings. 12th Annual IEEE Symposium on High Performance Interconnects","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129503724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
Worms vs. perimeters: the case for hard-LANs 蠕虫vs.外围:硬局域网的情况
Pub Date : 2004-08-05 DOI: 10.1109/CONECT.2004.1375206
N. Weaver, Daniel P. W. Ellis, Stuart Staniford-Chen, V. Paxson
Network worms - self-propagating network programs - represent a substantial threat to our network infrastructure. Due to the propagation speed of worms, reactive defenses need to be automatic. It is important to understand where and how these defenses need to fit in the network so that they cannot be easily evaded. As there are several mechanisms malcode authors can use to bypass existing perimeter-centric defenses, this position paper argues that substantial defenses need to be embedded in the local area network, thus creating "hard-LANs" designed to detect and respond to worm infections. When compared with conventional network intrusion detection systems (NIDSs), we believe that hard-LAN devices need to have two orders of magnitude better cost/performance, and at least two orders of magnitude better accuracy, resulting in substantial design challenges.
网络蠕虫——自我传播的网络程序——对我们的网络基础设施构成了重大威胁。由于蠕虫的传播速度,反应性防御需要是自动的。重要的是要了解这些防御需要适应网络的位置和方式,以便它们不能轻易被规避。由于恶意代码作者可以使用几种机制来绕过现有的以周边为中心的防御,本立场文件认为,需要在局域网中嵌入实质性的防御,从而创建旨在检测和响应蠕虫感染的“硬局域网”。与传统的网络入侵检测系统(nids)相比,我们认为硬局域网设备需要具有更好的两个数量级的成本/性能,以及至少两个数量级的精度,这导致了大量的设计挑战。
{"title":"Worms vs. perimeters: the case for hard-LANs","authors":"N. Weaver, Daniel P. W. Ellis, Stuart Staniford-Chen, V. Paxson","doi":"10.1109/CONECT.2004.1375206","DOIUrl":"https://doi.org/10.1109/CONECT.2004.1375206","url":null,"abstract":"Network worms - self-propagating network programs - represent a substantial threat to our network infrastructure. Due to the propagation speed of worms, reactive defenses need to be automatic. It is important to understand where and how these defenses need to fit in the network so that they cannot be easily evaded. As there are several mechanisms malcode authors can use to bypass existing perimeter-centric defenses, this position paper argues that substantial defenses need to be embedded in the local area network, thus creating \"hard-LANs\" designed to detect and respond to worm infections. When compared with conventional network intrusion detection systems (NIDSs), we believe that hard-LAN devices need to have two orders of magnitude better cost/performance, and at least two orders of magnitude better accuracy, resulting in substantial design challenges.","PeriodicalId":224195,"journal":{"name":"Proceedings. 12th Annual IEEE Symposium on High Performance Interconnects","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122413250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
Evaluation of a wireless enterprise backbone network architecture 无线企业骨干网体系结构评估
Pub Date : 2004-08-05 DOI: 10.1109/CONECT.2004.1375211
Ashish Raniwala, T. Chiueh
IEEE 802.11 wireless LAN technology is mainly used as an access network within corporate enterprises. All the WLAN access points are eventually connected to a wired backbone to reach the Internet or enterprise computing resources. We aim to expand WLAN into an enterprise-scale backbone network technology by developing a multichannel wireless mesh network architecture called Hyacinth. Hyacinth equips each node with multiple IEEE 802.11a/b NICs and supports distributed channel assignment/routing to increase the overall network throughput. We present the results of a detailed performance evaluation study on the multichannel mesh networking aspect of Hyacinth, based on both NS-2 simulations and empirical measurements collected from a 9-node Hyacinth prototype testbed. A key result of this study is that equipping each node of a Hyacinth network with just 3 NICs can increase the total network bandwidth by a factor of 6 to 7, as compared with single-channel wireless mesh network architecture.
IEEE 802.11无线局域网技术主要用作企业内部的接入网。所有的WLAN接入点最终都连接到一个有线骨干网,以到达Internet或企业计算资源。我们的目标是通过开发名为Hyacinth的多通道无线网状网络架构,将WLAN扩展为企业级骨干网技术。Hyacinth为每个节点配备多个IEEE 802.11a/b网卡,并支持分布式信道分配/路由,以提高整体网络吞吐量。基于NS-2仿真和9节点Hyacinth原型测试平台的经验测量,我们对Hyacinth的多通道网状网络方面进行了详细的性能评估研究。本研究的一个关键结果是,与单通道无线网状网络架构相比,为Hyacinth网络的每个节点配备3个网卡可以将总网络带宽提高6到7倍。
{"title":"Evaluation of a wireless enterprise backbone network architecture","authors":"Ashish Raniwala, T. Chiueh","doi":"10.1109/CONECT.2004.1375211","DOIUrl":"https://doi.org/10.1109/CONECT.2004.1375211","url":null,"abstract":"IEEE 802.11 wireless LAN technology is mainly used as an access network within corporate enterprises. All the WLAN access points are eventually connected to a wired backbone to reach the Internet or enterprise computing resources. We aim to expand WLAN into an enterprise-scale backbone network technology by developing a multichannel wireless mesh network architecture called Hyacinth. Hyacinth equips each node with multiple IEEE 802.11a/b NICs and supports distributed channel assignment/routing to increase the overall network throughput. We present the results of a detailed performance evaluation study on the multichannel mesh networking aspect of Hyacinth, based on both NS-2 simulations and empirical measurements collected from a 9-node Hyacinth prototype testbed. A key result of this study is that equipping each node of a Hyacinth network with just 3 NICs can increase the total network bandwidth by a factor of 6 to 7, as compared with single-channel wireless mesh network architecture.","PeriodicalId":224195,"journal":{"name":"Proceedings. 12th Annual IEEE Symposium on High Performance Interconnects","volume":"126 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122753368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 56
期刊
Proceedings. 12th Annual IEEE Symposium on High Performance Interconnects
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1