Shin'ichi Miura, Takayuki Okamoto, T. Boku, T. Hanawa, M. Sato
{"title":"RI2N:用于PC集群的高带宽、多链路以太网容错网络","authors":"Shin'ichi Miura, Takayuki Okamoto, T. Boku, T. Hanawa, M. Sato","doi":"10.1109/CLUSTR.2008.4663781","DOIUrl":null,"url":null,"abstract":"Although recent high-end interconnection network devices and switches provide a high performance/cost ratio, most of the small to medium sized PC clusters are still built on the commodity network, Ethernet. To enhance performance on commonly used gigabit Ethernet networks, link aggregation or binding technology is used. Currently, a Linux kernel is equipped with a software solution named linux channel bonding (LCB), which is based on IEEE802.3ad Link Aggregation technology. However, standard LCB has the problem of mismatching with the commonly used TCP protocol, which consequently implies several problems of both large latency and instability on bandwidth improvement. The fault-tolerant feature is also supported, but the usability is not sufficient. We have developed a new implementation similar to LCB named RI2N/DRV (redundant interconnection with inexpensive network with driver) for use on a gigabit Ethernet with a complete software stack that is very compatible with the TCP protocol. Our algorithm suppresses unnecessary ACK packets and retransmission of packets even in imbalanced network traffic and link failures on multiple links. It provides both high-bandwidth and fault-tolerant communication on multi-link gigabit Ethernet. We confirmed that this system improves the performance and reliability of the network, and our system can be applied to ordinary UNIX services such as NFS, without any modification of other modules.","PeriodicalId":198768,"journal":{"name":"2008 IEEE International Conference on Cluster Computing","volume":"122 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"RI2N: High-bandwidth and fault-tolerant network with multi-link Ethernet for PC clusters\",\"authors\":\"Shin'ichi Miura, Takayuki Okamoto, T. Boku, T. Hanawa, M. Sato\",\"doi\":\"10.1109/CLUSTR.2008.4663781\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Although recent high-end interconnection network devices and switches provide a high performance/cost ratio, most of the small to medium sized PC clusters are still built on the commodity network, Ethernet. To enhance performance on commonly used gigabit Ethernet networks, link aggregation or binding technology is used. Currently, a Linux kernel is equipped with a software solution named linux channel bonding (LCB), which is based on IEEE802.3ad Link Aggregation technology. However, standard LCB has the problem of mismatching with the commonly used TCP protocol, which consequently implies several problems of both large latency and instability on bandwidth improvement. The fault-tolerant feature is also supported, but the usability is not sufficient. We have developed a new implementation similar to LCB named RI2N/DRV (redundant interconnection with inexpensive network with driver) for use on a gigabit Ethernet with a complete software stack that is very compatible with the TCP protocol. Our algorithm suppresses unnecessary ACK packets and retransmission of packets even in imbalanced network traffic and link failures on multiple links. It provides both high-bandwidth and fault-tolerant communication on multi-link gigabit Ethernet. We confirmed that this system improves the performance and reliability of the network, and our system can be applied to ordinary UNIX services such as NFS, without any modification of other modules.\",\"PeriodicalId\":198768,\"journal\":{\"name\":\"2008 IEEE International Conference on Cluster Computing\",\"volume\":\"122 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-10-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 IEEE International Conference on Cluster Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CLUSTR.2008.4663781\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 IEEE International Conference on Cluster Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CLUSTR.2008.4663781","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
RI2N: High-bandwidth and fault-tolerant network with multi-link Ethernet for PC clusters
Although recent high-end interconnection network devices and switches provide a high performance/cost ratio, most of the small to medium sized PC clusters are still built on the commodity network, Ethernet. To enhance performance on commonly used gigabit Ethernet networks, link aggregation or binding technology is used. Currently, a Linux kernel is equipped with a software solution named linux channel bonding (LCB), which is based on IEEE802.3ad Link Aggregation technology. However, standard LCB has the problem of mismatching with the commonly used TCP protocol, which consequently implies several problems of both large latency and instability on bandwidth improvement. The fault-tolerant feature is also supported, but the usability is not sufficient. We have developed a new implementation similar to LCB named RI2N/DRV (redundant interconnection with inexpensive network with driver) for use on a gigabit Ethernet with a complete software stack that is very compatible with the TCP protocol. Our algorithm suppresses unnecessary ACK packets and retransmission of packets even in imbalanced network traffic and link failures on multiple links. It provides both high-bandwidth and fault-tolerant communication on multi-link gigabit Ethernet. We confirmed that this system improves the performance and reliability of the network, and our system can be applied to ordinary UNIX services such as NFS, without any modification of other modules.