Pub Date : 2015-05-07DOI: 10.1109/ANCS.2015.7110131
Hyesook Lim, Hayoung Byun
Packet classification is one of the most essential functions that Internet routers should perform at wire-speed for every incoming packet. An area-based quad-trie (AQT) for packet classification has an issue in search performance since many rule nodes can be encountered in a search procedure. A leaf-pushing AQT improves the search performance of the AQT by making a single rule node exist in each search path. This paper proposes a new algorithm to improve the search performance of the leaf-pushing AQT further. The proposed algorithm builds a leaf-pushing AQT using a Bloom filter and a hash table stored in on-chip memories. The level of a rule node and a pointer to a rule database are identified by sequentially querying the Bloom filter and by accessing the hash table, respectively.
{"title":"Packet classification using a bloom filter in a leaf-pushing area-based quad-trie","authors":"Hyesook Lim, Hayoung Byun","doi":"10.1109/ANCS.2015.7110131","DOIUrl":"https://doi.org/10.1109/ANCS.2015.7110131","url":null,"abstract":"Packet classification is one of the most essential functions that Internet routers should perform at wire-speed for every incoming packet. An area-based quad-trie (AQT) for packet classification has an issue in search performance since many rule nodes can be encountered in a search procedure. A leaf-pushing AQT improves the search performance of the AQT by making a single rule node exist in each search path. This paper proposes a new algorithm to improve the search performance of the leaf-pushing AQT further. The proposed algorithm builds a leaf-pushing AQT using a Bloom filter and a hash table stored in on-chip memories. The level of a rule node and a pointer to a rule database are identified by sequentially querying the Bloom filter and by accessing the hash table, respectively.","PeriodicalId":186232,"journal":{"name":"2015 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116182420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-05-07DOI: 10.1109/ANCS.2015.7110137
Jianzong Wang, Lianglun Cheng
The inadequate resource allocation, lack of I/O performance prediction and insufficient isolation are affecting the storage performance in the multi-tenant cloud storage environment. In order to guarantee the Quality of Service (QoS), Softwaredefined Storage (SDS) is an effective approach in data centers. However, the lack of intelligence, robustness and selfadjustment are blocking the applications and promotions of SDS heavily. This paper focuses on the QoS-Aware I/O resource scheduling problem to build data centers with high availability, scalability and QoS. We will study workload characteristics, requirement analysis, the theory of QoS in SDS and I/O scheduling strategies. We obtain such goals by proposing a mathematics model of workload burstness, QoS semantic description with rule execution mechanisms and dynamic robust I/O scheduling algorithms for multi-type resources allocation. In the current progress, A QoS-Aware I/O Scheduling Framework towards SDS, qSDS has been proposed for the SSD/HDD hybrid storage. The preliminary evaluation in some benchmarks shows that qSDS can gain better performance compared with other strategies.
{"title":"qSDS: A QoS-Aware I/O scheduling framework towards software defined storage","authors":"Jianzong Wang, Lianglun Cheng","doi":"10.1109/ANCS.2015.7110137","DOIUrl":"https://doi.org/10.1109/ANCS.2015.7110137","url":null,"abstract":"The inadequate resource allocation, lack of I/O performance prediction and insufficient isolation are affecting the storage performance in the multi-tenant cloud storage environment. In order to guarantee the Quality of Service (QoS), Softwaredefined Storage (SDS) is an effective approach in data centers. However, the lack of intelligence, robustness and selfadjustment are blocking the applications and promotions of SDS heavily. This paper focuses on the QoS-Aware I/O resource scheduling problem to build data centers with high availability, scalability and QoS. We will study workload characteristics, requirement analysis, the theory of QoS in SDS and I/O scheduling strategies. We obtain such goals by proposing a mathematics model of workload burstness, QoS semantic description with rule execution mechanisms and dynamic robust I/O scheduling algorithms for multi-type resources allocation. In the current progress, A QoS-Aware I/O Scheduling Framework towards SDS, qSDS has been proposed for the SSD/HDD hybrid storage. The preliminary evaluation in some benchmarks shows that qSDS can gain better performance compared with other strategies.","PeriodicalId":186232,"journal":{"name":"2015 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114556047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-05-07DOI: 10.1109/ANCS.2015.7110121
Liron Schiff, Y. Afek, A. Bremler-Barr
Configuring range based packet classification rules in network switches is crucial to all network core functionalities, such as firewalls and routing. However, OpenFlow, the leading management protocol for SDN switches, lacks the interface to configure range rules directly and only provides mask based rules, named flow entries. In this work we present, ORange, the first solution to multi dimensional range classification in OpenFlow. Our solution is based on paradigms used in state of the art non-OpenFlow classifiers and is designed in a modular fashion allowing future extensions and improvements. We consider switch space utilization as well as atomic updates functionality, and in the network context we provide flow consistency even if flows change their entrance point to the network during policy updates, a property we name cross-entrance consistency. Our scheme achieves remarkable results and is easy to deploy.
{"title":"Orange: multi field openflow based range classifier","authors":"Liron Schiff, Y. Afek, A. Bremler-Barr","doi":"10.1109/ANCS.2015.7110121","DOIUrl":"https://doi.org/10.1109/ANCS.2015.7110121","url":null,"abstract":"Configuring range based packet classification rules in network switches is crucial to all network core functionalities, such as firewalls and routing. However, OpenFlow, the leading management protocol for SDN switches, lacks the interface to configure range rules directly and only provides mask based rules, named flow entries. In this work we present, ORange, the first solution to multi dimensional range classification in OpenFlow. Our solution is based on paradigms used in state of the art non-OpenFlow classifiers and is designed in a modular fashion allowing future extensions and improvements. We consider switch space utilization as well as atomic updates functionality, and in the network context we provide flow consistency even if flows change their entrance point to the network during policy updates, a property we name cross-entrance consistency. Our scheme achieves remarkable results and is easy to deploy.","PeriodicalId":186232,"journal":{"name":"2015 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134451330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-05-07DOI: 10.1109/ANCS.2015.7110133
Cheng-Liang Hsieh, N. Weng
OpenFlow Switch in Software-Defined Networking (SDN) has changed packet classification from standard 5-tuple to arbitrary many-field. The growing number of fields in a rule and the increasing number of rules in a ruleset poses great challenges for packet classification in terms of performance, storage, and update cost. In this paper, we design a two-stage packet classification system to address those issues by exploiting ruleset sparsity and rule fields independence. A ruleset is examined offline with proposed matrices to find representative bits from different field in a rule. We leverage those representative bits and concatenate them as sample values to divide a ruleset into several subsets in sample spaces. Each subset is given a unique address for each sample space. A ruleset update only affects those related addresses. The proposed pre-filtering stage comes out only highly related rules by intersecting candidate rules from different sample spaces for full match process. Out system throughput is 356 MPPS for 1K 15-field rules and 213 MPPS for 100K 15-field rules when using a single NVIDIA K20C GPU card.
{"title":"Scalable many-field packet classification using multidimensional-cutting via selective bit-concatenation","authors":"Cheng-Liang Hsieh, N. Weng","doi":"10.1109/ANCS.2015.7110133","DOIUrl":"https://doi.org/10.1109/ANCS.2015.7110133","url":null,"abstract":"OpenFlow Switch in Software-Defined Networking (SDN) has changed packet classification from standard 5-tuple to arbitrary many-field. The growing number of fields in a rule and the increasing number of rules in a ruleset poses great challenges for packet classification in terms of performance, storage, and update cost. In this paper, we design a two-stage packet classification system to address those issues by exploiting ruleset sparsity and rule fields independence. A ruleset is examined offline with proposed matrices to find representative bits from different field in a rule. We leverage those representative bits and concatenate them as sample values to divide a ruleset into several subsets in sample spaces. Each subset is given a unique address for each sample space. A ruleset update only affects those related addresses. The proposed pre-filtering stage comes out only highly related rules by intersecting candidate rules from different sample spaces for full match process. Out system throughput is 356 MPPS for 1K 15-field rules and 213 MPPS for 100K 15-field rules when using a single NVIDIA K20C GPU card.","PeriodicalId":186232,"journal":{"name":"2015 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128251892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-05-07DOI: 10.1109/ANCS.2015.7110123
Yun Qu, Hao Zhang, Shijie Zhou, V. Prasanna
Due to the rapid growth of Internet, there is an increasing need for efficiently classifying packets with many header fields in large rule sets. For example, in Software Defined Networking (SDN), the OpenFlow table lookup can require 15 packet header fields to be examined. In this paper, we present several decomposition-based packet classification implementations with efficient optimization techniques. In the searching phase, packet headers are split or combined. In the merging phase, the partial searching results from all the fields are merged to generate the final result. We prototype our implementations on state-of-the-art Field Programmable Gate Array (FPGA), multi-core General Purpose Processor (GPP), and Graphics Processing Unit (GPU). On FPGA, we propose two optimization techniques to divide generic ranges; modular processing elements are constructed and concatenated into a systolic array. On multi-core GPP, we parallelize both the searching and merging phases using parallel program threads. On the GPU-accelerated platform, we minimize branch divergence and reduce the data communication overhead. Experimental results show that 500Million Packets Per Second (MPPS) throughput and 3μs latency can be achieved for 1:5K rule sets on FPGA. We achieve 14:7MPPS throughput and 30:5MPPS throughput for 32K rule sets on multi-core GPP and GPU-accelerated platforms, respectively. As a heterogeneous solution, our GPU-accelerated packet classier shows 2x speedup compared to the implementation using multi-core GPP only. Compared with prior works, our designs can match long packet headers against very complex rule sets.
{"title":"Optimizing many-field packet classification on FPGA, multi-core general purpose processor, and GPU","authors":"Yun Qu, Hao Zhang, Shijie Zhou, V. Prasanna","doi":"10.1109/ANCS.2015.7110123","DOIUrl":"https://doi.org/10.1109/ANCS.2015.7110123","url":null,"abstract":"Due to the rapid growth of Internet, there is an increasing need for efficiently classifying packets with many header fields in large rule sets. For example, in Software Defined Networking (SDN), the OpenFlow table lookup can require 15 packet header fields to be examined. In this paper, we present several decomposition-based packet classification implementations with efficient optimization techniques. In the searching phase, packet headers are split or combined. In the merging phase, the partial searching results from all the fields are merged to generate the final result. We prototype our implementations on state-of-the-art Field Programmable Gate Array (FPGA), multi-core General Purpose Processor (GPP), and Graphics Processing Unit (GPU). On FPGA, we propose two optimization techniques to divide generic ranges; modular processing elements are constructed and concatenated into a systolic array. On multi-core GPP, we parallelize both the searching and merging phases using parallel program threads. On the GPU-accelerated platform, we minimize branch divergence and reduce the data communication overhead. Experimental results show that 500Million Packets Per Second (MPPS) throughput and 3μs latency can be achieved for 1:5K rule sets on FPGA. We achieve 14:7MPPS throughput and 30:5MPPS throughput for 32K rule sets on multi-core GPP and GPU-accelerated platforms, respectively. As a heterogeneous solution, our GPU-accelerated packet classier shows 2x speedup compared to the implementation using multi-core GPP only. Compared with prior works, our designs can match long packet headers against very complex rule sets.","PeriodicalId":186232,"journal":{"name":"2015 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124756007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-05-07DOI: 10.1109/ANCS.2015.7110125
Haowei Yuan, P. Crowley
Name prefix lookup is a core building block of information-centric networking (ICN). In ICN hierarchical naming schemes, each packet has a name that consists of multiple variable-length name components, and packets are forwarded based on longest name prefix matching (LNPM). LNPM is challenging because names are longer than IP addresses and the namespace is unbounded. Recently proposed solutions have shown encouraging performance, however, most are optimized for or evaluated with a limited number of URL datasets that may not fully characterize the forwarding information base (FIB).What's more, the worst-case scenarios of several schemes require O(k) string lookups, where k is the number of components in each prefix. Thus, the sustained performance of existing solutions is not guaranteed. In this paper, we present a LNPM design based on the binary search of hash tables, which was originally proposed for IP lookup. With this design, the worst-case number of string lookups is O(log(k)) for prefixes with up to k components, regardless of the characteristics of the FIB. We implemented the design in software and demonstrated 10 Gbps throughput with one billion synthetic longest name prefix matching rules, each containing up to seven components. We also propose level pulling to optimize the average LNPM performance based on the observation that some prefixes have large numbers of next-level suffixes in the available URL datasets.
{"title":"Reliably scalable name prefix lookup","authors":"Haowei Yuan, P. Crowley","doi":"10.1109/ANCS.2015.7110125","DOIUrl":"https://doi.org/10.1109/ANCS.2015.7110125","url":null,"abstract":"Name prefix lookup is a core building block of information-centric networking (ICN). In ICN hierarchical naming schemes, each packet has a name that consists of multiple variable-length name components, and packets are forwarded based on longest name prefix matching (LNPM). LNPM is challenging because names are longer than IP addresses and the namespace is unbounded. Recently proposed solutions have shown encouraging performance, however, most are optimized for or evaluated with a limited number of URL datasets that may not fully characterize the forwarding information base (FIB).What's more, the worst-case scenarios of several schemes require O(k) string lookups, where k is the number of components in each prefix. Thus, the sustained performance of existing solutions is not guaranteed. In this paper, we present a LNPM design based on the binary search of hash tables, which was originally proposed for IP lookup. With this design, the worst-case number of string lookups is O(log(k)) for prefixes with up to k components, regardless of the characteristics of the FIB. We implemented the design in software and demonstrated 10 Gbps throughput with one billion synthetic longest name prefix matching rules, each containing up to seven components. We also propose level pulling to optimize the average LNPM performance based on the observation that some prefixes have large numbers of next-level suffixes in the available URL datasets.","PeriodicalId":186232,"journal":{"name":"2015 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS)","volume":"181 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129684107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-05-07DOI: 10.1109/ANCS.2015.7110129
Antonis Psathakis, Vassilis D. Papaefstathiou, Nikolaos Chrysos, Fabien Chaix, E. Vasilakis, D. Pnevmatikatos, M. Katevenis
This paper studies alternative Network-on-Chip architectures for emerging many-core chip multiprocessors, by exploring the following design options on mesh-based networks: Multiple physical networks (P), cores concentration (C), express channels (X), it widths (W), and virtual channels (V). We exhaustively evaluate all combinations of the afore-mentioned parameters (P, C, X, W, V), using the energy-throughput ratio (ETR) as a metric to classify network congurations. Our experimental results show that, on one hand, with an appropriate selection of parameters (V,W), an optimized baseline 2D mesh offers the best possible ETR for NoCs with up to a few tens of cores (64-core NoC). More complicated networks, using concentration and express channels, can reduce the zero-load latency, but do not necessarily help to improve ETR. On the other hand, for larger CMPs, a 2D mesh with multiple physical networks is a better option: once optimized, this architectural choice can reduce the ETR by up to 46% for 256 cores.
{"title":"A systematic evaluation of emerging mesh-like CMP NoCs","authors":"Antonis Psathakis, Vassilis D. Papaefstathiou, Nikolaos Chrysos, Fabien Chaix, E. Vasilakis, D. Pnevmatikatos, M. Katevenis","doi":"10.1109/ANCS.2015.7110129","DOIUrl":"https://doi.org/10.1109/ANCS.2015.7110129","url":null,"abstract":"This paper studies alternative Network-on-Chip architectures for emerging many-core chip multiprocessors, by exploring the following design options on mesh-based networks: Multiple physical networks (P), cores concentration (C), express channels (X), it widths (W), and virtual channels (V). We exhaustively evaluate all combinations of the afore-mentioned parameters (P, C, X, W, V), using the energy-throughput ratio (ETR) as a metric to classify network congurations. Our experimental results show that, on one hand, with an appropriate selection of parameters (V,W), an optimized baseline 2D mesh offers the best possible ETR for NoCs with up to a few tens of cores (64-core NoC). More complicated networks, using concentration and express channels, can reduce the zero-load latency, but do not necessarily help to improve ETR. On the other hand, for larger CMPs, a 2D mesh with multiple physical networks is a better option: once optimized, this architectural choice can reduce the ETR by up to 46% for 256 cores.","PeriodicalId":186232,"journal":{"name":"2015 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122348193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-05-07DOI: 10.1109/ANCS.2015.7110114
Y. Bachar
Summary form only given. The talk will discuss how open hardware, open software, and disaggregation enable to build efficient data centers in any size from small to super-big. The talk will show a real life Facebook architecture in network, hardware and software that will enable future data centers developers to control their destiny and scale at their own pace the data center capacity and size with the most effective PUI in the industry. We will discuss Wedge, 6-Pack, FBOSS, oBMC and many other aspects of the Facebook data centers.
{"title":"Disaggregation – the new way to build mega (and micro) data centers","authors":"Y. Bachar","doi":"10.1109/ANCS.2015.7110114","DOIUrl":"https://doi.org/10.1109/ANCS.2015.7110114","url":null,"abstract":"Summary form only given. The talk will discuss how open hardware, open software, and disaggregation enable to build efficient data centers in any size from small to super-big. The talk will show a real life Facebook architecture in network, hardware and software that will enable future data centers developers to control their destiny and scale at their own pace the data center capacity and size with the most effective PUI in the industry. We will discuss Wedge, 6-Pack, FBOSS, oBMC and many other aspects of the Facebook data centers.","PeriodicalId":186232,"journal":{"name":"2015 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS)","volume":"136 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123192306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-05-07DOI: 10.1109/ANCS.2015.7110128
Michel Machado, Cody Doucette, J. Byers
With the growing number of proposed clean-slate redesigns of the Internet, the need for a medium that enables all stakeholders to participate in the realization, evaluation, and selection of these designs is increasing. We believe that the missing catalyst is a meta network architecture that welcomes most, if not all, clean-state designs on a level playing field, lowers deployment barriers, and leaves the final evaluation to the broader community. This paper presents Linux XIA, a native implementation of XIA in the Linux kernel, as a candidate. We first describe Linux XIA in terms of its architectural realizations and algorithmic contributions. We then demonstrate how to port several distinct and unrelated network architectures onto Linux XIA. Finally, we provide a hybrid evaluation of Linux XIA at three levels of abstraction in terms of its ability to: evolve and foster interoperation of new architectures, embed disparate architectures inside the implementation's framework, and maintain a comparable forwarding performance to that of the legacy TCP/IP implementation. Given this evaluation, we substantiate a previously unsupported claim of XIA: that it readily supports and enables network evolution, collaboration, and interoperability - traits we view as central to the success of any future Internet architecture.
{"title":"Linux XIA: an interoperable meta network architecture to crowdsource the future internet","authors":"Michel Machado, Cody Doucette, J. Byers","doi":"10.1109/ANCS.2015.7110128","DOIUrl":"https://doi.org/10.1109/ANCS.2015.7110128","url":null,"abstract":"With the growing number of proposed clean-slate redesigns of the Internet, the need for a medium that enables all stakeholders to participate in the realization, evaluation, and selection of these designs is increasing. We believe that the missing catalyst is a meta network architecture that welcomes most, if not all, clean-state designs on a level playing field, lowers deployment barriers, and leaves the final evaluation to the broader community. This paper presents Linux XIA, a native implementation of XIA in the Linux kernel, as a candidate. We first describe Linux XIA in terms of its architectural realizations and algorithmic contributions. We then demonstrate how to port several distinct and unrelated network architectures onto Linux XIA. Finally, we provide a hybrid evaluation of Linux XIA at three levels of abstraction in terms of its ability to: evolve and foster interoperation of new architectures, embed disparate architectures inside the implementation's framework, and maintain a comparable forwarding performance to that of the legacy TCP/IP implementation. Given this evaluation, we substantiate a previously unsupported claim of XIA: that it readily supports and enables network evolution, collaboration, and interoperability - traits we view as central to the success of any future Internet architecture.","PeriodicalId":186232,"journal":{"name":"2015 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128446517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-05-01DOI: 10.1109/ANCS.2015.7110142
Yi Wang, Dongzhe Tai, Ting Zhang, Linxiao Jin, Huichen Dai, B. Liu, Xin Wu
Updating rules in the flow tables of SDN switches are complex and time-consuming. Therefore, we propose a cache-based scheme (named FlowShadow) to improve the packet processing performance and keep continuous operating while updating rules in the flow tables. FlowShadow caches the microflows in the hash table to build a fast path for packet processing. By leveraging the Action Table, FlowShadow achieves update consistency and good update performance. In order to examine the reliability, validity, utility and scalability of FlowShadow, we implement FlowShadow on the Open VSwitch and conduct numerous experiments with different settings to measure the performance of FlowShadow. The experimental results demonstrate that FlowShadow achieves a lookup speed of 75 million packets per second on a commodity PC under the real backbone traces; the system with FlowShadow speeds up 3.4× times of the original Open VSwitch.
{"title":"Flowshadow: a fast path for uninterrupted packet processing in SDN switches","authors":"Yi Wang, Dongzhe Tai, Ting Zhang, Linxiao Jin, Huichen Dai, B. Liu, Xin Wu","doi":"10.1109/ANCS.2015.7110142","DOIUrl":"https://doi.org/10.1109/ANCS.2015.7110142","url":null,"abstract":"Updating rules in the flow tables of SDN switches are complex and time-consuming. Therefore, we propose a cache-based scheme (named FlowShadow) to improve the packet processing performance and keep continuous operating while updating rules in the flow tables. FlowShadow caches the microflows in the hash table to build a fast path for packet processing. By leveraging the Action Table, FlowShadow achieves update consistency and good update performance. In order to examine the reliability, validity, utility and scalability of FlowShadow, we implement FlowShadow on the Open VSwitch and conduct numerous experiments with different settings to measure the performance of FlowShadow. The experimental results demonstrate that FlowShadow achieves a lookup speed of 75 million packets per second on a commodity PC under the real backbone traces; the system with FlowShadow speeds up 3.4× times of the original Open VSwitch.","PeriodicalId":186232,"journal":{"name":"2015 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125400032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}