{"title":"LOP: a novel SRAM-based architecture for low power and high throughput packet classification","authors":"Xin He, Jorgen Peddersen, S. Parameswaran","doi":"10.1145/1629435.1629455","DOIUrl":null,"url":null,"abstract":"Packet classification has become an important problem to solve in modern network processors used in networking embedded systems such as routers. Algorithms for matching incoming packets from the network to pre-defined rules, have been proposed by a number of researchers. Current software-based packet classification techniques have low performance, prompting many researchers to move their focus to new architectures encompassing both software and hardware components. Some of the newer hardware architectures exclusively utilize Ternary Content Addressable Memory (TCAM) to improve the performance of rule matching. However, this results in systems with high power consumption. TCAM consumes a high amount of power due to the fact that it reads the entire memory array during every access, much of which is unnecessary. In this paper, we propose LOP, a novel SRAM-based architecture where incoming packets are compared against parts of all rules simultaneously until a single matching rule is found for the compared bits in the packets. This method LOP significantly reduces power consumption as only a segment of the memory is compared against the incoming packet. Despite the additional time penalty to match a single packet, parallel comparison of multiple packets can improve throughput beyond that of the TCAMapproaches, while consuming significantly low power. Nine different benchmarks were tested in two classification systems, with results showing that LOP architectures provide high lookup rates and high throughput, and low power consumption. Compared with a state-of-the-art TCAM implementation (throughput of 495 Million Search per Second (Msps)) in 65nm CMOS technology, on average, LOP saves 43% of energy consumption with a throughput of 590Msps.","PeriodicalId":300268,"journal":{"name":"International Conference on Hardware/Software Codesign and System Synthesis","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Hardware/Software Codesign and System Synthesis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1629435.1629455","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
Packet classification has become an important problem to solve in modern network processors used in networking embedded systems such as routers. Algorithms for matching incoming packets from the network to pre-defined rules, have been proposed by a number of researchers. Current software-based packet classification techniques have low performance, prompting many researchers to move their focus to new architectures encompassing both software and hardware components. Some of the newer hardware architectures exclusively utilize Ternary Content Addressable Memory (TCAM) to improve the performance of rule matching. However, this results in systems with high power consumption. TCAM consumes a high amount of power due to the fact that it reads the entire memory array during every access, much of which is unnecessary. In this paper, we propose LOP, a novel SRAM-based architecture where incoming packets are compared against parts of all rules simultaneously until a single matching rule is found for the compared bits in the packets. This method LOP significantly reduces power consumption as only a segment of the memory is compared against the incoming packet. Despite the additional time penalty to match a single packet, parallel comparison of multiple packets can improve throughput beyond that of the TCAMapproaches, while consuming significantly low power. Nine different benchmarks were tested in two classification systems, with results showing that LOP architectures provide high lookup rates and high throughput, and low power consumption. Compared with a state-of-the-art TCAM implementation (throughput of 495 Million Search per Second (Msps)) in 65nm CMOS technology, on average, LOP saves 43% of energy consumption with a throughput of 590Msps.