{"title":"Estimating the Number of Active Flows in a Data Stream over a Sliding Window","authors":"Éric Fusy, F. Giroire","doi":"10.1137/1.9781611972979.9","DOIUrl":null,"url":null,"abstract":"A new algorithm is introduced to estimate the number of distinct flows (or connections) in a data stream. The algorithm maintains an accurate estimate of the number of distinct flows over a sliding window. It is simple to implement, parallelizes optimally, and has a very good trade-off between auxiliary memory and accuracy of the estimate: a relative accuracy of order 1/√m requires essentially a memory of order mln(n/m) words, where n is an upper bound on the number of flows to be seen over the sliding window. For instance, a memory of only 64kB is sufficient to maintain an estimate with accuracy of order 4 percents for a stream with several million flows. The algorithm has been validated both by simulations and experimentations on real traffic. It proves very efficient to monitor traffic and detect attacks.","PeriodicalId":340112,"journal":{"name":"Workshop on Analytic Algorithmics and Combinatorics","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"50","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Workshop on Analytic Algorithmics and Combinatorics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1137/1.9781611972979.9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 50
Abstract
A new algorithm is introduced to estimate the number of distinct flows (or connections) in a data stream. The algorithm maintains an accurate estimate of the number of distinct flows over a sliding window. It is simple to implement, parallelizes optimally, and has a very good trade-off between auxiliary memory and accuracy of the estimate: a relative accuracy of order 1/√m requires essentially a memory of order mln(n/m) words, where n is an upper bound on the number of flows to be seen over the sliding window. For instance, a memory of only 64kB is sufficient to maintain an estimate with accuracy of order 4 percents for a stream with several million flows. The algorithm has been validated both by simulations and experimentations on real traffic. It proves very efficient to monitor traffic and detect attacks.