{"title":"估算滑动窗口上数据流中活动流的数量","authors":"Éric Fusy, F. Giroire","doi":"10.1137/1.9781611972979.9","DOIUrl":null,"url":null,"abstract":"A new algorithm is introduced to estimate the number of distinct flows (or connections) in a data stream. The algorithm maintains an accurate estimate of the number of distinct flows over a sliding window. It is simple to implement, parallelizes optimally, and has a very good trade-off between auxiliary memory and accuracy of the estimate: a relative accuracy of order 1/√m requires essentially a memory of order mln(n/m) words, where n is an upper bound on the number of flows to be seen over the sliding window. For instance, a memory of only 64kB is sufficient to maintain an estimate with accuracy of order 4 percents for a stream with several million flows. The algorithm has been validated both by simulations and experimentations on real traffic. It proves very efficient to monitor traffic and detect attacks.","PeriodicalId":340112,"journal":{"name":"Workshop on Analytic Algorithmics and Combinatorics","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"50","resultStr":"{\"title\":\"Estimating the Number of Active Flows in a Data Stream over a Sliding Window\",\"authors\":\"Éric Fusy, F. Giroire\",\"doi\":\"10.1137/1.9781611972979.9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A new algorithm is introduced to estimate the number of distinct flows (or connections) in a data stream. The algorithm maintains an accurate estimate of the number of distinct flows over a sliding window. It is simple to implement, parallelizes optimally, and has a very good trade-off between auxiliary memory and accuracy of the estimate: a relative accuracy of order 1/√m requires essentially a memory of order mln(n/m) words, where n is an upper bound on the number of flows to be seen over the sliding window. For instance, a memory of only 64kB is sufficient to maintain an estimate with accuracy of order 4 percents for a stream with several million flows. The algorithm has been validated both by simulations and experimentations on real traffic. It proves very efficient to monitor traffic and detect attacks.\",\"PeriodicalId\":340112,\"journal\":{\"name\":\"Workshop on Analytic Algorithmics and Combinatorics\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-01-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"50\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Workshop on Analytic Algorithmics and Combinatorics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1137/1.9781611972979.9\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Workshop on Analytic Algorithmics and Combinatorics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1137/1.9781611972979.9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Estimating the Number of Active Flows in a Data Stream over a Sliding Window
A new algorithm is introduced to estimate the number of distinct flows (or connections) in a data stream. The algorithm maintains an accurate estimate of the number of distinct flows over a sliding window. It is simple to implement, parallelizes optimally, and has a very good trade-off between auxiliary memory and accuracy of the estimate: a relative accuracy of order 1/√m requires essentially a memory of order mln(n/m) words, where n is an upper bound on the number of flows to be seen over the sliding window. For instance, a memory of only 64kB is sufficient to maintain an estimate with accuracy of order 4 percents for a stream with several million flows. The algorithm has been validated both by simulations and experimentations on real traffic. It proves very efficient to monitor traffic and detect attacks.