{"title":"评估MapReduce对应用流量的分析","authors":"T. Vieira, S. Fernandes, V. Garcia","doi":"10.1145/2465839.2465846","DOIUrl":null,"url":null,"abstract":"The use of MapReduce for distributed data processing has been growing and achieving benefits with its application for different workloads. MapReduce can be used for distributed traffic analysis, although network traces present characteristics which are not similar to the data type commonly processed through MapReduce. Motivated by the use of MapReduce for profiling application traffic and due to the lack of evaluation of MapReduce for network traffic analysis and the peculiarity of this kind of data, this paper evaluates the performance of MapReduce in packet level analysis and DPI, analysing its scalability, speed-up, and the behavior of MapReduce phases. The experiments provide evidences for the predominant phases in this kind of job, and show the impact of input size, block size and number of nodes, on MapReduce completion time and scalability.","PeriodicalId":212430,"journal":{"name":"HPPN '13","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Evaluating MapReduce for profiling application traffic\",\"authors\":\"T. Vieira, S. Fernandes, V. Garcia\",\"doi\":\"10.1145/2465839.2465846\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The use of MapReduce for distributed data processing has been growing and achieving benefits with its application for different workloads. MapReduce can be used for distributed traffic analysis, although network traces present characteristics which are not similar to the data type commonly processed through MapReduce. Motivated by the use of MapReduce for profiling application traffic and due to the lack of evaluation of MapReduce for network traffic analysis and the peculiarity of this kind of data, this paper evaluates the performance of MapReduce in packet level analysis and DPI, analysing its scalability, speed-up, and the behavior of MapReduce phases. The experiments provide evidences for the predominant phases in this kind of job, and show the impact of input size, block size and number of nodes, on MapReduce completion time and scalability.\",\"PeriodicalId\":212430,\"journal\":{\"name\":\"HPPN '13\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-06-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"HPPN '13\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2465839.2465846\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"HPPN '13","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2465839.2465846","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Evaluating MapReduce for profiling application traffic
The use of MapReduce for distributed data processing has been growing and achieving benefits with its application for different workloads. MapReduce can be used for distributed traffic analysis, although network traces present characteristics which are not similar to the data type commonly processed through MapReduce. Motivated by the use of MapReduce for profiling application traffic and due to the lack of evaluation of MapReduce for network traffic analysis and the peculiarity of this kind of data, this paper evaluates the performance of MapReduce in packet level analysis and DPI, analysing its scalability, speed-up, and the behavior of MapReduce phases. The experiments provide evidences for the predominant phases in this kind of job, and show the impact of input size, block size and number of nodes, on MapReduce completion time and scalability.