Chuanxiong Guo, Lihua Yuan, Dong Xiang, Yingnong Dang, Ray Huang, D. Maltz, Zhaoyi Liu, Vin Wang, Bin Pang, Hua Chen, Zhi Lin, Varugis Kurien
{"title":"Pingmesh: A Large-Scale System for Data Center Network Latency Measurement and Analysis","authors":"Chuanxiong Guo, Lihua Yuan, Dong Xiang, Yingnong Dang, Ray Huang, D. Maltz, Zhaoyi Liu, Vin Wang, Bin Pang, Hua Chen, Zhi Lin, Varugis Kurien","doi":"10.1145/2785956.2787496","DOIUrl":null,"url":null,"abstract":"Can we get network latency between any two servers at any time in large-scale data center networks? The collected latency data can then be used to address a series of challenges: telling if an application perceived latency issue is caused by the network or not, defining and tracking network service level agreement (SLA), and automatic network troubleshooting. We have developed the Pingmesh system for large-scale data center network latency measurement and analysis to answer the above question affirmatively. Pingmesh has been running in Microsoft data centers for more than four years, and it collects tens of terabytes of latency data per day. Pingmesh is widely used by not only network software developers and engineers, but also application and service developers and operators.","PeriodicalId":268472,"journal":{"name":"Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication","volume":"71 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"387","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2785956.2787496","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 387
Abstract
Can we get network latency between any two servers at any time in large-scale data center networks? The collected latency data can then be used to address a series of challenges: telling if an application perceived latency issue is caused by the network or not, defining and tracking network service level agreement (SLA), and automatic network troubleshooting. We have developed the Pingmesh system for large-scale data center network latency measurement and analysis to answer the above question affirmatively. Pingmesh has been running in Microsoft data centers for more than four years, and it collects tens of terabytes of latency data per day. Pingmesh is widely used by not only network software developers and engineers, but also application and service developers and operators.