{"title":"Nahida:带内分布式跟踪与eBPF","authors":"Wanqi Yang, Pengfei Chen, Kai Liu, Huxing Zhang","doi":"arxiv-2311.09032","DOIUrl":null,"url":null,"abstract":"Microservices are commonly used in modern cloud-native applications to\nachieve agility. However, the complexity of service dependencies in large-scale\nmicroservices systems can lead to anomaly propagation, making fault\ntroubleshooting a challenge. To address this issue, distributed tracing systems\nhave been proposed to trace complete request execution paths, enabling\ndevelopers to troubleshoot anomalous services. However, existing distributed\ntracing systems have limitations such as invasive instrumentation, trace loss,\nor inaccurate trace correlation. To overcome these limitations, we propose a\nnew tracing system based on eBPF (extended Berkeley Packet Filter), named\nNahida, that can track complete requests in the kernel without intrusion,\nregardless of programming language or implementation. Our evaluation results\nshow that Nahida can track over 92% of requests with stable accuracy, even\nunder the high concurrency of user requests, while the state-of-the-art\nnon-invasive approaches can not track any of the requests. Importantly, Nahida\ncan track requests served by a multi-threaded application that none of the\nexisting invasive tracing systems can handle by instrumenting tracing codes\ninto libraries. Moreover, the overhead introduced by Nahida is negligible,\nincreasing service latency by only 1.55%-2.1%. Overall, Nahida provides an\neffective and non-invasive solution for distributed tracing.","PeriodicalId":501333,"journal":{"name":"arXiv - CS - Operating Systems","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Nahida: In-Band Distributed Tracing with eBPF\",\"authors\":\"Wanqi Yang, Pengfei Chen, Kai Liu, Huxing Zhang\",\"doi\":\"arxiv-2311.09032\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Microservices are commonly used in modern cloud-native applications to\\nachieve agility. However, the complexity of service dependencies in large-scale\\nmicroservices systems can lead to anomaly propagation, making fault\\ntroubleshooting a challenge. To address this issue, distributed tracing systems\\nhave been proposed to trace complete request execution paths, enabling\\ndevelopers to troubleshoot anomalous services. However, existing distributed\\ntracing systems have limitations such as invasive instrumentation, trace loss,\\nor inaccurate trace correlation. To overcome these limitations, we propose a\\nnew tracing system based on eBPF (extended Berkeley Packet Filter), named\\nNahida, that can track complete requests in the kernel without intrusion,\\nregardless of programming language or implementation. Our evaluation results\\nshow that Nahida can track over 92% of requests with stable accuracy, even\\nunder the high concurrency of user requests, while the state-of-the-art\\nnon-invasive approaches can not track any of the requests. Importantly, Nahida\\ncan track requests served by a multi-threaded application that none of the\\nexisting invasive tracing systems can handle by instrumenting tracing codes\\ninto libraries. Moreover, the overhead introduced by Nahida is negligible,\\nincreasing service latency by only 1.55%-2.1%. Overall, Nahida provides an\\neffective and non-invasive solution for distributed tracing.\",\"PeriodicalId\":501333,\"journal\":{\"name\":\"arXiv - CS - Operating Systems\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-11-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Operating Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2311.09032\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Operating Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2311.09032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Microservices are commonly used in modern cloud-native applications to
achieve agility. However, the complexity of service dependencies in large-scale
microservices systems can lead to anomaly propagation, making fault
troubleshooting a challenge. To address this issue, distributed tracing systems
have been proposed to trace complete request execution paths, enabling
developers to troubleshoot anomalous services. However, existing distributed
tracing systems have limitations such as invasive instrumentation, trace loss,
or inaccurate trace correlation. To overcome these limitations, we propose a
new tracing system based on eBPF (extended Berkeley Packet Filter), named
Nahida, that can track complete requests in the kernel without intrusion,
regardless of programming language or implementation. Our evaluation results
show that Nahida can track over 92% of requests with stable accuracy, even
under the high concurrency of user requests, while the state-of-the-art
non-invasive approaches can not track any of the requests. Importantly, Nahida
can track requests served by a multi-threaded application that none of the
existing invasive tracing systems can handle by instrumenting tracing codes
into libraries. Moreover, the overhead introduced by Nahida is negligible,
increasing service latency by only 1.55%-2.1%. Overall, Nahida provides an
effective and non-invasive solution for distributed tracing.