Nahida: In-Band Distributed Tracing with eBPF

Wanqi Yang, Pengfei Chen, Kai Liu, Huxing Zhang
{"title":"Nahida: In-Band Distributed Tracing with eBPF","authors":"Wanqi Yang, Pengfei Chen, Kai Liu, Huxing Zhang","doi":"arxiv-2311.09032","DOIUrl":null,"url":null,"abstract":"Microservices are commonly used in modern cloud-native applications to\nachieve agility. However, the complexity of service dependencies in large-scale\nmicroservices systems can lead to anomaly propagation, making fault\ntroubleshooting a challenge. To address this issue, distributed tracing systems\nhave been proposed to trace complete request execution paths, enabling\ndevelopers to troubleshoot anomalous services. However, existing distributed\ntracing systems have limitations such as invasive instrumentation, trace loss,\nor inaccurate trace correlation. To overcome these limitations, we propose a\nnew tracing system based on eBPF (extended Berkeley Packet Filter), named\nNahida, that can track complete requests in the kernel without intrusion,\nregardless of programming language or implementation. Our evaluation results\nshow that Nahida can track over 92% of requests with stable accuracy, even\nunder the high concurrency of user requests, while the state-of-the-art\nnon-invasive approaches can not track any of the requests. Importantly, Nahida\ncan track requests served by a multi-threaded application that none of the\nexisting invasive tracing systems can handle by instrumenting tracing codes\ninto libraries. Moreover, the overhead introduced by Nahida is negligible,\nincreasing service latency by only 1.55%-2.1%. Overall, Nahida provides an\neffective and non-invasive solution for distributed tracing.","PeriodicalId":501333,"journal":{"name":"arXiv - CS - Operating Systems","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Operating Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2311.09032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Microservices are commonly used in modern cloud-native applications to achieve agility. However, the complexity of service dependencies in large-scale microservices systems can lead to anomaly propagation, making fault troubleshooting a challenge. To address this issue, distributed tracing systems have been proposed to trace complete request execution paths, enabling developers to troubleshoot anomalous services. However, existing distributed tracing systems have limitations such as invasive instrumentation, trace loss, or inaccurate trace correlation. To overcome these limitations, we propose a new tracing system based on eBPF (extended Berkeley Packet Filter), named Nahida, that can track complete requests in the kernel without intrusion, regardless of programming language or implementation. Our evaluation results show that Nahida can track over 92% of requests with stable accuracy, even under the high concurrency of user requests, while the state-of-the-art non-invasive approaches can not track any of the requests. Importantly, Nahida can track requests served by a multi-threaded application that none of the existing invasive tracing systems can handle by instrumenting tracing codes into libraries. Moreover, the overhead introduced by Nahida is negligible, increasing service latency by only 1.55%-2.1%. Overall, Nahida provides an effective and non-invasive solution for distributed tracing.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Nahida:带内分布式跟踪与eBPF
微服务通常用于现代云原生应用程序,以实现敏捷性。然而,大规模微服务系统中服务依赖关系的复杂性可能导致异常传播,使故障排除成为一项挑战。为了解决这个问题,已经提出了分布式跟踪系统来跟踪完整的请求执行路径,使开发人员能够排除异常服务的故障。然而,现有的分布式跟踪系统存在局限性,例如侵入性仪器,跟踪丢失或不准确的跟踪相关性。为了克服这些限制,我们提出了一种新的基于eBPF(扩展伯克利包过滤器)的跟踪系统,命名为nahida,它可以在不入侵的情况下跟踪内核中的完整请求,无论编程语言或实现如何。我们的评估结果表明,即使在用户请求的高并发性下,Nahida也能以稳定的精度跟踪超过92%的请求,而目前最先进的非侵入性方法无法跟踪任何请求。重要的是,Nahidacan可以跟踪由多线程应用程序服务的请求,而现有的侵入性跟踪系统都无法通过将跟踪代码插入库来处理这些请求。此外,Nahida引入的开销可以忽略不计,只增加了1.55%-2.1%的服务延迟。总的来说,Nahida为分布式跟踪提供了一个有效且非侵入性的解决方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Analysis of Synchronization Mechanisms in Operating Systems Skip TLB flushes for reused pages within mmap's eBPF-mm: Userspace-guided memory management in Linux with eBPF BULKHEAD: Secure, Scalable, and Efficient Kernel Compartmentalization with PKS Rethinking Programmed I/O for Fast Devices, Cheap Cores, and Coherent Interconnects
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1