KEN: Kernel Extensions using Natural Language

Yusheng Zheng, Yiwei Yang, Maolin Chen, Andrew Quinn
{"title":"KEN: Kernel Extensions using Natural Language","authors":"Yusheng Zheng, Yiwei Yang, Maolin Chen, Andrew Quinn","doi":"arxiv-2312.05531","DOIUrl":null,"url":null,"abstract":"The ability to modify and extend an operating system is an important feature\nfor improving a system's security, reliability, and performance. The extended\nBerkeley Packet Filters (eBPF) ecosystem has emerged as the standard mechanism\nfor extending the Linux kernel and has recently been ported to Windows. eBPF\nprograms inject new logic into the kernel that the system will execute before\nor after existing logic. While the eBPF ecosystem provides a flexible mechanism\nfor kernel extension, it is difficult for developers to write eBPF programs\ntoday. An eBPF developer must have deep knowledge of the internals of the\noperating system to determine where to place logic and cope with programming\nlimitations on the control flow and data accesses of their eBPF program\nenforced by the eBPF verifier. This paper presents KEN, an alternative\nframework that alleviates the difficulty of writing an eBPF program by allowing\nKernel Extensions to be written in Natural language. KEN uses recent advances\nin large language models (LLMs) to synthesize an eBPF program given a user's\nEnglish language prompt. To ensure that LLM's output is semantically equivalent\nto the user's prompt, KEN employs a combination of LLM-empowered program\ncomprehension, symbolic execution, and a series of feedback loops. KEN's key\nnovelty is the combination of these techniques. In particular, the system uses\nsymbolic execution in a novel structure that allows it to combine the results\nof program synthesis and program comprehension and build on the recent success\nthat LLMs have shown for each of these tasks individually. To evaluate KEN, we\ndeveloped a new corpus of natural language prompts for eBPF programs. We show\nthat KEN produces correct eBPF programs on 80% which is an improvement of a\nfactor of 2.67 compared to an LLM-empowered program synthesis baseline.","PeriodicalId":501333,"journal":{"name":"arXiv - CS - Operating Systems","volume":"81 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Operating Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2312.05531","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The ability to modify and extend an operating system is an important feature for improving a system's security, reliability, and performance. The extended Berkeley Packet Filters (eBPF) ecosystem has emerged as the standard mechanism for extending the Linux kernel and has recently been ported to Windows. eBPF programs inject new logic into the kernel that the system will execute before or after existing logic. While the eBPF ecosystem provides a flexible mechanism for kernel extension, it is difficult for developers to write eBPF programs today. An eBPF developer must have deep knowledge of the internals of the operating system to determine where to place logic and cope with programming limitations on the control flow and data accesses of their eBPF program enforced by the eBPF verifier. This paper presents KEN, an alternative framework that alleviates the difficulty of writing an eBPF program by allowing Kernel Extensions to be written in Natural language. KEN uses recent advances in large language models (LLMs) to synthesize an eBPF program given a user's English language prompt. To ensure that LLM's output is semantically equivalent to the user's prompt, KEN employs a combination of LLM-empowered program comprehension, symbolic execution, and a series of feedback loops. KEN's key novelty is the combination of these techniques. In particular, the system uses symbolic execution in a novel structure that allows it to combine the results of program synthesis and program comprehension and build on the recent success that LLMs have shown for each of these tasks individually. To evaluate KEN, we developed a new corpus of natural language prompts for eBPF programs. We show that KEN produces correct eBPF programs on 80% which is an improvement of a factor of 2.67 compared to an LLM-empowered program synthesis baseline.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
KEN:使用自然语言的内核扩展
修改和扩展操作系统的能力是提高系统安全性、可靠性和性能的一项重要功能。扩展的伯克利数据包过滤器(eBPF)生态系统已成为扩展 Linux 内核的标准机制,最近还被移植到了 Windows 上。eBPF 程序将新的逻辑注入内核,系统将在现有逻辑之前或之后执行这些逻辑。虽然 eBPF 生态系统为内核扩展提供了一种灵活的机制,但如今开发人员很难编写 eBPF 程序。eBPF 开发者必须对操作系统的内部结构有深入的了解,才能确定在何处放置逻辑,并应对 eBPF 验证器对其 eBPF 程序的控制流和数据访问的编程限制。本文介绍的 KEN 是一个替代框架,它允许使用自然语言编写内核扩展,从而减轻了编写 eBPF 程序的难度。KEN 利用大语言模型(LLM)的最新进展,根据用户的英语提示合成 eBPF 程序。为确保 LLM 的输出在语义上等同于用户的提示,KEN 结合使用了 LLM 驱动的程序理解、符号执行和一系列反馈回路。KEN 的关键之处在于这些技术的结合。特别是,该系统在一种新颖的结构中使用了符号执行,从而将程序合成和程序理解的结果结合起来,并以 LLM 最近在这两项任务中分别取得的成功为基础。为了评估 KEN,我们为 eBPF 程序开发了一个新的自然语言提示语料库。结果表明,KEN 生成的 eBPF 程序正确率达到 80%,与 LLM 支持的程序合成基线相比,提高了 2.67 个系数。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Analysis of Synchronization Mechanisms in Operating Systems Skip TLB flushes for reused pages within mmap's eBPF-mm: Userspace-guided memory management in Linux with eBPF BULKHEAD: Secure, Scalable, and Efficient Kernel Compartmentalization with PKS Rethinking Programmed I/O for Fast Devices, Cheap Cores, and Coherent Interconnects
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1