ECG: Augmenting Embedded Operating System Fuzzing via LLM-Based Corpus Generation

IF 2.7 3区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Pub Date : 2024-11-06 DOI:10.1109/TCAD.2024.3447220
Qiang Zhang;Yuheng Shen;Jianzhong Liu;Yiru Xu;Heyuan Shi;Yu Jiang;Wanli Chang
{"title":"ECG: Augmenting Embedded Operating System Fuzzing via LLM-Based Corpus Generation","authors":"Qiang Zhang;Yuheng Shen;Jianzhong Liu;Yiru Xu;Heyuan Shi;Yu Jiang;Wanli Chang","doi":"10.1109/TCAD.2024.3447220","DOIUrl":null,"url":null,"abstract":"Embedded operating systems (Embedded OSs) power much of our critical infrastructure but are, in general, much less tested for bugs than general-purpose operating systems. Fuzzing Embedded OSs encounter significant roadblocks due to much less documented specifications, an inherent ineffectiveness in generating high-quality payloads. In this article, we propose ECG, an Embedded OS fuzzer empowered by large language models (LLMs) to sufficiently mitigate the aforementioned issues. ECG approaches fuzzing Embedded OS by automatically generating input specifications based on readily available source code and documentation, instrumenting and intercepting execution behavior for directional guidance information, and generating inputs with payloads according to the pregenerated input specifications and directional hints provided from previous runs. These methods are empowered by using an interactive refinement method to extract the most from LLMs while using established parsing checkers to validate the outputs. Our evaluation results demonstrate that ECG uncovered 32 new vulnerabilities across three popular open-source Embedded OS (RT-Linux, RaspiOS, and OpenWrt) and detected ten bugs in a commercial Embedded OS running on an actual device. Moreover, compared to Syzkaller, Moonshine, KernelGPT, Rtkaller, and DRLF, ECG has achieved additional kernel code coverage improvements of 23.20%, 19.46%, 10.96%, 15.47%, and 11.05%, respectively, with an overall average improvement of 16.02%. These results underscore ECG’s enhanced capability in uncovering vulnerabilities, thus contributing to the overall robustness and security of the Embedded OS.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"43 11","pages":"4238-4249"},"PeriodicalIF":2.7000,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10745813/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

Embedded operating systems (Embedded OSs) power much of our critical infrastructure but are, in general, much less tested for bugs than general-purpose operating systems. Fuzzing Embedded OSs encounter significant roadblocks due to much less documented specifications, an inherent ineffectiveness in generating high-quality payloads. In this article, we propose ECG, an Embedded OS fuzzer empowered by large language models (LLMs) to sufficiently mitigate the aforementioned issues. ECG approaches fuzzing Embedded OS by automatically generating input specifications based on readily available source code and documentation, instrumenting and intercepting execution behavior for directional guidance information, and generating inputs with payloads according to the pregenerated input specifications and directional hints provided from previous runs. These methods are empowered by using an interactive refinement method to extract the most from LLMs while using established parsing checkers to validate the outputs. Our evaluation results demonstrate that ECG uncovered 32 new vulnerabilities across three popular open-source Embedded OS (RT-Linux, RaspiOS, and OpenWrt) and detected ten bugs in a commercial Embedded OS running on an actual device. Moreover, compared to Syzkaller, Moonshine, KernelGPT, Rtkaller, and DRLF, ECG has achieved additional kernel code coverage improvements of 23.20%, 19.46%, 10.96%, 15.47%, and 11.05%, respectively, with an overall average improvement of 16.02%. These results underscore ECG’s enhanced capability in uncovering vulnerabilities, thus contributing to the overall robustness and security of the Embedded OS.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
ECG:通过基于 LLM 的语料库生成增强嵌入式操作系统模糊测试
嵌入式操作系统(Embedded OSs)为我们的许多关键基础设施提供了动力,但一般来说,与通用操作系统相比,嵌入式操作系统的漏洞测试要少得多。由于文档规范少得多,在生成高质量有效载荷方面存在固有的低效性,因此嵌入式操作系统的模糊测试遇到了重大障碍。在本文中,我们提出了一种嵌入式操作系统模糊器 ECG,该模糊器由大型语言模型(LLM)驱动,可充分缓解上述问题。ECG 采用以下方法对嵌入式操作系统进行模糊处理:根据现成的源代码和文档自动生成输入规范,利用工具和拦截执行行为以获取方向指引信息,并根据预生成的输入规范和先前运行中提供的方向提示生成带有有效载荷的输入。这些方法通过使用交互式细化方法来从 LLM 中提取最多信息,同时使用已建立的解析检查器来验证输出结果。我们的评估结果表明,ECG 在三个流行的开源嵌入式操作系统(RT-Linux、RaspiOS 和 OpenWrt)中发现了 32 个新漏洞,并在实际设备上运行的商业嵌入式操作系统中检测到 10 个漏洞。此外,与 Syzkaller、Moonshine、KernelGPT、Rtkaller 和 DRLF 相比,ECG 的内核代码覆盖率分别提高了 23.20%、19.46%、10.96%、15.47% 和 11.05%,总体平均提高了 16.02%。这些结果表明,ECG 在发现漏洞方面的能力得到了增强,从而提高了嵌入式操作系统的整体稳健性和安全性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
5.60
自引率
13.80%
发文量
500
审稿时长
7 months
期刊介绍: The purpose of this Transactions is to publish papers of interest to individuals in the area of computer-aided design of integrated circuits and systems composed of analog, digital, mixed-signal, optical, or microwave components. The aids include methods, models, algorithms, and man-machine interfaces for system-level, physical and logical design including: planning, synthesis, partitioning, modeling, simulation, layout, verification, testing, hardware-software co-design and documentation of integrated circuit and system designs of all complexities. Design tools and techniques for evaluating and designing integrated circuits and systems for metrics such as performance, power, reliability, testability, and security are a focus.
期刊最新文献
Table of Contents NOVELLA: Nonvolatile Last-Level Cache Bypass for Optimizing Off-Chip Memory Energy FreePrune: An Automatic Pruning Framework Across Various Granularities Based on Training-Free Evaluation CaBaFL: Asynchronous Federated Learning via Hierarchical Cache and Feature Balance MaskedHLS: Domain-Specific High-Level Synthesis of Masked Cryptographic Designs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1