{"title":"ECG:通过基于 LLM 的语料库生成增强嵌入式操作系统模糊测试","authors":"Qiang Zhang;Yuheng Shen;Jianzhong Liu;Yiru Xu;Heyuan Shi;Yu Jiang;Wanli Chang","doi":"10.1109/TCAD.2024.3447220","DOIUrl":null,"url":null,"abstract":"Embedded operating systems (Embedded OSs) power much of our critical infrastructure but are, in general, much less tested for bugs than general-purpose operating systems. Fuzzing Embedded OSs encounter significant roadblocks due to much less documented specifications, an inherent ineffectiveness in generating high-quality payloads. In this article, we propose ECG, an Embedded OS fuzzer empowered by large language models (LLMs) to sufficiently mitigate the aforementioned issues. ECG approaches fuzzing Embedded OS by automatically generating input specifications based on readily available source code and documentation, instrumenting and intercepting execution behavior for directional guidance information, and generating inputs with payloads according to the pregenerated input specifications and directional hints provided from previous runs. These methods are empowered by using an interactive refinement method to extract the most from LLMs while using established parsing checkers to validate the outputs. Our evaluation results demonstrate that ECG uncovered 32 new vulnerabilities across three popular open-source Embedded OS (RT-Linux, RaspiOS, and OpenWrt) and detected ten bugs in a commercial Embedded OS running on an actual device. Moreover, compared to Syzkaller, Moonshine, KernelGPT, Rtkaller, and DRLF, ECG has achieved additional kernel code coverage improvements of 23.20%, 19.46%, 10.96%, 15.47%, and 11.05%, respectively, with an overall average improvement of 16.02%. These results underscore ECG’s enhanced capability in uncovering vulnerabilities, thus contributing to the overall robustness and security of the Embedded OS.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"43 11","pages":"4238-4249"},"PeriodicalIF":2.7000,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ECG: Augmenting Embedded Operating System Fuzzing via LLM-Based Corpus Generation\",\"authors\":\"Qiang Zhang;Yuheng Shen;Jianzhong Liu;Yiru Xu;Heyuan Shi;Yu Jiang;Wanli Chang\",\"doi\":\"10.1109/TCAD.2024.3447220\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Embedded operating systems (Embedded OSs) power much of our critical infrastructure but are, in general, much less tested for bugs than general-purpose operating systems. Fuzzing Embedded OSs encounter significant roadblocks due to much less documented specifications, an inherent ineffectiveness in generating high-quality payloads. In this article, we propose ECG, an Embedded OS fuzzer empowered by large language models (LLMs) to sufficiently mitigate the aforementioned issues. ECG approaches fuzzing Embedded OS by automatically generating input specifications based on readily available source code and documentation, instrumenting and intercepting execution behavior for directional guidance information, and generating inputs with payloads according to the pregenerated input specifications and directional hints provided from previous runs. These methods are empowered by using an interactive refinement method to extract the most from LLMs while using established parsing checkers to validate the outputs. Our evaluation results demonstrate that ECG uncovered 32 new vulnerabilities across three popular open-source Embedded OS (RT-Linux, RaspiOS, and OpenWrt) and detected ten bugs in a commercial Embedded OS running on an actual device. Moreover, compared to Syzkaller, Moonshine, KernelGPT, Rtkaller, and DRLF, ECG has achieved additional kernel code coverage improvements of 23.20%, 19.46%, 10.96%, 15.47%, and 11.05%, respectively, with an overall average improvement of 16.02%. These results underscore ECG’s enhanced capability in uncovering vulnerabilities, thus contributing to the overall robustness and security of the Embedded OS.\",\"PeriodicalId\":13251,\"journal\":{\"name\":\"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems\",\"volume\":\"43 11\",\"pages\":\"4238-4249\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2024-11-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10745813/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10745813/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
ECG: Augmenting Embedded Operating System Fuzzing via LLM-Based Corpus Generation
Embedded operating systems (Embedded OSs) power much of our critical infrastructure but are, in general, much less tested for bugs than general-purpose operating systems. Fuzzing Embedded OSs encounter significant roadblocks due to much less documented specifications, an inherent ineffectiveness in generating high-quality payloads. In this article, we propose ECG, an Embedded OS fuzzer empowered by large language models (LLMs) to sufficiently mitigate the aforementioned issues. ECG approaches fuzzing Embedded OS by automatically generating input specifications based on readily available source code and documentation, instrumenting and intercepting execution behavior for directional guidance information, and generating inputs with payloads according to the pregenerated input specifications and directional hints provided from previous runs. These methods are empowered by using an interactive refinement method to extract the most from LLMs while using established parsing checkers to validate the outputs. Our evaluation results demonstrate that ECG uncovered 32 new vulnerabilities across three popular open-source Embedded OS (RT-Linux, RaspiOS, and OpenWrt) and detected ten bugs in a commercial Embedded OS running on an actual device. Moreover, compared to Syzkaller, Moonshine, KernelGPT, Rtkaller, and DRLF, ECG has achieved additional kernel code coverage improvements of 23.20%, 19.46%, 10.96%, 15.47%, and 11.05%, respectively, with an overall average improvement of 16.02%. These results underscore ECG’s enhanced capability in uncovering vulnerabilities, thus contributing to the overall robustness and security of the Embedded OS.
期刊介绍:
The purpose of this Transactions is to publish papers of interest to individuals in the area of computer-aided design of integrated circuits and systems composed of analog, digital, mixed-signal, optical, or microwave components. The aids include methods, models, algorithms, and man-machine interfaces for system-level, physical and logical design including: planning, synthesis, partitioning, modeling, simulation, layout, verification, testing, hardware-software co-design and documentation of integrated circuit and system designs of all complexities. Design tools and techniques for evaluating and designing integrated circuits and systems for metrics such as performance, power, reliability, testability, and security are a focus.