Monirul I. Sharif, A. Lanzi, Jonathon T. Giffin, Wenke Lee
{"title":"恶意软件仿真器的自动逆向工程","authors":"Monirul I. Sharif, A. Lanzi, Jonathon T. Giffin, Wenke Lee","doi":"10.1109/SP.2009.27","DOIUrl":null,"url":null,"abstract":"Malware authors have recently begun using emulation technology to obfuscate their code. They convert native malware binaries into bytecode programs written in a randomly generated instruction set and paired with a native binary emulator that interprets the bytecode. No existing malware analysis can reliably reverse this obfuscation technique. In this paper, we present the first work in automatic reverse engineering of malware emulators. Our algorithms are based on dynamic analysis. We execute the emulated malware in a protected environment and record the entire x86 instruction trace generated by the emulator. We then use dynamic data-flow and taint analysis over the trace to identify data buffers containing the bytecode program and extract the syntactic and semantic information about the bytecode instruction set. With these analysis outputs, we are able to generate data structures, such as control-flow graphs, that provide the foundation for subsequent malware analysis. We implemented a proof-of-concept system called Rotalume and evaluated it using both legitimate programs and malware emulated by VMProtect and Code Virtualizer. The results show that Rotalume accurately reveals the syntax and semantics of emulated instruction sets and reconstructs execution paths of original programs from their bytecode representations.","PeriodicalId":161757,"journal":{"name":"2009 30th IEEE Symposium on Security and Privacy","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"210","resultStr":"{\"title\":\"Automatic Reverse Engineering of Malware Emulators\",\"authors\":\"Monirul I. Sharif, A. Lanzi, Jonathon T. Giffin, Wenke Lee\",\"doi\":\"10.1109/SP.2009.27\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Malware authors have recently begun using emulation technology to obfuscate their code. They convert native malware binaries into bytecode programs written in a randomly generated instruction set and paired with a native binary emulator that interprets the bytecode. No existing malware analysis can reliably reverse this obfuscation technique. In this paper, we present the first work in automatic reverse engineering of malware emulators. Our algorithms are based on dynamic analysis. We execute the emulated malware in a protected environment and record the entire x86 instruction trace generated by the emulator. We then use dynamic data-flow and taint analysis over the trace to identify data buffers containing the bytecode program and extract the syntactic and semantic information about the bytecode instruction set. With these analysis outputs, we are able to generate data structures, such as control-flow graphs, that provide the foundation for subsequent malware analysis. We implemented a proof-of-concept system called Rotalume and evaluated it using both legitimate programs and malware emulated by VMProtect and Code Virtualizer. The results show that Rotalume accurately reveals the syntax and semantics of emulated instruction sets and reconstructs execution paths of original programs from their bytecode representations.\",\"PeriodicalId\":161757,\"journal\":{\"name\":\"2009 30th IEEE Symposium on Security and Privacy\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-05-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"210\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 30th IEEE Symposium on Security and Privacy\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SP.2009.27\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 30th IEEE Symposium on Security and Privacy","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SP.2009.27","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Automatic Reverse Engineering of Malware Emulators
Malware authors have recently begun using emulation technology to obfuscate their code. They convert native malware binaries into bytecode programs written in a randomly generated instruction set and paired with a native binary emulator that interprets the bytecode. No existing malware analysis can reliably reverse this obfuscation technique. In this paper, we present the first work in automatic reverse engineering of malware emulators. Our algorithms are based on dynamic analysis. We execute the emulated malware in a protected environment and record the entire x86 instruction trace generated by the emulator. We then use dynamic data-flow and taint analysis over the trace to identify data buffers containing the bytecode program and extract the syntactic and semantic information about the bytecode instruction set. With these analysis outputs, we are able to generate data structures, such as control-flow graphs, that provide the foundation for subsequent malware analysis. We implemented a proof-of-concept system called Rotalume and evaluated it using both legitimate programs and malware emulated by VMProtect and Code Virtualizer. The results show that Rotalume accurately reveals the syntax and semantics of emulated instruction sets and reconstructs execution paths of original programs from their bytecode representations.