{"title":"ORIGEN: Automatic Extraction of Offset-Revealing Instructions for Cross-Version Memory Analysis","authors":"Qian Feng, Aravind Prakash, Minghua Wang, Curtis Carmony, Heng Yin","doi":"10.1145/2897845.2897850","DOIUrl":null,"url":null,"abstract":"Semantic gap is a prominent problem in raw memory analysis, especially in Virtual Machine Introspection (VMI) and memory forensics. For COTS software, common memory forensics and VMI tools rely on the so-called \"data structure profiles\" -- a mapping between the semantic variables and their relative offsets within the structure in the binary. Construction of such profiles requires the expert knowledge about the internal working of a specified software version. At most time, it requires considerable manual efforts, which often turns out to be a cumbersome process. In this paper, we propose a notion named \"cross-version memory analysis\", wherein our goal is to alleviate the process of profile construction for new versions of a software by transferring the knowledge from the model that has already been trained on its old version. To this end, we first identify such Offset Revealing Instructions (ORI) in a given software and then leverage the code search techniques to label ORIs in an unknown version of the same software. With labeled ORIs, we can localize the profile for the new version. We provide a proof-of-concept implementation called ORIGEN. The efficacy and efficiency of ORIGEN have been empirically verified by a number of softwares. The experimental results show that by conducting the ORI search within Windows XP SP0 and Linux 3.5.0, we can successfully recover the data structure profiles for Windows XP SP2, Vista, Win 7, and Linux 2.6.32, 3.8.0, 3.13.0, respectively. The systematical evaluation on 40 versions of OpenSSH demonstrates ORIGEN can achieve a precision of more than 90%. As a case study, we integrate ORIGEN into a VMI tool to automatically extract semantic information required for VMI. We develop two plugins to the Volatility memory forensic framework, one for OpenSSH session key extraction, the other for encrypted filesystem key extraction. Both of them can achieve the cross-version analysis by ORIGEN.","PeriodicalId":166633,"journal":{"name":"Proceedings of the 11th ACM on Asia Conference on Computer and Communications Security","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 11th ACM on Asia Conference on Computer and Communications Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2897845.2897850","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14
Abstract
Semantic gap is a prominent problem in raw memory analysis, especially in Virtual Machine Introspection (VMI) and memory forensics. For COTS software, common memory forensics and VMI tools rely on the so-called "data structure profiles" -- a mapping between the semantic variables and their relative offsets within the structure in the binary. Construction of such profiles requires the expert knowledge about the internal working of a specified software version. At most time, it requires considerable manual efforts, which often turns out to be a cumbersome process. In this paper, we propose a notion named "cross-version memory analysis", wherein our goal is to alleviate the process of profile construction for new versions of a software by transferring the knowledge from the model that has already been trained on its old version. To this end, we first identify such Offset Revealing Instructions (ORI) in a given software and then leverage the code search techniques to label ORIs in an unknown version of the same software. With labeled ORIs, we can localize the profile for the new version. We provide a proof-of-concept implementation called ORIGEN. The efficacy and efficiency of ORIGEN have been empirically verified by a number of softwares. The experimental results show that by conducting the ORI search within Windows XP SP0 and Linux 3.5.0, we can successfully recover the data structure profiles for Windows XP SP2, Vista, Win 7, and Linux 2.6.32, 3.8.0, 3.13.0, respectively. The systematical evaluation on 40 versions of OpenSSH demonstrates ORIGEN can achieve a precision of more than 90%. As a case study, we integrate ORIGEN into a VMI tool to automatically extract semantic information required for VMI. We develop two plugins to the Volatility memory forensic framework, one for OpenSSH session key extraction, the other for encrypted filesystem key extraction. Both of them can achieve the cross-version analysis by ORIGEN.
语义缺口是原始内存分析中的一个突出问题,特别是在虚拟机自省(VMI)和内存取证中。对于COTS软件,公共内存取证和VMI工具依赖于所谓的“数据结构配置文件”——语义变量和它们在二进制结构中的相对偏移量之间的映射。构建这样的概要文件需要对特定软件版本的内部工作有专业的了解。在大多数情况下,它需要大量的手工工作,这通常是一个繁琐的过程。在本文中,我们提出了一个名为“跨版本记忆分析”的概念,其中我们的目标是通过转移已经在旧版本上训练过的模型的知识来减轻软件新版本的概要构建过程。为此,我们首先在给定的软件中识别这样的偏移显示指令(ORI),然后利用代码搜索技术在同一软件的未知版本中标记ORI。有了标记的ori,我们可以为新版本定位概要文件。我们提供了一个名为ORIGEN的概念验证实现。ORIGEN的有效性和效率已通过多个软件进行了实证验证。实验结果表明,通过在Windows XP SP0和Linux 3.5.0中进行ORI搜索,我们可以成功地恢复Windows XP SP2、Vista、Win 7和Linux 2.6.32、3.8.0、3.13.0的数据结构轮廓。对40个版本OpenSSH的系统评估表明,ORIGEN可以达到90%以上的精度。作为案例研究,我们将ORIGEN集成到VMI工具中,以自动提取VMI所需的语义信息。我们为volatile内存取证框架开发了两个插件,一个用于OpenSSH会话密钥提取,另一个用于加密文件系统密钥提取。它们都可以通过ORIGEN实现跨版本分析。