Modular Decompilation of Low-Level Code by Partial Evaluation

M. Gómez-Zamalloa, E. Albert, G. Puebla
{"title":"Modular Decompilation of Low-Level Code by Partial Evaluation","authors":"M. Gómez-Zamalloa, E. Albert, G. Puebla","doi":"10.1109/SCAM.2008.35","DOIUrl":null,"url":null,"abstract":"Decompiling low-level code to a high-level intermediate representation facilitates the development of analyzers, model checkers, etc. which reason about properties of the low-level code (e.g., bytecode, .NET). Interpretive decompilation consists in partially evaluating an interpreter for the low-level language (written in the high-level language) w.r.t. the code to be decompiled. There have been proofs-of-concept that interpretive decompilation is feasible, butt here remain important open issues when it comes to decompile a real language: does the approach scale up? is the quality of decompiled programs comparable to that obtained by ad-hoc decompilers? do decompiled programs preserve the structure of the original programs? This paper addresses these issues by presenting, to the best of our knowledge, the first modular scheme to enable interpretive decompilation of low-level code to a high-level representation, namely, we decompile bytecode into PROLOG. We introduce two notions of optimality. The first one requires that each method/block is decompiled just once. The second one requires that each program point is traversed at most once during decompilation. We demonstrate the impact of our modular approach and optimality issues on a series of realistic benchmarks. Decompilation times and decompiled program sizes are linear with the size of the input bytecode program. This demostrates empirically the scalability of modular decompilation of low-level code by partial evaluation.","PeriodicalId":433693,"journal":{"name":"2008 Eighth IEEE International Working Conference on Source Code Analysis and Manipulation","volume":"76 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 Eighth IEEE International Working Conference on Source Code Analysis and Manipulation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCAM.2008.35","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

Decompiling low-level code to a high-level intermediate representation facilitates the development of analyzers, model checkers, etc. which reason about properties of the low-level code (e.g., bytecode, .NET). Interpretive decompilation consists in partially evaluating an interpreter for the low-level language (written in the high-level language) w.r.t. the code to be decompiled. There have been proofs-of-concept that interpretive decompilation is feasible, butt here remain important open issues when it comes to decompile a real language: does the approach scale up? is the quality of decompiled programs comparable to that obtained by ad-hoc decompilers? do decompiled programs preserve the structure of the original programs? This paper addresses these issues by presenting, to the best of our knowledge, the first modular scheme to enable interpretive decompilation of low-level code to a high-level representation, namely, we decompile bytecode into PROLOG. We introduce two notions of optimality. The first one requires that each method/block is decompiled just once. The second one requires that each program point is traversed at most once during decompilation. We demonstrate the impact of our modular approach and optimality issues on a series of realistic benchmarks. Decompilation times and decompiled program sizes are linear with the size of the input bytecode program. This demostrates empirically the scalability of modular decompilation of low-level code by partial evaluation.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
部分求值的低级代码模块化反编译
将低级代码反编译为高级中间表示形式,有助于分析程序、模型检查程序等的开发,这些程序可以推断低级代码的属性(例如,字节码、。net)。解释性反编译包括部分地计算解释器对低级语言(用高级语言编写)而不是要反编译的代码的解释器。已经有概念证明解释式反编译是可行的,但是当涉及到反编译真正的语言时,这里仍然存在重要的开放问题:这种方法是否可以扩展?反编译程序的质量是否可与特设反编译器获得的质量相媲美?反编译程序是否保留原始程序的结构?本文通过提出这些问题,据我们所知,第一个模块化方案能够将低级代码的解释性反编译为高级表示,也就是说,我们将字节码反编译为PROLOG。我们引入两个最优性的概念。第一个要求每个方法/块只反编译一次。第二种方法要求每个程序点在反编译期间最多遍历一次。我们在一系列现实的基准测试中展示了我们的模块化方法和最优性问题的影响。反编译时间和反编译程序大小与输入字节码程序的大小成线性关系。这从经验上证明了通过部分求值对低级代码进行模块化反编译的可伸缩性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
CoordInspector: A Tool for Extracting Coordination Data from Legacy Code An Empirical Study of Function Overloading in C++ DTS - A Software Defects Testing System The Evolution and Decay of Statically Detected Source Code Vulnerabilities TBCppA: A Tracer Approach for Automatic Accurate Analysis of C Preprocessor's Behaviors
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1