[研究论文]现实世界中混淆与优化的结合

2018 IEEE 18th International Working Conference on Source Code Analysis and Manipulation (SCAM) Pub Date : 2018-09-01 DOI:10.1109/SCAM.2018.00010

S. Guelton, A. Guinet, Pierrick Brunet, J. Caamaño, F. Dagnat, Nicolas Szlifierski

{"title":"[研究论文]现实世界中混淆与优化的结合","authors":"S. Guelton, A. Guinet, Pierrick Brunet, J. Caamaño, F. Dagnat, Nicolas Szlifierski","doi":"10.1109/SCAM.2018.00010","DOIUrl":null,"url":null,"abstract":"Code obfuscation is the de facto standard to protect intellectual property when delivering code in an unmanaged environment. It relies on additive layers of code tangling techniques, white-box encryption calls and platform-specific or tool-specific countermeasures to make it harder for a reverse engineer to access critical pieces of data or to understand core algorithms. The literature provides plenty of different obfuscation techniques that can be used at compile time to transform data or control flow in order to provide some kind of protection against different reverse engineering scenarii. Scheduling code transformations to optimize a given metric is known as the pass scheduling problem, a problem known to be NP-hard, but solved in a practical way using hard-coded sequences that are generally satisfactory. Adding code obfuscation to the problem introduces two new dimensions. First, as a code obfuscator needs to find a balance between obfuscation and performance, pass scheduling becomes a multi-criteria optimization problem. Second, obfuscation passes transform their inputs in unconventional ways, which means some pass combinations may not be desirable or even valid. This paper highlights several issues met when blindly chaining different kind of obfuscation and optimization passes, emphasizing the need of a formal model to combine them. It proposes a non-intrusive formalism to leverage on sequential pass management techniques. The model is validated on real-world scenarii gathered during the development of an industrial-strength obfuscator on top of the LLVM compiler infrastructure.","PeriodicalId":127335,"journal":{"name":"2018 IEEE 18th International Working Conference on Source Code Analysis and Manipulation (SCAM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"[Research Paper] Combining Obfuscation and Optimizations in the Real World\",\"authors\":\"S. Guelton, A. Guinet, Pierrick Brunet, J. Caamaño, F. Dagnat, Nicolas Szlifierski\",\"doi\":\"10.1109/SCAM.2018.00010\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Code obfuscation is the de facto standard to protect intellectual property when delivering code in an unmanaged environment. It relies on additive layers of code tangling techniques, white-box encryption calls and platform-specific or tool-specific countermeasures to make it harder for a reverse engineer to access critical pieces of data or to understand core algorithms. The literature provides plenty of different obfuscation techniques that can be used at compile time to transform data or control flow in order to provide some kind of protection against different reverse engineering scenarii. Scheduling code transformations to optimize a given metric is known as the pass scheduling problem, a problem known to be NP-hard, but solved in a practical way using hard-coded sequences that are generally satisfactory. Adding code obfuscation to the problem introduces two new dimensions. First, as a code obfuscator needs to find a balance between obfuscation and performance, pass scheduling becomes a multi-criteria optimization problem. Second, obfuscation passes transform their inputs in unconventional ways, which means some pass combinations may not be desirable or even valid. This paper highlights several issues met when blindly chaining different kind of obfuscation and optimization passes, emphasizing the need of a formal model to combine them. It proposes a non-intrusive formalism to leverage on sequential pass management techniques. The model is validated on real-world scenarii gathered during the development of an industrial-strength obfuscator on top of the LLVM compiler infrastructure.\",\"PeriodicalId\":127335,\"journal\":{\"name\":\"2018 IEEE 18th International Working Conference on Source Code Analysis and Manipulation (SCAM)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE 18th International Working Conference on Source Code Analysis and Manipulation (SCAM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SCAM.2018.00010\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 18th International Working Conference on Source Code Analysis and Manipulation (SCAM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCAM.2018.00010","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

代码混淆是在非托管环境中交付代码时保护知识产权的事实上的标准。它依赖于代码纠缠技术、白盒加密调用和特定于平台或工具的对策，使逆向工程师更难访问关键数据片段或理解核心算法。文献提供了大量不同的混淆技术，可以在编译时用于转换数据或控制流，以提供针对不同逆向工程场景的某种保护。调度代码转换以优化给定度量被称为传递调度问题，这是一个已知的np困难问题，但以一种实际的方式解决，使用通常令人满意的硬编码序列。将代码混淆添加到问题中会引入两个新的维度。首先，由于代码混淆器需要在混淆和性能之间找到平衡，因此通道调度成为一个多准则优化问题。其次，混淆传递以非常规的方式转换其输入，这意味着一些传递组合可能不可取甚至无效。本文强调了当盲目链接不同类型的混淆和优化通道时遇到的几个问题，强调需要一个正式的模型来组合它们。它提出了一种非侵入式的形式来利用顺序通道管理技术。该模型在基于LLVM编译器基础架构的工业级混淆器开发过程中收集的真实场景上进行了验证。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

[Research Paper] Combining Obfuscation and Optimizations in the Real World

Code obfuscation is the de facto standard to protect intellectual property when delivering code in an unmanaged environment. It relies on additive layers of code tangling techniques, white-box encryption calls and platform-specific or tool-specific countermeasures to make it harder for a reverse engineer to access critical pieces of data or to understand core algorithms. The literature provides plenty of different obfuscation techniques that can be used at compile time to transform data or control flow in order to provide some kind of protection against different reverse engineering scenarii. Scheduling code transformations to optimize a given metric is known as the pass scheduling problem, a problem known to be NP-hard, but solved in a practical way using hard-coded sequences that are generally satisfactory. Adding code obfuscation to the problem introduces two new dimensions. First, as a code obfuscator needs to find a balance between obfuscation and performance, pass scheduling becomes a multi-criteria optimization problem. Second, obfuscation passes transform their inputs in unconventional ways, which means some pass combinations may not be desirable or even valid. This paper highlights several issues met when blindly chaining different kind of obfuscation and optimization passes, emphasizing the need of a formal model to combine them. It proposes a non-intrusive formalism to leverage on sequential pass management techniques. The model is validated on real-world scenarii gathered during the development of an industrial-strength obfuscator on top of the LLVM compiler infrastructure.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2018 IEEE 18th International Working Conference on Source Code Analysis and Manipulation (SCAM)

自引率

0.00%

发文量

期刊最新文献

[Research Paper] Untangling Composite Commits Using Program Slicing [Engineering Paper] Built-in Clone Detection in Meta Languages [Research Paper] Static JavaScript Call Graphs: A Comparative Study [Engineering Paper] Challenges of Implementing Cross Translation Unit Analysis in Clang Static Analyzer [Engineering Paper] Graal: The Quest for Source Code Knowledge