Analyzing and revivifying function signature inference using deep learning

IF 3.6 2区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Empirical Software Engineering Pub Date : 2024-05-08 DOI:10.1007/s10664-024-10453-9

Yan Lin, Trisha Singhal, Debin Gao, David Lo

{"title":"Analyzing and revivifying function signature inference using deep learning","authors":"Yan Lin, Trisha Singhal, Debin Gao, David Lo","doi":"10.1007/s10664-024-10453-9","DOIUrl":null,"url":null,"abstract":"<p>Function signature plays an important role in binary analysis and security enhancement, with typical examples in bug finding and control-flow integrity enforcement. However, recovery of function signatures by static binary analysis is challenging since crucial information vital for such recovery is stripped off during compilation. Although function signature recovery using deep learning (DL) is proposed in an effort to handle such challenges, the reported accuracy is low for binaries compiled with optimizations. In this paper, we first perform a systematic study to quantify the extent to which compiler optimizations (negatively) impact the accuracy of existing DL techniques based on Recurrent Neural Network (RNN) for function signature recovery. Our experiments show that the state-of-the-art DL technique has its accuracy dropped from 98.7% to 87.7% when training and testing optimized binaries. We further investigate the type of instructions that existing RNN model deems most important in inferring function signatures with the help of saliency map. The results show that existing RNN model mistakenly considers non-argument-accessing instructions to infer the number of arguments, especially when dealing with optimized binaries. Finally, we identify specific weaknesses in such existing approaches and propose an enhanced DL approach named ReSIL to incorporate compiler-optimization-specific domain knowledge into the learning process. Our experimental results show that ReSIL significantly improves the accuracy and F1 score in inferring function signatures, e.g., with accuracy in inferring the number of arguments for callees compiled with optimization flag O1 from 84.83% to 92.68%. Meanwhile, ReSIL correctly considers the argument-accessing instructions as the most important ones to perform the inferencing. We also demonstrate security implications of ReSIL in Control-Flow Integrity enforcement in stopping potential Counterfeit Object-Oriented Programming (COOP) attacks.</p>","PeriodicalId":11525,"journal":{"name":"Empirical Software Engineering","volume":"2021 1","pages":""},"PeriodicalIF":3.6000,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Empirical Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10664-024-10453-9","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

Function signature plays an important role in binary analysis and security enhancement, with typical examples in bug finding and control-flow integrity enforcement. However, recovery of function signatures by static binary analysis is challenging since crucial information vital for such recovery is stripped off during compilation. Although function signature recovery using deep learning (DL) is proposed in an effort to handle such challenges, the reported accuracy is low for binaries compiled with optimizations. In this paper, we first perform a systematic study to quantify the extent to which compiler optimizations (negatively) impact the accuracy of existing DL techniques based on Recurrent Neural Network (RNN) for function signature recovery. Our experiments show that the state-of-the-art DL technique has its accuracy dropped from 98.7% to 87.7% when training and testing optimized binaries. We further investigate the type of instructions that existing RNN model deems most important in inferring function signatures with the help of saliency map. The results show that existing RNN model mistakenly considers non-argument-accessing instructions to infer the number of arguments, especially when dealing with optimized binaries. Finally, we identify specific weaknesses in such existing approaches and propose an enhanced DL approach named ReSIL to incorporate compiler-optimization-specific domain knowledge into the learning process. Our experimental results show that ReSIL significantly improves the accuracy and F1 score in inferring function signatures, e.g., with accuracy in inferring the number of arguments for callees compiled with optimization flag O1 from 84.83% to 92.68%. Meanwhile, ReSIL correctly considers the argument-accessing instructions as the most important ones to perform the inferencing. We also demonstrate security implications of ReSIL in Control-Flow Integrity enforcement in stopping potential Counterfeit Object-Oriented Programming (COOP) attacks.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用深度学习分析和活化函数特征推理

函数签名在二进制分析和安全增强中发挥着重要作用，典型的例子有错误查找和控制流完整性执行。然而，通过静态二进制分析恢复函数签名具有挑战性，因为在编译过程中，恢复函数签名所需的关键信息会被剥离。虽然有人提出使用深度学习（DL）来恢复函数签名，以应对这种挑战，但对于经过优化编译的二进制文件来说，报告的准确率很低。在本文中，我们首先进行了一项系统研究，以量化编译器优化在多大程度上（负面地）影响了基于递归神经网络（RNN）的现有 DL 技术在函数签名恢复方面的准确性。我们的实验表明，在训练和测试优化二进制文件时，最先进的 DL 技术的准确率从 98.7% 降至 87.7%。我们进一步研究了现有 RNN 模型在借助显著性图推断功能特征时认为最重要的指令类型。结果表明，现有的 RNN 模型在推断参数数时错误地考虑了非参数访问指令，尤其是在处理优化二进制文件时。最后，我们指出了这些现有方法的具体弱点，并提出了一种名为 ReSIL 的增强型 DL 方法，将编译器优化特定领域的知识纳入学习过程。实验结果表明，ReSIL 显著提高了推断函数签名的准确率和 F1 分数，例如，推断使用优化标志 O1 编译的 callees 的参数数的准确率从 84.83% 提高到 92.68%。同时，ReSIL 正确地将参数访问指令视为执行推断的最重要指令。我们还展示了 ReSIL 在控制流完整性执行方面的安全意义，以阻止潜在的假冒面向对象编程（COOP）攻击。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Empirical Software Engineering 工程技术-计算机：软件工程

CiteScore

8.50

自引率

12.20%

发文量

169

审稿时长

>12 weeks

期刊介绍： Empirical Software Engineering provides a forum for applied software engineering research with a strong empirical component, and a venue for publishing empirical results relevant to both researchers and practitioners. Empirical studies presented here usually involve the collection and analysis of data and experience that can be used to characterize, evaluate and reveal relationships between software development deliverables, practices, and technologies. Over time, it is expected that such empirical results will form a body of knowledge leading to widely accepted and well-formed theories. The journal also offers industrial experience reports detailing the application of software technologies - processes, methods, or tools - and their effectiveness in industrial settings. Empirical Software Engineering promotes the publication of industry-relevant research, to address the significant gap between research and practice.