SMOKE:数百万行代码的可扩展路径敏感内存泄漏检测

Gang Fan, Rongxin Wu, Qingkai Shi, Xiao Xiao, Jinguo Zhou, Charles Zhang
{"title":"SMOKE:数百万行代码的可扩展路径敏感内存泄漏检测","authors":"Gang Fan, Rongxin Wu, Qingkai Shi, Xiao Xiao, Jinguo Zhou, Charles Zhang","doi":"10.1109/ICSE.2019.00025","DOIUrl":null,"url":null,"abstract":"Detecting memory leak at industrial scale is still not well addressed, in spite of the tremendous effort from both industry and academia in the past decades. Existing work suffers from an unresolved paradox - a highly precise analysis limits its scalability and an imprecise one seriously hurts its precision or recall. In this work, we present SMOKE, a staged approach to resolve this paradox. In the ?rst stage, instead of using a uniform precise analysis for all paths, we use a scalable but imprecise analysis to compute a succinct set of candidate memory leak paths. In the second stage, we leverage a more precise analysis to verify the feasibility of those candidates. The ?rst stage is scalable, due to the design of a new sparse program representation, the use-?ow graph (UFG), that models the problem as a polynomial-time state analysis. The second stage analysis is both precise and ef?cient, due to the smaller number of candidates and the design of a dedicated constraint solver. Experimental results show that SMOKE can ?nish checking industrial-sized projects, up to 8MLoC, in forty minutes with an average false positive rate of 24.4%. Besides, SMOKE is signi?cantly faster than the state-of-the-art research techniques as well as the industrial tools, with the speedup ranging from 5.2X to 22.8X. In the twenty-nine mature and extensively checked benchmark projects, SMOKE has discovered thirty previously unknown memory leaks which were con?rmed by developers, and one even assigned a CVE ID.","PeriodicalId":6736,"journal":{"name":"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)","volume":"20 1","pages":"72-82"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"46","resultStr":"{\"title\":\"SMOKE: Scalable Path-Sensitive Memory Leak Detection for Millions of Lines of Code\",\"authors\":\"Gang Fan, Rongxin Wu, Qingkai Shi, Xiao Xiao, Jinguo Zhou, Charles Zhang\",\"doi\":\"10.1109/ICSE.2019.00025\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Detecting memory leak at industrial scale is still not well addressed, in spite of the tremendous effort from both industry and academia in the past decades. Existing work suffers from an unresolved paradox - a highly precise analysis limits its scalability and an imprecise one seriously hurts its precision or recall. In this work, we present SMOKE, a staged approach to resolve this paradox. In the ?rst stage, instead of using a uniform precise analysis for all paths, we use a scalable but imprecise analysis to compute a succinct set of candidate memory leak paths. In the second stage, we leverage a more precise analysis to verify the feasibility of those candidates. The ?rst stage is scalable, due to the design of a new sparse program representation, the use-?ow graph (UFG), that models the problem as a polynomial-time state analysis. The second stage analysis is both precise and ef?cient, due to the smaller number of candidates and the design of a dedicated constraint solver. Experimental results show that SMOKE can ?nish checking industrial-sized projects, up to 8MLoC, in forty minutes with an average false positive rate of 24.4%. Besides, SMOKE is signi?cantly faster than the state-of-the-art research techniques as well as the industrial tools, with the speedup ranging from 5.2X to 22.8X. In the twenty-nine mature and extensively checked benchmark projects, SMOKE has discovered thirty previously unknown memory leaks which were con?rmed by developers, and one even assigned a CVE ID.\",\"PeriodicalId\":6736,\"journal\":{\"name\":\"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)\",\"volume\":\"20 1\",\"pages\":\"72-82\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"46\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSE.2019.00025\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSE.2019.00025","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 46

摘要

尽管工业界和学术界在过去几十年做出了巨大的努力,但在工业规模上检测内存泄漏仍然没有得到很好的解决。现有的工作存在一个未解决的悖论——高度精确的分析限制了它的可扩展性,而不精确的分析严重损害了它的精度或召回率。在这项工作中,我们提出了SMOKE,一种分阶段的方法来解决这个悖论。在第一阶段,我们不是对所有路径使用统一的精确分析,而是使用可扩展但不精确的分析来计算一组简洁的候选内存泄漏路径。在第二阶段,我们利用更精确的分析来验证这些候选方案的可行性。第一阶段是可扩展的,由于设计了新的稀疏程序表示,使用-?ow图(UFG),它将问题建模为多项式时间状态分析。第二阶段的分析既精确又有效。由于候选者数量较少,并且设计了专用约束求解器,因此非常方便。实验结果表明,SMOKE可以在40分钟内完成高达8MLoC的工业规模项目的检查,平均误报率为24.4%。此外,烟雾是一种信号。比最先进的研究技术和工业工具快得多,加速速度从5.2倍到22.8倍不等。在29个成熟和广泛检查的基准项目中,SMOKE发现了30个以前未知的内存泄漏。其中一个甚至被分配了一个CVE ID。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
SMOKE: Scalable Path-Sensitive Memory Leak Detection for Millions of Lines of Code
Detecting memory leak at industrial scale is still not well addressed, in spite of the tremendous effort from both industry and academia in the past decades. Existing work suffers from an unresolved paradox - a highly precise analysis limits its scalability and an imprecise one seriously hurts its precision or recall. In this work, we present SMOKE, a staged approach to resolve this paradox. In the ?rst stage, instead of using a uniform precise analysis for all paths, we use a scalable but imprecise analysis to compute a succinct set of candidate memory leak paths. In the second stage, we leverage a more precise analysis to verify the feasibility of those candidates. The ?rst stage is scalable, due to the design of a new sparse program representation, the use-?ow graph (UFG), that models the problem as a polynomial-time state analysis. The second stage analysis is both precise and ef?cient, due to the smaller number of candidates and the design of a dedicated constraint solver. Experimental results show that SMOKE can ?nish checking industrial-sized projects, up to 8MLoC, in forty minutes with an average false positive rate of 24.4%. Besides, SMOKE is signi?cantly faster than the state-of-the-art research techniques as well as the industrial tools, with the speedup ranging from 5.2X to 22.8X. In the twenty-nine mature and extensively checked benchmark projects, SMOKE has discovered thirty previously unknown memory leaks which were con?rmed by developers, and one even assigned a CVE ID.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
VFix: Value-Flow-Guided Precise Program Repair for Null Pointer Dereferences Search-Based Energy Testing of Android Scalable Approaches for Test Suite Reduction A System Identification Based Oracle for Control-CPS Software Fault Localization Training Binary Classifiers as Data Structure Invariants
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1