rCanary：检测 Rust 中半自动内存管理边界的内存泄漏

IF 6.5 1区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING IEEE Transactions on Software Engineering Pub Date : 2024-08-13 DOI:10.1109/TSE.2024.3443624

Mohan Cui;Hui Xu;Hongliang Tian;Yangfan Zhou

{"title":"rCanary：检测 Rust 中半自动内存管理边界的内存泄漏","authors":"Mohan Cui;Hui Xu;Hongliang Tian;Yangfan Zhou","doi":"10.1109/TSE.2024.3443624","DOIUrl":null,"url":null,"abstract":"Rust is an effective system programming language that guarantees memory safety via compile-time verifications. It employs a novel ownership-based resource management model to facilitate automated deallocation. This model is anticipated to eliminate memory leaks. However, we observed that user intervention drives it into semi-automated memory management and makes it error-prone to cause leaks. In contrast to violating memory-safety guarantees restricted by the \n<italic>unsafe\n keyword, the boundary of leaking memory is implicit, and the compiler would not emit any warnings for developers. In this paper, we present \n<sc>rCanary\n, a static, non-intrusive, and fully automated model checker to detect leaks across the semi-automated boundary. We design an encoder to abstract data with heap allocation and formalize a refined leak-free memory model based on boolean satisfiability. It can generate SMT-Lib2 format constraints for Rust MIR and is implemented as a Cargo component. We evaluate \n<sc>rCanary\n by using flawed package benchmarks collected from the pull requests of open-source Rust projects. The results indicate that it is possible to recall all these defects with acceptable false positives. We further apply our tool to more than 1,200 real-world crates from crates.io and GitHub, identifying 19 crates having memory leaks. Our analyzer is also efficient, that costs 8.4 seconds per package.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"50 9","pages":"2472-2484"},"PeriodicalIF":6.5000,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"rCanary: Detecting Memory Leaks Across Semi-Automated Memory Management Boundary in Rust\",\"authors\":\"Mohan Cui;Hui Xu;Hongliang Tian;Yangfan Zhou\",\"doi\":\"10.1109/TSE.2024.3443624\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Rust is an effective system programming language that guarantees memory safety via compile-time verifications. It employs a novel ownership-based resource management model to facilitate automated deallocation. This model is anticipated to eliminate memory leaks. However, we observed that user intervention drives it into semi-automated memory management and makes it error-prone to cause leaks. In contrast to violating memory-safety guarantees restricted by the \\n<italic>unsafe\\n keyword, the boundary of leaking memory is implicit, and the compiler would not emit any warnings for developers. In this paper, we present \\n<sc>rCanary\\n, a static, non-intrusive, and fully automated model checker to detect leaks across the semi-automated boundary. We design an encoder to abstract data with heap allocation and formalize a refined leak-free memory model based on boolean satisfiability. It can generate SMT-Lib2 format constraints for Rust MIR and is implemented as a Cargo component. We evaluate \\n<sc>rCanary\\n by using flawed package benchmarks collected from the pull requests of open-source Rust projects. The results indicate that it is possible to recall all these defects with acceptable false positives. We further apply our tool to more than 1,200 real-world crates from crates.io and GitHub, identifying 19 crates having memory leaks. Our analyzer is also efficient, that costs 8.4 seconds per package.\",\"PeriodicalId\":13324,\"journal\":{\"name\":\"IEEE Transactions on Software Engineering\",\"volume\":\"50 9\",\"pages\":\"2472-2484\"},\"PeriodicalIF\":6.5000,\"publicationDate\":\"2024-08-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Software Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10636096/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10636096/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

摘要

Rust 是一种有效的系统编程语言，可通过编译时验证保证内存安全。它采用了一种新颖的基于所有权的资源管理模型，便于自动去分配。该模型有望消除内存泄露。然而，我们观察到，用户的干预使其进入半自动化内存管理，并使其容易出错，从而导致泄露。与违反由不安全关键字限制的内存安全保证相反，泄露内存的边界是隐含的，编译器不会向开发人员发出任何警告。在本文中，我们提出了 rCanary，一种静态、非侵入式、全自动的模型检查器，用于检测跨越半自动边界的泄漏。我们设计了一个编码器来抽象具有堆分配的数据，并基于布尔可满足性形式化了一个精炼的无泄漏内存模型。它可以为 Rust MIR 生成 SMT-Lib2 格式的约束，并作为 Cargo 组件实现。我们使用从开源 Rust 项目的拉取请求中收集的有缺陷的软件包基准对 rCanary 进行了评估。结果表明，它能以可接受的误报率召回所有这些缺陷。我们还将这一工具应用于来自 crates.io 和 GitHub 的 1200 多个实际板条箱，发现了 19 个存在内存泄露的板条箱。我们的分析器也很高效，每个软件包只需 8.4 秒。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

rCanary: Detecting Memory Leaks Across Semi-Automated Memory Management Boundary in Rust

Rust is an effective system programming language that guarantees memory safety via compile-time verifications. It employs a novel ownership-based resource management model to facilitate automated deallocation. This model is anticipated to eliminate memory leaks. However, we observed that user intervention drives it into semi-automated memory management and makes it error-prone to cause leaks. In contrast to violating memory-safety guarantees restricted by the unsafe keyword, the boundary of leaking memory is implicit, and the compiler would not emit any warnings for developers. In this paper, we present rCanary , a static, non-intrusive, and fully automated model checker to detect leaks across the semi-automated boundary. We design an encoder to abstract data with heap allocation and formalize a refined leak-free memory model based on boolean satisfiability. It can generate SMT-Lib2 format constraints for Rust MIR and is implemented as a Cargo component. We evaluate rCanary by using flawed package benchmarks collected from the pull requests of open-source Rust projects. The results indicate that it is possible to recall all these defects with acceptable false positives. We further apply our tool to more than 1,200 real-world crates from crates.io and GitHub, identifying 19 crates having memory leaks. Our analyzer is also efficient, that costs 8.4 seconds per package.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Software Engineering 工程技术-工程：电子与电气

CiteScore

9.70

自引率

10.80%

发文量

724

审稿时长

6 months

期刊介绍： IEEE Transactions on Software Engineering seeks contributions comprising well-defined theoretical results and empirical studies with potential impacts on software construction, analysis, or management. The scope of this Transactions extends from fundamental mechanisms to the development of principles and their application in specific environments. Specific topic areas include: a) Development and maintenance methods and models: Techniques and principles for specifying, designing, and implementing software systems, encompassing notations and process models. b) Assessment methods: Software tests, validation, reliability models, test and diagnosis procedures, software redundancy, design for error control, and measurements and evaluation of process and product aspects. c) Software project management: Productivity factors, cost models, schedule and organizational issues, and standards. d) Tools and environments: Specific tools, integrated tool environments, associated architectures, databases, and parallel and distributed processing issues. e) System issues: Hardware-software trade-offs. f) State-of-the-art surveys: Syntheses and comprehensive reviews of the historical development within specific areas of interest.