首页 > 最新文献

IEEE Transactions on Software Engineering最新文献

英文 中文
MirrorFuzz: Leveraging LLM and Shared Bugs for Deep Learning Framework APIs Fuzzing MirrorFuzz:利用LLM和共享bug进行深度学习框架api模糊测试
IF 5.6 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-10-10 DOI: 10.1109/TSE.2025.3619966
Shiwen Ou;Yuwei Li;Lu Yu;Chengkun Wei;Tingke Wen;Qiangpu Chen;Yu Chen;Haizhi Tang;Zulie Pan
Deep learning (DL) frameworks serve as the backbone for a wide range of artificial intelligence applications. However, bugs within DL frameworks can cascade into critical issues in higher-level applications, jeopardizing reliability and security. While numerous techniques have been proposed to detect bugs in DL frameworks, research exploring common API patterns across frameworks and the potential risks they entail remains limited. Notably, many DL frameworks expose similar APIs with overlapping input parameters and functionalities, rendering them vulnerable to shared bugs, where a flaw in one API may extend to analogous APIs in other frameworks. To address this challenge, we propose MirrorFuzz, an automated API fuzzing solution to discover shared bugs in DL frameworks. MirrorFuzz operates in three stages: First, MirrorFuzz collects historical bug data for each API within a DL framework to identify potentially buggy APIs. Second, it matches each buggy API in a specific framework with similar APIs within and across other DL frameworks. Third, it employs large language models (LLMs) to synthesize code for the API under test, leveraging the historical bug data of similar APIs to trigger analogous bugs across APIs. We implement MirrorFuzz and evaluate it on four popular DL frameworks (TensorFlow, PyTorch, OneFlow, and Jittor). Extensive evaluation demonstrates that MirrorFuzz improves code coverage by 39.92% and 98.20% compared to state-of-the-art methods on TensorFlow and PyTorch, respectively. Moreover, MirrorFuzz discovers 315 bugs, 262 of which are newly found, and 80 bugs are fixed, with 52 of these bugs assigned CNVD IDs.
深度学习(DL)框架是广泛的人工智能应用的支柱。然而,深度学习框架中的错误可能会导致高级应用程序中的关键问题,从而危及可靠性和安全性。虽然已经提出了许多技术来检测DL框架中的错误,但探索跨框架的通用API模式及其潜在风险的研究仍然有限。值得注意的是,许多DL框架暴露了具有重叠输入参数和功能的类似API,使它们容易受到共享错误的影响,其中一个API中的缺陷可能扩展到其他框架中的类似API。为了应对这一挑战,我们提出了MirrorFuzz,这是一个自动化的API模糊测试解决方案,用于发现DL框架中的共享错误。MirrorFuzz分为三个阶段:首先,MirrorFuzz收集DL框架内每个API的历史错误数据,以识别潜在的错误API。其次,它将特定框架中的每个有bug的API与其他DL框架内部和跨框架的类似API进行匹配。第三,它使用大型语言模型(llm)来合成被测API的代码,利用类似API的历史错误数据来触发API之间的类似错误。我们实现了MirrorFuzz,并在四种流行的深度学习框架(TensorFlow, PyTorch, onflow和Jittor)上对其进行了评估。广泛的评估表明,与TensorFlow和PyTorch上最先进的方法相比,MirrorFuzz分别将代码覆盖率提高了39.92%和98.20%。此外,MirrorFuzz发现了315个bug,其中262个是新发现的,80个bug得到了修复,其中52个bug被分配了CNVD id。
{"title":"MirrorFuzz: Leveraging LLM and Shared Bugs for Deep Learning Framework APIs Fuzzing","authors":"Shiwen Ou;Yuwei Li;Lu Yu;Chengkun Wei;Tingke Wen;Qiangpu Chen;Yu Chen;Haizhi Tang;Zulie Pan","doi":"10.1109/TSE.2025.3619966","DOIUrl":"10.1109/TSE.2025.3619966","url":null,"abstract":"Deep learning (DL) frameworks serve as the backbone for a wide range of artificial intelligence applications. However, bugs within DL frameworks can cascade into critical issues in higher-level applications, jeopardizing reliability and security. While numerous techniques have been proposed to detect bugs in DL frameworks, research exploring common API patterns across frameworks and the potential risks they entail remains limited. Notably, many DL frameworks expose similar APIs with overlapping input parameters and functionalities, rendering them vulnerable to shared bugs, where a flaw in one API may extend to analogous APIs in other frameworks. To address this challenge, we propose MirrorFuzz, an automated API fuzzing solution to discover shared bugs in DL frameworks. MirrorFuzz operates in three stages: First, MirrorFuzz collects historical bug data for each API within a DL framework to identify potentially buggy APIs. Second, it matches each buggy API in a specific framework with similar APIs within and across other DL frameworks. Third, it employs large language models (LLMs) to synthesize code for the API under test, leveraging the historical bug data of similar APIs to trigger analogous bugs across APIs. We implement MirrorFuzz and evaluate it on four popular DL frameworks (TensorFlow, PyTorch, OneFlow, and Jittor). Extensive evaluation demonstrates that MirrorFuzz improves code coverage by 39.92% and 98.20% compared to state-of-the-art methods on TensorFlow and PyTorch, respectively. Moreover, MirrorFuzz discovers 315 bugs, 262 of which are newly found, and 80 bugs are fixed, with 52 of these bugs assigned CNVD IDs.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"52 1","pages":"360-375"},"PeriodicalIF":5.6,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11201027","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145260849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Condor: A Code Discriminator Integrating General Semantics with Code Details 秃鹰:一个集成了一般语义和代码细节的代码鉴别器
IF 7.4 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-10-10 DOI: 10.1109/tse.2025.3620145
Qingyuan Liang, Zhao Zhang, Chen Liu, Zeyu Sun, Wenjie Zhang, Yizhou Chen, Zixiao Zhao, Qi Luo, Wentao Wang, Yanjie Jiang, Yingfei Xiong, Lu Zhang
{"title":"Condor: A Code Discriminator Integrating General Semantics with Code Details","authors":"Qingyuan Liang, Zhao Zhang, Chen Liu, Zeyu Sun, Wenjie Zhang, Yizhou Chen, Zixiao Zhao, Qi Luo, Wentao Wang, Yanjie Jiang, Yingfei Xiong, Lu Zhang","doi":"10.1109/tse.2025.3620145","DOIUrl":"https://doi.org/10.1109/tse.2025.3620145","url":null,"abstract":"","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"10 1","pages":""},"PeriodicalIF":7.4,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145260848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficiently Testing Distributed Systems via Abstract State Space Prioritization 基于抽象状态空间优先级的分布式系统高效测试
IF 5.6 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-10-09 DOI: 10.1109/TSE.2025.3618976
Yu Gao;Dong Wang;Wensheng Dou;Wenhan Feng;Yu Liang;Jun Wei
The last five years have seen a rise of model checking guided testing (MCGT) approaches for systematically testing distributed systems. MCGT approaches generate test cases for distributed systems by traversing their verified abstract state spaces, simultaneously solving the three key problems faced in testing distributed systems, i.e., test input generation, test oracle construction and execution space enumeration. However, existing MCGT approaches struggle with traversing the huge state space of distributed systems, which can contain billions of system states. This makes the process of finding bugs time-consuming and expensive, often taking several weeks. In this paper, we propose Mosso to speed up model checking guided testing for distributed systems. We observe that there exist lots of redundant test scenarios in the abstract state space of distributed systems. Considering the characteristics of these redundant test scenarios, we propose three strategies: action independence, node symmetry and scenario equivalence, to identify and prioritize unique test scenarios when traversing the state space. We have applied Mosso on three real-world distributed systems. By employing the three strategies, our approach has achieved an average speedup of 56X (up to 208X) compared to the state-of-art MCGT approach. Additionally, our approach has successfully uncovered 2 previously-unknown bugs.
在过去的五年中,用于系统测试分布式系统的模型检查引导测试(MCGT)方法出现了增长。MCGT方法通过遍历经过验证的抽象状态空间,为分布式系统生成测试用例,同时解决了测试分布式系统面临的三个关键问题,即测试输入生成、测试oracle构建和执行空间枚举。然而,现有的MCGT方法难以遍历分布式系统的巨大状态空间,这可能包含数十亿个系统状态。这使得查找bug的过程既耗时又昂贵,通常需要几个星期的时间。在本文中,我们提出了Mosso来加速分布式系统的模型检查引导测试。研究发现,分布式系统的抽象状态空间中存在大量冗余的测试场景。针对这些冗余测试场景的特点,我们提出了动作独立、节点对称和场景等价三种策略,在遍历状态空间时识别唯一测试场景并对其进行优先级排序。我们已经在三个真实的分布式系统上应用了Mosso。通过采用这三种策略,与最先进的MCGT方法相比,我们的方法实现了56X(最高208X)的平均加速。此外,我们的方法还成功地发现了2个以前未知的bug。
{"title":"Efficiently Testing Distributed Systems via Abstract State Space Prioritization","authors":"Yu Gao;Dong Wang;Wensheng Dou;Wenhan Feng;Yu Liang;Jun Wei","doi":"10.1109/TSE.2025.3618976","DOIUrl":"10.1109/TSE.2025.3618976","url":null,"abstract":"The last five years have seen a rise of model checking guided testing (MCGT) approaches for systematically testing distributed systems. MCGT approaches generate test cases for distributed systems by traversing their verified abstract state spaces, simultaneously solving the three key problems faced in testing distributed systems, i.e., test input generation, test oracle construction and execution space enumeration. However, existing MCGT approaches struggle with traversing the huge state space of distributed systems, which can contain billions of system states. This makes the process of finding bugs time-consuming and expensive, often taking several weeks. In this paper, we propose <monospace>Mosso</monospace> to speed up model checking guided testing for distributed systems. We observe that there exist lots of redundant test scenarios in the abstract state space of distributed systems. Considering the characteristics of these redundant test scenarios, we propose three strategies: action independence, node symmetry and scenario equivalence, to identify and prioritize unique test scenarios when traversing the state space. We have applied <monospace>Mosso</monospace> on three real-world distributed systems. By employing the three strategies, our approach has achieved an average speedup of 56X (up to 208X) compared to the state-of-art MCGT approach. Additionally, our approach has successfully uncovered 2 previously-unknown bugs.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"52 2","pages":"395-410"},"PeriodicalIF":5.6,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145255731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient Function Orchestration for Large Language Models 大型语言模型的高效功能编排
IF 5.6 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-10-09 DOI: 10.1109/TSE.2025.3619112
Xiaoxia Liu;Peng Di;Cong Li;Jun Sun;Jingyi Wang
Function calling is a fundamental capability of today’s large language models, but sequential function calling posed efficiency problems. Recent studies have proposed to request function calls with parallelism support in order to alleviate this issue. However, they either delegate the concurrent function calls to users for execution which are conversely executed sequentially, or overlook the relations among various function calls, rending limited efficiency. This paper introduces LLMOrch, an advanced framework for automated, parallel function calling in large language models. The key principle behind LLMOrch is to identify an available processor to execute a function call while preventing any single processor from becoming overburdened. To this end, LLMOrch models the data relations (i.e., definition-use (def-use) dependencies among different function calls and coordinates their executions by their contro l relations (i.e., mutual-exclusion) as well as the working status of the underlying processors. When comparing with state-of-the-art techniques, LLMOrch demonstrated comparable efficiency improvements in orchestrating I/O-intensive functions, while significantly outperforming (2$times$) them with compute-intensive functions. LLMOrch’s performance even showed a linear correlation to the number of allocated processors. We believe that these results highlight the potential of LLMOrch as an efficient solution for parallel function orchestration in the context of large language models.
函数调用是当今大型语言模型的一项基本功能,但是顺序函数调用带来了效率问题。最近的研究建议请求具有并行性支持的函数调用,以缓解这个问题。然而,它们要么将并发函数调用委托给用户执行,而这些函数调用反过来顺序执行,要么忽略了各种函数调用之间的关系,从而导致效率有限。本文介绍了LLMOrch,一个用于大型语言模型中自动并行调用函数的高级框架。LLMOrch背后的关键原则是确定一个可用的处理器来执行函数调用,同时防止任何单个处理器负担过重。为此,LLMOrch对不同函数调用之间的数据关系(即定义-使用(定义-使用)依赖关系)进行建模,并通过它们的控制关系(即互斥)以及底层处理器的工作状态来协调它们的执行。当与最先进的技术进行比较时,LLMOrch在编排I/ o密集型功能方面展示了相当的效率改进,同时在计算密集型功能方面明显优于它们(2$times$)。LLMOrch的性能甚至与分配的处理器数量呈线性相关。我们相信这些结果突出了LLMOrch作为大型语言模型背景下并行功能编排的有效解决方案的潜力。
{"title":"Efficient Function Orchestration for Large Language Models","authors":"Xiaoxia Liu;Peng Di;Cong Li;Jun Sun;Jingyi Wang","doi":"10.1109/TSE.2025.3619112","DOIUrl":"10.1109/TSE.2025.3619112","url":null,"abstract":"Function calling is a fundamental capability of today’s large language models, but sequential function calling posed efficiency problems. Recent studies have proposed to request function calls with parallelism support in order to alleviate this issue. However, they either delegate the concurrent function calls to users for execution which are conversely executed sequentially, or overlook the relations among various function calls, rending limited efficiency. This paper introduces <monospace>LLMOrch</monospace>, an advanced framework for automated, parallel function calling in large language models. The key principle behind <monospace>LLMOrch</monospace> is to identify an available processor to execute a function call while preventing any single processor from becoming overburdened. To this end, <monospace>LLMOrch</monospace> models the data relations (i.e., definition-use (def-use) dependencies among different function calls and coordinates their executions by their contro l relations (i.e., mutual-exclusion) as well as the working status of the underlying processors. When comparing with state-of-the-art techniques, <monospace>LLMOrch</monospace> demonstrated comparable efficiency improvements in orchestrating I/O-intensive functions, while significantly outperforming (2<inline-formula><tex-math>$times$</tex-math></inline-formula>) them with compute-intensive functions. <monospace>LLMOrch</monospace>’s performance even showed a linear correlation to the number of allocated processors. We believe that these results highlight the potential of <monospace>LLMOrch</monospace> as an efficient solution for parallel function orchestration in the context of large language models.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"52 2","pages":"411-427"},"PeriodicalIF":5.6,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145255682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards Secure Code Generation with LLMs: A Study on Common Weakness Enumeration 利用llm实现安全代码生成:常见弱点枚举的研究
IF 7.4 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-10-09 DOI: 10.1109/tse.2025.3619281
Jianguo Zhao, Yuqiang Sun, Cheng Huang, Chengwei Liu, YaoHui Guan, Yutong Zeng, Yang Liu
{"title":"Towards Secure Code Generation with LLMs: A Study on Common Weakness Enumeration","authors":"Jianguo Zhao, Yuqiang Sun, Cheng Huang, Chengwei Liu, YaoHui Guan, Yutong Zeng, Yang Liu","doi":"10.1109/tse.2025.3619281","DOIUrl":"https://doi.org/10.1109/tse.2025.3619281","url":null,"abstract":"","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"18 1","pages":""},"PeriodicalIF":7.4,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145255730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Manifestations of Empathy in Software Engineering: How, Why, and When It Matters 软件工程中移情的表现:如何,为什么,何时重要
IF 7.4 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-10-09 DOI: 10.1109/tse.2025.3612888
Hashini Gunatilake, John Grundy, Rashina Hoda, Ingo Mueller
{"title":"Manifestations of Empathy in Software Engineering: How, Why, and When It Matters","authors":"Hashini Gunatilake, John Grundy, Rashina Hoda, Ingo Mueller","doi":"10.1109/tse.2025.3612888","DOIUrl":"https://doi.org/10.1109/tse.2025.3612888","url":null,"abstract":"","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"32 1","pages":""},"PeriodicalIF":7.4,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145255630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Aging-Related Bug Prediction Based on Multi-View Graph Feature Learning and Graph-Transformer 基于多视图图特征学习和图转换的老化相关Bug预测
IF 5.6 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-10-08 DOI: 10.1109/TSE.2025.3618113
Chen Zhang;Jianwen Xiang;Rui Hao;Kai Jia;Jing Tian;Roberto Natella;Roberto Pietrantuono;Domenico Cotroneo
Software aging, characterized by an increasing failure rate or performance degradation in long-running software systems, poses significant risks, including substantial financial losses and potential threats to human lives. This phenomenon is primarily driven by the accumulation of runtime errors, commonly referred to as aging-related bugs (ARBs). Aging-related bug prediction (ARBP) has been proposed to facilitate the detection and remediation of ARBs prior to software release. However, ARBP’s effectiveness heavily depends on the quality of dataset features used. Previous research has largely relied on a standard set of manually designed metrics, often overlooking that these metrics may fail to distinguish between code segments with different semantics, even when they exhibit identical metric values. While some studies have attempted to develop models that learn semantic features from source code, they typically focus on token-level or graph-level features, neglecting a comprehensive exploration of ARB characteristics within the source code. Specifically, there is insufficient discussion on whether deep semantic features can adequately capture the essential traits that trigger aging phenomena. In this paper, we propose a novel multi-view graph feature learning framework based on Graph-Transformer, which integrates newly proposed ARB features extracted from Abstract Syntax Trees with Code Property Graphs for feature learning. Our approach effectively captures hierarchical structures and variable dependencies, facilitating the identification of complex interactions that contribute to ARBs. Additionally, we implement sub-graph sampling and class imbalance strategies to enhance model performance. Experimental results across three datasets demonstrate that our method surpasses state-of-the-art approaches, a code property graph-based feature extraction method (specifically SGT), achieving precision improvements of 8.2 percentage points on Linux, 15.4 percentage points on MySQL, and 2.5 percentage points on NetBSD, thereby establishing a new benchmark for ARB prediction.
软件老化,以长期运行的软件系统中不断增加的故障率或性能下降为特征,构成了重大的风险,包括大量的经济损失和对人类生命的潜在威胁。这种现象主要是由运行时错误的积累所驱动的,这些错误通常被称为与老化相关的错误(arb)。老化相关的错误预测(ARBP)已经被提出,以促进在软件发布之前检测和修复arb。然而,ARBP的有效性在很大程度上取决于所使用的数据集特征的质量。以前的研究很大程度上依赖于一组手动设计的标准度量,经常忽略这些度量可能无法区分具有不同语义的代码段,即使它们显示相同的度量值。虽然一些研究试图开发从源代码中学习语义特征的模型,但它们通常侧重于令牌级或图级特征,而忽略了对源代码中ARB特征的全面探索。具体来说,对于深层语义特征是否能够充分捕捉触发衰老现象的基本特征,目前还没有足够的讨论。本文提出了一种基于graph - transformer的多视图图特征学习框架,该框架将从抽象语法树中提取的ARB特征与代码属性图相结合,用于特征学习。我们的方法有效地捕获了层次结构和变量依赖关系,促进了对导致arb的复杂交互的识别。此外,我们还实现了子图采样和类不平衡策略来提高模型的性能。在三个数据集上的实验结果表明,我们的方法超过了最先进的方法,一种基于代码属性图的特征提取方法(特别是SGT),在Linux上实现了8.2个百分点的精度提高,在MySQL上提高了15.4个百分点,在NetBSD上提高了2.5个百分点,从而为ARB预测建立了一个新的基准。
{"title":"Aging-Related Bug Prediction Based on Multi-View Graph Feature Learning and Graph-Transformer","authors":"Chen Zhang;Jianwen Xiang;Rui Hao;Kai Jia;Jing Tian;Roberto Natella;Roberto Pietrantuono;Domenico Cotroneo","doi":"10.1109/TSE.2025.3618113","DOIUrl":"10.1109/TSE.2025.3618113","url":null,"abstract":"Software aging, characterized by an increasing failure rate or performance degradation in long-running software systems, poses significant risks, including substantial financial losses and potential threats to human lives. This phenomenon is primarily driven by the accumulation of runtime errors, commonly referred to as aging-related bugs (ARBs). Aging-related bug prediction (ARBP) has been proposed to facilitate the detection and remediation of ARBs prior to software release. However, ARBP’s effectiveness heavily depends on the quality of dataset features used. Previous research has largely relied on a standard set of manually designed metrics, often overlooking that these metrics may fail to distinguish between code segments with different semantics, even when they exhibit identical metric values. While some studies have attempted to develop models that learn semantic features from source code, they typically focus on token-level or graph-level features, neglecting a comprehensive exploration of ARB characteristics within the source code. Specifically, there is insufficient discussion on whether deep semantic features can adequately capture the essential traits that trigger aging phenomena. In this paper, we propose a novel multi-view graph feature learning framework based on Graph-Transformer, which integrates newly proposed ARB features extracted from Abstract Syntax Trees with Code Property Graphs for feature learning. Our approach effectively captures hierarchical structures and variable dependencies, facilitating the identification of complex interactions that contribute to ARBs. Additionally, we implement sub-graph sampling and class imbalance strategies to enhance model performance. Experimental results across three datasets demonstrate that our method surpasses state-of-the-art approaches, a code property graph-based feature extraction method (specifically SGT), achieving precision improvements of 8.2 percentage points on Linux, 15.4 percentage points on MySQL, and 2.5 percentage points on NetBSD, thereby establishing a new benchmark for ARB prediction.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"52 1","pages":"221-245"},"PeriodicalIF":5.6,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145247017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CEDAR: Silent Control Flow Error Detection via Heterogeneous Relation Learning 基于异构关系学习的无声控制流错误检测
IF 7.4 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-10-08 DOI: 10.1109/tse.2025.3618552
Yang Liu, Jingjing Gu, Jingxuan Zhang, Bao Wen, Yi Zhuang
{"title":"CEDAR: Silent Control Flow Error Detection via Heterogeneous Relation Learning","authors":"Yang Liu, Jingjing Gu, Jingxuan Zhang, Bao Wen, Yi Zhuang","doi":"10.1109/tse.2025.3618552","DOIUrl":"https://doi.org/10.1109/tse.2025.3618552","url":null,"abstract":"","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"11 1","pages":""},"PeriodicalIF":7.4,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145247014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Do Automated Fixes Truly Mitigate Smart Contract Exploits? 自动修复真的能缓解智能合约漏洞吗?
IF 5.6 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-10-08 DOI: 10.1109/TSE.2025.3618123
Sofia Bobadilla;Monica Jin;Martin Monperrus
Automated Program Repair (APR) for smart contract security promises to automatically mitigate smart contract vulnerabilities responsible for billions in financial losses. However, the true effectiveness of this research in addressing smart contract exploits remains uncharted territory. This paper bridges this critical gap by introducing a novel and systematic experimental framework for evaluating exploit mitigation of program repair tools for smart contracts. We qualitatively and quantitatively analyze 20 state-of-the-art APR tools using a dataset of 143 vulnerable smart contracts, for which we manually craft 91 executable exploits. We are the very first to define and measure the essential “exploit mitigation rate”, giving researchers and practitioners a real sense of effectiveness. Our findings reveal substantial disparities in the state of the art, with an exploit mitigation rate ranging from a low of 29% to a high of 74%. Our study identifies systemic limitations, such as inconsistent functionality preservation, that must be addressed in future research on program repair for smart contracts.
用于智能合约安全的自动程序修复(APR)承诺自动减轻造成数十亿经济损失的智能合约漏洞。然而,这项研究在解决智能合约漏洞方面的真正有效性仍然是未知的领域。本文通过引入一种新的系统实验框架来评估智能合约程序修复工具的漏洞缓解,从而弥合了这一关键差距。我们使用143个易受攻击的智能合约数据集定性和定量地分析了20个最先进的APR工具,我们手动制作了91个可执行的漏洞。我们是第一个定义和衡量基本的“漏洞缓解率”,给研究人员和从业者一个真正的有效性感。我们的研究结果显示,各国在技术水平上存在巨大差异,漏洞缓解率从最低的29%到最高的74%不等。我们的研究确定了系统性限制,例如不一致的功能保存,这些限制必须在未来的智能合约程序修复研究中得到解决。
{"title":"Do Automated Fixes Truly Mitigate Smart Contract Exploits?","authors":"Sofia Bobadilla;Monica Jin;Martin Monperrus","doi":"10.1109/TSE.2025.3618123","DOIUrl":"10.1109/TSE.2025.3618123","url":null,"abstract":"Automated Program Repair (APR) for smart contract security promises to automatically mitigate smart contract vulnerabilities responsible for billions in financial losses. However, the true effectiveness of this research in addressing smart contract exploits remains uncharted territory. This paper bridges this critical gap by introducing a novel and systematic experimental framework for evaluating exploit mitigation of program repair tools for smart contracts. We qualitatively and quantitatively analyze 20 state-of-the-art APR tools using a dataset of 143 vulnerable smart contracts, for which we manually craft 91 executable exploits. We are the very first to define and measure the essential “exploit mitigation rate”, giving researchers and practitioners a real sense of effectiveness. Our findings reveal substantial disparities in the state of the art, with an exploit mitigation rate ranging from a low of 29% to a high of 74%. Our study identifies systemic limitations, such as inconsistent functionality preservation, that must be addressed in future research on program repair for smart contracts.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"52 1","pages":"100-115"},"PeriodicalIF":5.6,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11197044","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145247015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing Real-Time Operating System Security Analysis via Slice-based Fuzzing 通过基于切片的模糊测试增强实时操作系统安全分析
IF 7.4 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-10-06 DOI: 10.1109/tse.2025.3615642
Jialu Li, Haoyu Li, Yuchong Xie, Yanhao Wang, Qinsheng Hou, Libo Chen, Bo Zhang, Shenghong Li, Zhi Xue
{"title":"Enhancing Real-Time Operating System Security Analysis via Slice-based Fuzzing","authors":"Jialu Li, Haoyu Li, Yuchong Xie, Yanhao Wang, Qinsheng Hou, Libo Chen, Bo Zhang, Shenghong Li, Zhi Xue","doi":"10.1109/tse.2025.3615642","DOIUrl":"https://doi.org/10.1109/tse.2025.3615642","url":null,"abstract":"","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"1 1","pages":""},"PeriodicalIF":7.4,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145235690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Transactions on Software Engineering
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1