首页 > 最新文献

IEEE Transactions on Software Engineering最新文献

英文 中文
Improving Issue-PR Link Prediction via Knowledge-Aware Heterogeneous Graph Learning 通过知识感知异构图学习改进议题-PR 链接预测
IF 6.5 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-06-03 DOI: 10.1109/TSE.2024.3408448
Shuotong Bai;Huaxiao Liu;Enyan Dai;Lei Liu
Links between issues and pull requests (PRs) assist GitHub developers in tackling technical challenges, gaining development inspiration, and improving repository maintenance. In realistic repositories, these links are still insufficiently established. Aiming at this situation, existing works focus on issues and PRs themselves and employ text similarity with additional information like issue size to predict issue-PR links, yet their effectiveness is unsatisfactory. The limitation is that issues and PRs are not isolated on GitHub. Rather, they are related to multiple GitHub sources, including repositories and submitters, which, through their diverse relationships, can supply potential and crucial knowledge about technical domains, developmental insights, and cross-repository technical details. To this end, we propose Auto IP Linker (AIPL), which introduces the heterogeneous graph to model multiple GitHub sources with their relationships. Further, it leverages the metapath-based technique to reveal and incorporate the potential information for a more comprehensive understanding of issues and PRs. Firstly, we identify 4 types of GitHub sources related to issues and PRs (repositories, users, issues, PRs) as well as their relationships, and model them into task-specific heterogeneous graphs. Next, we analyze information transmitted among issues or PRs to reveal which knowledge is crucial for them. Based on our analysis, we formulate a series of metapaths and employ the metapath-based technique to incorporate various information for learning the knowledge-aware embedding of issues and PRs. Finally, we can infer whether an issue and a PR can be linked based on their embedding. We evaluate the performance of AIPL on real-world data sets collected from GitHub. The results show that, compared to the baselines, AIPL can achieve average improvements of 15.94%, 15.19%, 20.52%, and 18.50% in terms of Accuracy, Precision, Recall, and F1-score.
问题与拉取请求(PR)之间的链接有助于 GitHub 开发人员应对技术挑战、获得开发灵感并改善版本库维护。在现实的版本库中,这些链接仍未充分建立。针对这种情况,现有的工作侧重于问题和 PR 本身,并利用文本相似性和问题大小等附加信息来预测问题-PR 链接,但其效果并不理想。其局限性在于,GitHub 上的问题和 PR 并不是孤立的。相反,它们与多个 GitHub 来源相关,包括版本库和提交者,通过它们之间的不同关系,可以提供有关技术领域、开发见解和跨版本库技术细节的潜在关键知识。为此,我们提出了自动 IP 链接器(Auto IP Linker,AIPL),它引入了异构图,对多个 GitHub 来源及其关系进行建模。此外,它还利用基于元路径的技术来揭示和整合潜在信息,从而更全面地了解问题和 PR。首先,我们识别了与问题和公关相关的 4 种 GitHub 来源(仓库、用户、问题、公关)及其关系,并将它们建模为特定任务的异构图。接下来,我们分析问题或公关之间传递的信息,以揭示哪些知识对它们至关重要。在分析的基础上,我们制定了一系列元路径,并采用基于元路径的技术来整合各种信息,以学习问题和公关的知识感知嵌入。最后,我们可以根据问题和公关的嵌入推断它们之间是否存在关联。我们在从 GitHub 收集的实际数据集上评估了 AIPL 的性能。结果表明,与基线相比,AIPL 在准确率、精确率、召回率和 F1 分数方面的平均改进率分别为 15.94%、15.19%、20.52% 和 18.50%。
{"title":"Improving Issue-PR Link Prediction via Knowledge-Aware Heterogeneous Graph Learning","authors":"Shuotong Bai;Huaxiao Liu;Enyan Dai;Lei Liu","doi":"10.1109/TSE.2024.3408448","DOIUrl":"10.1109/TSE.2024.3408448","url":null,"abstract":"Links between issues and pull requests (PRs) assist GitHub developers in tackling technical challenges, gaining development inspiration, and improving repository maintenance. In realistic repositories, these links are still insufficiently established. Aiming at this situation, existing works focus on issues and PRs themselves and employ text similarity with additional information like issue size to predict issue-PR links, yet their effectiveness is unsatisfactory. The limitation is that issues and PRs are not isolated on GitHub. Rather, they are related to multiple GitHub sources, including repositories and submitters, which, through their diverse relationships, can supply potential and crucial knowledge about technical domains, developmental insights, and cross-repository technical details. To this end, we propose \u0000<underline>A</u>\u0000uto \u0000<bold>IP</b>\u0000 \u0000<underline>L</u>\u0000inker (AIPL), which introduces the heterogeneous graph to model multiple GitHub sources with their relationships. Further, it leverages the metapath-based technique to reveal and incorporate the potential information for a more comprehensive understanding of issues and PRs. Firstly, we identify 4 types of GitHub sources related to issues and PRs (repositories, users, issues, PRs) as well as their relationships, and model them into task-specific heterogeneous graphs. Next, we analyze information transmitted among issues or PRs to reveal which knowledge is crucial for them. Based on our analysis, we formulate a series of metapaths and employ the metapath-based technique to incorporate various information for learning the knowledge-aware embedding of issues and PRs. Finally, we can infer whether an issue and a PR can be linked based on their embedding. We evaluate the performance of AIPL on real-world data sets collected from GitHub. The results show that, compared to the baselines, AIPL can achieve average improvements of 15.94%, 15.19%, 20.52%, and 18.50% in terms of Accuracy, Precision, Recall, and F1-score.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141700522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ExplanaSC: A Framework for Determining Information Requirements for Explainable Blockchain Smart Contracts ExplanaSC:确定可解释区块链智能合约信息要求的框架
IF 6.5 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-06-03 DOI: 10.1109/TSE.2024.3408632
Hanouf Al Ghanmi;Rami Bahsoon
Blockchain smart contracts (SCs) have emerged as a transformative technology, enabling the automation and execution of contractual agreements without the need for intermediaries. However, as SCs evolve to become more complex in their decentralised decision-making abilities, there are notable difficulties in comprehending the underlying reasoning process and ensuring users’ understanding. The existing literature primarily focuses on the technical aspects of SC, overlooking the exploration of the decision-making process within these systems and the involvement of humans. In this paper, we propose a framework that integrates human-centered design principles by applying Situation Awareness (SA) and goal directed task analysis (GDTA) concepts to determine information requirements necessary to design eXplainable smart contracts (XSC). The framework provides a structured approach for requirements engineers to identify information that can keep users well-informed throughout the decision-making process. The framework considers factors such as the business logic model, data model, and roles and responsibilities model to define specific information requirements that shape SC behaviour and necessitate explanations. To guide the determination of information requirements, the framework categorises SC decision mechanisms into autonomy, governance, processing, and behaviour. The ExplanaSC framework promotes the generation of XSC explanations through three levels aligned with SA: XSC explanation for perception, XSC explanation for comprehension, and XSC explanation for projection. Overall, this framework contributes to the development of XSC systems and lays the foundation for more transparent, and trustworthy decentralised applications. The XSC explanations aims to facilitate user awareness of complex decision-making processes. The evaluation of the framework uses a case to exemplify the working of our framework, its added value and limitations, and consults experts in the field for feedback and refinements.
区块链智能合约(SC)已成为一种变革性技术,无需中介即可自动执行合同协议。然而,随着区块链智能合约在去中心化决策能力方面变得越来越复杂,在理解其背后的推理过程和确保用户理解方面存在明显的困难。现有文献主要关注 SC 的技术层面,忽视了对这些系统内部决策过程的探索以及人类的参与。在本文中,我们提出了一个框架,通过应用情境意识(SA)和目标导向任务分析(GDTA)概念,整合了以人为本的设计原则,以确定设计可解释智能合约(XSC)所需的信息需求。该框架为需求工程师提供了一种结构化方法,用于确定可在整个决策过程中让用户充分知情的信息。该框架考虑了业务逻辑模型、数据模型、角色和职责模型等因素,以定义形成 SC 行为并需要解释的特定信息需求。为指导信息需求的确定,该框架将 SC 决策机制分为自治、治理、处理和行为四类。ExplanaSC框架通过与SA相一致的三个层次来促进产生XSC解释:XSC感知解释、XSC理解解释和XSC预测解释。总之,该框架有助于开发 XSC 系统,并为更透明、更可信的分散式应用奠定基础。XSC 解释旨在促进用户了解复杂的决策过程。对该框架的评估使用了一个案例来说明我们框架的工作原理、其附加值和局限性,并咨询了该领域的专家以获得反馈和改进。
{"title":"ExplanaSC: A Framework for Determining Information Requirements for Explainable Blockchain Smart Contracts","authors":"Hanouf Al Ghanmi;Rami Bahsoon","doi":"10.1109/TSE.2024.3408632","DOIUrl":"10.1109/TSE.2024.3408632","url":null,"abstract":"Blockchain smart contracts (SCs) have emerged as a transformative technology, enabling the automation and execution of contractual agreements without the need for intermediaries. However, as SCs evolve to become more complex in their decentralised decision-making abilities, there are notable difficulties in comprehending the underlying reasoning process and ensuring users’ understanding. The existing literature primarily focuses on the technical aspects of SC, overlooking the exploration of the decision-making process within these systems and the involvement of humans. In this paper, we propose a framework that integrates human-centered design principles by applying Situation Awareness (SA) and goal directed task analysis (GDTA) concepts to determine information requirements necessary to design eXplainable smart contracts (XSC). The framework provides a structured approach for requirements engineers to identify information that can keep users well-informed throughout the decision-making process. The framework considers factors such as the business logic model, data model, and roles and responsibilities model to define specific information requirements that shape SC behaviour and necessitate explanations. To guide the determination of information requirements, the framework categorises SC decision mechanisms into autonomy, governance, processing, and behaviour. The ExplanaSC framework promotes the generation of XSC explanations through three levels aligned with SA: XSC explanation for perception, XSC explanation for comprehension, and XSC explanation for projection. Overall, this framework contributes to the development of XSC systems and lays the foundation for more transparent, and trustworthy decentralised applications. The XSC explanations aims to facilitate user awareness of complex decision-making processes. The evaluation of the framework uses a case to exemplify the working of our framework, its added value and limitations, and consults experts in the field for feedback and refinements.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141969503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Just-In-Time TODO-Missed Commits Detection 及时的 TODO-遗漏提交检测
IF 7.4 1区 计算机科学 Q1 Computer Science Pub Date : 2024-05-24 DOI: 10.1109/tse.2024.3405005
Haoye Wang, Zhipeng Gao, Xing Hu, David Lo, John Grundy, Xinyu Wang
{"title":"Just-In-Time TODO-Missed Commits Detection","authors":"Haoye Wang, Zhipeng Gao, Xing Hu, David Lo, John Grundy, Xinyu Wang","doi":"10.1109/tse.2024.3405005","DOIUrl":"https://doi.org/10.1109/tse.2024.3405005","url":null,"abstract":"","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":null,"pages":null},"PeriodicalIF":7.4,"publicationDate":"2024-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141096573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CRPWarner: Warning the Risk of Contract-Related Rug Pull in DeFi Smart Contracts CRPWarner:警告 DeFi 智能合约中与合约相关的扯皮风险
IF 7.4 1区 计算机科学 Q1 Computer Science Pub Date : 2024-04-30 DOI: 10.1109/TSE.2024.3392451
Zewei Lin;Jiachi Chen;Jiajing Wu;Weizhe Zhang;Yongjuan Wang;Zibin Zheng
In recent years, Decentralized Finance (DeFi) has grown rapidly due to the development of blockchain technology and smart contracts. As of March 2023, the estimated global cryptocurrency market cap has reached approximately $949 billion. However, security incidents continue to plague the DeFi ecosystem, and one of the most notorious examples is the “Rug Pull” scam. This type of cryptocurrency scam occurs when the developer of a particular token project intentionally abandons the project and disappears with investors’ funds. Despite only emerging in recent years, Rug Pull events have already caused significant financial losses. In this work, we manually collected and analyzed 103 real-world rug pull events, categorizing them based on their scam methods. Two primary categories were identified: Contract-related Rug Pull (through malicious functions in smart contracts) and Transaction-related Rug Pull (through cryptocurrency trading without utilizing malicious functions). Based on the analysis of rug pull events, we propose CRPWarner (short for Contract-related Rug Pull Risk Warner) to identify malicious functions in smart contracts and issue warnings regarding potential rug pulls. We evaluated CRPWarner on 69 open-source smart contracts related to rug pull events and achieved a 91.8% precision, 85.9% recall, and 88.7% F1-score. Additionally, when evaluating CRPWarner on 13,484 real-world token contracts on Ethereum, it successfully detected 4168 smart contracts with malicious functions, including zero-day examples. The precision of large-scale experiments reaches 84.9%.
近年来,由于区块链技术和智能合约的发展,去中心化金融(DeFi)发展迅速。截至 2023 年 3 月,全球加密货币市值估计已达到约 9 490 亿美元。然而,安全事件仍然困扰着 DeFi 生态系统,其中最臭名昭著的例子之一就是 "拉扯"(Rug Pull)骗局。当某个代币项目的开发者故意放弃项目并携投资者资金消失时,就会发生这种类型的加密货币骗局。尽管 Rug Pull 事件近几年才出现,但已经造成了巨大的经济损失。在这项工作中,我们手动收集并分析了 103 起真实世界中的 "地毯式拉升 "事件,并根据其诈骗方法进行了分类。我们确定了两个主要类别:与合约相关的 "拉人"(通过智能合约中的恶意函数)和与交易相关的 "拉人"(通过加密货币交易,不使用恶意函数)。根据对 "拉扯 "事件的分析,我们提出了 CRPWarner(与合约相关的 "拉扯 "风险华纳公司的简称),用于识别智能合约中的恶意函数,并就潜在的 "拉扯 "事件发出警告。我们在 69 个与拉拽事件相关的开源智能合约上对 CRPWarner 进行了评估,结果显示其精确度为 91.8%,召回率为 85.9%,F1 分数为 88.7%。此外,在对以太坊上的 13,484 份真实代币合约进行评估时,CRPWarner 成功检测出 4168 份具有恶意功能的智能合约,其中包括零日实例。大规模实验的精确度达到 84.9%。
{"title":"CRPWarner: Warning the Risk of Contract-Related Rug Pull in DeFi Smart Contracts","authors":"Zewei Lin;Jiachi Chen;Jiajing Wu;Weizhe Zhang;Yongjuan Wang;Zibin Zheng","doi":"10.1109/TSE.2024.3392451","DOIUrl":"10.1109/TSE.2024.3392451","url":null,"abstract":"In recent years, Decentralized Finance (DeFi) has grown rapidly due to the development of blockchain technology and smart contracts. As of March 2023, the estimated global cryptocurrency market cap has reached approximately $949 billion. However, security incidents continue to plague the DeFi ecosystem, and one of the most notorious examples is the “Rug Pull” scam. This type of cryptocurrency scam occurs when the developer of a particular token project intentionally abandons the project and disappears with investors’ funds. Despite only emerging in recent years, Rug Pull events have already caused significant financial losses. In this work, we manually collected and analyzed 103 real-world rug pull events, categorizing them based on their scam methods. Two primary categories were identified: \u0000<italic>Contract-related</i>\u0000 Rug Pull (through malicious functions in smart contracts) and \u0000<italic>Transaction-related</i>\u0000 Rug Pull (through cryptocurrency trading without utilizing malicious functions). Based on the analysis of rug pull events, we propose CRPWarner (short for \u0000<bold>C</b>\u0000ontract-related \u0000<bold>R</b>\u0000ug \u0000<bold>P</b>\u0000ull Risk \u0000<bold>Warner</b>\u0000) to identify malicious functions in smart contracts and issue warnings regarding potential rug pulls. We evaluated CRPWarner on 69 open-source smart contracts related to rug pull events and achieved a 91.8% precision, 85.9% recall, and 88.7% F1-score. Additionally, when evaluating CRPWarner on 13,484 real-world token contracts on Ethereum, it successfully detected 4168 smart contracts with malicious functions, including zero-day examples. The precision of large-scale experiments reaches 84.9%.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":null,"pages":null},"PeriodicalIF":7.4,"publicationDate":"2024-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140818047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How do Developers Adapt Code Snippets to Their Contexts? An Empirical Study of Context-Based Code Snippet Adaptations 开发人员如何根据上下文调整代码片段?基于上下文的代码片段适应性实证研究
IF 7.4 1区 计算机科学 Q1 Computer Science Pub Date : 2024-04-30 DOI: 10.1109/tse.2024.3395519
Tanghaoran Zhang, Yao Lu, Yue Yu, Xinjun Mao, Yang Zhang, Yuxin Zhao
{"title":"How do Developers Adapt Code Snippets to Their Contexts? An Empirical Study of Context-Based Code Snippet Adaptations","authors":"Tanghaoran Zhang, Yao Lu, Yue Yu, Xinjun Mao, Yang Zhang, Yuxin Zhao","doi":"10.1109/tse.2024.3395519","DOIUrl":"https://doi.org/10.1109/tse.2024.3395519","url":null,"abstract":"","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":null,"pages":null},"PeriodicalIF":7.4,"publicationDate":"2024-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140818054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Concretely Mapped Symbolic Memory Locations for Memory Error Detection 用于内存错误检测的具体映射符号内存位置
IF 6.5 1区 计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-04-30 DOI: 10.1109/TSE.2024.3395412
Haoxin Tu;Lingxiao Jiang;Jiaqi Hong;Xuhua Ding;He Jiang
Memory allocation is a fundamental operation for managing memory objects in many programming languages. Misusing allocated memory objects (e.g., buffer overflow and use-after-free) can have catastrophic consequences. Symbolic execution-based approaches have been used to detect such memory errors, benefiting from their capabilities in automatic path exploration and test case generation. However, existing symbolic execution engines still suffer from fundamental limitations in modeling dynamic memory layouts; they either represent the locations of memory objects as concrete addresses and thus limit their analyses only to specific address layouts and miss errors that may only occur when the objects are located at special addresses, or represent the locations as simple symbolic variables without sufficient constraints and thus suffer from memory state explosion when they execute read/write operations involving symbolic addresses. Such limitations hinder the existing symbolic execution engines from effectively detecting certain memory errors. In this study, we propose SymLoc, a symbolic execution-based approach that uses concretely mapped symbolic memory locations to alleviate the limitations mentioned above. Specifically, a new integration of three techniques is designed in SymLoc: (1) the symbolization of addresses and encoding of symbolic addresses into path constraints, (2) the symbolic memory read/write operations using a symbolic-concrete memory map, and (3) the automatic tracking of the uses of symbolic memory locations. We build SymLoc on top of the well-known symbolic execution engine KLEE and demonstrate its benefits in terms of memory error detection and code coverage capabilities. Our evaluation results show that: for address-specific spatial memory errors, SymLoc can detect 23 more errors in GNU Coreutils, Make, and m4 programs that are difficult for other approaches to detect, and cover 15% and 48% more unique lines of code in the programs than two baseline approaches; for temporal memory errors, SymLoc can detect 8%-64% more errors in the Juliet Test Suite than various existing state-of-the-art memory error detectors. We also present two case studies to show sample memory errors detected by SymLoc along with their root causes and implications.
在许多编程语言中,内存分配是管理内存对象的基本操作。滥用已分配的内存对象(如缓冲区溢出和使用后释放)可能会造成灾难性后果。基于符号执行的方法已被用于检测此类内存错误,并受益于其自动路径探索和测试用例生成的功能。然而,现有的符号执行引擎在对动态内存布局建模时仍存在根本性的限制;它们要么将内存对象的位置表示为具体地址,从而将分析局限于特定的地址布局,错过了只有当对象位于特殊地址时才可能发生的错误;要么将位置表示为简单的符号变量,没有足够的约束,从而在执行涉及符号地址的读/写操作时出现内存状态爆炸。这些限制妨碍了现有的符号执行引擎有效地检测某些内存错误。在本研究中,我们提出了 SymLoc,这是一种基于符号执行的方法,它使用具体映射的符号内存位置来缓解上述限制。具体来说,SymLoc设计了三种技术的新集成:(1) 地址符号化和将符号地址编码为路径约束;(2) 使用符号-具体内存映射进行符号内存读/写操作;(3) 自动跟踪符号内存位置的使用。我们在著名的符号执行引擎 KLEE 的基础上构建了 SymLoc,并展示了它在内存错误检测和代码覆盖能力方面的优势。我们的评估结果表明:对于特定地址的空间内存错误,SymLoc可以在GNU Coreutils、Make和m4程序中多检测出23个其他方法难以检测到的错误,与两种基线方法相比,SymLoc可以多覆盖程序中15%和48%的独特代码行;对于时间内存错误,SymLoc可以在Juliet测试套件中比现有的各种最先进的内存错误检测器多检测出8%-64%的错误。我们还介绍了两个案例研究,展示了SymLoc检测到的内存错误样本及其根本原因和影响。
{"title":"Concretely Mapped Symbolic Memory Locations for Memory Error Detection","authors":"Haoxin Tu;Lingxiao Jiang;Jiaqi Hong;Xuhua Ding;He Jiang","doi":"10.1109/TSE.2024.3395412","DOIUrl":"10.1109/TSE.2024.3395412","url":null,"abstract":"Memory allocation is a fundamental operation for managing memory objects in many programming languages. Misusing allocated memory objects (e.g., \u0000<italic>buffer overflow</i>\u0000 and \u0000<italic>use-after-free</i>\u0000) can have catastrophic consequences. Symbolic execution-based approaches have been used to detect such memory errors, benefiting from their capabilities in automatic path exploration and test case generation. However, existing symbolic execution engines still suffer from fundamental limitations in modeling dynamic memory layouts; they either represent the locations of memory objects as concrete addresses and thus limit their analyses only to specific address layouts and miss errors that may only occur when the objects are located at special addresses, or represent the locations as simple symbolic variables without sufficient constraints and thus suffer from memory state explosion when they execute read/write operations involving symbolic addresses. Such limitations hinder the existing symbolic execution engines from effectively detecting certain memory errors. In this study, we propose \u0000<sc>SymLoc</small>\u0000, a symbolic execution-based approach that uses concretely mapped symbolic memory locations to alleviate the limitations mentioned above. Specifically, a new integration of three techniques is designed in \u0000<sc>SymLoc</small>\u0000: (1) the symbolization of addresses and encoding of symbolic addresses into path constraints, (2) the symbolic memory read/write operations using a symbolic-concrete memory map, and (3) the automatic tracking of the uses of symbolic memory locations. We build \u0000<sc>SymLoc</small>\u0000 on top of the well-known symbolic execution engine KLEE and demonstrate its benefits in terms of memory error detection and code coverage capabilities. Our evaluation results show that: for address-specific spatial memory errors, \u0000<sc>SymLoc</small>\u0000 can detect 23 more errors in \u0000<monospace>GNU Coreutils</monospace>\u0000, \u0000<monospace>Make</monospace>\u0000, and \u0000<monospace>m4</monospace>\u0000 programs that are difficult for other approaches to detect, and cover 15% and 48% more unique lines of code in the programs than two baseline approaches; for temporal memory errors, \u0000<sc>SymLoc</small>\u0000 can detect 8%-64% more errors in the Juliet Test Suite than various existing state-of-the-art memory error detectors. We also present two case studies to show sample memory errors detected by \u0000<sc>SymLoc</small>\u0000 along with their root causes and implications.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":null,"pages":null},"PeriodicalIF":6.5,"publicationDate":"2024-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140818040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
VarGAN: Adversarial Learning of Variable Semantic Representations VarGAN:可变语义表征的对抗学习
IF 7.4 1区 计算机科学 Q1 Computer Science Pub Date : 2024-04-25 DOI: 10.1109/TSE.2024.3391730
Yalan Lin;Chengcheng Wan;Shuwen Bai;Xiaodong Gu
Variable names are of critical importance in code representation learning. However, due to diverse naming conventions, variables often receive arbitrary names, leading to long-tail, out-of-vocabulary (OOV), and other well-known problems. While the Byte-Pair Encoding (BPE) tokenizer has addressed the surface-level recognition of low-frequency tokens, it has not noticed the inadequate training of low-frequency identifiers by code representation models, resulting in an imbalanced distribution of rare and common identifiers. Consequently, code representation models struggle to effectively capture the semantics of low-frequency variable names. In this paper, we propose VarGAN, a novel method for variable name representations. VarGAN strengthens the training of low-frequency variables through adversarial training. Specifically, we regard the code representation model as a generator responsible for producing vectors from source code. Additionally, we employ a discriminator that detects whether the code input to the generator contains low-frequency variables. This adversarial setup regularizes the distribution of rare variables, making them overlap with their corresponding high-frequency counterparts in the vector space. Experimental results demonstrate that VarGAN empowers CodeBERT to generate code vectors that exhibit more uniform distribution for both low- and high-frequency identifiers. There is an improvement of 8% in similarity and relatedness scores compared to VarCLR in the IdBench benchmark. VarGAN is also validated in downstream tasks, where it exhibits enhanced capabilities in capturing token- and code-level semantics.
变量名在代码表示学习中至关重要。然而,由于命名规则的多样性,变量通常会被任意命名,从而导致长尾、词汇量不足(OOV)和其他众所周知的问题。虽然字节对编码(BPE)标记器解决了低频标记的表面识别问题,但它并没有注意到代码表示模型对低频标识符的训练不足,导致稀有标识符和常见标识符的分布不平衡。因此,代码表示模型很难有效捕捉低频变量名的语义。在本文中,我们提出了一种用于变量名表示的新方法 VarGAN。VarGAN 通过对抗训练加强了低频变量的训练。具体来说,我们将代码表示模型视为一个生成器,负责从源代码中生成向量。此外,我们还采用了一种判别器,用于检测输入到生成器的代码是否包含低频变量。这种对抗设置规范了稀有变量的分布,使它们与向量空间中相应的高频变量重叠。实验结果表明,VarGAN 使 CodeBERT 生成的代码向量在低频和高频标识符的分布上更加均匀。在 IdBench 基准测试中,与 VarCLR 相比,相似性和相关性得分提高了 8%。VarGAN 在下游任务中也得到了验证,在捕获标记和代码级语义方面表现出更强的能力。
{"title":"VarGAN: Adversarial Learning of Variable Semantic Representations","authors":"Yalan Lin;Chengcheng Wan;Shuwen Bai;Xiaodong Gu","doi":"10.1109/TSE.2024.3391730","DOIUrl":"10.1109/TSE.2024.3391730","url":null,"abstract":"Variable names are of critical importance in code representation learning. However, due to diverse naming conventions, variables often receive arbitrary names, leading to long-tail, out-of-vocabulary (OOV), and other well-known problems. While the Byte-Pair Encoding (BPE) tokenizer has addressed the surface-level recognition of low-frequency tokens, it has not noticed the inadequate training of low-frequency identifiers by code representation models, resulting in an imbalanced distribution of rare and common identifiers. Consequently, code representation models struggle to effectively capture the semantics of low-frequency variable names. In this paper, we propose VarGAN, a novel method for variable name representations. VarGAN strengthens the training of low-frequency variables through adversarial training. Specifically, we regard the code representation model as a generator responsible for producing vectors from source code. Additionally, we employ a discriminator that detects whether the code input to the generator contains low-frequency variables. This adversarial setup regularizes the distribution of rare variables, making them overlap with their corresponding high-frequency counterparts in the vector space. Experimental results demonstrate that VarGAN empowers CodeBERT to generate code vectors that exhibit more uniform distribution for both low- and high-frequency identifiers. There is an improvement of 8% in similarity and relatedness scores compared to VarCLR in the IdBench benchmark. VarGAN is also validated in downstream tasks, where it exhibits enhanced capabilities in capturing token- and code-level semantics.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":null,"pages":null},"PeriodicalIF":7.4,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140648614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TransformCode: A Contrastive Learning Framework for Code Embedding via Subtree Transformation TransformCode:通过子树变换进行代码嵌入的对比学习框架
IF 7.4 1区 计算机科学 Q1 Computer Science Pub Date : 2024-04-25 DOI: 10.1109/TSE.2024.3393419
Zixiang Xian;Rubing Huang;Dave Towey;Chunrong Fang;Zhenyu Chen
Artificial intelligence (AI) has revolutionized software engineering (SE) by enhancing software development efficiency. The advent of pre-trained models (PTMs) leveraging transfer learning has significantly advanced AI for SE. However, existing PTMs that operate on individual code tokens suffer from several limitations: They are costly to train and fine-tune; and they rely heavily on labeled data for fine-tuning on task-specific datasets. In this paper, we present TransformCode, a novel framework that learns code embeddings in a contrastive learning manner. Our framework is encoder-agnostic and language-agnostic, which means that it can leverage any encoder model and handle any programming language. We also propose a novel data-augmentation technique called abstract syntax tree (AST) transformation, which applies syntactic and semantic transformations to the original code snippets, to generate more diverse and robust samples for contrastive learning. Our framework has several advantages over existing methods: (1) It is flexible and adaptable, because it can easily be extended to other downstream tasks that require code representation (such as code-clone detection and classification); (2) it is efficient and scalable, because it does not require a large model or a large amount of training data, and it can support any programming language; (3) it is not limited to unsupervised learning, but can also be applied to some supervised learning tasks by incorporating task-specific labels or objectives; and (4) it can also adjust the number of encoder parameters based on computing resources. We evaluate our framework on several code-related tasks, and demonstrate its effectiveness and superiority over the state-of-the-art methods such as SourcererCC, Code2vec, and InferCode.
人工智能(AI)通过提高软件开发效率,彻底改变了软件工程(SE)。利用迁移学习的预训练模型(PTM)的出现极大地推动了人工智能在 SE 领域的应用。然而,对单个代码标记进行操作的现有 PTM 有几个局限性:它们的训练和微调成本很高;它们严重依赖特定任务数据集上的标记数据进行微调。在本文中,我们介绍了 TransformCode,这是一种以对比学习方式学习代码嵌入的新型框架。我们的框架与编码器和语言无关,这意味着它可以利用任何编码器模型,处理任何编程语言。我们还提出了一种名为抽象语法树(AST)转换的新颖数据扩充技术,该技术可对原始代码片段进行语法和语义转换,从而为对比学习生成更多样、更健壮的样本。与现有方法相比,我们的框架具有以下优势(1) 它具有灵活性和适应性,因为它可以很容易地扩展到其他需要代码表示的下游任务(如代码克隆检测和分类);(2) 它具有高效性和可扩展性,因为它不需要大型模型或大量训练数据,而且可以支持任何编程语言;(3) 它不仅限于无监督学习,还可以通过结合特定任务的标签或目标应用于某些有监督学习任务;(4) 它还可以根据计算资源调整编码器参数的数量。我们在几个与代码相关的任务中评估了我们的框架,并证明了它的有效性和优于 SourcererCC、Code2vec 和 InferCode 等最先进方法的优势。
{"title":"TransformCode: A Contrastive Learning Framework for Code Embedding via Subtree Transformation","authors":"Zixiang Xian;Rubing Huang;Dave Towey;Chunrong Fang;Zhenyu Chen","doi":"10.1109/TSE.2024.3393419","DOIUrl":"10.1109/TSE.2024.3393419","url":null,"abstract":"Artificial intelligence (AI) has revolutionized software engineering (SE) by enhancing software development efficiency. The advent of pre-trained models (PTMs) leveraging transfer learning has significantly advanced AI for SE. However, existing PTMs that operate on individual code tokens suffer from several limitations: They are costly to train and fine-tune; and they rely heavily on labeled data for fine-tuning on task-specific datasets. In this paper, we present \u0000<bold>TransformCode</b>\u0000, a novel framework that learns code embeddings in a contrastive learning manner. Our framework is encoder-agnostic and language-agnostic, which means that it can leverage any encoder model and handle any programming language. We also propose a novel data-augmentation technique called \u0000<bold>abstract syntax tree (AST) transformation</b>\u0000, which applies syntactic and semantic transformations to the original code snippets, to generate more diverse and robust samples for contrastive learning. Our framework has several advantages over existing methods: (1) It is flexible and adaptable, because it can easily be extended to other downstream tasks that require code representation (such as code-clone detection and classification); (2) it is efficient and scalable, because it does not require a large model or a large amount of training data, and it can support any programming language; (3) it is not limited to unsupervised learning, but can also be applied to some supervised learning tasks by incorporating task-specific labels or objectives; and (4) it can also adjust the number of encoder parameters based on computing resources. We evaluate our framework on several code-related tasks, and demonstrate its effectiveness and superiority over the state-of-the-art methods such as SourcererCC, Code2vec, and InferCode.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":null,"pages":null},"PeriodicalIF":7.4,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140648689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Neural Library Recommendation by Embedding Project-Library Knowledge Graph 通过嵌入项目图书馆知识图谱进行神经图书馆推荐
IF 7.4 1区 计算机科学 Q1 Computer Science Pub Date : 2024-04-24 DOI: 10.1109/TSE.2024.3393504
Bo Li;Haowei Quan;Jiawei Wang;Pei Liu;Haipeng Cai;Yuan Miao;Yun Yang;Li Li
The prosperity of software applications brings fierce market competition to developers. Employing third-party libraries (TPLs) to add new features to projects under development and to reduce the time to market has become a popular way in the community. However, given the tremendous TPLs ready for use, it is challenging for developers to effectively and efficiently identify the most suitable TPLs. To tackle this obstacle, we propose an innovative approach named PyRec to recommend potentially useful TPLs to developers for their projects. Taking Python project development as a use case, PyRec embeds Python projects, TPLs, contextual information, and relations between those entities into a knowledge graph. Then, it employs a graph neural network to capture useful information from the graph to make TPL recommendations. Different from existing approaches, PyRec can make full use of not only project-library interaction information but also contextual information to make more accurate TPL recommendations. Comprehensive evaluations are conducted based on 12,421 Python projects involving 963 TPLs, 9,675 extra entities, 121,474 library usage records, and 73,277 contextual records. Compared with five representative approaches, PyRec improves the recommendation performance significantly in all cases.
软件应用的繁荣给开发人员带来了激烈的市场竞争。利用第三方库(TPL)为开发中的项目增加新功能,缩短产品上市时间,已成为社会上流行的一种方式。然而,由于可供使用的第三方库数量巨大,开发人员如何有效、高效地识别最合适的第三方库是一项挑战。为了解决这一障碍,我们提出了一种名为 PyRec 的创新方法,为开发人员的项目推荐潜在有用的 TPL。以 Python 项目开发为例,PyRec 将 Python 项目、TPL、上下文信息以及这些实体之间的关系嵌入知识图谱。然后,它利用图神经网络从图中捕捉有用信息,从而提出 TPL 建议。与现有方法不同的是,PyRec 不仅能充分利用项目与库之间的交互信息,还能充分利用上下文信息,从而做出更准确的 TPL 推荐。我们基于 12,421 个 Python 项目进行了综合评估,这些项目涉及 963 个 TPL、9,675 个额外实体、121,474 条图书馆使用记录和 73,277 条上下文记录。与五种具有代表性的方法相比,PyRec 在所有情况下都显著提高了推荐性能。
{"title":"Neural Library Recommendation by Embedding Project-Library Knowledge Graph","authors":"Bo Li;Haowei Quan;Jiawei Wang;Pei Liu;Haipeng Cai;Yuan Miao;Yun Yang;Li Li","doi":"10.1109/TSE.2024.3393504","DOIUrl":"10.1109/TSE.2024.3393504","url":null,"abstract":"The prosperity of software applications brings fierce market competition to developers. Employing third-party libraries (TPLs) to add new features to projects under development and to reduce the time to market has become a popular way in the community. However, given the tremendous TPLs ready for use, it is challenging for developers to effectively and efficiently identify the most suitable TPLs. To tackle this obstacle, we propose an innovative approach named PyRec to recommend potentially useful TPLs to developers for their projects. Taking Python project development as a use case, PyRec embeds Python projects, TPLs, contextual information, and relations between those entities into a knowledge graph. Then, it employs a graph neural network to capture useful information from the graph to make TPL recommendations. Different from existing approaches, PyRec can make full use of not only project-library interaction information but also contextual information to make more accurate TPL recommendations. Comprehensive evaluations are conducted based on 12,421 Python projects involving 963 TPLs, 9,675 extra entities, 121,474 library usage records, and 73,277 contextual records. Compared with five representative approaches, PyRec improves the recommendation performance significantly in all cases.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":null,"pages":null},"PeriodicalIF":7.4,"publicationDate":"2024-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140642332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
No Need to Lift a Finger Anymore? Assessing the Quality of Code Generation by ChatGPT 不再需要动动手指?通过 ChatGPT 评估代码生成质量
IF 7.4 1区 计算机科学 Q1 Computer Science Pub Date : 2024-04-23 DOI: 10.1109/TSE.2024.3392499
Zhijie Liu;Yutian Tang;Xiapu Luo;Yuming Zhou;Liang Feng Zhang
Large language models (LLMs) have demonstrated impressive capabilities across various natural language processing (NLP) tasks, such as machine translation, question answering, summarization, and so on. Additionally, LLMs are also highly valuable in supporting software engineering tasks, particularly in the field of code generation. Automatic code generation is a process of automatically generating source code or executable code based on given specifications or requirements, improving developer productivity. In this study, we perform a systematic empirical assessment to the quality of code generation using ChatGPT, a recent state-of-the-art product LLM. We leverage 728 algorithm problems in five languages (i.e., C, C++, Java, Python, and JavaScript) and 18 CWEs with 54 code scenarios for the code generation task. Our evaluation encompasses a comprehensive analysis of code snippets generated by ChatGPT, focusing on three critical aspects: correctness, complexity, and security. We also specifically investigate ChatGPT's ability to engage in multi-round fixing process (i.e., ChatGPT's dialog ability, chatting between users and ChatGPT for fixing generated buggy code) of facilitating code generation. By delving into the generated code and examining the experimental results, this work provides valuable insights into the performance of ChatGPT in tackling code generation tasks over the three critical aspects. The experimental results demonstrate that (1) ChatGPT is better at generating functionally correct code for problems before 2021 in different languages than problems after 2021 with $48.14%$ advantage in Accepted rate on judgment platform, but ChatGPT's ability to directly fix erroneous code with multi-round fixing process to achieve correct functionality is relatively weak; (2) the distribution of cyclomatic and cognitive complexity levels for code snippets in different languages varies. Furthermore, the multi-round fixing process with ChatGPT generally preserves or increases the complexity levels of code snippets; (3) in algorithm scenarios with languages of C, C++, and Java, and CWE scenarios with languages of C and Python3, the code generated by ChatGPT has relevant vulnerabilities. However, the multi-round fixing process for vulnerable code snippets demonstrates promising results, with more than $89%$ of vulnerabilities successfully addressed; and (4) code generation may be affected by ChatGPT's non-determinism factor, resulting in variations of code snippets in functional correctness, complexity, and security. Overall, our findings uncover potential issues and limitations that arise in the ChatGPT-based code generation and lay the groundwork for improving AI and LLM-based code generation techniques.
大型语言模型(LLM)在机器翻译、问题解答、摘要等各种自然语言处理(NLP)任务中都表现出了令人印象深刻的能力。此外,大型语言模型在支持软件工程任务方面也极具价值,尤其是在代码生成领域。自动代码生成是指根据给定的规范或要求自动生成源代码或可执行代码,从而提高开发人员工作效率的过程。在本研究中,我们使用 ChatGPT 对代码生成的质量进行了系统的实证评估,ChatGPT 是最近推出的最先进的 LLM 产品。我们利用五种语言(即 C、C++、Java、Python 和 JavaScript)中的 728 个算法问题和 18 个 CWE 中的 54 个代码场景来完成代码生成任务。我们的评估包括对 ChatGPT 生成的代码片段进行全面分析,重点关注三个关键方面:正确性、复杂性和安全性。我们还特别研究了 ChatGPT 参与多轮修复过程的能力(即 ChatGPT 的对话能力、用户与 ChatGPT 之间为修复生成的错误代码而进行的聊天),以促进代码生成。通过深入研究生成的代码并检查实验结果,这项工作为 ChatGPT 在处理代码生成任务的三个关键方面的性能提供了有价值的见解。实验结果表明:(1) ChatGPT 在生成 2021 年之前不同语言问题的功能正确性代码方面优于 2021 年之后的问题,在判断平台上的接受率优势为 48.14%%$,但 ChatGPT 通过多轮修正过程直接修正错误代码以实现正确功能的能力相对较弱;(2) 不同语言代码片段的循环复杂度和认知复杂度水平分布各不相同。此外,ChatGPT 的多轮修复过程一般会保留或增加代码片段的复杂度级别;(3)在 C、C++ 和 Java 语言的算法场景,以及 C 和 Python3 语言的 CWE 场景中,ChatGPT 生成的代码存在相关漏洞。然而,多轮漏洞代码片段修复过程取得了可喜的成果,超过 89%$ 的漏洞被成功修复;(4)代码生成可能会受到 ChatGPT 非确定性因素的影响,导致代码片段在功能正确性、复杂性和安全性方面存在差异。总之,我们的发现揭示了基于 ChatGPT 的代码生成过程中可能出现的问题和局限性,为改进人工智能和基于 LLM 的代码生成技术奠定了基础。
{"title":"No Need to Lift a Finger Anymore? Assessing the Quality of Code Generation by ChatGPT","authors":"Zhijie Liu;Yutian Tang;Xiapu Luo;Yuming Zhou;Liang Feng Zhang","doi":"10.1109/TSE.2024.3392499","DOIUrl":"10.1109/TSE.2024.3392499","url":null,"abstract":"Large language models (LLMs) have demonstrated impressive capabilities across various natural language processing (NLP) tasks, such as machine translation, question answering, summarization, and so on. Additionally, LLMs are also highly valuable in supporting software engineering tasks, particularly in the field of code generation. Automatic code generation is a process of automatically generating source code or executable code based on given specifications or requirements, improving developer productivity. In this study, we perform a systematic empirical assessment to the quality of code generation using \u0000<i>ChatGPT</i>\u0000, a recent state-of-the-art product LLM. We leverage 728 algorithm problems in five languages (i.e., C, C++, Java, Python, and JavaScript) and 18 CWEs with 54 code scenarios for the code generation task. Our evaluation encompasses a comprehensive analysis of code snippets generated by \u0000<i>ChatGPT</i>\u0000, focusing on three critical aspects: correctness, complexity, and security. We also specifically investigate \u0000<i>ChatGPT</i>\u0000's ability to engage in multi-round fixing process (i.e., \u0000<i>ChatGPT</i>\u0000's dialog ability, chatting between users and \u0000<i>ChatGPT</i>\u0000 for fixing generated buggy code) of facilitating code generation. By delving into the generated code and examining the experimental results, this work provides valuable insights into the performance of \u0000<i>ChatGPT</i>\u0000 in tackling code generation tasks over the three critical aspects. The experimental results demonstrate that (1) \u0000<i>ChatGPT</i>\u0000 is better at generating functionally correct code for problems before 2021 in different languages than problems after 2021 with \u0000<inline-formula><tex-math>$48.14%$</tex-math></inline-formula>\u0000 advantage in \u0000<i>Accepted</i>\u0000 rate on judgment platform, but \u0000<i>ChatGPT</i>\u0000's ability to directly fix erroneous code with multi-round fixing process to achieve correct functionality is relatively weak; (2) the distribution of cyclomatic and cognitive complexity levels for code snippets in different languages varies. Furthermore, the multi-round fixing process with \u0000<i>ChatGPT </i>\u0000 generally preserves or increases the complexity levels of code snippets; (3) in algorithm scenarios with languages of C, C++, and Java, and CWE scenarios with languages of C and Python3, the code generated by \u0000<i>ChatGPT </i>\u0000 has relevant vulnerabilities. However, the multi-round fixing process for vulnerable code snippets demonstrates promising results, with more than \u0000<inline-formula><tex-math>$89%$</tex-math></inline-formula>\u0000 of vulnerabilities successfully addressed; and (4) code generation may be affected by \u0000<i>ChatGPT</i>\u0000's non-determinism factor, resulting in variations of code snippets in functional correctness, complexity, and security. Overall, our findings uncover potential issues and limitations that arise in the \u0000<i>ChatGPT</i>\u0000-based code generation and lay the groundwork for improving AI and LLM-based code generation techniques.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":null,"pages":null},"PeriodicalIF":7.4,"publicationDate":"2024-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140639870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Transactions on Software Engineering
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1