首页 > 最新文献

2020 IEEE 2nd International Workshop on Intelligent Bug Fixing (IBF)最新文献

英文 中文
Exploring the Differences between Plausible and Correct Patches at Fine-Grained Level 在细粒度水平上探索可信和正确补丁之间的差异
Pub Date : 2020-02-01 DOI: 10.1109/IBF50092.2020.9034821
Bo Yang, Jinqiu Yang
Test-based automated program repair techniques use test cases to validate the correctness of automatically-generated patches. However, insufficient test cases lead to the generation of incorrect patches, i.e., passing all the test cases, however are incorrect. In this work, we present an exploratory study to understand what are the runtime behaviours are being modified by automatically-generated plausible patches, and how such modifications of runtime behaviours are different from those by correct patches. We utilized an off-the-shelf invariant generation tool to infer an abstraction of runtime behaviours and computed the modified runtime behaviours at the abstraction level. Our exploratory study shows that majority of the studied plausible patches (92/96) expose different modifications of runtime behaviours (i.e., captured by the invariant generation tool), compared to correct patches.
基于测试的自动程序修复技术使用测试用例来验证自动生成的补丁的正确性。然而,不充分的测试用例会导致生成不正确的补丁,也就是说,通过了所有的测试用例,但是是不正确的。在这项工作中,我们提出了一项探索性研究,以了解自动生成的合理补丁正在修改哪些运行时行为,以及这种运行时行为的修改与正确补丁的修改有何不同。我们利用现成的不变量生成工具来推断运行时行为的抽象,并在抽象级别计算修改后的运行时行为。我们的探索性研究表明,与正确的补丁相比,大多数被研究的可信补丁(92/96)暴露了运行时行为的不同修改(即由不变生成工具捕获)。
{"title":"Exploring the Differences between Plausible and Correct Patches at Fine-Grained Level","authors":"Bo Yang, Jinqiu Yang","doi":"10.1109/IBF50092.2020.9034821","DOIUrl":"https://doi.org/10.1109/IBF50092.2020.9034821","url":null,"abstract":"Test-based automated program repair techniques use test cases to validate the correctness of automatically-generated patches. However, insufficient test cases lead to the generation of incorrect patches, i.e., passing all the test cases, however are incorrect. In this work, we present an exploratory study to understand what are the runtime behaviours are being modified by automatically-generated plausible patches, and how such modifications of runtime behaviours are different from those by correct patches. We utilized an off-the-shelf invariant generation tool to infer an abstraction of runtime behaviours and computed the modified runtime behaviours at the abstraction level. Our exploratory study shows that majority of the studied plausible patches (92/96) expose different modifications of runtime behaviours (i.e., captured by the invariant generation tool), compared to correct patches.","PeriodicalId":190321,"journal":{"name":"2020 IEEE 2nd International Workshop on Intelligent Bug Fixing (IBF)","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116733743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Can This Fault Be Detected by Automated Test Generation: A Preliminary Study 自动测试生成是否可以检测到该故障:初步研究
Pub Date : 2020-02-01 DOI: 10.1109/IBF50092.2020.9034780
Hangyuan Cheng, Ping Ma, Jingxuan Zhang, J. Xuan
Automated test generation can reduce the manual effort to improve software quality. A test generation method employs code coverage, such as the widely-used branch coverage, to guide the inference of test cases. These test cases can be used to detect hidden faults. An automatic tool takes a specific type of code coverage as a configurable parameter. Given an automated tool of test generation, a fault may be detected by one type of code coverage, but omitted by another. In frequently released software projects, the time budget of testing is limited. Configuring code coverage for a testing tool can effectively improve the quality of projects. In this paper, we conduct a preliminary study on whether a fault can be detected by specific code coverage in automated test generation. We build predictive models with 60 metrics of faulty source code to identify detectable faults under eight types of code coverage, such as branch coverage. In the experiment, an off-the-shelf tool, EvoSuite is used to generate test data. Experimental results show that different types of code coverage result in the detection of different faults. The extracted metrics of faulty source code can be used to predict whether a fault can be detected with the given code coverage; all studied code coverage can increase the number of detected faults that are missed by the widely-used branch coverage. This study can be viewed as a preliminary result to support the configuration of code coverage in the application of automated test generation.
自动化的测试生成可以减少提高软件质量的手工工作。测试生成方法使用代码覆盖,例如广泛使用的分支覆盖,来指导测试用例的推断。这些测试用例可用于检测隐藏的故障。自动工具将特定类型的代码覆盖率作为可配置参数。给定一个自动生成测试的工具,一个错误可能被一种类型的代码覆盖检测到,但被另一种类型的代码覆盖忽略。在频繁发布的软件项目中,测试的时间预算是有限的。为测试工具配置代码覆盖率可以有效地提高项目的质量。在本文中,我们对自动化测试生成中是否可以通过特定的代码覆盖来检测故障进行了初步的研究。我们用60个错误源代码的度量来建立预测模型,以在8种类型的代码覆盖下识别可检测的错误,例如分支覆盖。在实验中,使用了现成的工具EvoSuite来生成测试数据。实验结果表明,不同类型的代码覆盖率可以检测到不同的故障。提取的错误源代码度量可以用来预测在给定的代码覆盖率下是否可以检测到错误;所有研究过的代码覆盖都会增加被广泛使用的分支覆盖所遗漏的检测到的错误的数量。这项研究可以被看作是支持自动测试生成应用程序中代码覆盖配置的初步结果。
{"title":"Can This Fault Be Detected by Automated Test Generation: A Preliminary Study","authors":"Hangyuan Cheng, Ping Ma, Jingxuan Zhang, J. Xuan","doi":"10.1109/IBF50092.2020.9034780","DOIUrl":"https://doi.org/10.1109/IBF50092.2020.9034780","url":null,"abstract":"Automated test generation can reduce the manual effort to improve software quality. A test generation method employs code coverage, such as the widely-used branch coverage, to guide the inference of test cases. These test cases can be used to detect hidden faults. An automatic tool takes a specific type of code coverage as a configurable parameter. Given an automated tool of test generation, a fault may be detected by one type of code coverage, but omitted by another. In frequently released software projects, the time budget of testing is limited. Configuring code coverage for a testing tool can effectively improve the quality of projects. In this paper, we conduct a preliminary study on whether a fault can be detected by specific code coverage in automated test generation. We build predictive models with 60 metrics of faulty source code to identify detectable faults under eight types of code coverage, such as branch coverage. In the experiment, an off-the-shelf tool, EvoSuite is used to generate test data. Experimental results show that different types of code coverage result in the detection of different faults. The extracted metrics of faulty source code can be used to predict whether a fault can be detected with the given code coverage; all studied code coverage can increase the number of detected faults that are missed by the widely-used branch coverage. This study can be viewed as a preliminary result to support the configuration of code coverage in the application of automated test generation.","PeriodicalId":190321,"journal":{"name":"2020 IEEE 2nd International Workshop on Intelligent Bug Fixing (IBF)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130491039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Intelligent Bug Fixing 智能Bug修复
Pub Date : 2020-02-01 DOI: 10.1109/ibf50092.2020.9034809
Xiapu Luo, Weiyi Shang, Xiaobing Sun, Tao Zhang
{"title":"Intelligent Bug Fixing","authors":"Xiapu Luo, Weiyi Shang, Xiaobing Sun, Tao Zhang","doi":"10.1109/ibf50092.2020.9034809","DOIUrl":"https://doi.org/10.1109/ibf50092.2020.9034809","url":null,"abstract":"","PeriodicalId":190321,"journal":{"name":"2020 IEEE 2nd International Workshop on Intelligent Bug Fixing (IBF)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124957688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Empirical Study of Bug Bounty Programs 漏洞赏金计划的实证研究
Pub Date : 2020-02-01 DOI: 10.1109/IBF50092.2020.9034828
T. Walshe, Andrew C. Simpson
The task of identifying vulnerabilities is commonly outsourced to hackers participating in bug bounty programs. As of July 2019, bug bounty platforms such as HackerOne have over 200 publicly listed programs, with programs listed on HackerOne being responsible for the discovery of tens of thousands of vulnerabilities since 2013. We report the results of an empirical analysis that was undertaken using the data available from two bug bounty platforms to understand the costs and benefits of bug bounty programs both to participants and to organisations. We consider the economics of bug bounty programs, investigating the costs and benefits to those running such programs and the hackers that participate in finding vulnerabilities. We find that the average cost of operating a bug bounty program for a year is now less than the cost of hiring two additional software engineers.
识别漏洞的任务通常外包给参与漏洞赏金计划的黑客。截至2019年7月,HackerOne等漏洞赏金平台拥有200多个公开上市的程序,自2013年以来,HackerOne上列出的程序负责发现数万个漏洞。我们报告了一项实证分析的结果,该分析使用了两个漏洞赏金平台提供的数据,以了解漏洞赏金计划对参与者和组织的成本和收益。我们考虑了漏洞赏金计划的经济学,调查了运行这些计划的人和参与寻找漏洞的黑客的成本和收益。我们发现,运行一个漏洞赏金计划一年的平均成本,现在比多雇佣两名软件工程师的成本还低。
{"title":"An Empirical Study of Bug Bounty Programs","authors":"T. Walshe, Andrew C. Simpson","doi":"10.1109/IBF50092.2020.9034828","DOIUrl":"https://doi.org/10.1109/IBF50092.2020.9034828","url":null,"abstract":"The task of identifying vulnerabilities is commonly outsourced to hackers participating in bug bounty programs. As of July 2019, bug bounty platforms such as HackerOne have over 200 publicly listed programs, with programs listed on HackerOne being responsible for the discovery of tens of thousands of vulnerabilities since 2013. We report the results of an empirical analysis that was undertaken using the data available from two bug bounty platforms to understand the costs and benefits of bug bounty programs both to participants and to organisations. We consider the economics of bug bounty programs, investigating the costs and benefits to those running such programs and the hackers that participate in finding vulnerabilities. We find that the average cost of operating a bug bounty program for a year is now less than the cost of hiring two additional software engineers.","PeriodicalId":190321,"journal":{"name":"2020 IEEE 2nd International Workshop on Intelligent Bug Fixing (IBF)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122280980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Why Is My Bug Wontfix? 为什么我的Bug无法修复?
Pub Date : 2020-02-01 DOI: 10.1109/IBF50092.2020.9034539
Qingye Wang
Developers often use bug reports to triage and fix bugs. However, not every bug can be fixed eventually. To understand the underlying reasons why bugs are wontfix, we conduct an empirical study on three open source projects (i.e., Mozilla, Eclipse and Apache OpenOffice) in Bugzilla. First, we manually analyzed 600 wontfix bug reports. Second, we used the open card sorting approach to label these bug reports why they were wontfix, and we summarized 12 categories of reasons. Next, we further studied the frequency distribution of the categories across projects. We found that Not Support bug reports are the majority of the wontfix bug reports. Moreover, the frequency distribution of wontfix bug reports across the 12 categories is basically similar for the three open source projects.
开发人员经常使用bug报告来分类和修复bug。然而,并不是所有的bug最终都能被修复。为了理解bug无法修复的潜在原因,我们在Bugzilla中对三个开源项目(即Mozilla、Eclipse和Apache OpenOffice)进行了实证研究。首先,我们手动分析了600个wontfix错误报告。其次,我们使用开放卡片分类方法来标记这些错误报告,为什么它们不能修复,我们总结了12类原因。接下来,我们进一步研究了项目间类别的频率分布。我们发现不支持的错误报告是大多数的非修复错误报告。此外,在这三个开源项目中,12个类别中wontfix错误报告的频率分布基本相似。
{"title":"Why Is My Bug Wontfix?","authors":"Qingye Wang","doi":"10.1109/IBF50092.2020.9034539","DOIUrl":"https://doi.org/10.1109/IBF50092.2020.9034539","url":null,"abstract":"Developers often use bug reports to triage and fix bugs. However, not every bug can be fixed eventually. To understand the underlying reasons why bugs are wontfix, we conduct an empirical study on three open source projects (i.e., Mozilla, Eclipse and Apache OpenOffice) in Bugzilla. First, we manually analyzed 600 wontfix bug reports. Second, we used the open card sorting approach to label these bug reports why they were wontfix, and we summarized 12 categories of reasons. Next, we further studied the frequency distribution of the categories across projects. We found that Not Support bug reports are the majority of the wontfix bug reports. Moreover, the frequency distribution of wontfix bug reports across the 12 categories is basically similar for the three open source projects.","PeriodicalId":190321,"journal":{"name":"2020 IEEE 2nd International Workshop on Intelligent Bug Fixing (IBF)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130095986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
An Empirical Study of High-Impact Factors for Machine Learning-Based Vulnerability Detection 基于机器学习的漏洞检测的高影响因子实证研究
Pub Date : 2020-02-01 DOI: 10.1109/IBF50092.2020.9034888
Wei Zheng, Jialiang Gao, Xiaoxue Wu, Yuxing Xun, Guoliang Liu, Xiang Chen
Ahstract—Vulnerability detection is an important topic of software engineering. To improve the effectiveness and efficiency of vulnerability detection, many traditional machine learning-based and deep learning-based vulnerability detection methods have been proposed. However, the impact of different factors on vulnerability detection is unknown. For example, classification models and vectorization methods can directly affect the detection results and code replacement can affect the features of vulnerability detection. We conduct a comparative study to evaluate the impact of different classification algorithms, vectorization methods and user-defined variables and functions name replacement. In this paper, we collected three different vulnerability code datasets. These datasets correspond to different types of vulnerabilities and have different proportions of source code. Besides, we extract and analyze the features of vulnerability code datasets to explain some experimental results. Our findings from the experimental results can be summarized as follows: (i) the performance of using deep learning is better than using traditional machine learning and BLSTM can achieve the best performance. (ii) CountVectorizer can improve the performance of traditional machine learning. (iii) Different vulnerability types and different code sources will generate different features. We use the Random Forest algorithm to generate the features of vulnerability code datasets. These generated features include system-related functions, syntax keywords, and user-defined names. (iv) Datasets without user-defined variables and functions name replacement will achieve better vulnerability detection results.
漏洞检测是软件工程中的一个重要课题。为了提高漏洞检测的有效性和效率,人们提出了许多传统的基于机器学习和基于深度学习的漏洞检测方法。然而,不同因素对漏洞检测的影响是未知的。例如,分类模型和矢量化方法会直接影响检测结果,代码替换会影响漏洞检测的特征。我们进行了比较研究,以评估不同的分类算法,矢量化方法和用户自定义变量和函数名称替换的影响。在本文中,我们收集了三个不同的漏洞代码数据集。这些数据集对应不同类型的漏洞,源代码的比例也不同。此外,我们提取并分析了漏洞码数据集的特征,对一些实验结果进行了解释。我们的实验结果可以总结如下:(i)使用深度学习的性能优于使用传统机器学习,BLSTM可以达到最佳性能。(ii) CountVectorizer可以提高传统机器学习的性能。(iii)不同的漏洞类型和不同的代码源会产生不同的特征。我们使用随机森林算法生成漏洞代码数据集的特征。这些生成的特性包括与系统相关的函数、语法关键字和用户定义的名称。(iv)不替换自定义变量和函数名的数据集,漏洞检测效果更好。
{"title":"An Empirical Study of High-Impact Factors for Machine Learning-Based Vulnerability Detection","authors":"Wei Zheng, Jialiang Gao, Xiaoxue Wu, Yuxing Xun, Guoliang Liu, Xiang Chen","doi":"10.1109/IBF50092.2020.9034888","DOIUrl":"https://doi.org/10.1109/IBF50092.2020.9034888","url":null,"abstract":"Ahstract—Vulnerability detection is an important topic of software engineering. To improve the effectiveness and efficiency of vulnerability detection, many traditional machine learning-based and deep learning-based vulnerability detection methods have been proposed. However, the impact of different factors on vulnerability detection is unknown. For example, classification models and vectorization methods can directly affect the detection results and code replacement can affect the features of vulnerability detection. We conduct a comparative study to evaluate the impact of different classification algorithms, vectorization methods and user-defined variables and functions name replacement. In this paper, we collected three different vulnerability code datasets. These datasets correspond to different types of vulnerabilities and have different proportions of source code. Besides, we extract and analyze the features of vulnerability code datasets to explain some experimental results. Our findings from the experimental results can be summarized as follows: (i) the performance of using deep learning is better than using traditional machine learning and BLSTM can achieve the best performance. (ii) CountVectorizer can improve the performance of traditional machine learning. (iii) Different vulnerability types and different code sources will generate different features. We use the Random Forest algorithm to generate the features of vulnerability code datasets. These generated features include system-related functions, syntax keywords, and user-defined names. (iv) Datasets without user-defined variables and functions name replacement will achieve better vulnerability detection results.","PeriodicalId":190321,"journal":{"name":"2020 IEEE 2nd International Workshop on Intelligent Bug Fixing (IBF)","volume":"70 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117239275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Blve: Should the Current Software Version Be Suitable for Release? Blve:当前的软件版本是否适合发布?
Pub Date : 2020-02-01 DOI: 10.1109/IBF50092.2020.9034776
Wei Zheng, Zhao Shi, Xiaojun Chen, Junzheng Chen, Manqing Zhang, Xiang Chen
Recently, agile development has become a popular software development method and many version iterations occur during agile development. It is very important to ensure the quality of each software version. However in actual development, it is difficult to know every stage or version about large-scale software development. That means developers do not know exactly which version the current project corresponds to. Simultaneously, there are many necessary requirements for software release in actual development. When we know exactly the version corresponding to the current project, we can know whether the current software version meets the release requirements. Therefore, we need a good software version division method. This paper presents a novel software version division method Blve by using machine learning method. We construct an accurate division model trained with Support Vector Regression method (SVR) to divide software version by processing the data which is commonly recorded in bug list. Then, we process the results of the regression and use the classification indicators for evaluation. In addition, we propose a slope-based approach to optimize the model, and this optimization can improve the accuracy performance measure to about 95%.
近年来,敏捷开发已成为一种流行的软件开发方法,敏捷开发过程中发生了许多版本迭代。确保每个软件版本的质量是非常重要的。然而,在实际开发中,很难了解大型软件开发的每个阶段或版本。这意味着开发人员不知道当前项目对应于哪个版本。同时,在实际开发中,软件发布也有许多必要的需求。当我们确切地知道当前项目对应的版本时,我们就可以知道当前的软件版本是否满足发布要求。因此,我们需要一个好的软件版本划分方法。本文利用机器学习的方法提出了一种新的软件版本划分方法Blve。通过对bug列表中常见的软件版本划分数据进行处理,构建了基于支持向量回归(SVR)训练的准确的软件版本划分模型。然后,对回归结果进行处理,并使用分类指标进行评价。此外,我们提出了一种基于坡度的方法来优化模型,这种优化可以将精度性能指标提高到95%左右。
{"title":"Blve: Should the Current Software Version Be Suitable for Release?","authors":"Wei Zheng, Zhao Shi, Xiaojun Chen, Junzheng Chen, Manqing Zhang, Xiang Chen","doi":"10.1109/IBF50092.2020.9034776","DOIUrl":"https://doi.org/10.1109/IBF50092.2020.9034776","url":null,"abstract":"Recently, agile development has become a popular software development method and many version iterations occur during agile development. It is very important to ensure the quality of each software version. However in actual development, it is difficult to know every stage or version about large-scale software development. That means developers do not know exactly which version the current project corresponds to. Simultaneously, there are many necessary requirements for software release in actual development. When we know exactly the version corresponding to the current project, we can know whether the current software version meets the release requirements. Therefore, we need a good software version division method. This paper presents a novel software version division method Blve by using machine learning method. We construct an accurate division model trained with Support Vector Regression method (SVR) to divide software version by processing the data which is commonly recorded in bug list. Then, we process the results of the regression and use the classification indicators for evaluation. In addition, we propose a slope-based approach to optimize the model, and this optimization can improve the accuracy performance measure to about 95%.","PeriodicalId":190321,"journal":{"name":"2020 IEEE 2nd International Workshop on Intelligent Bug Fixing (IBF)","volume":"222 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133457625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Utilizing Source Code Embeddings to Identify Correct Patches 利用源代码嵌入来识别正确的补丁
Pub Date : 2020-02-01 DOI: 10.1109/IBF50092.2020.9034714
Viktor Csuvik, Dániel Horváth, Ferenc Horváth, László Vidács
The so called Generate-and-Validate approach of Automatic Program Repair consists of two main activities, the generate activity, which produces candidate solutions to the problem, and the validate activity, which checks the correctness of the generated solutions. The latter however might not give a reliable result, since most of the techniques establish the correctness of the solutions by (re-)running the available test cases. A program is marked as a possible fix, if it passes all the available test cases. Although tests can be run automatically, in real life applications the problem of over- and underfitting often occurs, resulting in inadequate patches. At this point manual investigation of repair candidates is needed although they passed the tests. Our goal is to investigate ways to predict correct patches. The core idea is to exploit textual and structural similarity between the original (buggy) program and the generated patches. To do so we apply Doc2vec and Bert embedding methods on source code. So far APR tools generate mostly one-line fixes, leaving most of the original source code intact. Our observation was, that patches which bring in new variables, make larger changes in the code are usually the incorrect ones. The proposed approach was evaluated on the QuixBugs dataset consisting of 40 bugs and fixes belonging to them. Our approach successfully filtered out 45% of the incorrect patches.
所谓的自动程序修复的生成和验证方法由两个主要活动组成,生成活动,它生成问题的候选解决方案,以及验证活动,它检查生成的解决方案的正确性。然而,后者可能不会给出可靠的结果,因为大多数技术通过(重新)运行可用的测试用例来建立解决方案的正确性。如果一个程序通过了所有可用的测试用例,它就被标记为可能的修复。尽管测试可以自动运行,但在实际应用中经常出现过拟合和欠拟合的问题,从而导致补丁不足。在这一点上,需要对候选修复进行人工调查,尽管它们通过了测试。我们的目标是研究预测正确斑块的方法。其核心思想是利用原始(有bug的)程序和生成的补丁之间的文本和结构相似性。为此,我们在源代码上应用Doc2vec和Bert嵌入方法。到目前为止,APR工具主要生成一行修复,保留了大部分原始源代码。我们的观察是,那些引入新变量、对代码进行较大修改的补丁通常是不正确的。提出的方法在QuixBugs数据集上进行了评估,该数据集包含40个错误和属于它们的修复。我们的方法成功地过滤掉了45%的不正确的补丁。
{"title":"Utilizing Source Code Embeddings to Identify Correct Patches","authors":"Viktor Csuvik, Dániel Horváth, Ferenc Horváth, László Vidács","doi":"10.1109/IBF50092.2020.9034714","DOIUrl":"https://doi.org/10.1109/IBF50092.2020.9034714","url":null,"abstract":"The so called Generate-and-Validate approach of Automatic Program Repair consists of two main activities, the generate activity, which produces candidate solutions to the problem, and the validate activity, which checks the correctness of the generated solutions. The latter however might not give a reliable result, since most of the techniques establish the correctness of the solutions by (re-)running the available test cases. A program is marked as a possible fix, if it passes all the available test cases. Although tests can be run automatically, in real life applications the problem of over- and underfitting often occurs, resulting in inadequate patches. At this point manual investigation of repair candidates is needed although they passed the tests. Our goal is to investigate ways to predict correct patches. The core idea is to exploit textual and structural similarity between the original (buggy) program and the generated patches. To do so we apply Doc2vec and Bert embedding methods on source code. So far APR tools generate mostly one-line fixes, leaving most of the original source code intact. Our observation was, that patches which bring in new variables, make larger changes in the code are usually the incorrect ones. The proposed approach was evaluated on the QuixBugs dataset consisting of 40 bugs and fixes belonging to them. Our approach successfully filtered out 45% of the incorrect patches.","PeriodicalId":190321,"journal":{"name":"2020 IEEE 2nd International Workshop on Intelligent Bug Fixing (IBF)","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124591000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
期刊
2020 IEEE 2nd International Workshop on Intelligent Bug Fixing (IBF)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1