Pub Date : 2022-12-01DOI: 10.1109/QRS57517.2022.00105
Sebastian Frank, Lion Wagner, M. A. Hakamian, Martin Straesser, A. Hoorn
Increased resilience compared to monolithic architectures is both one of the key promises of microservice-based architectures and a big challenge, e.g., due to the systems’ distributed nature. Resilience assessment through simulation requires fewer resources than the measurement-based techniques used in practice. However, there is no existing simulation approach that is suitable for a holistic resilience assessment of microservices comprised of (i) representative fault injections, (ii) common resilience mechanisms, and (iii) time-varying workloads. This paper presents MiSim — an extensible simulator for resilience assessment of microservice-based architectures. It overcomes the stated limitations of related work. MiSim fits resilience engineering practices by supporting scenario-based experiments and requiring only lightweight input models. We demonstrate how MiSim simulates (1) common resilience mechanisms — i.e., circuit breaker, connection limiter, retry, load balancer, and autoscaler — and (2) fault injections — i.e., instance/service killing and latency injections. In addition, we use TeaStore, a reference microservice-based architecture, aiming to reproduce scaling behavior from an experiment by using simulation. Our results show that MiSim allows for quantitative insights into microservice-based systems’ complex transient behavior by providing up to 25 metrics.
{"title":"MiSim: A Simulator for Resilience Assessment of Microservice-Based Architectures","authors":"Sebastian Frank, Lion Wagner, M. A. Hakamian, Martin Straesser, A. Hoorn","doi":"10.1109/QRS57517.2022.00105","DOIUrl":"https://doi.org/10.1109/QRS57517.2022.00105","url":null,"abstract":"Increased resilience compared to monolithic architectures is both one of the key promises of microservice-based architectures and a big challenge, e.g., due to the systems’ distributed nature. Resilience assessment through simulation requires fewer resources than the measurement-based techniques used in practice. However, there is no existing simulation approach that is suitable for a holistic resilience assessment of microservices comprised of (i) representative fault injections, (ii) common resilience mechanisms, and (iii) time-varying workloads. This paper presents MiSim — an extensible simulator for resilience assessment of microservice-based architectures. It overcomes the stated limitations of related work. MiSim fits resilience engineering practices by supporting scenario-based experiments and requiring only lightweight input models. We demonstrate how MiSim simulates (1) common resilience mechanisms — i.e., circuit breaker, connection limiter, retry, load balancer, and autoscaler — and (2) fault injections — i.e., instance/service killing and latency injections. In addition, we use TeaStore, a reference microservice-based architecture, aiming to reproduce scaling behavior from an experiment by using simulation. Our results show that MiSim allows for quantitative insights into microservice-based systems’ complex transient behavior by providing up to 25 metrics.","PeriodicalId":143812,"journal":{"name":"2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)","volume":"55 47","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120815913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/QRS57517.2022.00055
Yuan Huang, Rong Wang, Xiangping Chen, Xiao-cong Zhou, Ziyan Wang
Developers may not understand the Gas mechanism of Ethereum, so many smart contracts consume a lot of unnecessary Gas. To address this issue, existing studies have proposed several methods to optimize the code of the contracts to reduce Gas consumption. To verify the effectiveness, most of the methods deploy a private chain to make verification. However, a more reasonable way is to employ the real transactions on Ethereum to trigger the contracts before and after optimization, and then compare the Gas consumption. To achieve this goal, we proposed a method, GOV, to estimate the Gas consumption of the optimized contract by using the real transactions on Ethereum. Our method enables the optimized contract to follow the execution path of the contract before optimization, thus solving the problem of inconsistent execution paths before and after optimization. A preliminary evaluation shows that GOV can effectively estimate the Gas consumption of optimized contract.
{"title":"GOV: A Verification Method for Smart Contract Gas-Optimization","authors":"Yuan Huang, Rong Wang, Xiangping Chen, Xiao-cong Zhou, Ziyan Wang","doi":"10.1109/QRS57517.2022.00055","DOIUrl":"https://doi.org/10.1109/QRS57517.2022.00055","url":null,"abstract":"Developers may not understand the Gas mechanism of Ethereum, so many smart contracts consume a lot of unnecessary Gas. To address this issue, existing studies have proposed several methods to optimize the code of the contracts to reduce Gas consumption. To verify the effectiveness, most of the methods deploy a private chain to make verification. However, a more reasonable way is to employ the real transactions on Ethereum to trigger the contracts before and after optimization, and then compare the Gas consumption. To achieve this goal, we proposed a method, GOV, to estimate the Gas consumption of the optimized contract by using the real transactions on Ethereum. Our method enables the optimized contract to follow the execution path of the contract before optimization, thus solving the problem of inconsistent execution paths before and after optimization. A preliminary evaluation shows that GOV can effectively estimate the Gas consumption of optimized contract.","PeriodicalId":143812,"journal":{"name":"2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)","volume":"51 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120817291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/QRS57517.2022.00045
Leo Silva, Marília Castro, Miriam Silva, Milena Santos, U. Kulesza, Margarida Lima, H. Madeira
Developers' emotions are crucial elements that influence the overall job satisfaction of software engineers, including motivation, productivity, and quality of the work, affecting the software development lifecycle. Existing approaches to assess and monitor developers' emotions, such as facial expressions, self-assessed surveys, and biometric sensors, imply considerable intrusiveness on developers' routines and tend to be used only during limited periods. This paper proposes a new non-intrusive and automatable tool (Emotional Dashboard) to assess, monitor, and visualize software developers' emotions during long periods, providing team leaders and project managers with an overview of teams' and software developers' emotional statuses. The idea is to use posts shared by developers on social media to assess their emotions' polarity and visualize the emotional situation on a dashboard, allowing the identification of potentially abnormal emotional periods that may affect the software development. A first evaluation of the tool’s accuracy, done by comparing the emotion polarity (negative, positive, or neutral) of posts done by our tool with the manual classification of a set of posts done by three psychologists, has shown an accuracy of 77%. The tool is available for analysis at this link: https://emotional-dashboard.herokuapp.com.
{"title":"Emotional Dashboard: a Non-Intrusive Approach to Monitor Software Developers' Emotions and Personality Traits","authors":"Leo Silva, Marília Castro, Miriam Silva, Milena Santos, U. Kulesza, Margarida Lima, H. Madeira","doi":"10.1109/QRS57517.2022.00045","DOIUrl":"https://doi.org/10.1109/QRS57517.2022.00045","url":null,"abstract":"Developers' emotions are crucial elements that influence the overall job satisfaction of software engineers, including motivation, productivity, and quality of the work, affecting the software development lifecycle. Existing approaches to assess and monitor developers' emotions, such as facial expressions, self-assessed surveys, and biometric sensors, imply considerable intrusiveness on developers' routines and tend to be used only during limited periods. This paper proposes a new non-intrusive and automatable tool (Emotional Dashboard) to assess, monitor, and visualize software developers' emotions during long periods, providing team leaders and project managers with an overview of teams' and software developers' emotional statuses. The idea is to use posts shared by developers on social media to assess their emotions' polarity and visualize the emotional situation on a dashboard, allowing the identification of potentially abnormal emotional periods that may affect the software development. A first evaluation of the tool’s accuracy, done by comparing the emotion polarity (negative, positive, or neutral) of posts done by our tool with the manual classification of a set of posts done by three psychologists, has shown an accuracy of 77%. The tool is available for analysis at this link: https://emotional-dashboard.herokuapp.com.","PeriodicalId":143812,"journal":{"name":"2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129390281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/QRS57517.2022.00032
Dániel Vince, Attila Szatmári, Ákos Kiss, Árpád Beszédes
Spectrum-Based Fault Localization (SBFL) is based on risk formulas to rank program elements, which work generally well in various situations. However, it cannot be ruled out that zero division might happen during score calculation, which has negative consequences, e.g., essential elements will not be in the top part of the rank list. The literature has given several strategies to tackle the problem, although there is little knowledge on which one to use. In our work, we performed mathematical analysis and an empirical study to find out how this phenomenon affects SBFL. Results show that division by zero happens in many cases, and the strategies can mitigate their consequences with varying success. Thus, we propose a combined method to avoid the threat of division by zero and improve the trustworthiness of SBFL. Our proposals should be taken into consideration whenever a formula is being used or a new one is proposed.
{"title":"Division by Zero: Threats and Effects in Spectrum-Based Fault Localization Formulas","authors":"Dániel Vince, Attila Szatmári, Ákos Kiss, Árpád Beszédes","doi":"10.1109/QRS57517.2022.00032","DOIUrl":"https://doi.org/10.1109/QRS57517.2022.00032","url":null,"abstract":"Spectrum-Based Fault Localization (SBFL) is based on risk formulas to rank program elements, which work generally well in various situations. However, it cannot be ruled out that zero division might happen during score calculation, which has negative consequences, e.g., essential elements will not be in the top part of the rank list. The literature has given several strategies to tackle the problem, although there is little knowledge on which one to use. In our work, we performed mathematical analysis and an empirical study to find out how this phenomenon affects SBFL. Results show that division by zero happens in many cases, and the strategies can mitigate their consequences with varying success. Thus, we propose a combined method to avoid the threat of division by zero and improve the trustworthiness of SBFL. Our proposals should be taken into consideration whenever a formula is being used or a new one is proposed.","PeriodicalId":143812,"journal":{"name":"2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)","volume":"2016 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128582029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Developer expertise is an important factor that should be considered in various software development activities. And it is challenging to accurately profile the expertise of developers as their activities often disperse across different online communities, such as Community Question Answering sites (e.g., Stack Overflow) and Open Source Software platforms (e.g., GitHub). In this regard, early work mainly considers a single community while recent studies are starting to profile developers with cross-community data. However, few works consider the collaborative interactions among developers in evaluating developer expertise across communities. In this work, we propose a collaboration-aware approach to profiling developer expertise using cross-community data by taking into consideration developers’ contributions, collaborative interactions, and the dynamic changes of expertise. Specifically, we are concerned with the common developers in GitHub and Stack Overflow. First, we propose a time-sensitive model to characterize the developer’s expertise in the two communities and integrate the results to generate basic expertise profiles. Second, we build a developer network by analyzing the collaborative interactions among the developers of the two communities. Finally, we apply the topic-sensitive PageRank algorithm to incorporate developer relationships into expertise profiling. Results of extensive experiments on a large number of common developers of GitHub and Stack Overflow demonstrate the effectiveness of our approach.
{"title":"A Collaboration-Aware Approach to Profiling Developer Expertise with Cross-Community Data","authors":"Xiaotao Song, Jiafei Yan, Yuexin Huang, Hailong Sun, Hongyu Zhang","doi":"10.1109/QRS57517.2022.00043","DOIUrl":"https://doi.org/10.1109/QRS57517.2022.00043","url":null,"abstract":"Developer expertise is an important factor that should be considered in various software development activities. And it is challenging to accurately profile the expertise of developers as their activities often disperse across different online communities, such as Community Question Answering sites (e.g., Stack Overflow) and Open Source Software platforms (e.g., GitHub). In this regard, early work mainly considers a single community while recent studies are starting to profile developers with cross-community data. However, few works consider the collaborative interactions among developers in evaluating developer expertise across communities. In this work, we propose a collaboration-aware approach to profiling developer expertise using cross-community data by taking into consideration developers’ contributions, collaborative interactions, and the dynamic changes of expertise. Specifically, we are concerned with the common developers in GitHub and Stack Overflow. First, we propose a time-sensitive model to characterize the developer’s expertise in the two communities and integrate the results to generate basic expertise profiles. Second, we build a developer network by analyzing the collaborative interactions among the developers of the two communities. Finally, we apply the topic-sensitive PageRank algorithm to incorporate developer relationships into expertise profiling. Results of extensive experiments on a large number of common developers of GitHub and Stack Overflow demonstrate the effectiveness of our approach.","PeriodicalId":143812,"journal":{"name":"2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)","volume":"31 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132993160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/QRS57517.2022.00028
Chenglin Li, Yangyang Zhao, Yibiao Yang
Defect data is critical for software defect prediction. To collect defect data, it is essential to establish links between bugs and their fixes. Missing links (i.e. low link rate) can cause false negatives in the defect dataset, and bias the experimental results. Despite the importance of bug links, little prior work has used bug link rate as a criterion for selecting subjects, and there is no empirical evidence to know whether there are simpler alternative criteria for evaluating a project’s link rate to aid selection. To this end, we conduct a comprehensive study on the bug link rate. Based on 34 open-source projects, we make a detailed statistical analysis of the actual link rates of the projects, and examine the factors affecting link rates from both quantitative and qualitative perspectives. The findings could improve the understanding of bug link rates, and guide the selection of better subjects for defect prediction.
{"title":"An Empirical Study of the Bug Link Rate","authors":"Chenglin Li, Yangyang Zhao, Yibiao Yang","doi":"10.1109/QRS57517.2022.00028","DOIUrl":"https://doi.org/10.1109/QRS57517.2022.00028","url":null,"abstract":"Defect data is critical for software defect prediction. To collect defect data, it is essential to establish links between bugs and their fixes. Missing links (i.e. low link rate) can cause false negatives in the defect dataset, and bias the experimental results. Despite the importance of bug links, little prior work has used bug link rate as a criterion for selecting subjects, and there is no empirical evidence to know whether there are simpler alternative criteria for evaluating a project’s link rate to aid selection. To this end, we conduct a comprehensive study on the bug link rate. Based on 34 open-source projects, we make a detailed statistical analysis of the actual link rates of the projects, and examine the factors affecting link rates from both quantitative and qualitative perspectives. The findings could improve the understanding of bug link rates, and guide the selection of better subjects for defect prediction.","PeriodicalId":143812,"journal":{"name":"2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126878622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/QRS57517.2022.00061
Yuntao Zhang, Yanhao Wang, Yuwei Liu, Zhengyuan Pang, B. Fang
Binary code similarity identification is an important technique applied to many security applications (e.g., plagiarism detection, bug search). The primary challenge of this research topic is how to extract sufficient information from the binary code for similarity comparison. Although numerous approaches have been proposed to address the challenge, most of them leverage features determined by human experience or extracted using machine learning methods and ignore some critical technique semantic information. Additionally, they assess their approach exclusively in laboratory environments and lack real-world datasets. Both problems lead to the limited effectiveness of these methods in real application scenarios (e.g., vulnerable function search).In this paper, we propose a novel approach PDG2Vec, which extracts the data dependence graph and control dependence graph (i.e., program dependence graph (PDG)) as the features of functions and uses them for identifying function similarity. Meanwhile, we design several strategies to optimize the PDG’s construction and use them in similarity comparison to balance time-consuming and accuracy. We implement the prototype of PDG2Vec, which can perform binary code similarity comparison across architectures of x86, x86_64, MIPS32, ARM32, and ARM64. We evaluate PDG2Vec with two datasets. The experimental results show that PDG2Vec is resilient to cross-architecture and extracts more precise semantics than other approaches. Moreover, PDG2Vec outperforms the state-of-the-art tools in the vulnerable function search scenario and has excellent performance.
{"title":"PDG2Vec: Identify the Binary Function Similarity with Program Dependence Graph","authors":"Yuntao Zhang, Yanhao Wang, Yuwei Liu, Zhengyuan Pang, B. Fang","doi":"10.1109/QRS57517.2022.00061","DOIUrl":"https://doi.org/10.1109/QRS57517.2022.00061","url":null,"abstract":"Binary code similarity identification is an important technique applied to many security applications (e.g., plagiarism detection, bug search). The primary challenge of this research topic is how to extract sufficient information from the binary code for similarity comparison. Although numerous approaches have been proposed to address the challenge, most of them leverage features determined by human experience or extracted using machine learning methods and ignore some critical technique semantic information. Additionally, they assess their approach exclusively in laboratory environments and lack real-world datasets. Both problems lead to the limited effectiveness of these methods in real application scenarios (e.g., vulnerable function search).In this paper, we propose a novel approach PDG2Vec, which extracts the data dependence graph and control dependence graph (i.e., program dependence graph (PDG)) as the features of functions and uses them for identifying function similarity. Meanwhile, we design several strategies to optimize the PDG’s construction and use them in similarity comparison to balance time-consuming and accuracy. We implement the prototype of PDG2Vec, which can perform binary code similarity comparison across architectures of x86, x86_64, MIPS32, ARM32, and ARM64. We evaluate PDG2Vec with two datasets. The experimental results show that PDG2Vec is resilient to cross-architecture and extracts more precise semantics than other approaches. Moreover, PDG2Vec outperforms the state-of-the-art tools in the vulnerable function search scenario and has excellent performance.","PeriodicalId":143812,"journal":{"name":"2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)","volume":"126 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116092946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/QRS57517.2022.00027
Xinghan Zhao, Cong Tian
The software defect prediction method based on requirement specification is proposed to address the defect prediction needs in the requirements phase when the organization adopts the W-model of software development. The theoretical synthesis presents that the function point and the number of defects should be positively correlated. The theory’s correctness is verified by analyzing the correlation between function point and defect distribution of eight software applications. Then, the mathematical equations for software configuration testing defects are derived, and the specific meaning of the equation is explained. Finally, the shortcomings of this study and the subsequent research directions are pointed out.
{"title":"An Empirical Study on Software Defect Prediction using Function Point Analysis","authors":"Xinghan Zhao, Cong Tian","doi":"10.1109/QRS57517.2022.00027","DOIUrl":"https://doi.org/10.1109/QRS57517.2022.00027","url":null,"abstract":"The software defect prediction method based on requirement specification is proposed to address the defect prediction needs in the requirements phase when the organization adopts the W-model of software development. The theoretical synthesis presents that the function point and the number of defects should be positively correlated. The theory’s correctness is verified by analyzing the correlation between function point and defect distribution of eight software applications. Then, the mathematical equations for software configuration testing defects are derived, and the specific meaning of the equation is explained. Finally, the shortcomings of this study and the subsequent research directions are pointed out.","PeriodicalId":143812,"journal":{"name":"2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121767526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/QRS57517.2022.00035
Yilin Yang, Ziyuan Wang, Zhenyu Chen, Baowen Xu
Information Retrieval-based Bug localization (IRBL) techniques have become a hot research topic in bug localization due to their few external dependencies and low execution cost. However, existing IRBL techniques have many challenges regarding localization granularity and applicability. First, existing IRBL techniques have not yet achieved statement-level bug localization. Second, almost all studies are limited to Java-based projects, and the effectiveness of these techniques for other widely used programming languages (e.g., Python) is still unknown. The reason for these deficiencies is that existing IRBL techniques mainly employ conventional NLP techniques to analyze the bug reports and have not yet fully exploited the stack trace attached to the bug reports. To improve IRBL techniques in terms of localization granularity and adaptability, we propose a context-aware program simplification technique—COPS—that is able to localize defective statements in suspicious files by analyzing the stack trace in bug reports, which enables statement-level bug localization for Python-based projects. Experiments using 948 bug reports show that our technique can localize the buggy statements with 102.6% higher Top@10, 56.2% higher MAP@10, and 95.6% higher MRR@10 than the baseline. Compared with the state-of-the-art techniques, COPS can improve 19.1% in MAP@10 and achieve 92% buggy statement coverage with a full scope search. Experimental results show that COPS has higher bug localization effectiveness than existing IRBL techniques; and that COPS achieves the same effectiveness with higher execution efficiency than state-of-the-art statement-level defect techniques.
{"title":"Context-Aware Program Simplification to Improve Information Retrieval-Based Bug Localization","authors":"Yilin Yang, Ziyuan Wang, Zhenyu Chen, Baowen Xu","doi":"10.1109/QRS57517.2022.00035","DOIUrl":"https://doi.org/10.1109/QRS57517.2022.00035","url":null,"abstract":"Information Retrieval-based Bug localization (IRBL) techniques have become a hot research topic in bug localization due to their few external dependencies and low execution cost. However, existing IRBL techniques have many challenges regarding localization granularity and applicability. First, existing IRBL techniques have not yet achieved statement-level bug localization. Second, almost all studies are limited to Java-based projects, and the effectiveness of these techniques for other widely used programming languages (e.g., Python) is still unknown. The reason for these deficiencies is that existing IRBL techniques mainly employ conventional NLP techniques to analyze the bug reports and have not yet fully exploited the stack trace attached to the bug reports. To improve IRBL techniques in terms of localization granularity and adaptability, we propose a context-aware program simplification technique—COPS—that is able to localize defective statements in suspicious files by analyzing the stack trace in bug reports, which enables statement-level bug localization for Python-based projects. Experiments using 948 bug reports show that our technique can localize the buggy statements with 102.6% higher Top@10, 56.2% higher MAP@10, and 95.6% higher MRR@10 than the baseline. Compared with the state-of-the-art techniques, COPS can improve 19.1% in MAP@10 and achieve 92% buggy statement coverage with a full scope search. Experimental results show that COPS has higher bug localization effectiveness than existing IRBL techniques; and that COPS achieves the same effectiveness with higher execution efficiency than state-of-the-art statement-level defect techniques.","PeriodicalId":143812,"journal":{"name":"2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133996973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-01DOI: 10.1109/QRS57517.2022.00074
Zhengyuan Wei, Haipeng Wang, Imran Ashraf, William Chan
Testing deep neural networks requires high-quality test cases, but using new test cases would incur the labor-intensive test case labeling issue in the test oracle problem. Test case prioritization for failure-revealing test cases alleviates the problem. Existing metric-based techniques analyze vector-based prediction outputs. They cannot handle regression models. Existing mutation-based techniques either remain ineffective or incur high computational costs. In this paper, we propose EffiMAP, an effective and efficient test case prioritization technique with predictive mutation analysis. In the test phase, without performing a comprehensive mutation analysis, EffiMAP predicts whether model mutants are killed by a test case by the information extracted from the execution trace of the test case. Our experiment shows that EffiMAP significantly outperforms the previous state-of-the-art technique in both effectiveness and efficiency in the test phase of handling test cases of both classification and regression models. This paper is the first work to show the feasibility of predictive mutation analysis to rank test cases with a higher probability of exposing model prediction failures in the domain of deep neural network testing.
{"title":"Predictive Mutation Analysis of Test Case Prioritization for Deep Neural Networks","authors":"Zhengyuan Wei, Haipeng Wang, Imran Ashraf, William Chan","doi":"10.1109/QRS57517.2022.00074","DOIUrl":"https://doi.org/10.1109/QRS57517.2022.00074","url":null,"abstract":"Testing deep neural networks requires high-quality test cases, but using new test cases would incur the labor-intensive test case labeling issue in the test oracle problem. Test case prioritization for failure-revealing test cases alleviates the problem. Existing metric-based techniques analyze vector-based prediction outputs. They cannot handle regression models. Existing mutation-based techniques either remain ineffective or incur high computational costs. In this paper, we propose EffiMAP, an effective and efficient test case prioritization technique with predictive mutation analysis. In the test phase, without performing a comprehensive mutation analysis, EffiMAP predicts whether model mutants are killed by a test case by the information extracted from the execution trace of the test case. Our experiment shows that EffiMAP significantly outperforms the previous state-of-the-art technique in both effectiveness and efficiency in the test phase of handling test cases of both classification and regression models. This paper is the first work to show the feasibility of predictive mutation analysis to rank test cases with a higher probability of exposing model prediction failures in the domain of deep neural network testing.","PeriodicalId":143812,"journal":{"name":"2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS)","volume":"113 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114552836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}