R is a package-based, multi-paradigm programming language for scientific software. It provides an easy way to install third-party code, datasets, tests, documentation and examples through CRAN (Comprehensive R Archive Network). Prior works indicated developers tend to code workarounds to bypass CRAN's automated checks (performed when submitting a package) instead of fixing the code-doing so reduces packages' quality. It may become a threat to those analyses written in R that rely on miss-checked code. This preliminary study card-sorted source code comments and analysed StackOverflow (SO) conversations discussing CRAN checks to understand developers' attitudes. We determined that about a quarter of SO posts aim to bypass a check with a workaround; the most affected are code-related problems, package dependencies, installation and feasibility. We analyse these checks and outline future steps to improve similar automated analyses.
{"title":"On the Developers' Attitude Towards CRAN Checks","authors":"Pranjay Kumar, Davin Ie, M. Vidoni","doi":"10.1145/3524610.3528389","DOIUrl":"https://doi.org/10.1145/3524610.3528389","url":null,"abstract":"R is a package-based, multi-paradigm programming language for scientific software. It provides an easy way to install third-party code, datasets, tests, documentation and examples through CRAN (Comprehensive R Archive Network). Prior works indicated developers tend to code workarounds to bypass CRAN's automated checks (performed when submitting a package) instead of fixing the code-doing so reduces packages' quality. It may become a threat to those analyses written in R that rely on miss-checked code. This preliminary study card-sorted source code comments and analysed StackOverflow (SO) conversations discussing CRAN checks to understand developers' attitudes. We determined that about a quarter of SO posts aim to bypass a check with a workaround; the most affected are code-related problems, package dependencies, installation and feasibility. We analyse these checks and outline future steps to improve similar automated analyses.","PeriodicalId":426634,"journal":{"name":"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)","volume":"120 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123457526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Model-Driven Software Engineering relies both on domain-expertise as well as software engineering expertise to fully grasp its representative power in modeling complex systems. As is typical in the development of any system, modelers face similar challenges to classic software developers, whether with general modeling concepts or specific features of existing tools such as the Eclipse Modeling Framework. In this work, we aim to understand the issues that modelers face by analyzing discussions from Eclipse's modeling tool forums, MATLAB Central, and Stack Overflow. By performing a qualitative study using an open-coding process, we created a taxonomy of common issues faced by modelers. We considered both difficulty experienced when modeling a system and issues faced using existing modeling tools; these form the basis of our two research questions. Based on the taxonomy, we propose nine suggestions and enhancements, in three overarching groups, to improve the experience of modelers, at all levels of experience.
{"title":"How do I model my system? A Qualitative Study on the Challenges that Modelers Experience","authors":"Christopher Vendome, E. J. Rapos, Nick DiGennaro","doi":"10.1145/3524610.3529160","DOIUrl":"https://doi.org/10.1145/3524610.3529160","url":null,"abstract":"Model-Driven Software Engineering relies both on domain-expertise as well as software engineering expertise to fully grasp its representative power in modeling complex systems. As is typical in the development of any system, modelers face similar challenges to classic software developers, whether with general modeling concepts or specific features of existing tools such as the Eclipse Modeling Framework. In this work, we aim to understand the issues that modelers face by analyzing discussions from Eclipse's modeling tool forums, MATLAB Central, and Stack Overflow. By performing a qualitative study using an open-coding process, we created a taxonomy of common issues faced by modelers. We considered both difficulty experienced when modeling a system and issues faced using existing modeling tools; these form the basis of our two research questions. Based on the taxonomy, we propose nine suggestions and enhancements, in three overarching groups, to improve the experience of modelers, at all levels of experience.","PeriodicalId":426634,"journal":{"name":"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126105460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
F. Khalili, Ali Mohebbi, Valerio Terragni, M. Pezzè, L. Mariani, A. Heydarnoori
Reusing test cases across similar applications can significantly reduce testing effort. Some recent test reuse approaches successfully exploit word embedding models to semantically match GUI events across Android apps. It is a common understanding that word embedding models trained on domain-specific corpora perform better on specialized tasks. Our recent study confirms this understanding in the context of Android test reuse. It shows that word embedding models trained with a corpus of the English descriptions of apps in the Google Play Store lead to a better semantic matching of Android GUI events. Motivated by this result, we hypothesize that we can further increase the effectiveness of semantic matching by partitioning the corpus of app descriptions into domain-specific corpora. Our experiments do not confirm our hypothesis. This paper sheds light on this unexpected negative result that contradicts the common understanding.
{"title":"The Ineffectiveness of Domain-Specific Word Embedding Models for GUI Test Reuse","authors":"F. Khalili, Ali Mohebbi, Valerio Terragni, M. Pezzè, L. Mariani, A. Heydarnoori","doi":"10.1145/3524610.3527873","DOIUrl":"https://doi.org/10.1145/3524610.3527873","url":null,"abstract":"Reusing test cases across similar applications can significantly reduce testing effort. Some recent test reuse approaches successfully exploit word embedding models to semantically match GUI events across Android apps. It is a common understanding that word embedding models trained on domain-specific corpora perform better on specialized tasks. Our recent study confirms this understanding in the context of Android test reuse. It shows that word embedding models trained with a corpus of the English descriptions of apps in the Google Play Store lead to a better semantic matching of Android GUI events. Motivated by this result, we hypothesize that we can further increase the effectiveness of semantic matching by partitioning the corpus of app descriptions into domain-specific corpora. Our experiments do not confirm our hypothesis. This paper sheds light on this unexpected negative result that contradicts the common understanding.","PeriodicalId":426634,"journal":{"name":"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126658513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Program classification can be regarded as a high-level abstraction of code, laying a foundation for various tasks related to source code comprehension, and has a very wide range of applications in the field of software engineering, such as code clone detection, code smell classification, defects classification, etc. The cross-language program classification can realize code transfer in different programming languages, and can also promote cross-language code reuse, thereby helping developers to write code quickly and reduce the development time of code transfer. Most of the existing studies focus on the semantic learning of the code, whilst few studies are devoted to cross-language tasks. The main challenge of cross-language program classification is how to extract semantic features of different programming languages. In order to cope with this difficulty, we propose a Unified Abstract Syntax Tree (namely UAST in this paper) neural network. In detail, the core idea of UAST consists of two unified mechanisms. First, UAST learns an AST representation by unifying the AST traversal sequence and graph-like AST structure for capturing semantic code features. Second, we construct a mechanism called unified vocabulary, which can reduce the feature gap between different programming languages, so it can achieve the role of cross-language program classification. Besides, we collect a dataset containing 20,000 files of five programming languages, which can be used as a benchmark dataset for the cross-language program classification task. We have done experiments on two datasets, and the results show that our proposed approach out-performs the state-of-the-art baselines in terms of four evaluation metrics (Precision, Recall, F1-score, and Accuracy).
{"title":"Unified Abstract Syntax Tree Representation Learning for Cross-Language Program Classification","authors":"Kesu Wang, Meng Yan, He Zhang, Haibo Hu","doi":"10.1145/3524610.3527915","DOIUrl":"https://doi.org/10.1145/3524610.3527915","url":null,"abstract":"Program classification can be regarded as a high-level abstraction of code, laying a foundation for various tasks related to source code comprehension, and has a very wide range of applications in the field of software engineering, such as code clone detection, code smell classification, defects classification, etc. The cross-language program classification can realize code transfer in different programming languages, and can also promote cross-language code reuse, thereby helping developers to write code quickly and reduce the development time of code transfer. Most of the existing studies focus on the semantic learning of the code, whilst few studies are devoted to cross-language tasks. The main challenge of cross-language program classification is how to extract semantic features of different programming languages. In order to cope with this difficulty, we propose a Unified Abstract Syntax Tree (namely UAST in this paper) neural network. In detail, the core idea of UAST consists of two unified mechanisms. First, UAST learns an AST representation by unifying the AST traversal sequence and graph-like AST structure for capturing semantic code features. Second, we construct a mechanism called unified vocabulary, which can reduce the feature gap between different programming languages, so it can achieve the role of cross-language program classification. Besides, we collect a dataset containing 20,000 files of five programming languages, which can be used as a benchmark dataset for the cross-language program classification task. We have done experiments on two datasets, and the results show that our proposed approach out-performs the state-of-the-art baselines in terms of four evaluation metrics (Precision, Recall, F1-score, and Accuracy).","PeriodicalId":426634,"journal":{"name":"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126916725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anna Vacca, Michele Fredella, Andrea Di Sorbo, C. A. Visaggio, G. Canfora
Blockchain technology is becoming increasingly popular, and smart contracts (i.e., programs that run on top of the blockchain) represent a crucial element of this technology. In particular, smart contracts running on Ethereum (i.e., one of the most popular blockchain platforms) are often developed with Solidity, and their deployment and execution consume gas (i.e., a fee compensating the computing resources required). Smart contract development frequently involves code reuse, but poor readable smart contracts could hinder their reuse. However, writing readable smart contracts is challenging, since practices for improving the readability could also be in contrast with optimization strategies for reducing gas consumption. This paper aims at better understanding (i) the readability aspects for which traditional software and smart contracts differ, and (ii) the specific smart contract readability features exhibiting significant relationships with gas consumption. We leverage a set of metrics that previous research has proven correlated with code readability. In particular, we first compare the values of these metrics obtained for both Solidity smart contracts and traditional software systems (written in Java). Then, we investigate the correlations occurring between these metrics and gas consumption and between each pair of metrics. The results of our study highlight that smart contracts usually exhibit lower readability than traditional software for what concerns the number of parentheses, inline comments, and blank lines used. In addition, we found some readability metrics (such as the average length of identifiers and the average number of keywords) that significantly correlate with gas consumption.
{"title":"An Empirical Investigation on the Trade-off between Smart Contract Readability and Gas Consumption","authors":"Anna Vacca, Michele Fredella, Andrea Di Sorbo, C. A. Visaggio, G. Canfora","doi":"10.1145/3524610.3529157","DOIUrl":"https://doi.org/10.1145/3524610.3529157","url":null,"abstract":"Blockchain technology is becoming increasingly popular, and smart contracts (i.e., programs that run on top of the blockchain) represent a crucial element of this technology. In particular, smart contracts running on Ethereum (i.e., one of the most popular blockchain platforms) are often developed with Solidity, and their deployment and execution consume gas (i.e., a fee compensating the computing resources required). Smart contract development frequently involves code reuse, but poor readable smart contracts could hinder their reuse. However, writing readable smart contracts is challenging, since practices for improving the readability could also be in contrast with optimization strategies for reducing gas consumption. This paper aims at better understanding (i) the readability aspects for which traditional software and smart contracts differ, and (ii) the specific smart contract readability features exhibiting significant relationships with gas consumption. We leverage a set of metrics that previous research has proven correlated with code readability. In particular, we first compare the values of these metrics obtained for both Solidity smart contracts and traditional software systems (written in Java). Then, we investigate the correlations occurring between these metrics and gas consumption and between each pair of metrics. The results of our study highlight that smart contracts usually exhibit lower readability than traditional software for what concerns the number of parentheses, inline comments, and blank lines used. In addition, we found some readability metrics (such as the average length of identifiers and the average number of keywords) that significantly correlate with gas consumption.","PeriodicalId":426634,"journal":{"name":"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)","volume":"214 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130768285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yixin Guo, Pengcheng Li, Yingwei Luo, Xiaolin Wang, Zhenlin Wang
With the rapid growth of program scale, program analysis, mainte-nance and optimization become increasingly diverse and complex. Applying learning-assisted methodologies onto program analysis has attracted ever-increasing attention. However, a large number of program factors including syntax structures, semantics, running platforms and compilation configurations block the effective re-alization of these methods. To overcome these obstacles, existing works prefer to be on a basis of source code or abstract syntax tree, but unfortunately are sub-optimal for binary-oriented analysis tasks closely related to the compilation process. To this end, we propose a new program analysis approach that aims at solving program-level and procedure-level tasks with one model, by taking advantage of the great power of graph neural networks from the level of binary code. By fusing the semantics of control flow graphs, data flow graphs and call graphs into one model, and embedding instructions and values simultaneously, our method can effectively work around emerging compilation-related problems. By testing the proposed method on two tasks, binary similarity detection and dead store prediction, the results show that our method is able to achieve as high accuracy as 83.25%, and 82.77%.
{"title":"Exploring GNN Based Program Embedding Technologies for Binary Related Tasks","authors":"Yixin Guo, Pengcheng Li, Yingwei Luo, Xiaolin Wang, Zhenlin Wang","doi":"10.1145/3524610.3527900","DOIUrl":"https://doi.org/10.1145/3524610.3527900","url":null,"abstract":"With the rapid growth of program scale, program analysis, mainte-nance and optimization become increasingly diverse and complex. Applying learning-assisted methodologies onto program analysis has attracted ever-increasing attention. However, a large number of program factors including syntax structures, semantics, running platforms and compilation configurations block the effective re-alization of these methods. To overcome these obstacles, existing works prefer to be on a basis of source code or abstract syntax tree, but unfortunately are sub-optimal for binary-oriented analysis tasks closely related to the compilation process. To this end, we propose a new program analysis approach that aims at solving program-level and procedure-level tasks with one model, by taking advantage of the great power of graph neural networks from the level of binary code. By fusing the semantics of control flow graphs, data flow graphs and call graphs into one model, and embedding instructions and values simultaneously, our method can effectively work around emerging compilation-related problems. By testing the proposed method on two tasks, binary similarity detection and dead store prediction, the results show that our method is able to achieve as high accuracy as 83.25%, and 82.77%.","PeriodicalId":426634,"journal":{"name":"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121350641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Issue reports are a pivotal interface between developers and users for receiving information about bugs in their products. In practice, issue reports often have incorrect information or insufficient information to enable bugs to be reproduced, and this has the effect of delaying the entire bug-fixing process. To facilitate their bug-reproduction work, GitHub has provided a new feature that allows users to share videos (e.g., mp4 files.) Using such videos, reports can be made to developers about the details of bugs by recording the symptoms, reproduction steps, and other important aspects of bug information. While such visual issue reports have the potential to significantly improve the bug-fixing process, no studies have empirically exam-ined this impact. In this paper, we conduct a preliminary study to identify the characteristics of visual issue reports by comparing them with non-visual issue reports. We collect 1,230 videos and 18,760 images from 226,286 issues on 4,173 publicly available repositories. Our preliminary analysis shows that issue reports with images are described in fewer words than non-visual issue reports. In addition, we observe that most dis-cussions in visual issue reports are concerned with either conditions for reproduction (e.g., when) or GUI (e.g., pageviewcontroller.)
{"title":"Do Visual Issue Reports Help Developers Fix Bugs?: - A Preliminary Study of Using Videos and Images to Report Issues on GitHub -","authors":"Hiroki Kuramoto, Masanari Kondo, Yutaro Kashiwa, Yuta Ishimoto, Kaze Shindo, Yasutaka Kamei, Naoyasu Ubayashi","doi":"10.1145/3524610.3527882","DOIUrl":"https://doi.org/10.1145/3524610.3527882","url":null,"abstract":"Issue reports are a pivotal interface between developers and users for receiving information about bugs in their products. In practice, issue reports often have incorrect information or insufficient information to enable bugs to be reproduced, and this has the effect of delaying the entire bug-fixing process. To facilitate their bug-reproduction work, GitHub has provided a new feature that allows users to share videos (e.g., mp4 files.) Using such videos, reports can be made to developers about the details of bugs by recording the symptoms, reproduction steps, and other important aspects of bug information. While such visual issue reports have the potential to significantly improve the bug-fixing process, no studies have empirically exam-ined this impact. In this paper, we conduct a preliminary study to identify the characteristics of visual issue reports by comparing them with non-visual issue reports. We collect 1,230 videos and 18,760 images from 226,286 issues on 4,173 publicly available repositories. Our preliminary analysis shows that issue reports with images are described in fewer words than non-visual issue reports. In addition, we observe that most dis-cussions in visual issue reports are concerned with either conditions for reproduction (e.g., when) or GUI (e.g., pageviewcontroller.)","PeriodicalId":426634,"journal":{"name":"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123120096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Code clones widely exist in open-source and industrial software projects and are still recognized as a threat to software main-tenance due to the additional effort required for the simultaneous maintenance of multiple clone instances and potential defects caused by inconsistent changes in clone instances. To alleviate the threat, it is essential to accurately and efficiently make the decisions of change propagation between clone instances. Based on an exploratory study on clone change propagation with five famous open-source projects, we find that a clone class can have both propagation-required changes and propagation-free changes and thus fine-grained change propagation decision is required. Based on the findings, we propose a graph-based deep learning approach to predict the change propagation requirements of clone instances. We develop a graph representation, named Fused Clone Program Dependency Graph (FC-PDG), to capture the textual and structural code contexts of a pair of clone instances along with the changes on one of them. Based on the representation, we design a deep learning model that uses a Relational Graph Convolutional Network (R-GCN) to predict the change propagation requirement. We evaluate the approach with a dataset constructed based on 51 open-source Java projects, which includes 24,672 pairs of matched changes and 38,041 non-matched changes. The results show that the approach achieves high precision (83.1%), recall (81.2%), and F1-score (82.1%). Our further evaluation with three other open-source projects confirms the generality of the trained clone change propagation prediction model.
{"title":"Predicting Change Propagation between Code Clone Instances by Graph-based Deep Learning","authors":"Bin Hu, Yijian Wu, Xin Peng, Chaofeng Sha, Xiaochen Wang, Baiqiang Fu, Wenyun Zhao","doi":"10.1145/3524610.3527912","DOIUrl":"https://doi.org/10.1145/3524610.3527912","url":null,"abstract":"Code clones widely exist in open-source and industrial software projects and are still recognized as a threat to software main-tenance due to the additional effort required for the simultaneous maintenance of multiple clone instances and potential defects caused by inconsistent changes in clone instances. To alleviate the threat, it is essential to accurately and efficiently make the decisions of change propagation between clone instances. Based on an exploratory study on clone change propagation with five famous open-source projects, we find that a clone class can have both propagation-required changes and propagation-free changes and thus fine-grained change propagation decision is required. Based on the findings, we propose a graph-based deep learning approach to predict the change propagation requirements of clone instances. We develop a graph representation, named Fused Clone Program Dependency Graph (FC-PDG), to capture the textual and structural code contexts of a pair of clone instances along with the changes on one of them. Based on the representation, we design a deep learning model that uses a Relational Graph Convolutional Network (R-GCN) to predict the change propagation requirement. We evaluate the approach with a dataset constructed based on 51 open-source Java projects, which includes 24,672 pairs of matched changes and 38,041 non-matched changes. The results show that the approach achieves high precision (83.1%), recall (81.2%), and F1-score (82.1%). Our further evaluation with three other open-source projects confirms the generality of the trained clone change propagation prediction model.","PeriodicalId":426634,"journal":{"name":"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)","volume":"47 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120876795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ju-Yeol Yu, Yan Lei, Huan Xie, Lingfeng Fu, Chunyan Liu
Automated fault localization techniques collect runtime information as input data to identify suspicious statement potentially respon-sible for program failures. To discover the statistical coincidences between test results (i.e., failing or passing) and the executions of the different statements of a program (i.e., executed or not exe-cuted), researchers developed a suspiciousness methodology (e.g., spectrum-based formulas and deep neural network models). How-ever, the occurrences of coincidental correctness (CC) which means the faulty statements were executed but the output of the program was right affect the effectiveness of fault localization. Many re-searchers seek to identify CC tests using cluster analysis. However, the high-dimensional data containing too much noise reduce the effectiveness of cluster analysis. To overcome the obstacle, we propose CBCFL: a context-based cluster fault localization approach, which incorporates a failure context showing how a failure is produced into cluster analysis. Specifically, CBCFL uses the failure context containing the state-ments whose execution affects the output of a failing test as input data for cluster analysis to improve the effectiveness of identifying CC tests. Since CC tests execute the faulty statement, we change the labels of CC tests into failing tests. We take the context and the corresponding changed labels as the input data for fault local-ization techniques. To evaluate the effectiveness of CBCFL, we conduct large-scale experiments on six large-sized programs using five state-of-the-art fault localization approaches. The experimen-tal results show that CBCFL is more effective than the baselines, e.g., our approach can improve the MLP-FL method using cluster analysis by at most 200%, 250%, and 320% under the Top-1, Top-5, and Top-10 accuracies.
{"title":"Context-based Cluster Fault Localization","authors":"Ju-Yeol Yu, Yan Lei, Huan Xie, Lingfeng Fu, Chunyan Liu","doi":"10.1145/3524610.3527891","DOIUrl":"https://doi.org/10.1145/3524610.3527891","url":null,"abstract":"Automated fault localization techniques collect runtime information as input data to identify suspicious statement potentially respon-sible for program failures. To discover the statistical coincidences between test results (i.e., failing or passing) and the executions of the different statements of a program (i.e., executed or not exe-cuted), researchers developed a suspiciousness methodology (e.g., spectrum-based formulas and deep neural network models). How-ever, the occurrences of coincidental correctness (CC) which means the faulty statements were executed but the output of the program was right affect the effectiveness of fault localization. Many re-searchers seek to identify CC tests using cluster analysis. However, the high-dimensional data containing too much noise reduce the effectiveness of cluster analysis. To overcome the obstacle, we propose CBCFL: a context-based cluster fault localization approach, which incorporates a failure context showing how a failure is produced into cluster analysis. Specifically, CBCFL uses the failure context containing the state-ments whose execution affects the output of a failing test as input data for cluster analysis to improve the effectiveness of identifying CC tests. Since CC tests execute the faulty statement, we change the labels of CC tests into failing tests. We take the context and the corresponding changed labels as the input data for fault local-ization techniques. To evaluate the effectiveness of CBCFL, we conduct large-scale experiments on six large-sized programs using five state-of-the-art fault localization approaches. The experimen-tal results show that CBCFL is more effective than the baselines, e.g., our approach can improve the MLP-FL method using cluster analysis by at most 200%, 250%, and 320% under the Top-1, Top-5, and Top-10 accuracies.","PeriodicalId":426634,"journal":{"name":"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)","volume":" 35","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113952895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Madeline Janecek, Naser Ezzati-Jivan, A. Hamou-Lhadj
Identifying and diagnosing performance anomalies is essential for maintaining software quality, yet it can be a complex and time-consuming task. Low level kernel events have been used as an excellent data source to monitor performance, but raw trace data is often too large to easily conduct effective analyses. To address this shortcoming, in this paper, we propose a framework for uncovering performance problems using execution critical path data. A critical path is the longest execution sequence without wait delays, and it can provide valuable insight into a program's internal and external dependencies. Upon extracting this data, course grained anomaly detection techniques are employed to determine if a finer grained analysis is required. If this is the case, the critical paths of individual executions are grouped together with machine learning clustering to identify different execution types, and outlying anomalies are identified using performance indicators. Finally, multiple sequence alignment is used to pinpoint specific abnormalities in the identified anomalous executions, allowing for improved application performance diagnosis and overall program comprehension.
{"title":"Performance Anomaly Detection through Sequence Alignment of System-Level Traces","authors":"Madeline Janecek, Naser Ezzati-Jivan, A. Hamou-Lhadj","doi":"10.1145/3524610.3527898","DOIUrl":"https://doi.org/10.1145/3524610.3527898","url":null,"abstract":"Identifying and diagnosing performance anomalies is essential for maintaining software quality, yet it can be a complex and time-consuming task. Low level kernel events have been used as an excellent data source to monitor performance, but raw trace data is often too large to easily conduct effective analyses. To address this shortcoming, in this paper, we propose a framework for uncovering performance problems using execution critical path data. A critical path is the longest execution sequence without wait delays, and it can provide valuable insight into a program's internal and external dependencies. Upon extracting this data, course grained anomaly detection techniques are employed to determine if a finer grained analysis is required. If this is the case, the critical paths of individual executions are grouped together with machine learning clustering to identify different execution types, and outlying anomalies are identified using performance indicators. Finally, multiple sequence alignment is used to pinpoint specific abnormalities in the identified anomalous executions, allowing for improved application performance diagnosis and overall program comprehension.","PeriodicalId":426634,"journal":{"name":"2022 IEEE/ACM 30th International Conference on Program Comprehension (ICPC)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133158496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}