Pub Date : 2022-05-01DOI: 10.1109/icse-nier55298.2022.9793508
{"title":"Program Committee of ICSE-NIER 2022","authors":"","doi":"10.1109/icse-nier55298.2022.9793508","DOIUrl":"https://doi.org/10.1109/icse-nier55298.2022.9793508","url":null,"abstract":"","PeriodicalId":416186,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)","volume":"168 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123554266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-01DOI: 10.1109/icse-nier55298.2022.9793535
Marcel Bohme
We discuss the advent of a new program analysis paradigm that allows anyone to make precise statements about the behavior of programs as they run in production across hundreds and millions of machines or devices. The scale-oblivious, in vivo program analysis leverages an almost inconceivable rate of user-generated program executions across large fleets to analyze programs of arbitrary size and composition with negligible performance overhead. In this paper, we reflect on the program analysis problem, the prevalent paradigm, and the practical reality of program analysis at large software companies. We illustrate the new paradigm using several success stories and suggest a number of exciting new research directions.
{"title":"Statistical Reasoning About Programs","authors":"Marcel Bohme","doi":"10.1109/icse-nier55298.2022.9793535","DOIUrl":"https://doi.org/10.1109/icse-nier55298.2022.9793535","url":null,"abstract":"We discuss the advent of a new program analysis paradigm that allows anyone to make precise statements about the behavior of programs as they run in production across hundreds and millions of machines or devices. The scale-oblivious, in vivo program analysis leverages an almost inconceivable rate of user-generated program executions across large fleets to analyze programs of arbitrary size and composition with negligible performance overhead. In this paper, we reflect on the program analysis problem, the prevalent paradigm, and the practical reality of program analysis at large software companies. We illustrate the new paradigm using several success stories and suggest a number of exciting new research directions.","PeriodicalId":416186,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)","volume":"230 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121465537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Title Page iii","authors":"Los Alamitos, C. Washington, bullet Tokyo","doi":"10.1109/pads.2008.1","DOIUrl":"https://doi.org/10.1109/pads.2008.1","url":null,"abstract":"","PeriodicalId":416186,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123223436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Reading through code, finding relevant methods, classes and files takes a significant portion of software development time. Having good tool support for this code browsing activity can reduce human effort and increase overall developer productivity. To help with program comprehension activities, building an abstract code summary of a software system from its call graph is an active research area. A call graph is a visual representation of the caller-callee relationships between different methods of a software system. Call graphs can be difficult to comprehend for a large code-base. Previous work by Gharibi et al. on abstract code summarizing suggested using the Agglomerative Hierarchical Clustering (AHC) tree for understanding the codebase. Each node in the tree is associated with the top five method names. When we replicated the previous approach, we observed that the number of nodes in the AHC tree is burdensome for developers to explore. We also noticed only five method names for each node is not sufficient to comprehend an abstract node. We propose a technique to transform the AHC tree using cluster flattening for natural grouping and reduced nodes. We also generate a natural text summary for each abstract node derived from method comments. In order to evaluate our proposed approach, we collected developers’ opinions about the abstract code summary tree based on their codebase. The evaluation results confirm that our approach can not only help developers get an overview of their codebases but also could assist them in doing specific software maintenance tasks.
{"title":"Supporting program comprehension by generating abstract code summary tree","authors":"Avijit Bhattacharjee, B. Roy, Kevin A. Schneider","doi":"10.1145/3510455.3512793","DOIUrl":"https://doi.org/10.1145/3510455.3512793","url":null,"abstract":"Reading through code, finding relevant methods, classes and files takes a significant portion of software development time. Having good tool support for this code browsing activity can reduce human effort and increase overall developer productivity. To help with program comprehension activities, building an abstract code summary of a software system from its call graph is an active research area. A call graph is a visual representation of the caller-callee relationships between different methods of a software system. Call graphs can be difficult to comprehend for a large code-base. Previous work by Gharibi et al. on abstract code summarizing suggested using the Agglomerative Hierarchical Clustering (AHC) tree for understanding the codebase. Each node in the tree is associated with the top five method names. When we replicated the previous approach, we observed that the number of nodes in the AHC tree is burdensome for developers to explore. We also noticed only five method names for each node is not sufficient to comprehend an abstract node. We propose a technique to transform the AHC tree using cluster flattening for natural grouping and reduced nodes. We also generate a natural text summary for each abstract node derived from method comments. In order to evaluate our proposed approach, we collected developers’ opinions about the abstract code summary tree based on their codebase. The evaluation results confirm that our approach can not only help developers get an overview of their codebases but also could assist them in doing specific software maintenance tasks.","PeriodicalId":416186,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127429119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Android byte-code transformations are used to optimize applications (apps) in terms of run-time performance and size. But do they affect the energy consumption during this process? If they do, can we employ them to reduce an app’s energy consumption? Given that most existing energy optimization techniques require developers to modify their code, a byte-code level modification technique will save developers’ time and effort. In this paper, we investigate if byte-code transformations combined with genetic search can reduce an app’s energy consumption. After applying our technique on four real-world apps, we find that some combinations of the byte-code transformations reduce the energy consumption by up to 11%.
{"title":"Black Box Technique to Reduce Energy Consumption of Android Apps","authors":"A. A. Bangash, Karim Ali, Abram Hindle","doi":"10.1145/3510455.3512795","DOIUrl":"https://doi.org/10.1145/3510455.3512795","url":null,"abstract":"Android byte-code transformations are used to optimize applications (apps) in terms of run-time performance and size. But do they affect the energy consumption during this process? If they do, can we employ them to reduce an app’s energy consumption? Given that most existing energy optimization techniques require developers to modify their code, a byte-code level modification technique will save developers’ time and effort. In this paper, we investigate if byte-code transformations combined with genetic search can reduce an app’s energy consumption. After applying our technique on four real-world apps, we find that some combinations of the byte-code transformations reduce the energy consumption by up to 11%.","PeriodicalId":416186,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115685852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The robustness and availability of cloud services are becoming increasingly important as more applications migrate to the cloud. The operations landscape today is more complex, than ever. Site reliability engineers (SREs) are expected to handle more incidents than ever before with shorter service-level agreements (SLAs). By exploiting log, tracing, metric, and network data, Artificial Intelligence for IT Operations (AIOps) enables detection of faults and anomalous issues of services. A wide variety of anomaly detection techniques have been incorporated in various AIOps platforms (e.g. PCA and autoencoder), but they all suffer from false positives. In this paper, we propose an unsupervised approach for persistent anomaly detection on top of the traditional anomaly detection approaches, with the goal of reducing false positives and providing more trustworthy alerting signals. We test our method on both simulated and real-world datasets. Our technique reduces false positive anomalies by at least 28%, resulting in more reliable and trustworthy notifications. CCS CONCEPTS • Computing methodologies $rightarrow$ Anomaly detection;. Software and its engineering $rightarrow$Maintaining software. ACM Reference Format: Dipanwita Guhathakurta, Pooja Aggarwal, Seema Nagar, and Rohan Arora, Bing Zhou. 2022. Utilizing Persistence for Post Facto Suppression of Invalid Anomalies Using System Logs. In New Ideas and Emerging Results (ICSENIER’22), May 21-29, 2022, Pittsburgh, PA, USA. ACM, New York, NY, USA, 5 pages. https://doi.org/10.1145/3510455.3512774
{"title":"Utilizing Persistence for Post Facto Suppression of Invalid Anomalies Using System Logs","authors":"Dipanwita Guhathakurta, Pooja Aggarwal, Seema Nagar, Rohan Arora, Bing Zhou","doi":"10.1145/3510455.3512774","DOIUrl":"https://doi.org/10.1145/3510455.3512774","url":null,"abstract":"The robustness and availability of cloud services are becoming increasingly important as more applications migrate to the cloud. The operations landscape today is more complex, than ever. Site reliability engineers (SREs) are expected to handle more incidents than ever before with shorter service-level agreements (SLAs). By exploiting log, tracing, metric, and network data, Artificial Intelligence for IT Operations (AIOps) enables detection of faults and anomalous issues of services. A wide variety of anomaly detection techniques have been incorporated in various AIOps platforms (e.g. PCA and autoencoder), but they all suffer from false positives. In this paper, we propose an unsupervised approach for persistent anomaly detection on top of the traditional anomaly detection approaches, with the goal of reducing false positives and providing more trustworthy alerting signals. We test our method on both simulated and real-world datasets. Our technique reduces false positive anomalies by at least 28%, resulting in more reliable and trustworthy notifications. CCS CONCEPTS • Computing methodologies $rightarrow$ Anomaly detection;. Software and its engineering $rightarrow$Maintaining software. ACM Reference Format: Dipanwita Guhathakurta, Pooja Aggarwal, Seema Nagar, and Rohan Arora, Bing Zhou. 2022. Utilizing Persistence for Post Facto Suppression of Invalid Anomalies Using System Logs. In New Ideas and Emerging Results (ICSENIER’22), May 21-29, 2022, Pittsburgh, PA, USA. ACM, New York, NY, USA, 5 pages. https://doi.org/10.1145/3510455.3512774","PeriodicalId":416186,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)","volume":"144 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121330537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Georges Aaron Randrianaina, D. Khelladi, Olivier Zendra, M. Acher
Building software is a crucial task to compile, test, and deploy software systems while continuously ensuring quality. As software is more and more configurable, building multiple configurations is a pressing need, yet, costly and challenging to instrument. The common practice is to independently build (a.k.a., clean build) a software for a subset of configurations. While incremental build has been considered for software evolution and relatively small modifications of the source code, it has surprisingly not been considered for software configurations. In this vision paper, we formulate the hypothesis that incremental build can reduce the cost of exploring the configuration space of software systems. We detail how we apply incremental build for two real-world application scenarios and conduct a preliminary evaluation on two case studies, namely x264 and Linux Kernel. For x264, we found that one can incrementally build configurations in an order such that overall build time is reduced. Nevertheless, we could not find any optimal order with the Linux Kernel, due to a high distance between random configurations. Therefore, we show it is possible to control the process of generating configurations: we could reuse commonality and gain up to 66% of build time compared to only clean builds.CCS CONCEPTS • Software and its engineering $rightarrow$ Software configuration management and version control systems.
{"title":"Towards Incremental Build of Software Configurations","authors":"Georges Aaron Randrianaina, D. Khelladi, Olivier Zendra, M. Acher","doi":"10.1145/3510455.3512792","DOIUrl":"https://doi.org/10.1145/3510455.3512792","url":null,"abstract":"Building software is a crucial task to compile, test, and deploy software systems while continuously ensuring quality. As software is more and more configurable, building multiple configurations is a pressing need, yet, costly and challenging to instrument. The common practice is to independently build (a.k.a., clean build) a software for a subset of configurations. While incremental build has been considered for software evolution and relatively small modifications of the source code, it has surprisingly not been considered for software configurations. In this vision paper, we formulate the hypothesis that incremental build can reduce the cost of exploring the configuration space of software systems. We detail how we apply incremental build for two real-world application scenarios and conduct a preliminary evaluation on two case studies, namely x264 and Linux Kernel. For x264, we found that one can incrementally build configurations in an order such that overall build time is reduced. Nevertheless, we could not find any optimal order with the Linux Kernel, due to a high distance between random configurations. Therefore, we show it is possible to control the process of generating configurations: we could reuse commonality and gain up to 66% of build time compared to only clean builds.CCS CONCEPTS • Software and its engineering $rightarrow$ Software configuration management and version control systems.","PeriodicalId":416186,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133974986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-01DOI: 10.1109/icse-nier55298.2022.9793512
{"title":"Message from the NIER Chairs of ICSE 2022","authors":"","doi":"10.1109/icse-nier55298.2022.9793512","DOIUrl":"https://doi.org/10.1109/icse-nier55298.2022.9793512","url":null,"abstract":"","PeriodicalId":416186,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132859825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The latest trend of incorporating various data-centric machine learning (ML) models in software-intensive systems has posed new challenges in the quality assurance practice of software engineering, especially in a high-risk environment. ML experts are now focusing on explaining ML models to assure the safe behavior of ML-based systems. However, not enough attention has been paid to explain the inherent uncertainty of the training data. The current practice of ML-based system engineering lacks transparency in the systematic fitness assessment process of the training data before engaging in the rigorous ML model training. We propose a method of assessing the collective confidence in the quality of a training dataset by using Dempster Shafer theory and its modified combination rule (Yager’s rule). With the example of training datasets for pedestrian detection of autonomous vehicles, we demonstrate how the proposed approach can be used by the stakeholders with diverse expertise to combine their beliefs in the quality arguments and evidences about the data. Our results open up a scope of future research on data requirements engineering that can facilitate evidence-based data assurance for ML-based safety-critical systems. CCS CONCEPTS•Software and its engineering $rightarrow$Risk management; Collaboration in software development;•Mathematics of computing $rightarrow$ Hypothesis testing and confidence interval computation.
{"title":"Are We Training with The Right Data? Evaluating Collective Confidence in Training Data using Dempster Shafer Theory","authors":"Sangeeta Dey, Seok-Won Lee","doi":"10.1145/3510455.3512779","DOIUrl":"https://doi.org/10.1145/3510455.3512779","url":null,"abstract":"The latest trend of incorporating various data-centric machine learning (ML) models in software-intensive systems has posed new challenges in the quality assurance practice of software engineering, especially in a high-risk environment. ML experts are now focusing on explaining ML models to assure the safe behavior of ML-based systems. However, not enough attention has been paid to explain the inherent uncertainty of the training data. The current practice of ML-based system engineering lacks transparency in the systematic fitness assessment process of the training data before engaging in the rigorous ML model training. We propose a method of assessing the collective confidence in the quality of a training dataset by using Dempster Shafer theory and its modified combination rule (Yager’s rule). With the example of training datasets for pedestrian detection of autonomous vehicles, we demonstrate how the proposed approach can be used by the stakeholders with diverse expertise to combine their beliefs in the quality arguments and evidences about the data. Our results open up a scope of future research on data requirements engineering that can facilitate evidence-based data assurance for ML-based safety-critical systems. CCS CONCEPTS•Software and its engineering $rightarrow$Risk management; Collaboration in software development;•Mathematics of computing $rightarrow$ Hypothesis testing and confidence interval computation.","PeriodicalId":416186,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132098936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jai Kannan, Scott Barnett, Luís Cruz, Anj Simmons, Akash Agarwal
Meeting the rise of industry demand to incorporate machine learning (ML) components into software systems requires interdisciplinary teams contributing to a shared code base. To maintain consistency, reduce defects and ensure maintainability, developers use code analysis tools to aid them in identifying defects and maintaining standards. With the inclusion of machine learning, tools must account for the cultural differences within the teams which manifests as multiple programming languages, and conflicting definitions and objectives. Existing tools fail to identify these cultural differences and are geared towards software engineering which reduces their adoption in ML projects. In our approach we attempt to resolve this problem by exploring the use of context which includes i) purpose of the source code, ii) technical domain, iii) problem domain, iv) team norms, v) operational environment, and vi) development lifecycle stage to provide contextualised error reporting for code analysis. To demonstrate our approach, we adapt Pylint as an example and apply a set of contextual transformations to the linting results based on the domain of individual project files under analysis. This allows for contextualised and meaningful error reporting for the end user. CCS CONCEPTS • Software and its engineering → Software maintenance tools.
{"title":"MLSmellHound: A Context-Aware Code Analysis Tool","authors":"Jai Kannan, Scott Barnett, Luís Cruz, Anj Simmons, Akash Agarwal","doi":"10.1145/3510455.3512773","DOIUrl":"https://doi.org/10.1145/3510455.3512773","url":null,"abstract":"Meeting the rise of industry demand to incorporate machine learning (ML) components into software systems requires interdisciplinary teams contributing to a shared code base. To maintain consistency, reduce defects and ensure maintainability, developers use code analysis tools to aid them in identifying defects and maintaining standards. With the inclusion of machine learning, tools must account for the cultural differences within the teams which manifests as multiple programming languages, and conflicting definitions and objectives. Existing tools fail to identify these cultural differences and are geared towards software engineering which reduces their adoption in ML projects. In our approach we attempt to resolve this problem by exploring the use of context which includes i) purpose of the source code, ii) technical domain, iii) problem domain, iv) team norms, v) operational environment, and vi) development lifecycle stage to provide contextualised error reporting for code analysis. To demonstrate our approach, we adapt Pylint as an example and apply a set of contextual transformations to the linting results based on the domain of individual project files under analysis. This allows for contextualised and meaningful error reporting for the end user. CCS CONCEPTS • Software and its engineering → Software maintenance tools.","PeriodicalId":416186,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125612706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}