首页 > 最新文献

2022 IEEE/ACM 44th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)最新文献

英文 中文
Program Committee of ICSE-NIER 2022 ICSE-NIER 2022项目委员会
{"title":"Program Committee of ICSE-NIER 2022","authors":"","doi":"10.1109/icse-nier55298.2022.9793508","DOIUrl":"https://doi.org/10.1109/icse-nier55298.2022.9793508","url":null,"abstract":"","PeriodicalId":416186,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)","volume":"168 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123554266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Statistical Reasoning About Programs 关于程序的统计推理
Marcel Bohme
We discuss the advent of a new program analysis paradigm that allows anyone to make precise statements about the behavior of programs as they run in production across hundreds and millions of machines or devices. The scale-oblivious, in vivo program analysis leverages an almost inconceivable rate of user-generated program executions across large fleets to analyze programs of arbitrary size and composition with negligible performance overhead. In this paper, we reflect on the program analysis problem, the prevalent paradigm, and the practical reality of program analysis at large software companies. We illustrate the new paradigm using several success stories and suggest a number of exciting new research directions.
我们讨论了一种新的程序分析范式的出现,它允许任何人对程序在数亿台机器或设备上运行时的行为做出精确的陈述。与规模无关的活体程序分析利用几乎不可思议的用户生成程序执行率,在大型机群中分析任意大小和组成的程序,而性能开销可以忽略不计。在本文中,我们对大型软件公司的程序分析问题、流行的范式和实际情况进行了反思。我们用几个成功的案例来说明新的范式,并提出了一些令人兴奋的新的研究方向。
{"title":"Statistical Reasoning About Programs","authors":"Marcel Bohme","doi":"10.1109/icse-nier55298.2022.9793535","DOIUrl":"https://doi.org/10.1109/icse-nier55298.2022.9793535","url":null,"abstract":"We discuss the advent of a new program analysis paradigm that allows anyone to make precise statements about the behavior of programs as they run in production across hundreds and millions of machines or devices. The scale-oblivious, in vivo program analysis leverages an almost inconceivable rate of user-generated program executions across large fleets to analyze programs of arbitrary size and composition with negligible performance overhead. In this paper, we reflect on the program analysis problem, the prevalent paradigm, and the practical reality of program analysis at large software companies. We illustrate the new paradigm using several success stories and suggest a number of exciting new research directions.","PeriodicalId":416186,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)","volume":"230 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121465537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Title Page iii 第三页标题
Los Alamitos, C. Washington, bullet Tokyo
{"title":"Title Page iii","authors":"Los Alamitos, C. Washington, bullet Tokyo","doi":"10.1109/pads.2008.1","DOIUrl":"https://doi.org/10.1109/pads.2008.1","url":null,"abstract":"","PeriodicalId":416186,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123223436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Supporting program comprehension by generating abstract code summary tree 通过生成抽象代码摘要树来支持程序理解
Avijit Bhattacharjee, B. Roy, Kevin A. Schneider
Reading through code, finding relevant methods, classes and files takes a significant portion of software development time. Having good tool support for this code browsing activity can reduce human effort and increase overall developer productivity. To help with program comprehension activities, building an abstract code summary of a software system from its call graph is an active research area. A call graph is a visual representation of the caller-callee relationships between different methods of a software system. Call graphs can be difficult to comprehend for a large code-base. Previous work by Gharibi et al. on abstract code summarizing suggested using the Agglomerative Hierarchical Clustering (AHC) tree for understanding the codebase. Each node in the tree is associated with the top five method names. When we replicated the previous approach, we observed that the number of nodes in the AHC tree is burdensome for developers to explore. We also noticed only five method names for each node is not sufficient to comprehend an abstract node. We propose a technique to transform the AHC tree using cluster flattening for natural grouping and reduced nodes. We also generate a natural text summary for each abstract node derived from method comments. In order to evaluate our proposed approach, we collected developers’ opinions about the abstract code summary tree based on their codebase. The evaluation results confirm that our approach can not only help developers get an overview of their codebases but also could assist them in doing specific software maintenance tasks.
通读代码、查找相关的方法、类和文件需要花费很大一部分软件开发时间。为这种代码浏览活动提供良好的工具支持可以减少人力并提高开发人员的总体生产力。为了帮助程序理解活动,从调用图中构建软件系统的抽象代码摘要是一个活跃的研究领域。调用图是软件系统中不同方法之间的呼叫者-被呼叫者关系的可视化表示。对于大型代码库,调用图可能很难理解。Gharibi等人之前关于抽象代码总结的工作建议使用聚集分层聚类(AHC)树来理解代码库。树中的每个节点都与前五个方法名相关联。当我们复制前面的方法时,我们观察到AHC树中的节点数量对于开发人员来说是繁重的。我们还注意到,每个节点只有五个方法名不足以理解抽象节点。我们提出了一种利用聚类平坦化对AHC树进行自然分组和节点约简的技术。我们还为派生自方法注释的每个抽象节点生成自然的文本摘要。为了评估我们提出的方法,我们收集了开发人员关于基于他们的代码库的抽象代码摘要树的意见。评估结果证实,我们的方法不仅可以帮助开发人员了解他们的代码库,还可以帮助他们完成特定的软件维护任务。
{"title":"Supporting program comprehension by generating abstract code summary tree","authors":"Avijit Bhattacharjee, B. Roy, Kevin A. Schneider","doi":"10.1145/3510455.3512793","DOIUrl":"https://doi.org/10.1145/3510455.3512793","url":null,"abstract":"Reading through code, finding relevant methods, classes and files takes a significant portion of software development time. Having good tool support for this code browsing activity can reduce human effort and increase overall developer productivity. To help with program comprehension activities, building an abstract code summary of a software system from its call graph is an active research area. A call graph is a visual representation of the caller-callee relationships between different methods of a software system. Call graphs can be difficult to comprehend for a large code-base. Previous work by Gharibi et al. on abstract code summarizing suggested using the Agglomerative Hierarchical Clustering (AHC) tree for understanding the codebase. Each node in the tree is associated with the top five method names. When we replicated the previous approach, we observed that the number of nodes in the AHC tree is burdensome for developers to explore. We also noticed only five method names for each node is not sufficient to comprehend an abstract node. We propose a technique to transform the AHC tree using cluster flattening for natural grouping and reduced nodes. We also generate a natural text summary for each abstract node derived from method comments. In order to evaluate our proposed approach, we collected developers’ opinions about the abstract code summary tree based on their codebase. The evaluation results confirm that our approach can not only help developers get an overview of their codebases but also could assist them in doing specific software maintenance tasks.","PeriodicalId":416186,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127429119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Black Box Technique to Reduce Energy Consumption of Android Apps 降低安卓应用能耗的黑盒技术
A. A. Bangash, Karim Ali, Abram Hindle
Android byte-code transformations are used to optimize applications (apps) in terms of run-time performance and size. But do they affect the energy consumption during this process? If they do, can we employ them to reduce an app’s energy consumption? Given that most existing energy optimization techniques require developers to modify their code, a byte-code level modification technique will save developers’ time and effort. In this paper, we investigate if byte-code transformations combined with genetic search can reduce an app’s energy consumption. After applying our technique on four real-world apps, we find that some combinations of the byte-code transformations reduce the energy consumption by up to 11%.
Android字节码转换用于在运行时性能和大小方面优化应用程序(app)。但是在这个过程中它们会影响能耗吗?如果可以,我们是否可以利用它们来减少应用程序的能耗?考虑到大多数现有的能源优化技术需要开发人员修改他们的代码,字节码级别的修改技术将节省开发人员的时间和精力。在本文中,我们研究了字节码转换与遗传搜索相结合是否可以降低应用程序的能耗。在将我们的技术应用于四个实际应用程序后,我们发现字节码转换的某些组合可以减少高达11%的能耗。
{"title":"Black Box Technique to Reduce Energy Consumption of Android Apps","authors":"A. A. Bangash, Karim Ali, Abram Hindle","doi":"10.1145/3510455.3512795","DOIUrl":"https://doi.org/10.1145/3510455.3512795","url":null,"abstract":"Android byte-code transformations are used to optimize applications (apps) in terms of run-time performance and size. But do they affect the energy consumption during this process? If they do, can we employ them to reduce an app’s energy consumption? Given that most existing energy optimization techniques require developers to modify their code, a byte-code level modification technique will save developers’ time and effort. In this paper, we investigate if byte-code transformations combined with genetic search can reduce an app’s energy consumption. After applying our technique on four real-world apps, we find that some combinations of the byte-code transformations reduce the energy consumption by up to 11%.","PeriodicalId":416186,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115685852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Utilizing Persistence for Post Facto Suppression of Invalid Anomalies Using System Logs 利用持久性对无效异常的事后抑制使用系统日志
Dipanwita Guhathakurta, Pooja Aggarwal, Seema Nagar, Rohan Arora, Bing Zhou
The robustness and availability of cloud services are becoming increasingly important as more applications migrate to the cloud. The operations landscape today is more complex, than ever. Site reliability engineers (SREs) are expected to handle more incidents than ever before with shorter service-level agreements (SLAs). By exploiting log, tracing, metric, and network data, Artificial Intelligence for IT Operations (AIOps) enables detection of faults and anomalous issues of services. A wide variety of anomaly detection techniques have been incorporated in various AIOps platforms (e.g. PCA and autoencoder), but they all suffer from false positives. In this paper, we propose an unsupervised approach for persistent anomaly detection on top of the traditional anomaly detection approaches, with the goal of reducing false positives and providing more trustworthy alerting signals. We test our method on both simulated and real-world datasets. Our technique reduces false positive anomalies by at least 28%, resulting in more reliable and trustworthy notifications. CCS CONCEPTS • Computing methodologies $rightarrow$ Anomaly detection;. Software and its engineering $rightarrow$Maintaining software. ACM Reference Format: Dipanwita Guhathakurta, Pooja Aggarwal, Seema Nagar, and Rohan Arora, Bing Zhou. 2022. Utilizing Persistence for Post Facto Suppression of Invalid Anomalies Using System Logs. In New Ideas and Emerging Results (ICSENIER’22), May 21-29, 2022, Pittsburgh, PA, USA. ACM, New York, NY, USA, 5 pages. https://doi.org/10.1145/3510455.3512774
随着越来越多的应用程序迁移到云端,云服务的健壮性和可用性变得越来越重要。如今的运营环境比以往任何时候都更加复杂。站点可靠性工程师(SREs)被期望用更短的服务水平协议(sla)处理比以往更多的事件。通过利用日志、跟踪、度量和网络数据,用于IT操作的人工智能(AIOps)能够检测服务的故障和异常问题。各种各样的异常检测技术已经被整合到各种AIOps平台中(例如PCA和自动编码器),但它们都存在误报的问题。在本文中,我们在传统异常检测方法的基础上提出了一种无监督的持续异常检测方法,目的是减少误报并提供更可信的报警信号。我们在模拟和现实世界的数据集上测试了我们的方法。我们的技术将假阳性异常减少了至少28%,从而产生更可靠和值得信赖的通知。CCS CONCEPTS•计算方法$右划$异常检测;软件及其工程$右右$维护软件。ACM参考格式:Dipanwita Guhathakurta, Pooja Aggarwal, Seema Nagar, and Rohan Arora,周冰,2022。利用持久性对无效异常的事后抑制使用系统日志。《新思想与新成果》(ICSENIER ' 22), 2022年5月21-29日,美国宾夕法尼亚州匹兹堡。ACM,纽约,美国,5页。https://doi.org/10.1145/3510455.3512774
{"title":"Utilizing Persistence for Post Facto Suppression of Invalid Anomalies Using System Logs","authors":"Dipanwita Guhathakurta, Pooja Aggarwal, Seema Nagar, Rohan Arora, Bing Zhou","doi":"10.1145/3510455.3512774","DOIUrl":"https://doi.org/10.1145/3510455.3512774","url":null,"abstract":"The robustness and availability of cloud services are becoming increasingly important as more applications migrate to the cloud. The operations landscape today is more complex, than ever. Site reliability engineers (SREs) are expected to handle more incidents than ever before with shorter service-level agreements (SLAs). By exploiting log, tracing, metric, and network data, Artificial Intelligence for IT Operations (AIOps) enables detection of faults and anomalous issues of services. A wide variety of anomaly detection techniques have been incorporated in various AIOps platforms (e.g. PCA and autoencoder), but they all suffer from false positives. In this paper, we propose an unsupervised approach for persistent anomaly detection on top of the traditional anomaly detection approaches, with the goal of reducing false positives and providing more trustworthy alerting signals. We test our method on both simulated and real-world datasets. Our technique reduces false positive anomalies by at least 28%, resulting in more reliable and trustworthy notifications. CCS CONCEPTS • Computing methodologies $rightarrow$ Anomaly detection;. Software and its engineering $rightarrow$Maintaining software. ACM Reference Format: Dipanwita Guhathakurta, Pooja Aggarwal, Seema Nagar, and Rohan Arora, Bing Zhou. 2022. Utilizing Persistence for Post Facto Suppression of Invalid Anomalies Using System Logs. In New Ideas and Emerging Results (ICSENIER’22), May 21-29, 2022, Pittsburgh, PA, USA. ACM, New York, NY, USA, 5 pages. https://doi.org/10.1145/3510455.3512774","PeriodicalId":416186,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)","volume":"144 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121330537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Towards Incremental Build of Software Configurations 迈向软件配置的增量构建
Georges Aaron Randrianaina, D. Khelladi, Olivier Zendra, M. Acher
Building software is a crucial task to compile, test, and deploy software systems while continuously ensuring quality. As software is more and more configurable, building multiple configurations is a pressing need, yet, costly and challenging to instrument. The common practice is to independently build (a.k.a., clean build) a software for a subset of configurations. While incremental build has been considered for software evolution and relatively small modifications of the source code, it has surprisingly not been considered for software configurations. In this vision paper, we formulate the hypothesis that incremental build can reduce the cost of exploring the configuration space of software systems. We detail how we apply incremental build for two real-world application scenarios and conduct a preliminary evaluation on two case studies, namely x264 and Linux Kernel. For x264, we found that one can incrementally build configurations in an order such that overall build time is reduced. Nevertheless, we could not find any optimal order with the Linux Kernel, due to a high distance between random configurations. Therefore, we show it is possible to control the process of generating configurations: we could reuse commonality and gain up to 66% of build time compared to only clean builds.CCS CONCEPTS • Software and its engineering $rightarrow$ Software configuration management and version control systems.
构建软件是在持续确保质量的同时编译、测试和部署软件系统的关键任务。随着软件的可配置性越来越高,构建多个配置是一个迫切的需求,然而,对仪器来说,成本高昂且具有挑战性。常见的做法是为配置子集独立构建(也就是干净构建)软件。虽然增量构建已经被考虑用于软件发展和相对较小的源代码修改,但令人惊讶的是,它没有被考虑用于软件配置。在这篇远景论文中,我们提出了一个假设,即增量构建可以减少探索软件系统配置空间的成本。我们详细介绍了如何为两个实际应用程序场景应用增量构建,并对两个案例研究(即x264和Linux Kernel)进行了初步评估。对于x264,我们发现可以按一定顺序逐步构建配置,从而减少总体构建时间。然而,由于随机配置之间的距离很大,我们无法在Linux内核中找到任何最佳顺序。因此,我们展示了控制生成配置的过程是可能的:我们可以重用共性,并且与仅使用干净构建相比,可以获得高达66%的构建时间。CCS CONCEPTS•软件及其工程$right row$软件配置管理和版本控制系统。
{"title":"Towards Incremental Build of Software Configurations","authors":"Georges Aaron Randrianaina, D. Khelladi, Olivier Zendra, M. Acher","doi":"10.1145/3510455.3512792","DOIUrl":"https://doi.org/10.1145/3510455.3512792","url":null,"abstract":"Building software is a crucial task to compile, test, and deploy software systems while continuously ensuring quality. As software is more and more configurable, building multiple configurations is a pressing need, yet, costly and challenging to instrument. The common practice is to independently build (a.k.a., clean build) a software for a subset of configurations. While incremental build has been considered for software evolution and relatively small modifications of the source code, it has surprisingly not been considered for software configurations. In this vision paper, we formulate the hypothesis that incremental build can reduce the cost of exploring the configuration space of software systems. We detail how we apply incremental build for two real-world application scenarios and conduct a preliminary evaluation on two case studies, namely x264 and Linux Kernel. For x264, we found that one can incrementally build configurations in an order such that overall build time is reduced. Nevertheless, we could not find any optimal order with the Linux Kernel, due to a high distance between random configurations. Therefore, we show it is possible to control the process of generating configurations: we could reuse commonality and gain up to 66% of build time compared to only clean builds.CCS CONCEPTS • Software and its engineering $rightarrow$ Software configuration management and version control systems.","PeriodicalId":416186,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133974986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Message from the NIER Chairs of ICSE 2022 来自ICSE 2022的NIER主席的信息
{"title":"Message from the NIER Chairs of ICSE 2022","authors":"","doi":"10.1109/icse-nier55298.2022.9793512","DOIUrl":"https://doi.org/10.1109/icse-nier55298.2022.9793512","url":null,"abstract":"","PeriodicalId":416186,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132859825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Are We Training with The Right Data? Evaluating Collective Confidence in Training Data using Dempster Shafer Theory 我们的训练数据正确吗?用Dempster Shafer理论评估训练数据的集体置信度
Sangeeta Dey, Seok-Won Lee
The latest trend of incorporating various data-centric machine learning (ML) models in software-intensive systems has posed new challenges in the quality assurance practice of software engineering, especially in a high-risk environment. ML experts are now focusing on explaining ML models to assure the safe behavior of ML-based systems. However, not enough attention has been paid to explain the inherent uncertainty of the training data. The current practice of ML-based system engineering lacks transparency in the systematic fitness assessment process of the training data before engaging in the rigorous ML model training. We propose a method of assessing the collective confidence in the quality of a training dataset by using Dempster Shafer theory and its modified combination rule (Yager’s rule). With the example of training datasets for pedestrian detection of autonomous vehicles, we demonstrate how the proposed approach can be used by the stakeholders with diverse expertise to combine their beliefs in the quality arguments and evidences about the data. Our results open up a scope of future research on data requirements engineering that can facilitate evidence-based data assurance for ML-based safety-critical systems. CCS CONCEPTS•Software and its engineering $rightarrow$Risk management; Collaboration in software development;•Mathematics of computing $rightarrow$ Hypothesis testing and confidence interval computation.
在软件密集型系统中整合各种以数据为中心的机器学习(ML)模型的最新趋势给软件工程的质量保证实践带来了新的挑战,特别是在高风险环境中。机器学习专家现在专注于解释机器学习模型,以确保基于机器学习的系统的安全行为。然而,对训练数据固有的不确定性的解释却没有引起足够的重视。目前基于机器学习的系统工程实践在进行严格的机器学习模型训练之前,训练数据的系统适应度评估过程缺乏透明度。我们提出了一种利用Dempster Shafer理论及其改进的组合规则(Yager规则)来评估训练数据集质量的集体置信度的方法。以自动驾驶车辆行人检测的训练数据集为例,我们展示了具有不同专业知识的利益相关者如何使用所提出的方法来结合他们对数据质量论点和证据的信念。我们的研究结果为数据需求工程的未来研究开辟了一个范围,可以促进基于ml的安全关键系统的循证数据保证。CCS CONCEPTS软件及其工程风险管理;软件开发中的协作;•计算的数学假设检验和置信区间计算。
{"title":"Are We Training with The Right Data? Evaluating Collective Confidence in Training Data using Dempster Shafer Theory","authors":"Sangeeta Dey, Seok-Won Lee","doi":"10.1145/3510455.3512779","DOIUrl":"https://doi.org/10.1145/3510455.3512779","url":null,"abstract":"The latest trend of incorporating various data-centric machine learning (ML) models in software-intensive systems has posed new challenges in the quality assurance practice of software engineering, especially in a high-risk environment. ML experts are now focusing on explaining ML models to assure the safe behavior of ML-based systems. However, not enough attention has been paid to explain the inherent uncertainty of the training data. The current practice of ML-based system engineering lacks transparency in the systematic fitness assessment process of the training data before engaging in the rigorous ML model training. We propose a method of assessing the collective confidence in the quality of a training dataset by using Dempster Shafer theory and its modified combination rule (Yager’s rule). With the example of training datasets for pedestrian detection of autonomous vehicles, we demonstrate how the proposed approach can be used by the stakeholders with diverse expertise to combine their beliefs in the quality arguments and evidences about the data. Our results open up a scope of future research on data requirements engineering that can facilitate evidence-based data assurance for ML-based safety-critical systems. CCS CONCEPTS•Software and its engineering $rightarrow$Risk management; Collaboration in software development;•Mathematics of computing $rightarrow$ Hypothesis testing and confidence interval computation.","PeriodicalId":416186,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132098936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
MLSmellHound: A Context-Aware Code Analysis Tool MLSmellHound:上下文感知代码分析工具
Jai Kannan, Scott Barnett, Luís Cruz, Anj Simmons, Akash Agarwal
Meeting the rise of industry demand to incorporate machine learning (ML) components into software systems requires interdisciplinary teams contributing to a shared code base. To maintain consistency, reduce defects and ensure maintainability, developers use code analysis tools to aid them in identifying defects and maintaining standards. With the inclusion of machine learning, tools must account for the cultural differences within the teams which manifests as multiple programming languages, and conflicting definitions and objectives. Existing tools fail to identify these cultural differences and are geared towards software engineering which reduces their adoption in ML projects. In our approach we attempt to resolve this problem by exploring the use of context which includes i) purpose of the source code, ii) technical domain, iii) problem domain, iv) team norms, v) operational environment, and vi) development lifecycle stage to provide contextualised error reporting for code analysis. To demonstrate our approach, we adapt Pylint as an example and apply a set of contextual transformations to the linting results based on the domain of individual project files under analysis. This allows for contextualised and meaningful error reporting for the end user. CCS CONCEPTS • Software and its engineering → Software maintenance tools.
为了满足不断增长的行业需求,将机器学习(ML)组件整合到软件系统中,需要跨学科团队为共享代码库做出贡献。为了保持一致性、减少缺陷和确保可维护性,开发人员使用代码分析工具来帮助他们识别缺陷和维护标准。随着机器学习的加入,工具必须考虑到团队内部的文化差异,这些差异表现为多种编程语言,以及相互冲突的定义和目标。现有的工具无法识别这些文化差异,并且面向软件工程,这减少了它们在ML项目中的采用。在我们的方法中,我们试图通过探索上下文的使用来解决这个问题,其中包括i)源代码的目的,ii)技术领域,iii)问题领域,iv)团队规范,v)操作环境,以及vi)开发生命周期阶段,为代码分析提供上下文化的错误报告。为了演示我们的方法,我们将Pylint作为一个例子,并根据所分析的单个项目文件的域对检查结果应用一组上下文转换。这允许为最终用户提供上下文化和有意义的错误报告。•软件及其工程→软件维护工具。
{"title":"MLSmellHound: A Context-Aware Code Analysis Tool","authors":"Jai Kannan, Scott Barnett, Luís Cruz, Anj Simmons, Akash Agarwal","doi":"10.1145/3510455.3512773","DOIUrl":"https://doi.org/10.1145/3510455.3512773","url":null,"abstract":"Meeting the rise of industry demand to incorporate machine learning (ML) components into software systems requires interdisciplinary teams contributing to a shared code base. To maintain consistency, reduce defects and ensure maintainability, developers use code analysis tools to aid them in identifying defects and maintaining standards. With the inclusion of machine learning, tools must account for the cultural differences within the teams which manifests as multiple programming languages, and conflicting definitions and objectives. Existing tools fail to identify these cultural differences and are geared towards software engineering which reduces their adoption in ML projects. In our approach we attempt to resolve this problem by exploring the use of context which includes i) purpose of the source code, ii) technical domain, iii) problem domain, iv) team norms, v) operational environment, and vi) development lifecycle stage to provide contextualised error reporting for code analysis. To demonstrate our approach, we adapt Pylint as an example and apply a set of contextual transformations to the linting results based on the domain of individual project files under analysis. This allows for contextualised and meaningful error reporting for the end user. CCS CONCEPTS • Software and its engineering → Software maintenance tools.","PeriodicalId":416186,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125612706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2022 IEEE/ACM 44th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1