2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)最新文献_第5页

2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)

Pub Date : 2022-02-03 DOI: 10.1145/3510457.3513082

Elgun Jabrayilzade, Mikhail Evtikhiev, Eray Tüzün, V. Kovalenko

Bus factor is a metric that identifies how resilient is the project to the sudden engineer turnover. It states the minimal number of engineers that have to be hit by a bus for a project to be stalled. Even though the metric is often discussed in the community, few studies consider its general relevance. Moreover, the existing tools for bus factor estimation focus solely on the data from version control systems, even though there exists other channels for knowledge generation and distribution. With a survey of 269 engineers, we find that the bus factor is perceived as an important problem in collective development, and determine the highest impact channels of knowledge generation and distribution in software development teams. We also propose a multimodal bus factor estimation algorithm that uses data on code reviews and meetings together with the VCS data. We test the algorithm on 13 projects developed at JetBrains and compared its results to the results of the state-of-the-art tool by Avelino et al. against the ground truth collected in a survey of the engineers working on these projects. Our algorithm is slightly better in terms of both predicting the bus factor as well as key developers compared to the results of Avelino et al. Finally, we use the interviews and the surveys to derive a set of best practices to address the bus factor issue and proposals for the possible bus factor assessment tool.

总线因子是一种度量，用于确定项目对工程师突然离职的弹性有多大。它规定了一个项目被公共汽车撞到的最少工程师人数。尽管这个度量标准在社区中经常被讨论，但很少有研究考虑到它的普遍相关性。此外，现有的总线因子估计工具只关注版本控制系统的数据，尽管存在其他的知识生成和分发渠道。通过对269名工程师的调查，我们发现总线因素被认为是集体开发中的一个重要问题，并确定了软件开发团队中知识生成和分布的最高影响渠道。我们还提出了一种多模式总线因子估计算法，该算法将代码审查和会议数据与VCS数据一起使用。我们在JetBrains开发的13个项目中测试了该算法，并将其结果与Avelino等人最先进的工具的结果进行了比较，并将其与在这些项目中工作的工程师调查中收集的基本事实进行了比较。与Avelino等人的结果相比，我们的算法在预测总线因子和关键开发人员方面稍好一些。最后，我们通过访谈和调查得出了一组解决公共因素问题的最佳实践，并为可能的公共因素评估工具提出了建议。

{"title":"Bus Factor in Practice","authors":"Elgun Jabrayilzade, Mikhail Evtikhiev, Eray Tüzün, V. Kovalenko","doi":"10.1145/3510457.3513082","DOIUrl":"https://doi.org/10.1145/3510457.3513082","url":null,"abstract":"Bus factor is a metric that identifies how resilient is the project to the sudden engineer turnover. It states the minimal number of engineers that have to be hit by a bus for a project to be stalled. Even though the metric is often discussed in the community, few studies consider its general relevance. Moreover, the existing tools for bus factor estimation focus solely on the data from version control systems, even though there exists other channels for knowledge generation and distribution. With a survey of 269 engineers, we find that the bus factor is perceived as an important problem in collective development, and determine the highest impact channels of knowledge generation and distribution in software development teams. We also propose a multimodal bus factor estimation algorithm that uses data on code reviews and meetings together with the VCS data. We test the algorithm on 13 projects developed at JetBrains and compared its results to the results of the state-of-the-art tool by Avelino et al. against the ground truth collected in a survey of the engineers working on these projects. Our algorithm is slightly better in terms of both predicting the bus factor as well as key developers compared to the results of Avelino et al. Finally, we use the interviews and the surveys to derive a set of best practices to address the bus factor issue and proposals for the possible bus factor assessment tool.","PeriodicalId":119790,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125793822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

The Unexplored Terrain of Compiler Warnings 编译器警告的未知领域

2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)

Pub Date : 2022-01-25 DOI: 10.1145/3510457.3513057

Gunnar Kudrjavets, Aditya Kumar, Nachiappan Nagappan, Ayushi Rastogi

The authors' industry experiences suggest that compiler warnings, a lightweight version of program analysis, are valuable early bug detection tools. Significant costs are associated with patches and security bulletins for issues that could have been avoided if compiler warnings were addressed. Yet, the industry's attitude towards compiler warnings is mixed. Practices range from silencing all compiler warnings to having a zero-tolerance policy as to any warnings. Current published data indicates that addressing compiler warnings early is beneficial. However, support for this value theory stems from grey literature or is anecdotal. Additional focused research is needed to truly assess the cost-benefit of addressing warnings.

作者的行业经验表明，编译器警告(程序分析的轻量级版本)是有价值的早期错误检测工具。如果解决了编译器警告，则可以避免的问题的补丁和安全公告会带来重大成本。然而，业界对编译器警告的态度不一。实践范围从沉默所有编译器警告到对任何警告采取零容忍策略。当前公布的数据表明，尽早处理编译器警告是有益的。然而，对这一价值理论的支持来自灰色文献或轶事。需要更多的重点研究来真正评估解决警告的成本效益。

引用次数: 0

“Project smells” - Experiences in Analysing the Software Quality of ML Projects with mllint “项目气味”——用mllint分析ML项目软件质量的经验

2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)

Pub Date : 2022-01-20 DOI: 10.1145/3510457.3513041

B. V. Oort, L. Cruz, B. Loni, A. Deursen

Machine Learning (ML) projects incur novel challenges in their development and productionisation over traditional software applications, though established principles and best practices in ensuring the project's software quality still apply. While using static analysis to catch code smells has been shown to improve software quality attributes, it is only a small piece of the software quality puzzle, especially in the case of ML projects given their additional challenges and lower degree of Software Engineering (SE) experience in the data scientists that develop them. We introduce the novel concept of project smells which consider deficits in project management as a more holistic perspective on software quality in ML projects. An open-source static analysis tool mllint was also implemented to help detect and mitigate these. Our research evaluates this novel concept of project smells in the industrial context of ING, a global bank and large software- and data-intensive organisation. We also investigate the perceived importance of these project smells for proof-of-concept versus production-ready ML projects, as well as the perceived obstructions and benefits to using static analysis tools such as mllint. Our findings indicate a need for context-aware static analysis tools, that fit the needs of the project at its current stage of development, while requiring minimal configuration effort from the user.

尽管确保项目软件质量的既定原则和最佳实践仍然适用，但机器学习(ML)项目在开发和生产过程中比传统软件应用程序面临新的挑战。虽然使用静态分析来捕捉代码气味已经被证明可以提高软件质量属性，但它只是软件质量难题的一小部分，特别是在ML项目的情况下，考虑到它们的额外挑战和开发它们的数据科学家较低程度的软件工程(SE)经验。我们引入了项目气味的新概念，它将项目管理中的缺陷视为ML项目中软件质量的更全面的视角。还实现了一个开源静态分析工具mllint来帮助检测和缓解这些问题。我们的研究在荷兰国际集团(一家全球性银行和大型软件和数据密集型组织)的工业背景下评估了这种新颖的项目气味概念。我们还研究了这些项目气味对于概念验证和生产就绪ML项目的重要性，以及使用静态分析工具(如mllint)的障碍和好处。我们的发现表明需要上下文感知的静态分析工具，它适合项目在当前开发阶段的需要，同时需要用户最小的配置工作。

{"title":"“Project smells” - Experiences in Analysing the Software Quality of ML Projects with mllint","authors":"B. V. Oort, L. Cruz, B. Loni, A. Deursen","doi":"10.1145/3510457.3513041","DOIUrl":"https://doi.org/10.1145/3510457.3513041","url":null,"abstract":"Machine Learning (ML) projects incur novel challenges in their development and productionisation over traditional software applications, though established principles and best practices in ensuring the project's software quality still apply. While using static analysis to catch code smells has been shown to improve software quality attributes, it is only a small piece of the software quality puzzle, especially in the case of ML projects given their additional challenges and lower degree of Software Engineering (SE) experience in the data scientists that develop them. We introduce the novel concept of project smells which consider deficits in project management as a more holistic perspective on software quality in ML projects. An open-source static analysis tool mllint was also implemented to help detect and mitigate these. Our research evaluates this novel concept of project smells in the industrial context of ING, a global bank and large software- and data-intensive organisation. We also investigate the perceived importance of these project smells for proof-of-concept versus production-ready ML projects, as well as the perceived obstructions and benefits to using static analysis tools such as mllint. Our findings indicate a need for context-aware static analysis tools, that fit the needs of the project at its current stage of development, while requiring minimal configuration effort from the user.","PeriodicalId":119790,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115532178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Toward Among-Device AI from On-Device AI with Stream Pipelines 从带有流管道的设备上AI转向设备间AI

2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)

Pub Date : 2022-01-16 DOI: 10.1145/3510457.3513026

MyungJoo Ham, Sangjung Woo, Jaeyun Jung, Wook Song, Gichan Jang, Y. Ahn, Hyoungjoo Ahn

Modern consumer electronic devices often provide intelligence services with deep neural networks. We have started migrating the computing locations of intelligence services from cloud servers (traditional AI systems) to the corresponding devices (on-device AI systems). On-device AI systems generally have the advantages of preserving privacy, removing network latency, and saving cloud costs. With the emergence of on-device AI systems having relatively low computing power, the inconsistent and varying hardware resources and capabilities pose difficulties. Authors' affiliation has started applying a stream pipeline framework, NNStreamer, for on-device AI systems, saving developmental costs and hardware resources and improving performance. We want to expand the types of devices and applications with on-device AI services products of both the affiliation and second/third parties. We also want to make each AI service atomic, re-deployable, and shared among connected devices of arbitrary vendors; we now have yet another requirement introduced as it always has been. The new requirement of “among-device AI” includes connectivity between AI pipelines so that they may share computing resources and hardware capabilities across a wide range of devices regardless of vendors and manufacturers. We propose extensions of the stream pipeline framework, NNStreamer, for on-device AI so that NNStreamer may provide among-device AI capability. This work is a Linux Foundation (LF AI & Data) open source project accepting contributions from the general public.

现代消费电子设备通常通过深度神经网络提供情报服务。我们已经开始将智能服务的计算位置从云服务器(传统人工智能系统)迁移到相应的设备(设备上的人工智能系统)。设备上的人工智能系统通常具有保护隐私、消除网络延迟和节省云成本的优势。随着计算能力相对较低的设备上人工智能系统的出现，硬件资源和能力的不一致和变化带来了困难。作者联盟已经开始应用流管道框架NNStreamer，用于设备上的人工智能系统，节省开发成本和硬件资源，提高性能。我们希望通过附属和第二/第三方的设备上人工智能服务产品来扩展设备和应用程序的类型。我们还想让每个AI服务原子化，可重新部署，并在任意供应商的连接设备之间共享;我们现在又引入了另一个需求。“设备间人工智能”的新要求包括人工智能管道之间的连接，以便它们可以在各种设备上共享计算资源和硬件功能，而不受供应商和制造商的限制。我们建议扩展流管道框架，NNStreamer，用于设备上的AI，以便NNStreamer可以提供设备间的AI功能。这项工作是Linux基金会(LF AI & Data)的开源项目，接受公众的贡献。

{"title":"Toward Among-Device AI from On-Device AI with Stream Pipelines","authors":"MyungJoo Ham, Sangjung Woo, Jaeyun Jung, Wook Song, Gichan Jang, Y. Ahn, Hyoungjoo Ahn","doi":"10.1145/3510457.3513026","DOIUrl":"https://doi.org/10.1145/3510457.3513026","url":null,"abstract":"Modern consumer electronic devices often provide intelligence services with deep neural networks. We have started migrating the computing locations of intelligence services from cloud servers (traditional AI systems) to the corresponding devices (on-device AI systems). On-device AI systems generally have the advantages of preserving privacy, removing network latency, and saving cloud costs. With the emergence of on-device AI systems having relatively low computing power, the inconsistent and varying hardware resources and capabilities pose difficulties. Authors' affiliation has started applying a stream pipeline framework, NNStreamer, for on-device AI systems, saving developmental costs and hardware resources and improving performance. We want to expand the types of devices and applications with on-device AI services products of both the affiliation and second/third parties. We also want to make each AI service atomic, re-deployable, and shared among connected devices of arbitrary vendors; we now have yet another requirement introduced as it always has been. The new requirement of “among-device AI” includes connectivity between AI pipelines so that they may share computing resources and hardware capabilities across a wide range of devices regardless of vendors and manufacturers. We propose extensions of the stream pipeline framework, NNStreamer, for on-device AI so that NNStreamer may provide among-device AI capability. This work is a Linux Foundation (LF AI & Data) open source project accepting contributions from the general public.","PeriodicalId":119790,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123312439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Decision Models for Selecting Patterns and Strategies in Microservices Systems and their Evaluation by Practitioners 微服务系统中选择模式和策略的决策模型及其实践者的评估

2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)

Pub Date : 2022-01-15 DOI: 10.1145/3510457.3513079

M. Waseem, Peng Liang, Aakash Ahmad, Mojtaba Shahin, A. Khan, Gast'on M'arquez

Researchers and practitioners have recently proposed many Mi-croservices Architecture (MSA) patterns and strategies covering various aspects of microservices system life cycle, such as service design and security. However, selecting and implementing these patterns and strategies can entail various challenges for microser-vices practitioners. To this end, this study proposes decision models for selecting patterns and strategies covering four MSA design ar-eas: application decomposition into microservices, microservices security, microservices communication, and service discovery. We used peer-reviewed and grey literature to identify the patterns, strategies, and quality attributes for creating these decision models. To evaluate the familiarity, understandability, completeness, and usefulness of the decision models, we conducted semi-structured interviews with 24 microservices practitioners from 12 countries across five continents. Our evaluation results show that the prac-titioners found the decision models as an effective guide to select microservices patterns and strategies.

研究人员和从业者最近提出了许多微服务体系结构(MSA)模式和策略，涵盖了微服务系统生命周期的各个方面，如服务设计和安全性。然而，选择和实现这些模式和策略可能会给微服务从业者带来各种挑战。为此，本研究提出了选择模式和策略的决策模型，涵盖了四个MSA设计领域:应用程序分解为微服务、微服务安全、微服务通信和服务发现。我们使用同行评审和灰色文献来识别模式、策略和创建这些决策模型的质量属性。为了评估决策模型的熟悉度、可理解性、完整性和有用性，我们对来自五大洲12个国家的24位微服务从业者进行了半结构化访谈。我们的评估结果表明，从业者发现决策模型是选择微服务模式和策略的有效指南。

{"title":"Decision Models for Selecting Patterns and Strategies in Microservices Systems and their Evaluation by Practitioners","authors":"M. Waseem, Peng Liang, Aakash Ahmad, Mojtaba Shahin, A. Khan, Gast'on M'arquez","doi":"10.1145/3510457.3513079","DOIUrl":"https://doi.org/10.1145/3510457.3513079","url":null,"abstract":"Researchers and practitioners have recently proposed many Mi-croservices Architecture (MSA) patterns and strategies covering various aspects of microservices system life cycle, such as service design and security. However, selecting and implementing these patterns and strategies can entail various challenges for microser-vices practitioners. To this end, this study proposes decision models for selecting patterns and strategies covering four MSA design ar-eas: application decomposition into microservices, microservices security, microservices communication, and service discovery. We used peer-reviewed and grey literature to identify the patterns, strategies, and quality attributes for creating these decision models. To evaluate the familiarity, understandability, completeness, and usefulness of the decision models, we conducted semi-structured interviews with 24 microservices practitioners from 12 countries across five continents. Our evaluation results show that the prac-titioners found the decision models as an effective guide to select microservices patterns and strategies.","PeriodicalId":119790,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125517891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

What are Weak Links in the npm Supply Chain? npm供应链中的薄弱环节是什么?

2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)

Pub Date : 2021-12-19 DOI: 10.1145/3510457.3513044

N. Zahan, L. Williams, Thomas Zimmermann, Patrice Godefroid, Brendan Murphy, C. Maddila

Modern software development frequently uses third-party packages, raising the concern of supply chain security attacks. Many attackers target popular package managers, like npm, and their users with supply chain attacks. In 2021 there was a 650% year-on-year growth in security attacks by exploiting Open Source Software's supply chain. Proactive approaches are needed to predict package vulnerability to high-risk supply chain attacks. The goal of this work is to help software developers and security specialists in measuring npm supply chain weak link signals to prevent future supply chain attacks by empirically studying npm package metadata. In this paper, we analyzed the metadata of 1.63 million JavaScript npm packages. We propose six signals of security weaknesses in a software supply chain, such as the presence of install scripts, maintainer accounts associated with an expired email domain, and inactive packages with inactive maintainers. One of our case studies identified 11 malicious packages from the install scripts signal. We also found 2,818 maintainer email addresses associated with expired domains, allowing an attacker to hijack 8,494 packages by taking over the npm accounts. We obtained feedback on our weak link signals through a survey responded to by 470 npm package developers. The majority of the developers supported three out of our six proposed weak link signals. The developers also indicated that they would want to be notified about weak links signals before using third-party packages. Additionally, we discussed eight new signals suggested by package developers.

现代软件开发经常使用第三方软件包，这引起了对供应链安全攻击的关注。许多攻击者针对流行的包管理器(如npm)及其用户进行供应链攻击。2021年，利用开源软件供应链的安全攻击同比增长了650%。需要前瞻性的方法来预测包装对高风险供应链攻击的脆弱性。这项工作的目标是通过实证研究npm包元数据，帮助软件开发人员和安全专家测量npm供应链薄弱环节信号，以防止未来的供应链攻击。在本文中，我们分析了163万个JavaScript npm包的元数据。我们提出了软件供应链中安全弱点的六个信号，例如安装脚本的存在，与过期电子邮件域相关联的维护者帐户，以及与不活跃维护者相关联的不活跃包。我们的一个案例研究从安装脚本信号中识别了11个恶意包。我们还发现了2818个与过期域名相关的维护者电子邮件地址，攻击者通过接管npm帐户劫持了8494个包。我们通过470个npm包开发者的调查获得了关于弱链接信号的反馈。大多数开发人员支持我们提出的六个弱链路信号中的三个。开发人员还表示，他们希望在使用第三方软件包之前收到有关弱链接信号的通知。此外，我们还讨论了包开发人员建议的8个新信号。

{"title":"What are Weak Links in the npm Supply Chain?","authors":"N. Zahan, L. Williams, Thomas Zimmermann, Patrice Godefroid, Brendan Murphy, C. Maddila","doi":"10.1145/3510457.3513044","DOIUrl":"https://doi.org/10.1145/3510457.3513044","url":null,"abstract":"Modern software development frequently uses third-party packages, raising the concern of supply chain security attacks. Many attackers target popular package managers, like npm, and their users with supply chain attacks. In 2021 there was a 650% year-on-year growth in security attacks by exploiting Open Source Software's supply chain. Proactive approaches are needed to predict package vulnerability to high-risk supply chain attacks. The goal of this work is to help software developers and security specialists in measuring npm supply chain weak link signals to prevent future supply chain attacks by empirically studying npm package metadata. In this paper, we analyzed the metadata of 1.63 million JavaScript npm packages. We propose six signals of security weaknesses in a software supply chain, such as the presence of install scripts, maintainer accounts associated with an expired email domain, and inactive packages with inactive maintainers. One of our case studies identified 11 malicious packages from the install scripts signal. We also found 2,818 maintainer email addresses associated with expired domains, allowing an attacker to hijack 8,494 packages by taking over the npm accounts. We obtained feedback on our weak link signals through a survey responded to by 470 npm package developers. The majority of the developers supported three out of our six proposed weak link signals. The developers also indicated that they would want to be notified about weak links signals before using third-party packages. Additionally, we discussed eight new signals suggested by package developers.","PeriodicalId":119790,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114312752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 51

InspectJS: Leveraging Code Similarity and User-Feedback for Effective Taint Specification Inference for JavaScript InspectJS:利用代码相似度和用户反馈对JavaScript进行有效的污点规范推断

2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)

Pub Date : 2021-11-18 DOI: 10.1145/3510457.3513048

Saikat Dutta, D. Garbervetsky, Shuvendu K. Lahiri, Max Schäfer

Static analysis has established itself as a weapon of choice for detecting security vulnerabilities. Taint analysis in particular is a very general and powerful technique, where security policies are expressed in terms of forbidden flows, either from untrusted input sources to sensitive sinks (in integrity policies) or from sensitive sources to untrusted sinks (in confidentiality policies). The appeal of this approach is that the taint-tracking mechanism has to be implemented only once, and can then be parameterized with different taint specifications (that is, sets of sources and sinks, as well as any sanitizers that render otherwise problematic flows innocuous) to detect many different kinds of vulnerabilities. But while techniques for implementing scalable inter-procedural static taint tracking are fairly well established, crafting taint specifications is still more of an art than a science, and in practice tends to involve a lot of manual effort. Past work has focussed on automated techniques for inferring taint specifications for libraries either from their implementation or from the way they tend to be used in client code. Among the latter, machine learning-based approaches have shown great promise. In this work we present our experience combining an existing machine-learning approach to mining sink specifications for JavaScript libraries with manual taint modelling in the context of GitHub's CodeQL analysis framework. We show that the machine-learning component can successfully infer many new taint sinks that either are not part of the manual modelling or are not detected due to analysis incompleteness. Moreover, we present techniques for organizing sink predictions using automated ranking and code-similarity metrics that allow an analysis engineer to efficiently sift through large numbers of predictions to identify true positives.

静态分析已经成为检测安全漏洞的首选武器。污染分析是一种非常通用和强大的技术，其中安全策略用禁止流来表示，从不可信的输入源到敏感的接收点(在完整性策略中)或从敏感的输入源到不可信的接收点(在机密性策略中)。这种方法的吸引力在于，污染跟踪机制必须只实现一次，然后可以使用不同的污染规范(即，源和汇的集合，以及任何使其他有问题的流无害的消毒器)进行参数化，以检测许多不同类型的漏洞。但是，虽然实现可伸缩的过程间静态污染跟踪的技术已经相当成熟，但制定污染规范仍然是一门艺术，而不是科学，而且在实践中往往涉及大量的手工工作。过去的工作主要集中在自动化技术上，用于从库的实现或它们在客户端代码中使用的方式来推断库的污染规范。在后者中，基于机器学习的方法显示出巨大的前景。在这项工作中，我们展示了我们的经验，结合现有的机器学习方法，在GitHub的CodeQL分析框架的背景下，挖掘JavaScript库的sink规范和手动污染建模。我们表明，机器学习组件可以成功地推断出许多新的污染汇，这些污染汇要么不是人工建模的一部分，要么由于分析不完整而未被检测到。此外，我们提出了使用自动排名和代码相似性度量来组织汇预测的技术，这些技术允许分析工程师有效地筛选大量预测以识别真正的阳性。

{"title":"InspectJS: Leveraging Code Similarity and User-Feedback for Effective Taint Specification Inference for JavaScript","authors":"Saikat Dutta, D. Garbervetsky, Shuvendu K. Lahiri, Max Schäfer","doi":"10.1145/3510457.3513048","DOIUrl":"https://doi.org/10.1145/3510457.3513048","url":null,"abstract":"Static analysis has established itself as a weapon of choice for detecting security vulnerabilities. Taint analysis in particular is a very general and powerful technique, where security policies are expressed in terms of forbidden flows, either from untrusted input sources to sensitive sinks (in integrity policies) or from sensitive sources to untrusted sinks (in confidentiality policies). The appeal of this approach is that the taint-tracking mechanism has to be implemented only once, and can then be parameterized with different taint specifications (that is, sets of sources and sinks, as well as any sanitizers that render otherwise problematic flows innocuous) to detect many different kinds of vulnerabilities. But while techniques for implementing scalable inter-procedural static taint tracking are fairly well established, crafting taint specifications is still more of an art than a science, and in practice tends to involve a lot of manual effort. Past work has focussed on automated techniques for inferring taint specifications for libraries either from their implementation or from the way they tend to be used in client code. Among the latter, machine learning-based approaches have shown great promise. In this work we present our experience combining an existing machine-learning approach to mining sink specifications for JavaScript libraries with manual taint modelling in the context of GitHub's CodeQL analysis framework. We show that the machine-learning component can successfully infer many new taint sinks that either are not part of the manual modelling or are not detected due to analysis incompleteness. Moreover, we present techniques for organizing sink predictions using automated ranking and code-similarity metrics that allow an analysis engineer to efficiently sift through large numbers of predictions to identify true positives.","PeriodicalId":119790,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115745697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Software engineering for Responsible AI: An empirical study and operationalised patterns 负责任人工智能的软件工程:实证研究和可操作模式

2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)

Pub Date : 2021-11-18 DOI: 10.1109/ICSE-SEIP55303.2022.9793864

Q. Lu, Liming Zhu, Xiwei Xu, J. Whittle, David M. Douglas, Conrad Sanderson

AI ethics principles and guidelines are typically high-level and do not provide concrete guidance on how to develop responsible AI systems. To address this shortcoming, we perform an empirical study involving interviews with 21 scientists and engineers to understand the practitioners' views on AI ethics principles and their implementation. Our major findings are: (1) the current practice is often a done-once-and-forget type of ethical risk assessment at a particular development step, which is not sufficient for highly uncertain and continual learning AI systems; (2) ethical requirements are either omitted or mostly stated as high-level objectives, and not specified explicitly in verifiable way as system outputs or outcomes; (3) although ethical requirements have the characteristics of cross-cutting quality and non-functional requirements amenable to architecture and design analysis, system-level architecture and design are under-explored; (4) there is a strong desire for continuously monitoring and validating AI systems post deployment for ethical requirements but current operation practices provide limited guidance. To address these findings, we suggest a preliminary list of patterns to provide operationalised guidance for developing responsible AI systems.

人工智能伦理原则和指导方针通常是高层次的，并没有就如何开发负责任的人工智能系统提供具体指导。为了解决这一缺点，我们进行了一项实证研究，采访了21位科学家和工程师，以了解从业者对人工智能伦理原则及其实施的看法。我们的主要发现是:(1)目前的做法通常是在特定的开发步骤中进行一次性的道德风险评估，这对于高度不确定和持续学习的人工智能系统来说是不够的;(2)伦理要求要么被省略，要么主要作为高层目标陈述，而没有以可验证的方式明确指定为系统输出或结果;(3)虽然伦理需求具有跨领域的质量和非功能需求的特点，适合架构和设计分析，但对系统级架构和设计的探索不足;(4)人们强烈希望在部署人工智能系统后持续监控和验证其道德要求，但目前的操作实践提供的指导有限。为了解决这些发现，我们提出了一个初步的模式列表，为开发负责任的人工智能系统提供可操作的指导。

{"title":"Software engineering for Responsible AI: An empirical study and operationalised patterns","authors":"Q. Lu, Liming Zhu, Xiwei Xu, J. Whittle, David M. Douglas, Conrad Sanderson","doi":"10.1109/ICSE-SEIP55303.2022.9793864","DOIUrl":"https://doi.org/10.1109/ICSE-SEIP55303.2022.9793864","url":null,"abstract":"AI ethics principles and guidelines are typically high-level and do not provide concrete guidance on how to develop responsible AI systems. To address this shortcoming, we perform an empirical study involving interviews with 21 scientists and engineers to understand the practitioners' views on AI ethics principles and their implementation. Our major findings are: (1) the current practice is often a done-once-and-forget type of ethical risk assessment at a particular development step, which is not sufficient for highly uncertain and continual learning AI systems; (2) ethical requirements are either omitted or mostly stated as high-level objectives, and not specified explicitly in verifiable way as system outputs or outcomes; (3) although ethical requirements have the characteristics of cross-cutting quality and non-functional requirements amenable to architecture and design analysis, system-level architecture and design are under-explored; (4) there is a strong desire for continuously monitoring and validating AI systems post deployment for ethical requirements but current operation practices provide limited guidance. To address these findings, we suggest a preliminary list of patterns to provide operationalised guidance for developing responsible AI systems.","PeriodicalId":119790,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)","volume":"99 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127110915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 22

Counterfactual Explanations for Models of Code 代码模型的反事实解释

2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)

Pub Date : 2021-11-10 DOI: 10.1145/3510457.3513081

Jürgen Cito, Işıl Dillig, V. Murali, S. Chandra

Machine learning (ML) models play an increasingly prevalent role in many software engineering tasks. However, because most models are now powered by opaque deep neural networks, it can be difficult for developers to understand why the model came to a certain conclusion and how to act upon the model's prediction. Motivated by this problem, this paper explores counterfactual explanations for models of source code. Such counterfactual explanations constitute minimal changes to the source code under which the model “changes its mind”. We integrate counterfactual explanation generation to models of source code in a real-world setting. We describe considerations that impact both the ability to find realistic and plausible counterfactual explanations, as well as the usefulness of such explanation to the developers that use the model. In a series of experiments we investigate the efficacy of our approach on three different models, each based on a BERT-like architecture operating over source code.

机器学习(ML)模型在许多软件工程任务中发挥着越来越普遍的作用。然而，由于大多数模型现在都是由不透明的深度神经网络驱动的，因此开发人员很难理解模型为什么会得出某个结论，以及如何根据模型的预测采取行动。受此问题的启发，本文探讨了源代码模型的反事实解释。这种反事实的解释构成了对源代码的最小更改，在这些更改下，模型“改变了它的想法”。我们将反事实解释生成集成到现实世界环境中的源代码模型中。我们描述了影响找到现实的和似是而非的解释的能力的考虑因素，以及这种解释对使用模型的开发人员的有用性。在一系列实验中，我们研究了我们的方法在三个不同模型上的有效性，每个模型都基于在源代码上操作的类似bert的架构。

引用次数: 28

When Cyber-Physical Systems Meet AI: A Benchmark, an Evaluation, and a Way Forward 当网络物理系统遇到人工智能:基准，评估和前进的道路

2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)

Pub Date : 2021-11-08 DOI: 10.1145/3510457.3513049

Jiayang Song, Deyun Lyu, Zhenya Zhang, Zhijie Wang, Tianyi Zhang, L. Ma

Cyber-Physical Systems (CPS) have been broadly deployed in safety-critical domains, such as automotive systems, avionics, medical devices, etc. In recent years, Artificial Intelligence (AI) has been increasingly adopted to control CPS. Despite the popularity of AI-enabled CPS, few benchmarks are publicly available. There is also a lack of deep understanding on the performance and reliability of AI-enabled CPS across different industrial domains. To bridge this gap, we present a public benchmark of industry-level CPS in seven domains and build AI controllers for them via state-of-the-art deep reinforcement learning (DRL) methods. Based on that, we further perform a systematic evaluation of these AI-enabled systems with their traditional counterparts to identify current challenges and future opportunities. Our key findings include (1) AI controllers do not always outperform traditional controllers, (2) existing CPS testing techniques (falsification, specifically) fall short of analyzing AI-enabled CPS, and (3) building a hybrid system that strategically combines and switches between AI controllers and traditional controllers can achieve better performance across different domains. Our results highlight the need for new testing techniques for AI-enabled CPS and the need for more investigations into hybrid CPS to achieve optimal performance and reliability. Our benchmark, code, detailed evaluation results, and experiment scripts are available on https://sites.google.com/view/ai-cps-benchmark.

信息物理系统(CPS)已广泛应用于安全关键领域，如汽车系统、航空电子设备、医疗设备等。近年来，人工智能(AI)越来越多地用于控制CPS。尽管人工智能支持的CPS很受欢迎，但很少有公开的基准。对于人工智能支持的CPS在不同工业领域的性能和可靠性也缺乏深入的了解。为了弥补这一差距，我们在七个领域提出了行业级CPS的公共基准，并通过最先进的深度强化学习(DRL)方法为它们构建人工智能控制器。在此基础上，我们进一步对这些人工智能系统与传统系统进行系统评估，以确定当前的挑战和未来的机遇。我们的主要发现包括:(1)人工智能控制器并不总是优于传统控制器，(2)现有的CPS测试技术(特别是证伪)无法分析人工智能支持的CPS，以及(3)构建一个混合系统，在人工智能控制器和传统控制器之间进行战略性组合和切换，可以在不同领域实现更好的性能。我们的研究结果强调了对人工智能CPS的新测试技术的需求，以及对混合CPS进行更多研究以实现最佳性能和可靠性的需求。我们的基准测试、代码、详细的评估结果和实验脚本可在https://sites.google.com/view/ai-cps-benchmark上获得。

{"title":"When Cyber-Physical Systems Meet AI: A Benchmark, an Evaluation, and a Way Forward","authors":"Jiayang Song, Deyun Lyu, Zhenya Zhang, Zhijie Wang, Tianyi Zhang, L. Ma","doi":"10.1145/3510457.3513049","DOIUrl":"https://doi.org/10.1145/3510457.3513049","url":null,"abstract":"Cyber-Physical Systems (CPS) have been broadly deployed in safety-critical domains, such as automotive systems, avionics, medical devices, etc. In recent years, Artificial Intelligence (AI) has been increasingly adopted to control CPS. Despite the popularity of AI-enabled CPS, few benchmarks are publicly available. There is also a lack of deep understanding on the performance and reliability of AI-enabled CPS across different industrial domains. To bridge this gap, we present a public benchmark of industry-level CPS in seven domains and build AI controllers for them via state-of-the-art deep reinforcement learning (DRL) methods. Based on that, we further perform a systematic evaluation of these AI-enabled systems with their traditional counterparts to identify current challenges and future opportunities. Our key findings include (1) AI controllers do not always outperform traditional controllers, (2) existing CPS testing techniques (falsification, specifically) fall short of analyzing AI-enabled CPS, and (3) building a hybrid system that strategically combines and switches between AI controllers and traditional controllers can achieve better performance across different domains. Our results highlight the need for new testing techniques for AI-enabled CPS and the need for more investigations into hybrid CPS to achieve optimal performance and reliability. Our benchmark, code, detailed evaluation results, and experiment scripts are available on https://sites.google.com/view/ai-cps-benchmark.","PeriodicalId":119790,"journal":{"name":"2022 IEEE/ACM 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125767927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10