An empirical investigation of challenges of specifying training data and runtime monitors for critical software with machine learning and their relation to architectural decisions
{"title":"An empirical investigation of challenges of specifying training data and runtime monitors for critical software with machine learning and their relation to architectural decisions","authors":"","doi":"10.1007/s00766-024-00415-4","DOIUrl":null,"url":null,"abstract":"<h3>Abstract</h3> <p>The development and operation of critical software that contains machine learning (ML) models requires diligence and established processes. Especially the training data used during the development of ML models have major influences on the later behaviour of the system. Runtime monitors are used to provide guarantees for that behaviour. Runtime monitors for example check that the data at runtime is compatible with the data used to train the model. In a first step towards identifying challenges when specifying requirements for training data and runtime monitors, we conducted and thematically analysed ten interviews with practitioners who develop ML models for critical applications in the automotive industry. We identified 17 themes describing the challenges and classified them in six challenge groups. In a second step, we found interconnection between the challenge themes through an additional semantic analysis of the interviews. We explored how the identified challenge themes and their interconnections can be mapped to different architecture views. This step involved identifying relevant architecture views such as data, context, hardware, AI model, and functional safety views that can address the identified challenges. The article presents a list of the identified underlying challenges, identified relations between the challenges and a mapping to architecture views. The intention of this work is to highlight once more that requirement specifications and system architecture are interlinked, even for AI-specific specification challenges such as specifying requirements for training data and runtime monitoring.</p>","PeriodicalId":20912,"journal":{"name":"Requirements Engineering","volume":"20 1","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Requirements Engineering","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00766-024-00415-4","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
The development and operation of critical software that contains machine learning (ML) models requires diligence and established processes. Especially the training data used during the development of ML models have major influences on the later behaviour of the system. Runtime monitors are used to provide guarantees for that behaviour. Runtime monitors for example check that the data at runtime is compatible with the data used to train the model. In a first step towards identifying challenges when specifying requirements for training data and runtime monitors, we conducted and thematically analysed ten interviews with practitioners who develop ML models for critical applications in the automotive industry. We identified 17 themes describing the challenges and classified them in six challenge groups. In a second step, we found interconnection between the challenge themes through an additional semantic analysis of the interviews. We explored how the identified challenge themes and their interconnections can be mapped to different architecture views. This step involved identifying relevant architecture views such as data, context, hardware, AI model, and functional safety views that can address the identified challenges. The article presents a list of the identified underlying challenges, identified relations between the challenges and a mapping to architecture views. The intention of this work is to highlight once more that requirement specifications and system architecture are interlinked, even for AI-specific specification challenges such as specifying requirements for training data and runtime monitoring.
摘要 开发和运行包含机器学习(ML)模型的关键软件需要勤奋和成熟的流程。特别是在开发 ML 模型过程中使用的训练数据对系统以后的行为有重大影响。运行时监控器用于为这种行为提供保证。例如,运行时监控器会检查运行时的数据是否与用于训练模型的数据兼容。在确定训练数据和运行时监控器的具体要求时所面临挑战的第一步,我们与为汽车行业关键应用开发 ML 模型的从业人员进行了十次访谈,并对访谈内容进行了专题分析。我们确定了 17 个描述挑战的主题,并将它们分为六个挑战组。第二步,我们通过对访谈进行语义分析,发现了挑战主题之间的相互联系。我们探讨了如何将确定的挑战主题及其相互联系映射到不同的架构视图中。这一步包括确定相关的架构视图,如数据、上下文、硬件、人工智能模型和功能安全视图,以应对已确定的挑战。文章列出了已确定的基本挑战、已确定的挑战之间的关系以及与架构视图的映射。这项工作的目的是再次强调,需求规格和系统架构是相互关联的,即使是针对特定人工智能规格的挑战,如指定训练数据和运行时监控的需求。
期刊介绍:
The journal provides a focus for the dissemination of new results about the elicitation, representation and validation of requirements of software intensive information systems or applications. Theoretical and applied submissions are welcome, but all papers must explicitly address:
-the practical consequences of the ideas for the design of complex systems
-how the ideas should be evaluated by the reflective practitioner
The journal is motivated by a multi-disciplinary view that considers requirements not only in terms of software components specification but also in terms of activities for their elicitation, representation and agreement, carried out within an organisational and social context. To this end, contributions are sought from fields such as software engineering, information systems, occupational sociology, cognitive and organisational psychology, human-computer interaction, computer-supported cooperative work, linguistics and philosophy for work addressing specifically requirements engineering issues.