使用大型语言模型的工程需求标准化的敏捷方法

syst mt`lyh Pub Date : 2023-07-10 DOI:10.3390/systems11070352

Archana Tikayat Ray, B. Cole, Olivia Fischer, Anirudh Prabhakara Bhat, Ryan T. White, D. Mavris

{"title":"使用大型语言模型的工程需求标准化的敏捷方法","authors":"Archana Tikayat Ray, B. Cole, Olivia Fischer, Anirudh Prabhakara Bhat, Ryan T. White, D. Mavris","doi":"10.3390/systems11070352","DOIUrl":null,"url":null,"abstract":"The increased complexity of modern systems is calling for an integrated and comprehensive approach to system design and development and, in particular, a shift toward Model-Based Systems Engineering (MBSE) approaches for system design. The requirements that serve as the foundation for these intricate systems are still primarily expressed in Natural Language (NL), which can contain ambiguities and inconsistencies and suffer from a lack of structure that hinders their direct translation into models. The colossal developments in the field of Natural Language Processing (NLP), in general, and Large Language Models (LLMs), in particular, can serve as an enabler for the conversion of NL requirements into machine-readable requirements. Doing so is expected to facilitate their standardization and use in a model-based environment. This paper discusses a two-fold strategy for converting NL requirements into machine-readable requirements using language models. The first approach involves creating a requirements table by extracting information from free-form NL requirements. The second approach consists of an agile methodology that facilitates the identification of boilerplate templates for different types of requirements based on observed linguistic patterns. For this study, three different LLMs are utilized. Two of these models are fine-tuned versions of Bidirectional Encoder Representations from Transformers (BERTs), specifically, aeroBERT-NER and aeroBERT-Classifier, which are trained on annotated aerospace corpora. Another LLM, called flair/chunk-english, is utilized to identify sentence chunks present in NL requirements. All three language models are utilized together to achieve the standardization of requirements. The effectiveness of the methodologies is demonstrated through the semi-automated creation of boilerplates for requirements from Parts 23 and 25 of Title 14 Code of Federal Regulations (CFRs).","PeriodicalId":52858,"journal":{"name":"syst mt`lyh","volume":"70 1","pages":"352"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Agile Methodology for the Standardization of Engineering Requirements Using Large Language Models\",\"authors\":\"Archana Tikayat Ray, B. Cole, Olivia Fischer, Anirudh Prabhakara Bhat, Ryan T. White, D. Mavris\",\"doi\":\"10.3390/systems11070352\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The increased complexity of modern systems is calling for an integrated and comprehensive approach to system design and development and, in particular, a shift toward Model-Based Systems Engineering (MBSE) approaches for system design. The requirements that serve as the foundation for these intricate systems are still primarily expressed in Natural Language (NL), which can contain ambiguities and inconsistencies and suffer from a lack of structure that hinders their direct translation into models. The colossal developments in the field of Natural Language Processing (NLP), in general, and Large Language Models (LLMs), in particular, can serve as an enabler for the conversion of NL requirements into machine-readable requirements. Doing so is expected to facilitate their standardization and use in a model-based environment. This paper discusses a two-fold strategy for converting NL requirements into machine-readable requirements using language models. The first approach involves creating a requirements table by extracting information from free-form NL requirements. The second approach consists of an agile methodology that facilitates the identification of boilerplate templates for different types of requirements based on observed linguistic patterns. For this study, three different LLMs are utilized. Two of these models are fine-tuned versions of Bidirectional Encoder Representations from Transformers (BERTs), specifically, aeroBERT-NER and aeroBERT-Classifier, which are trained on annotated aerospace corpora. Another LLM, called flair/chunk-english, is utilized to identify sentence chunks present in NL requirements. All three language models are utilized together to achieve the standardization of requirements. The effectiveness of the methodologies is demonstrated through the semi-automated creation of boilerplates for requirements from Parts 23 and 25 of Title 14 Code of Federal Regulations (CFRs).\",\"PeriodicalId\":52858,\"journal\":{\"name\":\"syst mt`lyh\",\"volume\":\"70 1\",\"pages\":\"352\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"syst mt`lyh\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/systems11070352\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"syst mt`lyh","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/systems11070352","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

现代系统日益增加的复杂性要求对系统设计和开发采用集成的和综合的方法，特别是对系统设计转向基于模型的系统工程(MBSE)方法。作为这些复杂系统基础的需求仍然主要以自然语言(NL)表达，自然语言可能包含歧义和不一致，并且缺乏结构，这阻碍了它们直接转换为模型。一般来说，自然语言处理(NLP)领域的巨大发展，特别是大型语言模型(llm)，可以作为将自然语言处理需求转换为机器可读需求的推动者。这样做有望促进它们的标准化和在基于模型的环境中的使用。本文讨论了使用语言模型将自然语言需求转换为机器可读需求的双重策略。第一种方法包括通过从自由形式的NL需求中提取信息来创建需求表。第二种方法包括一种敏捷方法，它可以根据观察到的语言模式，方便地为不同类型的需求识别样板模板。在本研究中，使用了三种不同的llm。其中两个模型是来自变形变压器(bert)的双向编码器表示的微调版本，具体来说，是在带注释的航空航天语料库上训练的aerbert - ner和aerbert - classifier。另一种LLM称为flair/chunk-english，用于识别NL要求中出现的句子块。所有三种语言模型一起使用，以实现需求的标准化。该方法的有效性通过半自动创建联邦法规(cfr)第14章第23部分和第25部分要求的样板来证明。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Agile Methodology for the Standardization of Engineering Requirements Using Large Language Models

The increased complexity of modern systems is calling for an integrated and comprehensive approach to system design and development and, in particular, a shift toward Model-Based Systems Engineering (MBSE) approaches for system design. The requirements that serve as the foundation for these intricate systems are still primarily expressed in Natural Language (NL), which can contain ambiguities and inconsistencies and suffer from a lack of structure that hinders their direct translation into models. The colossal developments in the field of Natural Language Processing (NLP), in general, and Large Language Models (LLMs), in particular, can serve as an enabler for the conversion of NL requirements into machine-readable requirements. Doing so is expected to facilitate their standardization and use in a model-based environment. This paper discusses a two-fold strategy for converting NL requirements into machine-readable requirements using language models. The first approach involves creating a requirements table by extracting information from free-form NL requirements. The second approach consists of an agile methodology that facilitates the identification of boilerplate templates for different types of requirements based on observed linguistic patterns. For this study, three different LLMs are utilized. Two of these models are fine-tuned versions of Bidirectional Encoder Representations from Transformers (BERTs), specifically, aeroBERT-NER and aeroBERT-Classifier, which are trained on annotated aerospace corpora. Another LLM, called flair/chunk-english, is utilized to identify sentence chunks present in NL requirements. All three language models are utilized together to achieve the standardization of requirements. The effectiveness of the methodologies is demonstrated through the semi-automated creation of boilerplates for requirements from Parts 23 and 25 of Title 14 Code of Federal Regulations (CFRs).

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

syst mt`lyh

自引率

0.00%

发文量

审稿时长

9 weeks