Dachuan Shi , Olga Meyer , Michael Oberle , Thomas Bauernhansl
{"title":"利用微调大型语言模型和资产管理外壳进行双重数据映射,实现可互操作的知识表示法","authors":"Dachuan Shi , Olga Meyer , Michael Oberle , Thomas Bauernhansl","doi":"10.1016/j.rcim.2024.102837","DOIUrl":null,"url":null,"abstract":"<div><p>In the context of Industry 4.0, ensuring the compatibility of digital twins (DTs) with existing software systems in the manufacturing sector presents a significant challenge. The Asset Administration Shell (AAS), conceptualized as the standardized DT for an asset, offers a powerful framework that connects the DT with the established software infrastructure through interoperable knowledge representation. Although the IEC 63278 series specifies the AAS metamodel, it lacks a matching strategy for automating the mapping between proprietary data from existing software and AAS information models. Addressing this gap, we introduce a novel dual data mapping system (DDMS) that utilizes a fine-tuned open-source large language model (LLM) for entity matching. This system facilitates not only the mapping between existing software and AAS models but also between AAS models and standardized vocabulary dictionaries, thereby enhancing the model's semantic interoperability. A case study within the injection molding domain illustrates the practical application of DDMS for the automated creation of AAS instances, seamlessly integrating the manufacturer's existing data. Furthermore, we extensively investigate the potential of fine-tuning decode-only LLMs as generative classifiers and encoding-based classifiers for the entity matching task. To this end, we establish two AAS-specific datasets by collecting and compiling AAS-related resources. In addition, supplementary experiments are performed on general entity-matching benchmark datasets to ensure that our empirical conclusions and insights are generally applicable. The experiment results indicate that the fine-tuned generative LLM classifier achieves slightly better results, while the encoding-based classifier enables much faster inference. Furthermore, the fine-tuned LLM surpasses all state-of-the-art approaches for entity matching, including GPT-4 enhanced with in-context learning and chain of thoughts. This evidence highlights the effectiveness of the proposed DDMS in bridging the interoperability gap within DT applications, offering a scalable solution for the manufacturing industry.</p></div>","PeriodicalId":21452,"journal":{"name":"Robotics and Computer-integrated Manufacturing","volume":"91 ","pages":"Article 102837"},"PeriodicalIF":9.1000,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0736584524001248/pdfft?md5=ce13f8902e3f0294b6ee2bb0f02f505e&pid=1-s2.0-S0736584524001248-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Dual data mapping with fine-tuned large language models and asset administration shells toward interoperable knowledge representation\",\"authors\":\"Dachuan Shi , Olga Meyer , Michael Oberle , Thomas Bauernhansl\",\"doi\":\"10.1016/j.rcim.2024.102837\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>In the context of Industry 4.0, ensuring the compatibility of digital twins (DTs) with existing software systems in the manufacturing sector presents a significant challenge. The Asset Administration Shell (AAS), conceptualized as the standardized DT for an asset, offers a powerful framework that connects the DT with the established software infrastructure through interoperable knowledge representation. Although the IEC 63278 series specifies the AAS metamodel, it lacks a matching strategy for automating the mapping between proprietary data from existing software and AAS information models. Addressing this gap, we introduce a novel dual data mapping system (DDMS) that utilizes a fine-tuned open-source large language model (LLM) for entity matching. This system facilitates not only the mapping between existing software and AAS models but also between AAS models and standardized vocabulary dictionaries, thereby enhancing the model's semantic interoperability. A case study within the injection molding domain illustrates the practical application of DDMS for the automated creation of AAS instances, seamlessly integrating the manufacturer's existing data. Furthermore, we extensively investigate the potential of fine-tuning decode-only LLMs as generative classifiers and encoding-based classifiers for the entity matching task. To this end, we establish two AAS-specific datasets by collecting and compiling AAS-related resources. In addition, supplementary experiments are performed on general entity-matching benchmark datasets to ensure that our empirical conclusions and insights are generally applicable. The experiment results indicate that the fine-tuned generative LLM classifier achieves slightly better results, while the encoding-based classifier enables much faster inference. Furthermore, the fine-tuned LLM surpasses all state-of-the-art approaches for entity matching, including GPT-4 enhanced with in-context learning and chain of thoughts. This evidence highlights the effectiveness of the proposed DDMS in bridging the interoperability gap within DT applications, offering a scalable solution for the manufacturing industry.</p></div>\",\"PeriodicalId\":21452,\"journal\":{\"name\":\"Robotics and Computer-integrated Manufacturing\",\"volume\":\"91 \",\"pages\":\"Article 102837\"},\"PeriodicalIF\":9.1000,\"publicationDate\":\"2024-07-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S0736584524001248/pdfft?md5=ce13f8902e3f0294b6ee2bb0f02f505e&pid=1-s2.0-S0736584524001248-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Robotics and Computer-integrated Manufacturing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0736584524001248\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Robotics and Computer-integrated Manufacturing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0736584524001248","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
Dual data mapping with fine-tuned large language models and asset administration shells toward interoperable knowledge representation
In the context of Industry 4.0, ensuring the compatibility of digital twins (DTs) with existing software systems in the manufacturing sector presents a significant challenge. The Asset Administration Shell (AAS), conceptualized as the standardized DT for an asset, offers a powerful framework that connects the DT with the established software infrastructure through interoperable knowledge representation. Although the IEC 63278 series specifies the AAS metamodel, it lacks a matching strategy for automating the mapping between proprietary data from existing software and AAS information models. Addressing this gap, we introduce a novel dual data mapping system (DDMS) that utilizes a fine-tuned open-source large language model (LLM) for entity matching. This system facilitates not only the mapping between existing software and AAS models but also between AAS models and standardized vocabulary dictionaries, thereby enhancing the model's semantic interoperability. A case study within the injection molding domain illustrates the practical application of DDMS for the automated creation of AAS instances, seamlessly integrating the manufacturer's existing data. Furthermore, we extensively investigate the potential of fine-tuning decode-only LLMs as generative classifiers and encoding-based classifiers for the entity matching task. To this end, we establish two AAS-specific datasets by collecting and compiling AAS-related resources. In addition, supplementary experiments are performed on general entity-matching benchmark datasets to ensure that our empirical conclusions and insights are generally applicable. The experiment results indicate that the fine-tuned generative LLM classifier achieves slightly better results, while the encoding-based classifier enables much faster inference. Furthermore, the fine-tuned LLM surpasses all state-of-the-art approaches for entity matching, including GPT-4 enhanced with in-context learning and chain of thoughts. This evidence highlights the effectiveness of the proposed DDMS in bridging the interoperability gap within DT applications, offering a scalable solution for the manufacturing industry.
期刊介绍:
The journal, Robotics and Computer-Integrated Manufacturing, focuses on sharing research applications that contribute to the development of new or enhanced robotics, manufacturing technologies, and innovative manufacturing strategies that are relevant to industry. Papers that combine theory and experimental validation are preferred, while review papers on current robotics and manufacturing issues are also considered. However, papers on traditional machining processes, modeling and simulation, supply chain management, and resource optimization are generally not within the scope of the journal, as there are more appropriate journals for these topics. Similarly, papers that are overly theoretical or mathematical will be directed to other suitable journals. The journal welcomes original papers in areas such as industrial robotics, human-robot collaboration in manufacturing, cloud-based manufacturing, cyber-physical production systems, big data analytics in manufacturing, smart mechatronics, machine learning, adaptive and sustainable manufacturing, and other fields involving unique manufacturing technologies.