走向自我认知的复杂产品设计系统:机械工程中使用大型语言模型的细粒度多模态特征识别和语义理解方法

IF 9.9 1区 工程技术 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Advanced Engineering Informatics Pub Date : 2025-05-01 Epub Date: 2025-03-22 DOI:10.1016/j.aei.2025.103265
Xinxin Liang, Zuoxu Wang, Jihong Liu
{"title":"走向自我认知的复杂产品设计系统:机械工程中使用大型语言模型的细粒度多模态特征识别和语义理解方法","authors":"Xinxin Liang,&nbsp;Zuoxu Wang,&nbsp;Jihong Liu","doi":"10.1016/j.aei.2025.103265","DOIUrl":null,"url":null,"abstract":"<div><div>Facing the promising tendency of human-artificial intelligence (AI) collaborative product design, fine-grained and multi-modal mechanical part recognition and semantic understanding have become a basic task for achieving a self-cognitive product design system. However, traditional semantic understanding approaches for mechanical parts can only handle single-modal data, which is either textual or image data, resulting in the following limitations 1) insufficient mining on fine-grained part’s functional/behavioral/structural information, and 2) ineffectiveness on multi-modal part information alignment, therefore restricting the intelligence level of the previous product design assistants. To mitigate these challenges, this paper proposes a fine-grained multimodal reasoning approach for mechanical part semantic understanding. The proposed approach utilizes a pre-trained Convolutional Neural Network (CNN) for visual feature extraction, a large language model (LLM) called LLaMA3 for advanced textual analysis, and a Unified Feature Fusion Module (UFFM) to facilitate robust cross-modal interactions. A positive and negative sample generation mechanism is implemented to refine the model’s ability to discern subtle variations in complex components. Experimental evaluations on the Industrial Part Multimodal Dataset (IPMD) demonstrate a significant improvement in classification accuracy, providing a more precise and intelligent solution for the semantic understanding in complex product design systems.</div></div>","PeriodicalId":50941,"journal":{"name":"Advanced Engineering Informatics","volume":"65 ","pages":"Article 103265"},"PeriodicalIF":9.9000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Towards a self-cognitive complex product design system: A fine-grained multi-modal feature recognition and semantic understanding approach using large language models in mechanical engineering\",\"authors\":\"Xinxin Liang,&nbsp;Zuoxu Wang,&nbsp;Jihong Liu\",\"doi\":\"10.1016/j.aei.2025.103265\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Facing the promising tendency of human-artificial intelligence (AI) collaborative product design, fine-grained and multi-modal mechanical part recognition and semantic understanding have become a basic task for achieving a self-cognitive product design system. However, traditional semantic understanding approaches for mechanical parts can only handle single-modal data, which is either textual or image data, resulting in the following limitations 1) insufficient mining on fine-grained part’s functional/behavioral/structural information, and 2) ineffectiveness on multi-modal part information alignment, therefore restricting the intelligence level of the previous product design assistants. To mitigate these challenges, this paper proposes a fine-grained multimodal reasoning approach for mechanical part semantic understanding. The proposed approach utilizes a pre-trained Convolutional Neural Network (CNN) for visual feature extraction, a large language model (LLM) called LLaMA3 for advanced textual analysis, and a Unified Feature Fusion Module (UFFM) to facilitate robust cross-modal interactions. A positive and negative sample generation mechanism is implemented to refine the model’s ability to discern subtle variations in complex components. Experimental evaluations on the Industrial Part Multimodal Dataset (IPMD) demonstrate a significant improvement in classification accuracy, providing a more precise and intelligent solution for the semantic understanding in complex product design systems.</div></div>\",\"PeriodicalId\":50941,\"journal\":{\"name\":\"Advanced Engineering Informatics\",\"volume\":\"65 \",\"pages\":\"Article 103265\"},\"PeriodicalIF\":9.9000,\"publicationDate\":\"2025-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Advanced Engineering Informatics\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1474034625001582\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/3/22 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advanced Engineering Informatics","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1474034625001582","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/3/22 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

面对人-人工智能协同产品设计的发展趋势,细粒度、多模态的机械零件识别和语义理解成为实现自认知产品设计系统的基本任务。然而,传统的机械零件语义理解方法只能处理单模态数据,即文本数据或图像数据,因此存在以下局限性:1)对细粒度零件的功能/行为/结构信息挖掘不足;2)对多模态零件信息对齐效果不佳,从而限制了以往产品设计助手的智能水平。为了缓解这些挑战,本文提出了一种用于机械零件语义理解的细粒度多模态推理方法。所提出的方法利用预训练卷积神经网络(CNN)进行视觉特征提取,一个名为LLaMA3的大型语言模型(LLM)进行高级文本分析,以及一个统一特征融合模块(UFFM)来促进鲁棒的跨模态交互。正、负样本生成机制的实施,以完善模型的能力,以辨别微妙的变化在复杂的组件。在工业零件多模态数据集(IPMD)上的实验评估表明,该方法显著提高了分类精度,为复杂产品设计系统的语义理解提供了更精确、更智能的解决方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Towards a self-cognitive complex product design system: A fine-grained multi-modal feature recognition and semantic understanding approach using large language models in mechanical engineering
Facing the promising tendency of human-artificial intelligence (AI) collaborative product design, fine-grained and multi-modal mechanical part recognition and semantic understanding have become a basic task for achieving a self-cognitive product design system. However, traditional semantic understanding approaches for mechanical parts can only handle single-modal data, which is either textual or image data, resulting in the following limitations 1) insufficient mining on fine-grained part’s functional/behavioral/structural information, and 2) ineffectiveness on multi-modal part information alignment, therefore restricting the intelligence level of the previous product design assistants. To mitigate these challenges, this paper proposes a fine-grained multimodal reasoning approach for mechanical part semantic understanding. The proposed approach utilizes a pre-trained Convolutional Neural Network (CNN) for visual feature extraction, a large language model (LLM) called LLaMA3 for advanced textual analysis, and a Unified Feature Fusion Module (UFFM) to facilitate robust cross-modal interactions. A positive and negative sample generation mechanism is implemented to refine the model’s ability to discern subtle variations in complex components. Experimental evaluations on the Industrial Part Multimodal Dataset (IPMD) demonstrate a significant improvement in classification accuracy, providing a more precise and intelligent solution for the semantic understanding in complex product design systems.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Advanced Engineering Informatics
Advanced Engineering Informatics 工程技术-工程:综合
CiteScore
12.40
自引率
18.20%
发文量
292
审稿时长
45 days
期刊介绍: Advanced Engineering Informatics is an international Journal that solicits research papers with an emphasis on 'knowledge' and 'engineering applications'. The Journal seeks original papers that report progress in applying methods of engineering informatics. These papers should have engineering relevance and help provide a scientific base for more reliable, spontaneous, and creative engineering decision-making. Additionally, papers should demonstrate the science of supporting knowledge-intensive engineering tasks and validate the generality, power, and scalability of new methods through rigorous evaluation, preferably both qualitatively and quantitatively. Abstracting and indexing for Advanced Engineering Informatics include Science Citation Index Expanded, Scopus and INSPEC.
期刊最新文献
Synergistic in-domain and out-of-domain learning to strengthen visual scene understanding in data-scarce, imbalanced construction settings Span entropy: A novel time series complexity measurement with a redesigned phase space reconstruction Collaborative planning model for mixed traffic flow in bottleneck zones considering compliance and the impact of human-driven vehicles A method for safety risk dynamic assessment in flight cockpit intelligent human-machine interaction Multi-objective differential evolution algorithm based on partial reinforcement learning intelligence for engineering design problems and physics-informed neural networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1