利用无监督学习从公共数据集中检索CAD装配模型

IF 9.9 1区 工程技术 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Advanced Engineering Informatics Pub Date : 2025-05-01 Epub Date: 2025-02-17 DOI:10.1016/j.aei.2025.103182
Yixuan Li , Jie Zhang , Jiazhen Pang , Ya Yao
{"title":"利用无监督学习从公共数据集中检索CAD装配模型","authors":"Yixuan Li ,&nbsp;Jie Zhang ,&nbsp;Jiazhen Pang ,&nbsp;Ya Yao","doi":"10.1016/j.aei.2025.103182","DOIUrl":null,"url":null,"abstract":"<div><div>Retrieving assembly models from public datasets can yield enriched outcomes and broaden the spectrum of insights. However, public datasets often present unique challenges, such as variance in quality and granularity of assembly models, lack of standardized methods for organizing and labeling, which hinder efficient and accurate retrieval. To address these issues, this paper presents a robust two-step retrieval method tailored for CAD assembly models from public datasets. The first phase utilizes hierarchical clustering in an unsupervised learning framework to systematically organize CAD assembly models. Each assembly model is represented by a feature vector that encapsulates geometrical and topological features derived from its Boundary Representation (B-rep), and reflects hierarchical relationships among parts and components. These feature vectors serve as the basis for systematic indexing via hierarchical clustering, grouping models based on similarity measurement. Each cluster’s centroid, representing the collective feature vector, facilitates efficient and targeted retrieval. In the second phase, the query model is directly compared to cluster centroids, enabling rapid identification of similar assembly collections. To enhance precision within identified clusters, we introduce a fine-grained retrieval technique that integrates Optimal Subsequence Bijection (OSB) with Maximum Mean Discrepancy (MMD). Evaluations on a heterogeneous dataset demonstrate that our method not only streamlines dataset organization but also effectively addresses quality variations, significantly improving retrieval efficiency across extensive collections.</div></div>","PeriodicalId":50941,"journal":{"name":"Advanced Engineering Informatics","volume":"65 ","pages":"Article 103182"},"PeriodicalIF":9.9000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Harnessing unsupervised learning for retrieving CAD assembly models from public datasets\",\"authors\":\"Yixuan Li ,&nbsp;Jie Zhang ,&nbsp;Jiazhen Pang ,&nbsp;Ya Yao\",\"doi\":\"10.1016/j.aei.2025.103182\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Retrieving assembly models from public datasets can yield enriched outcomes and broaden the spectrum of insights. However, public datasets often present unique challenges, such as variance in quality and granularity of assembly models, lack of standardized methods for organizing and labeling, which hinder efficient and accurate retrieval. To address these issues, this paper presents a robust two-step retrieval method tailored for CAD assembly models from public datasets. The first phase utilizes hierarchical clustering in an unsupervised learning framework to systematically organize CAD assembly models. Each assembly model is represented by a feature vector that encapsulates geometrical and topological features derived from its Boundary Representation (B-rep), and reflects hierarchical relationships among parts and components. These feature vectors serve as the basis for systematic indexing via hierarchical clustering, grouping models based on similarity measurement. Each cluster’s centroid, representing the collective feature vector, facilitates efficient and targeted retrieval. In the second phase, the query model is directly compared to cluster centroids, enabling rapid identification of similar assembly collections. To enhance precision within identified clusters, we introduce a fine-grained retrieval technique that integrates Optimal Subsequence Bijection (OSB) with Maximum Mean Discrepancy (MMD). Evaluations on a heterogeneous dataset demonstrate that our method not only streamlines dataset organization but also effectively addresses quality variations, significantly improving retrieval efficiency across extensive collections.</div></div>\",\"PeriodicalId\":50941,\"journal\":{\"name\":\"Advanced Engineering Informatics\",\"volume\":\"65 \",\"pages\":\"Article 103182\"},\"PeriodicalIF\":9.9000,\"publicationDate\":\"2025-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Advanced Engineering Informatics\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1474034625000758\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/2/17 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advanced Engineering Informatics","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1474034625000758","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/17 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

从公共数据集中检索装配模型可以产生丰富的结果,并拓宽见解的范围。然而,公共数据集经常面临独特的挑战,如装配模型的质量和粒度差异,缺乏标准化的组织和标记方法,这阻碍了有效和准确的检索。为了解决这些问题,本文提出了一种针对公共数据集的CAD装配模型定制的鲁棒两步检索方法。第一阶段利用无监督学习框架中的分层聚类系统地组织CAD装配模型。每个装配模型由一个特征向量表示,该特征向量封装了来自其边界表示(B-rep)的几何和拓扑特征,并反映了零件和组件之间的层次关系。这些特征向量作为系统索引的基础,通过层次聚类,基于相似性度量的分组模型。每个聚类的质心代表集体特征向量,便于高效和有针对性的检索。在第二阶段,直接将查询模型与聚类质心进行比较,从而能够快速识别相似的装配集合。为了提高识别聚类的精度,我们引入了一种细粒度检索技术,该技术将最优子序列双注入(OSB)和最大平均差异(MMD)相结合。对异构数据集的评估表明,我们的方法不仅简化了数据集组织,而且有效地解决了质量差异,显著提高了广泛集合的检索效率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Harnessing unsupervised learning for retrieving CAD assembly models from public datasets
Retrieving assembly models from public datasets can yield enriched outcomes and broaden the spectrum of insights. However, public datasets often present unique challenges, such as variance in quality and granularity of assembly models, lack of standardized methods for organizing and labeling, which hinder efficient and accurate retrieval. To address these issues, this paper presents a robust two-step retrieval method tailored for CAD assembly models from public datasets. The first phase utilizes hierarchical clustering in an unsupervised learning framework to systematically organize CAD assembly models. Each assembly model is represented by a feature vector that encapsulates geometrical and topological features derived from its Boundary Representation (B-rep), and reflects hierarchical relationships among parts and components. These feature vectors serve as the basis for systematic indexing via hierarchical clustering, grouping models based on similarity measurement. Each cluster’s centroid, representing the collective feature vector, facilitates efficient and targeted retrieval. In the second phase, the query model is directly compared to cluster centroids, enabling rapid identification of similar assembly collections. To enhance precision within identified clusters, we introduce a fine-grained retrieval technique that integrates Optimal Subsequence Bijection (OSB) with Maximum Mean Discrepancy (MMD). Evaluations on a heterogeneous dataset demonstrate that our method not only streamlines dataset organization but also effectively addresses quality variations, significantly improving retrieval efficiency across extensive collections.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Advanced Engineering Informatics
Advanced Engineering Informatics 工程技术-工程:综合
CiteScore
12.40
自引率
18.20%
发文量
292
审稿时长
45 days
期刊介绍: Advanced Engineering Informatics is an international Journal that solicits research papers with an emphasis on 'knowledge' and 'engineering applications'. The Journal seeks original papers that report progress in applying methods of engineering informatics. These papers should have engineering relevance and help provide a scientific base for more reliable, spontaneous, and creative engineering decision-making. Additionally, papers should demonstrate the science of supporting knowledge-intensive engineering tasks and validate the generality, power, and scalability of new methods through rigorous evaluation, preferably both qualitatively and quantitatively. Abstracting and indexing for Advanced Engineering Informatics include Science Citation Index Expanded, Scopus and INSPEC.
期刊最新文献
A novel adaptive sampling approach toward Bayesian support vector model for dependent multi-task regression DART: Domain-Adaptive Representation Learning for cross-city trajectory analysis Limited-sample deep learning-based detection and segmentation of infrared radiation anomaly zones in rock fractures Tandem CCV-VLM: Visual construction safety inspection based on Collaborative Cross-Verification VLM SCTP-Net: a multi-stage continuous-time propagation network for non-stationary concrete discharge anomaly recognition
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1