Advanced Engineering Informatics最新文献_第10页

Hierarchical fair competition-based differential evolution algorithm for global optimization and application in LED spectral matching coefficients searching 基于分层公平竞争的差分进化全局优化算法及其在LED光谱匹配系数搜索中的应用

IF 9.9 1区工程技术 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Advanced Engineering Informatics

Pub Date : 2026-01-17 DOI: 10.1016/j.aei.2026.104338

Yongjun Wang , Xinglei Xu , Chengliang Jin , Luhao Cao , Qiming Tian , Shiwei Xu

Evolutionary optimization algorithms continue to attract attention for addressing challenging global optimization (GO) problems owing to their flexibility and adaptability. This study proposed an improved differential evolution (DE)-based algorithm within the hierarchical fair competition (HFC) framework, namely HFCDE. In contrast to traditional DE, which has only one phase, layer, and population, HFCDE allows individuals to evolve across multi-phases and -layers in sub-populations: in each hierarchical layer of each phase, each sub-population evolves iteratively using traditional DE operators until the termination condition is reached. In addition, the parameters such as the number of phases, the number of layers, and the portion coefficients (used to regulate the maximum iterations in each phase) can be adjusted independently for specific problems. Specifically, a typical version of this approach, a two-layer, three-phase HFCDE algorithm, was investigated in detail. Experiments were conducted on 70 benchmark functions (including 13 high-dimensional ones) as well as a complex optimization problem in industrial lighting systems involving the spectral coefficient of a light-emitting diode (LED). Numerical results demonstrated that an accelerated global convergence speed, greater robustness, and higher solution accuracy were achieved, compared with some state-of-the-art evolutionary optimization methods. The percentage of cases where HFCDE outperformed competitors ranged between 71 and 100%. The key parameter settings were also investigated and discussed in detail, showing the relaxed parameter tuning in HFCDE. Furthermore, the HFCDE framework—with its flexible parameter setting mechanism and extensibility to different layers/phases, integration with an adaptive or dynamic parameter adjustment strategy, and replacement of DE operators by other optimization operators—has great potential for addressing GO challenges.

进化优化算法由于其灵活性和适应性，在解决具有挑战性的全局优化（GO）问题方面一直受到关注。本文提出了一种基于分级公平竞争（HFC）框架的改进差分进化（DE）算法，即HFCDE。与只有一个阶段、层和种群的传统DE不同，HFCDE允许个体在子种群中跨多阶段和多层进化：在每个阶段的每个分层层中，每个子种群使用传统DE算子迭代进化，直到达到终止条件。此外，相位数、层数和部分系数（用于调节每个阶段的最大迭代）等参数可以针对特定问题独立调整。具体来说，详细研究了该方法的一个典型版本，即两层，三相HFCDE算法。在70个基准函数（包括13个高维函数）和一个涉及发光二极管（LED）光谱系数的复杂工业照明系统优化问题上进行了实验。数值结果表明，与现有的进化优化方法相比，该方法具有更快的全局收敛速度、更强的鲁棒性和更高的求解精度。hfde优于竞争对手的案例百分比在71%到100%之间。对关键参数的设置也进行了详细的研究和讨论，展示了HFCDE中的宽松参数调整。此外，HFCDE框架具有灵活的参数设置机制和不同层/阶段的可扩展性，集成了自适应或动态参数调整策略，并用其他优化算子替代DE算子，具有解决GO挑战的巨大潜力。

{"title":"Hierarchical fair competition-based differential evolution algorithm for global optimization and application in LED spectral matching coefficients searching","authors":"Yongjun Wang , Xinglei Xu , Chengliang Jin , Luhao Cao , Qiming Tian , Shiwei Xu","doi":"10.1016/j.aei.2026.104338","DOIUrl":"10.1016/j.aei.2026.104338","url":null,"abstract":"<div><div>Evolutionary optimization algorithms continue to attract attention for addressing challenging global optimization (GO) problems owing to their flexibility and adaptability. This study proposed an improved differential evolution (DE)-based algorithm within the hierarchical fair competition (HFC) framework, namely HFCDE. In contrast to traditional DE, which has only one phase, layer, and population, HFCDE allows individuals to evolve across multi-phases and -layers in sub-populations: in each hierarchical layer of each phase, each sub-population evolves iteratively using traditional DE operators until the termination condition is reached. In addition, the parameters such as the number of phases, the number of layers, and the portion coefficients (used to regulate the maximum iterations in each phase) can be adjusted independently for specific problems. Specifically, a typical version of this approach, a two-layer, three-phase HFCDE algorithm, was investigated in detail. Experiments were conducted on 70 benchmark functions (including 13 high-dimensional ones) as well as a complex optimization problem in industrial lighting systems involving the spectral coefficient of a light-emitting diode (LED). Numerical results demonstrated that an accelerated global convergence speed, greater robustness, and higher solution accuracy were achieved, compared with some state-of-the-art evolutionary optimization methods. The percentage of cases where HFCDE outperformed competitors ranged between 71 and 100%. The key parameter settings were also investigated and discussed in detail, showing the relaxed parameter tuning in HFCDE. Furthermore, the HFCDE framework—with its flexible parameter setting mechanism and extensibility to different layers/phases, integration with an adaptive or dynamic parameter adjustment strategy, and replacement of DE operators by other optimization operators—has great potential for addressing GO challenges.</div></div>","PeriodicalId":50941,"journal":{"name":"Advanced Engineering Informatics","volume":"71 ","pages":"Article 104338"},"PeriodicalIF":9.9,"publicationDate":"2026-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145977712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A transformer-based framework for cross-material in situ monitoring in extrusion-based bioprinting 一种基于变压器的框架，用于挤压生物打印中交叉材料的原位监测

IF 9.9 1区工程技术 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Advanced Engineering Informatics

Pub Date : 2026-01-16 DOI: 10.1016/j.aei.2026.104323

Jiayi Zhang , Kaicheng Yu , Yifeng Yao , Lihua Lu , Qiang Gao , Peng Zhang , Guoyin Shang , Swee Leong Sing

Additive manufacturing is advancing toward intelligent and functionally reliable fabrication, particularly in biomedical applications. In extrusion-based three-dimensional (3D) bioprinting, machine learning (ML)-enabled in situ monitoring is crucial for improving print quality and ensuring the functional performance of tissue engineering constructs. This study proposes a transformer-based transfer learning framework for cross-material monitoring that efficiently transfers knowledge across diverse polymer systems under limited-data conditions. The model extracts geometric features of extruded filaments from in situ monitoring images and achieves 99.55% classification accuracy on PLCL and over 98% accuracy for PCL/

β

-TCP and GelMA datasets with less than 0.1% trainable parameters. Beyond visual monitoring, the predicted filament process states were quantitatively correlated with downstream mechanical performance, demonstrating a 24.6% improvement in tensile strength and enhanced geometric fidelity under optimal-heating conditions. Furthermore, in vivo wound-healing experiments using the bioprinted constructs verified the biological relevance and translational potential of the proposed monitoring strategy. Constructs fabricated under optimal conditions promoted accelerated tissue regeneration and vascularization, achieving faster wound closure within 10 days compared with suboptimal printing conditions. Overall, the proposed transformer-based cross-material framework establishes a generalizable and biologically validated paradigm for vision-guided process monitoring, providing a key step toward intelligent and adaptive bioprinting.

增材制造正朝着智能和功能可靠的制造方向发展，特别是在生物医学应用方面。在基于挤压的三维（3D）生物打印中，支持机器学习（ML）的原位监测对于提高打印质量和确保组织工程结构的功能性能至关重要。本研究提出了一种基于变压器的跨材料监测迁移学习框架，该框架可在有限数据条件下有效地跨不同聚合物系统迁移知识。该模型从现场监测图像中提取挤压细丝的几何特征，在可训练参数小于0.1%的情况下，对PLCL的分类准确率达到99.55%，对PCL/β-TCP和GelMA数据集的分类准确率达到98%以上。除了视觉监测外，预测的长丝工艺状态与下游机械性能定量相关，显示在最佳加热条件下拉伸强度提高24.6%，几何保真度增强。此外，使用生物打印构建体的体内伤口愈合实验验证了所提出的监测策略的生物学相关性和转化潜力。在最佳条件下制造的结构促进了组织再生和血管形成，与次优打印条件相比，在10天内实现了更快的伤口愈合。总的来说，提出的基于变压器的跨材料框架为视觉引导的过程监测建立了一个可推广的和生物验证的范例，为智能和自适应生物打印提供了关键的一步。

{"title":"A transformer-based framework for cross-material in situ monitoring in extrusion-based bioprinting","authors":"Jiayi Zhang , Kaicheng Yu , Yifeng Yao , Lihua Lu , Qiang Gao , Peng Zhang , Guoyin Shang , Swee Leong Sing","doi":"10.1016/j.aei.2026.104323","DOIUrl":"10.1016/j.aei.2026.104323","url":null,"abstract":"<div><div>Additive manufacturing is advancing toward intelligent and functionally reliable fabrication, particularly in biomedical applications. In extrusion-based three-dimensional (3D) bioprinting, machine learning (ML)-enabled <em>in situ</em> monitoring is crucial for improving print quality and ensuring the functional performance of tissue engineering constructs. This study proposes a transformer-based transfer learning framework for cross-material monitoring that efficiently transfers knowledge across diverse polymer systems under limited-data conditions. The model extracts geometric features of extruded filaments from in situ monitoring images and achieves 99.55% classification accuracy on PLCL and over 98% accuracy for PCL/<span><math><mi>β</mi></math></span>-TCP and GelMA datasets with less than 0.1% trainable parameters. Beyond visual monitoring, the predicted filament process states were quantitatively correlated with downstream mechanical performance, demonstrating a 24.6% improvement in tensile strength and enhanced geometric fidelity under optimal-heating conditions. Furthermore, <em>in vivo</em> wound-healing experiments using the bioprinted constructs verified the biological relevance and translational potential of the proposed monitoring strategy. Constructs fabricated under optimal conditions promoted accelerated tissue regeneration and vascularization, achieving faster wound closure within 10 days compared with suboptimal printing conditions. Overall, the proposed transformer-based cross-material framework establishes a generalizable and biologically validated paradigm for vision-guided process monitoring, providing a key step toward intelligent and adaptive bioprinting.</div></div>","PeriodicalId":50941,"journal":{"name":"Advanced Engineering Informatics","volume":"71 ","pages":"Article 104323"},"PeriodicalIF":9.9,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145977715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

AECBench: A hierarchical benchmark for knowledge evaluation of large language models in the AEC field AECBench: AEC领域大型语言模型知识评估的层次基准

IF 9.9 1区工程技术 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Advanced Engineering Informatics

Pub Date : 2026-01-16 DOI: 10.1016/j.aei.2026.104314

Chen Liang , Zhaoqi Huang , Haofen Wang , Fu Chai , Chunying Yu , Huanhuan Wei , Zhengjie Liu , Yanpeng Li , Hongjun Wang , Ruifeng Luo , Xianzhong Zhao

Large language models (LLMs), as a novel information technology, are seeing increasing adoption in the Architecture, Engineering, and Construction (AEC) field. They have shown their potential to streamline processes throughout the building lifecycle. However, the robustness and reliability of LLMs in such a specialized and safety-critical domain remain to be evaluated. To address this challenge, this paper establishes AECBench, a comprehensive benchmark designed to quantify the strengths and limitations of current LLMs in the AEC domain. The benchmark features a five-level, cognition-oriented evaluation framework (i.e., Knowledge Memorization, Knowledge Understanding, Knowledge Reasoning, Knowledge Calculation, and Knowledge Application). Based on the framework, 23 representative evaluation tasks were defined. These tasks were derived from authentic AEC practice, with scope ranging from codes retrieval to specialized documents generation. Subsequently, a 4800-question dataset encompassing diverse formats, including open-ended questions, was crafted primarily by engineers and validated through a two-round expert review. Furthermore, an “LLM-as-a-Judge” approach was introduced to provide a scalable and consistent methodology for evaluating complex, long-form responses leveraging expert-derived rubrics. Through the evaluation of nine LLMs, a clear performance decline across five cognitive levels was revealed. Despite demonstrating proficiency in foundational tasks at the Knowledge Memorization and Understanding levels, the models showed significant performance deficits, particularly in interpreting knowledge from tables in building codes, executing complex reasoning and calculation, and generating domain-specific documents. Consequently, this study lays the groundwork for future research and development aimed at the robust and reliable integration of LLMs into safety-critical engineering practices.

大型语言模型（llm）作为一种新的信息技术，在体系结构、工程和构造（AEC）领域的应用越来越广泛。它们已经显示出在整个建筑生命周期中简化流程的潜力。然而，法学硕士在这样一个专业和安全关键领域的稳健性和可靠性仍有待评估。为了应对这一挑战，本文建立了AECBench，这是一个全面的基准，旨在量化当前AEC领域法学硕士的优势和局限性。该基准采用以认知为导向的五级评价框架（即知识记忆、知识理解、知识推理、知识计算和知识应用）。在此基础上，定义了23个具有代表性的评价任务。这些任务来源于真实的AEC实践，范围从代码检索到专门的文档生成。随后，一个包含多种格式（包括开放式问题）的4800个问题的数据集主要由工程师制作，并通过两轮专家评审进行验证。此外，引入了“法学硕士作为法官”的方法，以提供可扩展和一致的方法，以利用专家衍生的准则来评估复杂的长篇响应。通过对九位法学硕士的评估，我们发现他们在五个认知水平上的表现明显下降。尽管在知识记忆和理解水平上展示了对基础任务的熟练程度，这些模型显示了显著的性能缺陷，特别是在解释建筑规范中的表中的知识、执行复杂的推理和计算以及生成特定领域的文档方面。因此，本研究为未来的研究和开发奠定了基础，旨在将llm稳健可靠地集成到安全关键工程实践中。

{"title":"AECBench: A hierarchical benchmark for knowledge evaluation of large language models in the AEC field","authors":"Chen Liang , Zhaoqi Huang , Haofen Wang , Fu Chai , Chunying Yu , Huanhuan Wei , Zhengjie Liu , Yanpeng Li , Hongjun Wang , Ruifeng Luo , Xianzhong Zhao","doi":"10.1016/j.aei.2026.104314","DOIUrl":"10.1016/j.aei.2026.104314","url":null,"abstract":"<div><div>Large language models (LLMs), as a novel information technology, are seeing increasing adoption in the Architecture, Engineering, and Construction (AEC) field. They have shown their potential to streamline processes throughout the building lifecycle. However, the robustness and reliability of LLMs in such a specialized and safety-critical domain remain to be evaluated. To address this challenge, this paper establishes AECBench, a comprehensive benchmark designed to quantify the strengths and limitations of current LLMs in the AEC domain. The benchmark features a five-level, cognition-oriented evaluation framework (i.e., Knowledge Memorization, Knowledge Understanding, Knowledge Reasoning, Knowledge Calculation, and Knowledge Application). Based on the framework, 23 representative evaluation tasks were defined. These tasks were derived from authentic AEC practice, with scope ranging from codes retrieval to specialized documents generation. Subsequently, a 4800-question dataset encompassing diverse formats, including open-ended questions, was crafted primarily by engineers and validated through a two-round expert review. Furthermore, an “LLM-as-a-Judge” approach was introduced to provide a scalable and consistent methodology for evaluating complex, long-form responses leveraging expert-derived rubrics. Through the evaluation of nine LLMs, a clear performance decline across five cognitive levels was revealed. Despite demonstrating proficiency in foundational tasks at the Knowledge Memorization and Understanding levels, the models showed significant performance deficits, particularly in interpreting knowledge from tables in building codes, executing complex reasoning and calculation, and generating domain-specific documents. Consequently, this study lays the groundwork for future research and development aimed at the robust and reliable integration of LLMs into safety-critical engineering practices.</div></div>","PeriodicalId":50941,"journal":{"name":"Advanced Engineering Informatics","volume":"71 ","pages":"Article 104314"},"PeriodicalIF":9.9,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145977716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Efficiency-aware seismic fragility analysis of super-high arch dam using unsupervised ground motion clustering with probabilistic representation 基于概率表示的无监督地震动聚类的超高拱坝地震易损分析

IF 9.9 1区工程技术 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Advanced Engineering Informatics

Pub Date : 2026-01-15 DOI: 10.1016/j.aei.2026.104318

Yingbo Chen , Mingchao Li , Qiubing Ren , Zhiyong Qi , Hui Liang

Machine Learning (ML)-driven approaches have been employed to replace computationally intensive seismic simulations of hydraulic engineering structures. For the complex seismic responses of arch dams, constructing a metamodel that captures the nonlinear relationship between ground motion inputs and structural response outputs using a limited set of numerical simulations can significantly reduce the computational cost. However, conventional deterministic predictions and fragility analyses fail to account for the high aleatory and epistemic uncertainties inherent in the seismic response of arch dams. To this end, this paper proposes an efficient fragility analysis method for arch dams that integrates probabilistic ML algorithms with the traditional Incremental Dynamic Analysis. By constructing a Natural Gradient Boosting (NGBoost) metamodel for the arch dam dynamic response, not only can the predicted mean value of each response sample be obtained, but also its conditional probability distribution. Superimpose the simulation data with the response distribution predicted by NGBoost, and the binary parameters of the fragility function are estimated, thereby generating both the fragility curve and the uncertain fragility interval of arch dam. Additionally, representative Ground Motion Records (GMRs) for the arch dam are selected using the Partitioning Around Medoids (PAM) unsupervised clustering technique, determining the minimum subset proportion that effectively represents the whole GMR dataset. The effectiveness of the proposed method is validated in a super-high arch dam. The 40% GMR proportion is found to adequately reproduce the fragility curves of the whole dataset, with the reference curve falling within the derived uncertainty interval, achieving a 56.8% reduction in computational cost. The 60% GMR proportion ensured fragility curves with balanced accuracy and effectiveness, exhibiting maximum mean differences of 0.058 and maximum standard deviation differences of 0.031 from reference curves across all damage levels, while reducing computational cost by 39.7%. Comparative results demonstrate the superiority of NGBoost and PAM over existing deterministic metamodels and GMRs selection techniques, respectively. The efficient fragility analysis method proposed in this study ultimately enables the direct characterization of uncertainties in arch dam seismic responses.

机器学习（ML）驱动的方法已被用于取代水力工程结构的计算密集型地震模拟。对于拱坝复杂的地震反应，利用有限的数值模拟，构建一个能够捕捉地震动输入与结构反应输出之间非线性关系的元模型，可以显著降低计算成本。然而，传统的确定性预测和脆弱性分析无法解释拱坝地震反应中固有的高度随机性和认知不确定性。为此，本文提出了一种将概率ML算法与传统增量动力分析相结合的拱坝易损性分析方法。通过构建拱坝动力响应的自然梯度增强元模型（NGBoost），不仅可以得到各响应样本的预测均值，还可以得到其条件概率分布。将模拟数据与NGBoost预测的响应分布叠加，估计出易损性函数的二值参数，从而生成拱坝易损性曲线和不确定易损性区间。此外，利用无监督聚类技术（PAM）选择具有代表性的拱坝地震动记录（GMRs），确定有效代表整个GMR数据集的最小子集比例。在某超高层拱坝工程中验证了该方法的有效性。发现40%的GMR比例可以充分再现整个数据集的脆弱性曲线，参考曲线落在推导的不确定性区间内，计算成本降低56.8%。60%的GMR比例保证了脆性曲线的准确性和有效性的平衡，在所有损伤级别上与参考曲线的最大平均差异为0.058，最大标准差差异为0.031，同时减少了39.7%的计算成本。对比结果表明，NGBoost和PAM分别优于现有的确定性元模型和gmr选择技术。本研究提出的高效易损分析方法最终能够直接表征拱坝地震反应的不确定性。

{"title":"Efficiency-aware seismic fragility analysis of super-high arch dam using unsupervised ground motion clustering with probabilistic representation","authors":"Yingbo Chen , Mingchao Li , Qiubing Ren , Zhiyong Qi , Hui Liang","doi":"10.1016/j.aei.2026.104318","DOIUrl":"10.1016/j.aei.2026.104318","url":null,"abstract":"<div><div>Machine Learning (ML)-driven approaches have been employed to replace computationally intensive seismic simulations of hydraulic engineering structures. For the complex seismic responses of arch dams, constructing a metamodel that captures the nonlinear relationship between ground motion inputs and structural response outputs using a limited set of numerical simulations can significantly reduce the computational cost. However, conventional deterministic predictions and fragility analyses fail to account for the high aleatory and epistemic uncertainties inherent in the seismic response of arch dams. To this end, this paper proposes an efficient fragility analysis method for arch dams that integrates probabilistic ML algorithms with the traditional Incremental Dynamic Analysis. By constructing a Natural Gradient Boosting (NGBoost) metamodel for the arch dam dynamic response, not only can the predicted mean value of each response sample be obtained, but also its conditional probability distribution. Superimpose the simulation data with the response distribution predicted by NGBoost, and the binary parameters of the fragility function are estimated, thereby generating both the fragility curve and the uncertain fragility interval of arch dam. Additionally, representative Ground Motion Records (GMRs) for the arch dam are selected using the Partitioning Around Medoids (PAM) unsupervised clustering technique, determining the minimum subset proportion that effectively represents the whole GMR dataset. The effectiveness of the proposed method is validated in a super-high arch dam. The 40% GMR proportion is found to adequately reproduce the fragility curves of the whole dataset, with the reference curve falling within the derived uncertainty interval, achieving a 56.8% reduction in computational cost. The 60% GMR proportion ensured fragility curves with balanced accuracy and effectiveness, exhibiting maximum mean differences of 0.058 and maximum standard deviation differences of 0.031 from reference curves across all damage levels, while reducing computational cost by 39.7%. Comparative results demonstrate the superiority of NGBoost and PAM over existing deterministic metamodels and GMRs selection techniques, respectively. The efficient fragility analysis method proposed in this study ultimately enables the direct characterization of uncertainties in arch dam seismic responses.</div></div>","PeriodicalId":50941,"journal":{"name":"Advanced Engineering Informatics","volume":"71 ","pages":"Article 104318"},"PeriodicalIF":9.9,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145978250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Pilot-driven deep learning based RIS-assisted beamforming for secrecy rate maximization 基于导频驱动深度学习的ris辅助波束成形保密率最大化

IF 9.9 1区工程技术 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Advanced Engineering Informatics

Pub Date : 2026-01-15 DOI: 10.1016/j.aei.2026.104352

Natasha Elizabeth Francis, Khoa Tran Phan, Peng Cheng

Secure beamforming in reconfigurable intelligent surface (RIS)-assisted multiuser downlink systems is challenging due to high computational complexity and complex channel state information (CSI) estimation. This work proposes a pilot-driven beamforming network (PilotBeamNet) that jointly designs base-station (BS) transmit beamforming and quantized RIS phases directly from uplink pilot received signals with legitimate-user location cues to capture geometry. The framework avoids explicit channel estimation and slow iterative algorithms. A convolutional module reads each pilot frame, a long short term memory (LSTM) block with lightweight temporal attention aggregates them, and two simple heads output the beamformers and the discrete RIS phases. The location cues are embedded and fused with the features extracted from the pilot frames by the convolutional, LSTM, and temporal attention modules. Training maximizes ergodic secrecy rate (ESR) through Monte Carlo sampling of unknown eavesdropper channels, enabling robustness without requiring eavesdropper CSI. Once trained, PilotBeamNet performs single-pass inference with latency determined only by network depth. Across all tested conditions, PilotBeamNet achieves 10%–30% ESR improvement depending on the signal-to-noise ratio (SNR), pilot length, and RIS size, while reducing inference latency by more than an order of magnitude compared to alternating optimization (AO) and outperforming multilayer perceptron (MLP) baselines. It also maintains consistent performance under phase quantization and delivers higher secrecy rates across all evaluated configurations.

由于高计算复杂度和复杂的信道状态信息（CSI）估计，可重构智能表面（RIS）辅助多用户下行系统的安全波束形成具有挑战性。这项工作提出了一个导频驱动的波束形成网络（PilotBeamNet），该网络直接从上行导频接收的具有合法用户位置线索的信号中设计基站（BS）发射波束形成和量化RIS相位，以捕获几何形状。该框架避免了显式信道估计和缓慢的迭代算法。卷积模块读取每个导频帧，具有轻量级时间注意力的长短期记忆（LSTM）块将它们聚合，两个简单的头输出波束形成器和离散的RIS相位。通过卷积、LSTM和时间注意模块，将位置线索嵌入并融合到从导频帧中提取的特征中。训练通过对未知窃听信道的蒙特卡罗采样最大化遍历保密率（ESR），在不需要窃听者CSI的情况下实现鲁棒性。经过训练后，PilotBeamNet执行单遍推理，延迟仅由网络深度决定。在所有测试条件下，根据信噪比（SNR）、导频长度和RIS大小，PilotBeamNet实现了10%-30%的ESR改进，同时与交替优化（AO）相比，推理延迟减少了一个数量级以上，并且优于多层感知器（MLP）基线。它还在相位量化下保持一致的性能，并在所有评估的配置中提供更高的保密率。

{"title":"Pilot-driven deep learning based RIS-assisted beamforming for secrecy rate maximization","authors":"Natasha Elizabeth Francis, Khoa Tran Phan, Peng Cheng","doi":"10.1016/j.aei.2026.104352","DOIUrl":"10.1016/j.aei.2026.104352","url":null,"abstract":"<div><div>Secure beamforming in reconfigurable intelligent surface (RIS)-assisted multiuser downlink systems is challenging due to high computational complexity and complex channel state information (CSI) estimation. This work proposes a pilot-driven beamforming network (PilotBeamNet) that jointly designs base-station (BS) transmit beamforming and quantized RIS phases directly from uplink pilot received signals with legitimate-user location cues to capture geometry. The framework avoids explicit channel estimation and slow iterative algorithms. A convolutional module reads each pilot frame, a long short term memory (LSTM) block with lightweight temporal attention aggregates them, and two simple heads output the beamformers and the discrete RIS phases. The location cues are embedded and fused with the features extracted from the pilot frames by the convolutional, LSTM, and temporal attention modules. Training maximizes ergodic secrecy rate (ESR) through Monte Carlo sampling of unknown eavesdropper channels, enabling robustness without requiring eavesdropper CSI. Once trained, PilotBeamNet performs single-pass inference with latency determined only by network depth. Across all tested conditions, PilotBeamNet achieves 10%–30% ESR improvement depending on the signal-to-noise ratio (SNR), pilot length, and RIS size, while reducing inference latency by more than an order of magnitude compared to alternating optimization (AO) and outperforming multilayer perceptron (MLP) baselines. It also maintains consistent performance under phase quantization and delivers higher secrecy rates across all evaluated configurations.</div></div>","PeriodicalId":50941,"journal":{"name":"Advanced Engineering Informatics","volume":"71 ","pages":"Article 104352"},"PeriodicalIF":9.9,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145977717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

DirA-Net: Directional awareness enhancement network for long-distance road crack detection DirA-Net：用于长距离道路裂缝检测的定向感知增强网络

IF 9.9 1区工程技术 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Advanced Engineering Informatics

Pub Date : 2026-01-15 DOI: 10.1016/j.aei.2026.104344

Chenglong Mi, Yanling Chen, Quan Qi, Huaibin Qin

To tackle the challenges of long-range dependency modeling and noise interference in road crack detection, this paper presents the directional awareness enhancement network (DirA-Net). Conventional CNNs struggle to capture the linear continuity of cracks due to limited receptive fields. To address this, an eight-directional spatial state module (8-SSM) captures directional structures through multi-directional state propagation, effectively modeling elongated crack patterns. To mitigate background noise and enhance multi-scale feature fusion, a median receptive field fusion module (MRF) integrates median filtering with multi-scale dilated convolutions, improving noise suppression and receptive field integration. A global shuffle attention module (GSA) further strengthens long-range feature dependencies by combining channel attention, channel shuffling, and spatial attention. Furthermore, a spatial position awareness module (SPA) leverages coordinate-guided attention and global spatial context to enhance crack localization and structural perception. Experiments on five public datasets show that the proposed method outperforms eight state-of-the-art models in terms of average F1-score and mIoU across different thresholds, while achieving an average inference speed of 12.77 FPS, demonstrating both accuracy and efficiency.

为了解决道路裂缝检测中存在的远程依赖建模和噪声干扰问题，提出了一种方向感知增强网络（DirA-Net）。传统的cnn很难捕捉到裂缝的线性连续性，因为接收域有限。为了解决这个问题，一个八向空间状态模块（8-SSM）通过多向状态传播捕获定向结构，有效地模拟了拉长的裂纹模式。为了减轻背景噪声，增强多尺度特征融合，中值感受野融合模块（MRF）将中值滤波与多尺度扩张卷积相结合，改善了噪声抑制和感受野融合。全局洗牌注意模块（GSA）通过结合通道注意、通道洗牌和空间注意进一步增强了远程特征依赖性。此外，空间位置感知模块（SPA）利用坐标引导的注意力和全局空间环境来增强裂缝定位和结构感知。在5个公开数据集上的实验表明，该方法在不同阈值下的平均f1分数和mIoU均优于8个最先进的模型，平均推理速度为12.77 FPS，既准确又高效。

{"title":"DirA-Net: Directional awareness enhancement network for long-distance road crack detection","authors":"Chenglong Mi, Yanling Chen, Quan Qi, Huaibin Qin","doi":"10.1016/j.aei.2026.104344","DOIUrl":"10.1016/j.aei.2026.104344","url":null,"abstract":"<div><div>To tackle the challenges of long-range dependency modeling and noise interference in road crack detection, this paper presents the directional awareness enhancement network (DirA-Net). Conventional CNNs struggle to capture the linear continuity of cracks due to limited receptive fields. To address this, an eight-directional spatial state module (8-SSM) captures directional structures through multi-directional state propagation, effectively modeling elongated crack patterns. To mitigate background noise and enhance multi-scale feature fusion, a median receptive field fusion module (MRF) integrates median filtering with multi-scale dilated convolutions, improving noise suppression and receptive field integration. A global shuffle attention module (GSA) further strengthens long-range feature dependencies by combining channel attention, channel shuffling, and spatial attention. Furthermore, a spatial position awareness module (SPA) leverages coordinate-guided attention and global spatial context to enhance crack localization and structural perception. Experiments on five public datasets show that the proposed method outperforms eight state-of-the-art models in terms of average F1-score and mIoU across different thresholds, while achieving an average inference speed of 12.77 FPS, demonstrating both accuracy and efficiency.</div></div>","PeriodicalId":50941,"journal":{"name":"Advanced Engineering Informatics","volume":"71 ","pages":"Article 104344"},"PeriodicalIF":9.9,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145978248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Depth-guided cross-modal fusion and diffusion-based enhancement for robust pavement defect segmentation 基于深度引导的跨模态融合和扩散增强稳健路面缺陷分割

IF 9.9 1区工程技术 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Advanced Engineering Informatics

Pub Date : 2026-01-14 DOI: 10.1016/j.aei.2026.104339

Yihui Shan , Wei Li , Jiaqi Shi , Yansong Wang , Zhenzhen Xing , Jiangang Ding , Lili Pei

Accurate and efficient perception of pavement conditions is essential for maintaining transportation infrastructure and ensuring driving safety. In real-world road environments, visual and depth data collected by inspection or autonomous vehicles are often affected by modality imbalance, sensor noise, and image degradation, which compromise the reliability of defect segmentation. To address these challenges, this study proposes a cross-modal segmentation framework that integrates depth-guided fusion with generative latent feature enhancement to achieve robust pavement defect perception under diverse conditions. A defect-centric and class-aware depth prompting strategy is developed to transform geometric priors into explicit guidance for the intensity stream, enabling background suppression before encoding and boundary refinement within intermediate layers. In parallel, a latent feature enhancement module aligns the Segment Anything Model (SAM) feature space with a pretrained diffusion latent space and performs efficient one-step denoising, restoring structural consistency while avoiding the heavy overhead of iterative diffusion sampling. The overall design preserves the efficiency and generalization of SAM while introducing lightweight trainable adapters and low-rank diffusion updates. Experimental evaluations on multimodal pavement datasets demonstrate that the proposed approach achieves higher segmentation accuracy and robustness compared with state-of-the-art fusion methods. The results highlight the potential of the proposed framework to support intelligent pavement inspection, condition assessment, and maintenance decision-making.

准确有效地感知路面状况对于维护交通基础设施和确保驾驶安全至关重要。在现实道路环境中，检测或自动驾驶车辆收集的视觉和深度数据经常受到模态不平衡、传感器噪声和图像退化的影响，从而影响缺陷分割的可靠性。为了解决这些挑战，本研究提出了一种跨模态分割框架，该框架将深度引导融合与生成潜在特征增强相结合，以实现不同条件下稳健的路面缺陷感知。开发了一种以缺陷为中心和类别感知的深度提示策略，将几何先验转换为强度流的明确指导，实现了编码前的背景抑制和中间层内的边界细化。同时，潜在特征增强模块将分段任意模型（SAM）特征空间与预训练的扩散潜在空间对齐，并执行有效的一步去噪，在恢复结构一致性的同时避免了迭代扩散采样的繁重开销。总体设计保留了SAM的效率和通用性，同时引入了轻量级可训练适配器和低秩扩散更新。在多模式路面数据集上的实验结果表明，与现有的融合方法相比，该方法具有更高的分割精度和鲁棒性。研究结果强调了该框架在支持智能路面检测、状况评估和维护决策方面的潜力。

{"title":"Depth-guided cross-modal fusion and diffusion-based enhancement for robust pavement defect segmentation","authors":"Yihui Shan , Wei Li , Jiaqi Shi , Yansong Wang , Zhenzhen Xing , Jiangang Ding , Lili Pei","doi":"10.1016/j.aei.2026.104339","DOIUrl":"10.1016/j.aei.2026.104339","url":null,"abstract":"<div><div>Accurate and efficient perception of pavement conditions is essential for maintaining transportation infrastructure and ensuring driving safety. In real-world road environments, visual and depth data collected by inspection or autonomous vehicles are often affected by modality imbalance, sensor noise, and image degradation, which compromise the reliability of defect segmentation. To address these challenges, this study proposes a cross-modal segmentation framework that integrates depth-guided fusion with generative latent feature enhancement to achieve robust pavement defect perception under diverse conditions. A defect-centric and class-aware depth prompting strategy is developed to transform geometric priors into explicit guidance for the intensity stream, enabling background suppression before encoding and boundary refinement within intermediate layers. In parallel, a latent feature enhancement module aligns the Segment Anything Model (SAM) feature space with a pretrained diffusion latent space and performs efficient one-step denoising, restoring structural consistency while avoiding the heavy overhead of iterative diffusion sampling. The overall design preserves the efficiency and generalization of SAM while introducing lightweight trainable adapters and low-rank diffusion updates. Experimental evaluations on multimodal pavement datasets demonstrate that the proposed approach achieves higher segmentation accuracy and robustness compared with state-of-the-art fusion methods. The results highlight the potential of the proposed framework to support intelligent pavement inspection, condition assessment, and maintenance decision-making.</div></div>","PeriodicalId":50941,"journal":{"name":"Advanced Engineering Informatics","volume":"71 ","pages":"Article 104339"},"PeriodicalIF":9.9,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145978246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Motion-prior and Confidence-aware Gaussian Splatting (MCGS) SLAM for 3D scene reconstruction of indoor built environments 运动先验和自信感知高斯飞溅（MCGS） SLAM用于室内建筑环境的三维场景重建

IF 9.9 1区工程技术 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Advanced Engineering Informatics

Pub Date : 2026-01-14 DOI: 10.1016/j.aei.2026.104317

Yuanyuan Deng, Vincent J.L. Gan

3D Gaussian Splatting SLAM is an emerging area for 3D scene reconstruction, yet it faces challenges in maintaining motion and temporal consistency under rapid motion, illumination changes, and dynamic object occlusion in the built environment. This paper proposes a motion-prior and confidence-aware Gaussian Splatting (MCGS) SLAM, which hardnesses a probabilistic motion-prior framework that enforces kinematic constraints and adapts to varying motion states. Secondly, the confidence estimation mechanism evaluates the reliability of Gaussian primitives based on temporal, geometric, photometric, and structural indicators to guide a balanced spatial representation. An adaptive keyframe selection method further optimizes keyframe density and improves temporal coherence by dynamically adjusting keyframe frequency. Lastly, multi-task optimization is undertaken, which combines photometric and geometric loss, probabilistic motion constraints, and confidence-based weighting, enabling joint optimization of pose tracking accuracy and mapping quality. Experiments show that MCGS-SLAM achieves 83% trajectory error reduction and 3D mapping quality gains while maintaining a competitive frame rate.

三维高斯飞溅SLAM是三维场景重建的一个新兴领域，但在快速运动、光照变化和建筑环境中动态物体遮挡的情况下，如何保持运动和时间的一致性面临挑战。本文提出了一种运动先验和置信度感知的高斯飞溅SLAM (MCGS)，该SLAM采用了一种概率运动先验框架，该框架可以强制执行运动约束并适应不同的运动状态。其次，置信估计机制基于时间、几何、光度和结构指标评估高斯原语的可靠性，以指导平衡的空间表示。自适应关键帧选择方法通过动态调整关键帧频率进一步优化关键帧密度，提高时间相干性。最后，进行了多任务优化，结合了光度和几何损失、概率运动约束和基于置信度的加权，实现了姿态跟踪精度和映射质量的联合优化。实验表明，MCGS-SLAM在保持具有竞争力的帧率的同时，实现了83%的轨迹误差减少和3D映射质量的提高。

{"title":"Motion-prior and Confidence-aware Gaussian Splatting (MCGS) SLAM for 3D scene reconstruction of indoor built environments","authors":"Yuanyuan Deng, Vincent J.L. Gan","doi":"10.1016/j.aei.2026.104317","DOIUrl":"10.1016/j.aei.2026.104317","url":null,"abstract":"<div><div>3D Gaussian Splatting SLAM is an emerging area for 3D scene reconstruction, yet it faces challenges in maintaining motion and temporal consistency under rapid motion, illumination changes, and dynamic object occlusion in the built environment. This paper proposes a motion-prior and confidence-aware Gaussian Splatting (MCGS) SLAM, which hardnesses a probabilistic motion-prior framework that enforces kinematic constraints and adapts to varying motion states. Secondly, the confidence estimation mechanism evaluates the reliability of Gaussian primitives based on temporal, geometric, photometric, and structural indicators to guide a balanced spatial representation. An adaptive keyframe selection method further optimizes keyframe density and improves temporal coherence by dynamically adjusting keyframe frequency. Lastly, multi-task optimization is undertaken, which combines photometric and geometric loss, probabilistic motion constraints, and confidence-based weighting, enabling joint optimization of pose tracking accuracy and mapping quality. Experiments show that MCGS-SLAM achieves 83% trajectory error reduction and 3D mapping quality gains while maintaining a competitive frame rate.</div></div>","PeriodicalId":50941,"journal":{"name":"Advanced Engineering Informatics","volume":"71 ","pages":"Article 104317"},"PeriodicalIF":9.9,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145978247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Dynamic reliability informed adaptive task scheduling for multirobot manufacturing system 基于动态可靠性的多机器人制造系统自适应任务调度

IF 9.9 1区工程技术 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Advanced Engineering Informatics

Pub Date : 2026-01-13 DOI: 10.1016/j.aei.2026.104335

Jian Zhou , Hang Zhang , Buyun Tang , Lianyu Zheng , Yiwei Wang

Efficient and reliable scheduling is critical for multirobot manufacturing systems, yet existing performance diagnosis and reliability evaluation methods fail to capture the dynamic evolution of system states, and current scheduling often neglects reliability information. This paper proposes a collaborative scheduling method integrating dynamic reliability assessment. First, multisource operational signals are collected and fused to identify degradation stages. Based on a classification probability mapping mechanism, the service state is then converted into probabilistic attributes applicable to modeling and scheduling. Subsequently, a system model incorporating structure, state, and behavior is constructed, and reliability indicators of the system are dynamically evaluated through logical model evolution. On this basis, a heuristic multiagent reinforcement learning scheduling algorithm is designed, using reliability attributes and constraint graph structures as inputs to achieve collaborative scheduling of multiple robots. Finally, during task execution, real-time state changes are dynamically perceived, and the scheduling plan is adaptively updated by triggering rescheduling based on the real-time evaluation results, thus forming a reliability-informed closed-loop scheduling mechanism. Case studies demonstrate 18.2% reduction in task completion time for static allocation compared to four baseline methods, along with 17.1% improvement for dynamic rescheduling against engineering practices. These quantitative results confirm the method’s significant enhancements in scheduling responsiveness to degradation, adaptive task optimization, and overall system stability and efficiency.

高效、可靠的调度对多机器人制造系统至关重要，但现有的性能诊断和可靠性评估方法无法捕捉到系统状态的动态演变，且当前的调度往往忽略了可靠性信息。提出了一种集成动态可靠性评估的协同调度方法。首先，采集并融合多源操作信号，识别退化阶段；基于分类概率映射机制，将服务状态转换为适用于建模和调度的概率属性。随后，构建了包含结构、状态和行为的系统模型，并通过逻辑模型演化对系统的可靠性指标进行动态评估。在此基础上，设计了启发式多智能体强化学习调度算法，以可靠性属性和约束图结构为输入，实现多机器人协同调度。最后，在任务执行过程中，动态感知实时状态变化，并根据实时评估结果触发重调度，自适应更新调度计划，形成一种可靠性知情的闭环调度机制。案例研究表明，与四种基线方法相比，静态分配的任务完成时间减少了18.2%，与工程实践相比，动态重新调度的任务完成时间提高了17.1%。这些定量结果证实了该方法在调度响应退化、自适应任务优化以及整体系统稳定性和效率方面的显著增强。

{"title":"Dynamic reliability informed adaptive task scheduling for multirobot manufacturing system","authors":"Jian Zhou , Hang Zhang , Buyun Tang , Lianyu Zheng , Yiwei Wang","doi":"10.1016/j.aei.2026.104335","DOIUrl":"10.1016/j.aei.2026.104335","url":null,"abstract":"<div><div>Efficient and reliable scheduling is critical for multirobot manufacturing systems, yet existing performance diagnosis and reliability evaluation methods fail to capture the dynamic evolution of system states, and current scheduling often neglects reliability information. This paper proposes a collaborative scheduling method integrating dynamic reliability assessment. First, multisource operational signals are collected and fused to identify degradation stages. Based on a classification probability mapping mechanism, the service state is then converted into probabilistic attributes applicable to modeling and scheduling. Subsequently, a system model incorporating structure, state, and behavior is constructed, and reliability indicators of the system are dynamically evaluated through logical model evolution. On this basis, a heuristic multiagent reinforcement learning scheduling algorithm is designed, using reliability attributes and constraint graph structures as inputs to achieve collaborative scheduling of multiple robots. Finally, during task execution, real-time state changes are dynamically perceived, and the scheduling plan is adaptively updated by triggering rescheduling based on the real-time evaluation results, thus forming a reliability-informed closed-loop scheduling mechanism. Case studies demonstrate 18.2% reduction in task completion time for static allocation compared to four baseline methods, along with 17.1% improvement for dynamic rescheduling against engineering practices. These quantitative results confirm the method’s significant enhancements in scheduling responsiveness to degradation, adaptive task optimization, and overall system stability and efficiency.</div></div>","PeriodicalId":50941,"journal":{"name":"Advanced Engineering Informatics","volume":"71 ","pages":"Article 104335"},"PeriodicalIF":9.9,"publicationDate":"2026-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145977718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enabling AI-driven modular building design: an auto-decoder approach for IFC 3D geometry representation 启用ai驱动的模块化建筑设计：IFC 3D几何表示的自动解码器方法

IF 9.9 1区工程技术 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Advanced Engineering Informatics

Pub Date : 2026-01-13 DOI: 10.1016/j.aei.2026.104326

Sang Du , Lei Hou , Guomin (Kevin) Zhang , Yang Zou , Haosen Chen

Modular building design requires numerous context-dependent component variants that traditional constraint-based methods cannot exhaustively enumerate. Industry Foundation Classes (IFC) models encode rich spatial and semantic context from completed modular projects. This context could enable Artificial Intelligence (AI) models to generate component variants and complement constraint-based methods. However, IFC 3D geometry that carries spatial context is not directly usable by AI models. This stems from IFC’s complex data structure. To address this limitation, this paper proposes a readily deployable auto-decoder method that produces AI-compatible vectors from IFC geometry. First, an IFC export strategy that retains component spatial context is employed. Second, a sampling method that pairs 3D points with their distances to the nearest surface is applied. Third, an auto-decoder neural network that jointly optimises per-component vectors and the model weights is presented, yielding context-aware representation vectors for modular components. Finally, an octree-based decoder for accurate geometry recovery from vectors is employed. Experiments on real-world modular project data demonstrate that the resulting vectors preserve geometric fidelity and support component variant generation. Geometric fidelity is confirmed by the mean and maximum surface reconstruction errors of 14.57 mm and 51.94 mm, sufficient for modular building design analysis. Support for component variant generation is evidenced by geometric interpolation linearity exceeding 0.98 out of 1, showing excellent variant generation suitability. This method makes IFC spatial context accessible to AI-driven modular design methods, transforming Design for Manufacture and Assembly (DfMA) data into actionable knowledge. Codes available on GitHub.

模块化建筑设计需要大量与上下文相关的组件变体，而传统的基于约束的方法无法详尽地列举这些变体。工业基础类（IFC）模型从已完成的模块化项目中编码丰富的空间和语义上下文。该上下文可以使人工智能（AI）模型生成组件变体并补充基于约束的方法。然而，带有空间背景的IFC 3D几何图形不能直接用于AI模型。这源于IFC复杂的数据结构。为了解决这一限制，本文提出了一种易于部署的自动解码器方法，该方法可以从IFC几何形状中产生与ai兼容的向量。首先，采用了保留组件空间上下文的IFC出口策略。其次，采用一种将三维点与其最近表面的距离配对的采样方法。第三，提出了一种自动解码器神经网络，该网络联合优化每个组件向量和模型权重，生成模块化组件的上下文感知表示向量。最后，采用基于八叉树的解码器对矢量进行精确的几何恢复。在实际模块化工程数据上的实验表明，所得到的向量保持了几何保真度，并支持组件变体的生成。平均表面重构误差为14.57 mm，最大表面重构误差为51.94 mm，证实了几何保真度，足以进行模块化建筑设计分析。几何插补线性度超过0.98 (out of 1)，显示出良好的变量生成适宜性。这种方法使人工智能驱动的模块化设计方法可以访问IFC的空间背景，将制造和装配设计（DfMA）数据转化为可操作的知识。代码可在GitHub。

{"title":"Enabling AI-driven modular building design: an auto-decoder approach for IFC 3D geometry representation","authors":"Sang Du , Lei Hou , Guomin (Kevin) Zhang , Yang Zou , Haosen Chen","doi":"10.1016/j.aei.2026.104326","DOIUrl":"10.1016/j.aei.2026.104326","url":null,"abstract":"<div><div>Modular building design requires numerous context-dependent component variants that traditional constraint-based methods cannot exhaustively enumerate. Industry Foundation Classes (IFC) models encode rich spatial and semantic context from completed modular projects. This context could enable Artificial Intelligence (AI) models to generate component variants and complement constraint-based methods. However, IFC 3D geometry that carries spatial context is not directly usable by AI models. This stems from IFC’s complex data structure. To address this limitation, this paper proposes a readily deployable auto-decoder method that produces AI-compatible vectors from IFC geometry. First, an IFC export strategy that retains component spatial context is employed. Second, a sampling method that pairs 3D points with their distances to the nearest surface is applied. Third, an auto-decoder neural network that jointly optimises per-component vectors and the model weights is presented, yielding context-aware representation vectors for modular components. Finally, an octree-based decoder for accurate geometry recovery from vectors is employed. Experiments on real-world modular project data demonstrate that the resulting vectors preserve geometric fidelity and support component variant generation. Geometric fidelity is confirmed by the mean and maximum surface reconstruction errors of 14.57 mm and 51.94 mm, sufficient for modular building design analysis. Support for component variant generation is evidenced by geometric interpolation linearity exceeding 0.98 out of 1, showing excellent variant generation suitability. This method makes IFC spatial context accessible to AI-driven modular design methods, transforming Design for Manufacture and Assembly (DfMA) data into actionable knowledge. Codes available on GitHub.</div></div>","PeriodicalId":50941,"journal":{"name":"Advanced Engineering Informatics","volume":"71 ","pages":"Article 104326"},"PeriodicalIF":9.9,"publicationDate":"2026-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145978257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0