基于自然语言处理的深度迁移学习模型，适用于各种表格数据集，用于预测混凝土中复合杆件的粘结强度

IF 8.5 1区工程技术 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Computer-Aided Civil and Infrastructure Engineering Pub Date : 2024-10-12 DOI:10.1111/mice.13357

Pei‐Fu Zhang, Daxu Zhang, Xiao‐Ling Zhao, Xuan Zhao, Mudassir Iqbal, Yiliyaer Tuerxunmaimaiti, Qi Zhao

{"title":"基于自然语言处理的深度迁移学习模型，适用于各种表格数据集，用于预测混凝土中复合杆件的粘结强度","authors":"Pei‐Fu Zhang, Daxu Zhang, Xiao‐Ling Zhao, Xuan Zhao, Mudassir Iqbal, Yiliyaer Tuerxunmaimaiti, Qi Zhao","doi":"10.1111/mice.13357","DOIUrl":null,"url":null,"abstract":"As conventional machine learning models often struggle with scarcity and structural variation of training data, this paper proposes a novel regression transfer learning framework called transferable tabular regressor (TransTabRegressor) to address this challenge. The TransTabRegressor integrates natural language processing (NLP) for feature encoding, transformer for enhanced feature representation, and deep learning (DL) for robust modeling, facilitating effective transfer learning across tabular datasets using reducing input parameters. By leveraging the NLP data processor, the framework embeds both parameter names and values, enabling it to recognize and adapt to different expressions of similar parameters. For instance, the bond strength of fiber‐reinforced polymer (FRP) bars embedded in ultra‐high‐performance concrete (UHPC) is critical for ensuring the integrity of FRP‐UHPC structures. While pullout tests are widely adopted for their simplicity to generate substantial data, beam tests provide a closer approximation to actual stress conditions but are more complex thus resulting in limited data size. As a verification, the framework is applied to predict the bond strength of FRP bars embedded in UHPC using limited beam test data. A pre‐trained model is first established using 479 pieces of pullout test data. Subsequently, two transfer learning models are developed by fine‐tuning on 115 pieces of beam test data, where 66 correspond to concrete splitting failure and 49 correspond to pullout failure. For comparative analysis, XGBoost and neural network models are directly trained on the beam test data. Evaluation results demonstrate that the transfer learning models achieve significantly improved prediction accuracy and generalization capability. This study significantly highlights the effectiveness of the proposed TransTabRegressor in handling data scarcity and variability in input parameters across various engineering applications.","PeriodicalId":156,"journal":{"name":"Computer-Aided Civil and Infrastructure Engineering","volume":"9 1","pages":""},"PeriodicalIF":8.5000,"publicationDate":"2024-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Natural language processing‐based deep transfer learning model across diverse tabular datasets for bond strength prediction of composite bars in concrete\",\"authors\":\"Pei‐Fu Zhang, Daxu Zhang, Xiao‐Ling Zhao, Xuan Zhao, Mudassir Iqbal, Yiliyaer Tuerxunmaimaiti, Qi Zhao\",\"doi\":\"10.1111/mice.13357\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As conventional machine learning models often struggle with scarcity and structural variation of training data, this paper proposes a novel regression transfer learning framework called transferable tabular regressor (TransTabRegressor) to address this challenge. The TransTabRegressor integrates natural language processing (NLP) for feature encoding, transformer for enhanced feature representation, and deep learning (DL) for robust modeling, facilitating effective transfer learning across tabular datasets using reducing input parameters. By leveraging the NLP data processor, the framework embeds both parameter names and values, enabling it to recognize and adapt to different expressions of similar parameters. For instance, the bond strength of fiber‐reinforced polymer (FRP) bars embedded in ultra‐high‐performance concrete (UHPC) is critical for ensuring the integrity of FRP‐UHPC structures. While pullout tests are widely adopted for their simplicity to generate substantial data, beam tests provide a closer approximation to actual stress conditions but are more complex thus resulting in limited data size. As a verification, the framework is applied to predict the bond strength of FRP bars embedded in UHPC using limited beam test data. A pre‐trained model is first established using 479 pieces of pullout test data. Subsequently, two transfer learning models are developed by fine‐tuning on 115 pieces of beam test data, where 66 correspond to concrete splitting failure and 49 correspond to pullout failure. For comparative analysis, XGBoost and neural network models are directly trained on the beam test data. Evaluation results demonstrate that the transfer learning models achieve significantly improved prediction accuracy and generalization capability. This study significantly highlights the effectiveness of the proposed TransTabRegressor in handling data scarcity and variability in input parameters across various engineering applications.\",\"PeriodicalId\":156,\"journal\":{\"name\":\"Computer-Aided Civil and Infrastructure Engineering\",\"volume\":\"9 1\",\"pages\":\"\"},\"PeriodicalIF\":8.5000,\"publicationDate\":\"2024-10-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer-Aided Civil and Infrastructure Engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1111/mice.13357\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer-Aided Civil and Infrastructure Engineering","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1111/mice.13357","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

摘要

由于传统的机器学习模型往往难以应对训练数据的稀缺性和结构性变化，本文提出了一种名为可转移表格回归器（TransTabRegressor）的新型回归转移学习框架来应对这一挑战。TransTabRegressor 整合了用于特征编码的自然语言处理（NLP）、用于增强特征表示的转换器和用于稳健建模的深度学习（DL），从而在减少输入参数的情况下促进表格数据集之间的有效迁移学习。通过利用 NLP 数据处理器，该框架同时嵌入了参数名称和数值，使其能够识别和适应类似参数的不同表达方式。例如，嵌入超高性能混凝土（UHPC）中的纤维增强聚合物（FRP）条的粘结强度对于确保 FRP-UHPC 结构的完整性至关重要。拉拔试验因其简单易行、可生成大量数据而被广泛采用，而梁试验更接近实际应力条件，但更为复杂，因此数据量有限。作为验证，我们利用有限的梁试验数据，将该框架应用于预测嵌入 UHPC 的玻璃钢条的粘接强度。首先使用 479 个拉拔测试数据建立了一个预训练模型。随后，通过对 115 条梁测试数据进行微调，建立了两个迁移学习模型，其中 66 条对应混凝土劈裂失效，49 条对应拉拔失效。为了进行对比分析，直接在梁测试数据上训练了 XGBoost 和神经网络模型。评估结果表明，迁移学习模型显著提高了预测精度和泛化能力。这项研究大大凸显了所提出的 TransTabRegressor 在处理各种工程应用中的数据稀缺性和输入参数可变性方面的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Natural language processing‐based deep transfer learning model across diverse tabular datasets for bond strength prediction of composite bars in concrete

As conventional machine learning models often struggle with scarcity and structural variation of training data, this paper proposes a novel regression transfer learning framework called transferable tabular regressor (TransTabRegressor) to address this challenge. The TransTabRegressor integrates natural language processing (NLP) for feature encoding, transformer for enhanced feature representation, and deep learning (DL) for robust modeling, facilitating effective transfer learning across tabular datasets using reducing input parameters. By leveraging the NLP data processor, the framework embeds both parameter names and values, enabling it to recognize and adapt to different expressions of similar parameters. For instance, the bond strength of fiber‐reinforced polymer (FRP) bars embedded in ultra‐high‐performance concrete (UHPC) is critical for ensuring the integrity of FRP‐UHPC structures. While pullout tests are widely adopted for their simplicity to generate substantial data, beam tests provide a closer approximation to actual stress conditions but are more complex thus resulting in limited data size. As a verification, the framework is applied to predict the bond strength of FRP bars embedded in UHPC using limited beam test data. A pre‐trained model is first established using 479 pieces of pullout test data. Subsequently, two transfer learning models are developed by fine‐tuning on 115 pieces of beam test data, where 66 correspond to concrete splitting failure and 49 correspond to pullout failure. For comparative analysis, XGBoost and neural network models are directly trained on the beam test data. Evaluation results demonstrate that the transfer learning models achieve significantly improved prediction accuracy and generalization capability. This study significantly highlights the effectiveness of the proposed TransTabRegressor in handling data scarcity and variability in input parameters across various engineering applications.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computer-Aided Civil and Infrastructure Engineering 工程技术-工程：土木

CiteScore

17.60

自引率

19.80%

发文量

146

审稿时长

1 months

期刊介绍： Computer-Aided Civil and Infrastructure Engineering stands as a scholarly, peer-reviewed archival journal, serving as a vital link between advancements in computer technology and civil and infrastructure engineering. The journal serves as a distinctive platform for the publication of original articles, spotlighting novel computational techniques and inventive applications of computers. Specifically, it concentrates on recent progress in computer and information technologies, fostering the development and application of emerging computing paradigms. Encompassing a broad scope, the journal addresses bridge, construction, environmental, highway, geotechnical, structural, transportation, and water resources engineering. It extends its reach to the management of infrastructure systems, covering domains such as highways, bridges, pavements, airports, and utilities. The journal delves into areas like artificial intelligence, cognitive modeling, concurrent engineering, database management, distributed computing, evolutionary computing, fuzzy logic, genetic algorithms, geometric modeling, internet-based technologies, knowledge discovery and engineering, machine learning, mobile computing, multimedia technologies, networking, neural network computing, optimization and search, parallel processing, robotics, smart structures, software engineering, virtual reality, and visualization techniques.