This paper proposes a fully automated end-to-end inspection-path-planning strategy for underground utilities, such as pipelines, based on building information modeling (BIM). An automatic extraction method is developed to process utility information from BIM models, using a registration step that pairs each pipeline with its corresponding utility branch. This is followed by geometric modification via offset algorithms that account for obstacle dimensions to generate safe navigation paths. A novel inspection algorithm, the utility-Chinese postman problem (U-CPP), is introduced to generate a topological map and ensure full-coverage inspection. A Dynamo prototype integrates all these algorithms, minimizing manual intervention and achieving full-process automation. The method is validated with three real-world utility BIM models featuring diverse cross-sectional configurations. The U-CPP algorithm achieves 100% coverage with minimal repetition rates and computes optimized inspection paths in 24, 23, and 23 ms. Results demonstrate that the proposed strategy efficiently automates both information extraction and full-coverage path planning. The U-CPP algorithm proves to be robust, computationally efficient, and effective in handling diverse utility configurations.
{"title":"Automated path-planning strategy for robotic inspection of underground utilities based on building information model","authors":"Zihan Yang, Jiangpeng Shu, Jishuang Jiang, Wentao Han, Yichang Wang, Liang Zhao, Yong Bai","doi":"10.1111/mice.70107","DOIUrl":"10.1111/mice.70107","url":null,"abstract":"<p>This paper proposes a fully automated end-to-end inspection-path-planning strategy for underground utilities, such as pipelines, based on building information modeling (BIM). An automatic extraction method is developed to process utility information from BIM models, using a registration step that pairs each pipeline with its corresponding utility branch. This is followed by geometric modification via offset algorithms that account for obstacle dimensions to generate safe navigation paths. A novel inspection algorithm, the utility-Chinese postman problem (U-CPP), is introduced to generate a topological map and ensure full-coverage inspection. A Dynamo prototype integrates all these algorithms, minimizing manual intervention and achieving full-process automation. The method is validated with three real-world utility BIM models featuring diverse cross-sectional configurations. The U-CPP algorithm achieves 100% coverage with minimal repetition rates and computes optimized inspection paths in 24, 23, and 23 ms. Results demonstrate that the proposed strategy efficiently automates both information extraction and full-coverage path planning. The U-CPP algorithm proves to be robust, computationally efficient, and effective in handling diverse utility configurations.</p>","PeriodicalId":156,"journal":{"name":"Computer-Aided Civil and Infrastructure Engineering","volume":"40 29","pages":"5554-5575"},"PeriodicalIF":9.1,"publicationDate":"2025-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/mice.70107","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145295094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Structural health monitoring (SHM) research generates vast amount of information, especially as unstructured data formats. To date, most natural language processing (NLP) applications focus on extracting information (syntactic or semantic level) rather than providing latent knowledge and generating newer information (pragmatic level). Thus, this study proposes a pragmatic NLP framework integrating named entity recognition (NER) model (BERT–BiLSTM–CRF), domain-specific knowledge graph (KG), and hypothesis generation. Using a labeled dataset, the semantic-aware NER model achieved 0.8998 accuracy and 0.8705 F1 score, allowing precise label prediction for unseen texts. Then, domain-specific KG constructed interrelations across diverse literature, blending insights. From this enriched KG, the framework generated candidate hypotheses to provide latent knowledge. In this work, the generated hypothesis is validated by showing a strong correlation to the literature. The results of this study showed the potential of pragmatic NLP on SHM, offering pathways for latent knowledge reasoning and cross-disciplinary research insight discovery.
{"title":"Hypothesis generation from pragmatic causal relationships for latent knowledge reasoning in the civil engineering domain","authors":"Sangbin Lee, Robin Eunju Kim","doi":"10.1111/mice.70101","DOIUrl":"10.1111/mice.70101","url":null,"abstract":"<p>Structural health monitoring (SHM) research generates vast amount of information, especially as unstructured data formats. To date, most natural language processing (NLP) applications focus on extracting information (syntactic or semantic level) rather than providing latent knowledge and generating newer information (pragmatic level). Thus, this study proposes a pragmatic NLP framework integrating named entity recognition (NER) model (BERT–BiLSTM–CRF), domain-specific knowledge graph (KG), and hypothesis generation. Using a labeled dataset, the semantic-aware NER model achieved 0.8998 accuracy and 0.8705 F1 score, allowing precise label prediction for unseen texts. Then, domain-specific KG constructed interrelations across diverse literature, blending insights. From this enriched KG, the framework generated candidate hypotheses to provide latent knowledge. In this work, the generated hypothesis is validated by showing a strong correlation to the literature. The results of this study showed the potential of pragmatic NLP on SHM, offering pathways for latent knowledge reasoning and cross-disciplinary research insight discovery.</p>","PeriodicalId":156,"journal":{"name":"Computer-Aided Civil and Infrastructure Engineering","volume":"40 29","pages":"5447-5473"},"PeriodicalIF":9.1,"publicationDate":"2025-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/mice.70101","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145295095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhixiang Gao, Hanzhang Ge, Said M. Easa, Yue Liu, HengYan Pan, Yonggang Wang
Horizontal and vertical curves significantly affect crash risk due to their impact on driver behavior, vehicle dynamics, and sight distance. However, their combined effects and spatial interactions remain underexplored in large-scale safety assessments. To address limitations in high-resolution geometric data and insufficient spatial modeling, this study proposes a geometry-oriented crash risk assessment framework based on graph neural networks. Leveraging open-source geospatial data, this study extracts fine-grained curve features and constructs a GraphSAGE model to capture spatial dependencies among road segments. A dual-graph architecture is developed to jointly encode both segment-level and network-level information. In large-scale empirical evaluations, the proposed model exhibits excellent predictive performance (F1 > 0.985) and strong spatial correlation with historical crash distributions (r > 0.7). The model effectively identifies high-risk segments characterized by poor geometric continuity or abrupt structural transitions, providing decision support for alignment optimization. The model effectively identifies high-risk segments characterized by poor geometric continuity or abrupt structural transitions, thereby supporting informed decisions for alignment improvements. This research enhances the understanding of the geometry–safety relationship and offers a scalable, open-source tool to support local and regional traffic safety interventions.
{"title":"A spatial graph learning framework for multi-scale road safety management based on road-curve features and open-source data","authors":"Zhixiang Gao, Hanzhang Ge, Said M. Easa, Yue Liu, HengYan Pan, Yonggang Wang","doi":"10.1111/mice.70104","DOIUrl":"10.1111/mice.70104","url":null,"abstract":"<p>Horizontal and vertical curves significantly affect crash risk due to their impact on driver behavior, vehicle dynamics, and sight distance. However, their combined effects and spatial interactions remain underexplored in large-scale safety assessments. To address limitations in high-resolution geometric data and insufficient spatial modeling, this study proposes a geometry-oriented crash risk assessment framework based on graph neural networks. Leveraging open-source geospatial data, this study extracts fine-grained curve features and constructs a GraphSAGE model to capture spatial dependencies among road segments. A dual-graph architecture is developed to jointly encode both segment-level and network-level information. In large-scale empirical evaluations, the proposed model exhibits excellent predictive performance (F1 > 0.985) and strong spatial correlation with historical crash distributions (<i>r</i> > 0.7). The model effectively identifies high-risk segments characterized by poor geometric continuity or abrupt structural transitions, providing decision support for alignment optimization. The model effectively identifies high-risk segments characterized by poor geometric continuity or abrupt structural transitions, thereby supporting informed decisions for alignment improvements. This research enhances the understanding of the geometry–safety relationship and offers a scalable, open-source tool to support local and regional traffic safety interventions.</p>","PeriodicalId":156,"journal":{"name":"Computer-Aided Civil and Infrastructure Engineering","volume":"40 28","pages":"5228-5252"},"PeriodicalIF":9.1,"publicationDate":"2025-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/mice.70104","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145289304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The reliability of machine learning heavily depends on training data; however, in the field of geotechnical engineering, it is challenging to obtain diverse datasets due to economic and accessibility limitations. The aim of this study is to propose a method for generating data for use in the training phase of machine learning by combining Monte Carlo simulations and genetic algorithms. The original data sample is constructed using a 1 × 1 m grid for a slope, based on geotechnical properties measured in 23 regions, including soil cohesion, slope angle, soil density, soil depth, and friction angle. Based on the original sample, further predictions are made at an additional 1777 grid locations to estimate the spatial distribution of geotechnical properties across the entire slope. When a single variable is used as input, the log-likelihood values (e.g., –5.4 to –144.5) are used only as relative indicators, not as absolute measures. The results are also compared to those generated using existing algorithms such as the synthetic minority oversampling technique and adaptive synthetic sampling. The data generated using the proposed method exhibits fewer duplicate values, broader distribution ranges, and greater diversity. To ensure that the generated data closely aligns with the statistical characteristics of the actual data, the combination of input variables is configured to maximize the log-likelihood value. To achieve this, Pearson correlation values are referenced, and multivariate input variables are constructed using highly correlated factors. As a result of this approach, the log-likelihood value increased by 21% to 96%. This study demonstrates that the method combining Monte Carlo simulations and genetic algorithms generates data with more diverse distributions, compared to existing methods. It also highlights that constructing multivariable input data is preferable for improving reliability.
{"title":"Methodology for generating diverse geotechnical datasets using Monte Carlo simulation and genetic algorithms","authors":"Junghee Park, Hyung-Koo Yoon","doi":"10.1111/mice.70106","DOIUrl":"10.1111/mice.70106","url":null,"abstract":"<p>The reliability of machine learning heavily depends on training data; however, in the field of geotechnical engineering, it is challenging to obtain diverse datasets due to economic and accessibility limitations. The aim of this study is to propose a method for generating data for use in the training phase of machine learning by combining Monte Carlo simulations and genetic algorithms. The original data sample is constructed using a 1 × 1 m grid for a slope, based on geotechnical properties measured in 23 regions, including soil cohesion, slope angle, soil density, soil depth, and friction angle. Based on the original sample, further predictions are made at an additional 1777 grid locations to estimate the spatial distribution of geotechnical properties across the entire slope. When a single variable is used as input, the log-likelihood values (e.g., –5.4 to –144.5) are used only as relative indicators, not as absolute measures. The results are also compared to those generated using existing algorithms such as the synthetic minority oversampling technique and adaptive synthetic sampling. The data generated using the proposed method exhibits fewer duplicate values, broader distribution ranges, and greater diversity. To ensure that the generated data closely aligns with the statistical characteristics of the actual data, the combination of input variables is configured to maximize the log-likelihood value. To achieve this, Pearson correlation values are referenced, and multivariate input variables are constructed using highly correlated factors. As a result of this approach, the log-likelihood value increased by 21% to 96%. This study demonstrates that the method combining Monte Carlo simulations and genetic algorithms generates data with more diverse distributions, compared to existing methods. It also highlights that constructing multivariable input data is preferable for improving reliability.</p>","PeriodicalId":156,"journal":{"name":"Computer-Aided Civil and Infrastructure Engineering","volume":"40 29","pages":"5494-5511"},"PeriodicalIF":9.1,"publicationDate":"2025-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/mice.70106","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145289303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guoliang Feng, Yiqiao Li, Andre Y. C. Tok, Stephen G. Ritchie
Rail freight serves as a reliable cost-effective and fuel-efficient mode for long-distance ground freight transportation. Existing rail data sources rely heavily on aggregate reports that lead to significant spatiotemporal data gaps for infrastructure planning and regulatory evaluation. This paper presents RailVM—a vision-based deep learning framework for freight rail monitoring using infrared-enabled side-fire cameras. RailVM can accurately identify railcar and locomotive classes and extract unique locomotive tag identifiers for continuous 24/7 monitoring. It introduces three key innovations. The first is a depth-aware background modeling module that incorporates depth information to improve foreground extraction across diverse environments. The second is an advanced you only look once (YOLO)-based object-detection model—rail-specific-YOLO—that integrates a triplet attention mechanism and a Rail-intersection over union loss function to improve the identification of low-profile railcars. The third is that RailVM enables continuous day–night monitoring using infrared imaging to ensure accurate performance even in low-visibility conditions. RailVM was designed and validated for independent transferability at major rail freight gateways in California. Remarkably, it reduced gondola count errors from 41% to 2% and achieved under 5% mean error across 14 railcar classes in red, green, and blue as well as infrared modes of operation, outperforming baselines and demonstrating potential for robust real-world generalization.
{"title":"Freight rail activity inventory system using a vision-based deep learning framework","authors":"Guoliang Feng, Yiqiao Li, Andre Y. C. Tok, Stephen G. Ritchie","doi":"10.1111/mice.70083","DOIUrl":"10.1111/mice.70083","url":null,"abstract":"<p>Rail freight serves as a reliable cost-effective and fuel-efficient mode for long-distance ground freight transportation. Existing rail data sources rely heavily on aggregate reports that lead to significant spatiotemporal data gaps for infrastructure planning and regulatory evaluation. This paper presents RailVM—a vision-based deep learning framework for freight rail monitoring using infrared-enabled side-fire cameras. RailVM can accurately identify railcar and locomotive classes and extract unique locomotive tag identifiers for continuous 24/7 monitoring. It introduces three key innovations. The first is a depth-aware background modeling module that incorporates depth information to improve foreground extraction across diverse environments. The second is an advanced you only look once (YOLO)-based object-detection model—rail-specific-YOLO—that integrates a triplet attention mechanism and a Rail-intersection over union loss function to improve the identification of low-profile railcars. The third is that RailVM enables continuous day–night monitoring using infrared imaging to ensure accurate performance even in low-visibility conditions. RailVM was designed and validated for independent transferability at major rail freight gateways in California. Remarkably, it reduced gondola count errors from 41% to 2% and achieved under 5% mean error across 14 railcar classes in red, green, and blue as well as infrared modes of operation, outperforming baselines and demonstrating potential for robust real-world generalization.</p>","PeriodicalId":156,"journal":{"name":"Computer-Aided Civil and Infrastructure Engineering","volume":"40 27","pages":"4692-4717"},"PeriodicalIF":9.1,"publicationDate":"2025-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/mice.70083","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145277473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The cover image is based on the article Muck volume measurement of earth pressure balance shield using 3D point cloud based on deep learning by Shaojie Qin et al., https://doi.org/10.1111/mice.70067.