Transformer-based deep learning model and video dataset for installation action recognition in offsite projects

IF 11.5 1区 工程技术 Q1 CONSTRUCTION & BUILDING TECHNOLOGY Automation in Construction Pub Date : 2025-04-01 Epub Date: 2025-02-06 DOI:10.1016/j.autcon.2025.106042
Junyoung Jang , Eunbeen Jeong , Tae Wan Kim
{"title":"Transformer-based deep learning model and video dataset for installation action recognition in offsite projects","authors":"Junyoung Jang ,&nbsp;Eunbeen Jeong ,&nbsp;Tae Wan Kim","doi":"10.1016/j.autcon.2025.106042","DOIUrl":null,"url":null,"abstract":"<div><div>This paper developed and evaluated the Precast Concrete Installation Dataset (PCI-Dataset), a large-scale video dataset for automatically recognizing precast concrete (PC) installation activities. The dataset comprises 12,791 video clips (5 s each, 1080 × 1080 resolution, 30fps) from actual PC construction sites, including 12 balanced activity classes combining three component types and four work stages. Evaluation of six Transformer-based video classification models showed VideoMAE V2 achieved the highest overall accuracy of 98.10 %, followed by UniFormer V2, Video Swin, MVIT, ViViT, and TimeSformer. VideoMAE V2 achieved F1 scores above 80 % for most activities, with a peak of 92.20 % for slab assembly. In a case study on a real PC construction site, the model demonstrated high recognition accuracies: 100 % for lifting, 85.83–100 % for rigging, and 93.75–100 % for assembly operations. The paper contributes to PC construction management theory by applying computer vision for real-time and automated work recognition and analysis.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"172 ","pages":"Article 106042"},"PeriodicalIF":11.5000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Automation in Construction","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0926580525000822","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/6 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"CONSTRUCTION & BUILDING TECHNOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

This paper developed and evaluated the Precast Concrete Installation Dataset (PCI-Dataset), a large-scale video dataset for automatically recognizing precast concrete (PC) installation activities. The dataset comprises 12,791 video clips (5 s each, 1080 × 1080 resolution, 30fps) from actual PC construction sites, including 12 balanced activity classes combining three component types and four work stages. Evaluation of six Transformer-based video classification models showed VideoMAE V2 achieved the highest overall accuracy of 98.10 %, followed by UniFormer V2, Video Swin, MVIT, ViViT, and TimeSformer. VideoMAE V2 achieved F1 scores above 80 % for most activities, with a peak of 92.20 % for slab assembly. In a case study on a real PC construction site, the model demonstrated high recognition accuracies: 100 % for lifting, 85.83–100 % for rigging, and 93.75–100 % for assembly operations. The paper contributes to PC construction management theory by applying computer vision for real-time and automated work recognition and analysis.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于变压器的深度学习模型和视频数据集,用于非现场项目安装动作识别
本文开发并评估了预制混凝土安装数据集(PCI-Dataset),这是一个用于自动识别预制混凝土(PC)安装活动的大型视频数据集。该数据集包括12,791个实际PC建筑工地的视频片段(每个5秒,1080 × 1080分辨率,30fps),包括12个平衡的活动类别,结合3个组件类型和4个工作阶段。对6种基于transformer的视频分类模型的评估显示,VideoMAE V2的总体准确率最高,达到98.10%,其次是UniFormer V2、video Swin、MVIT、ViViT和TimeSformer。在大多数活动中,VideoMAE V2的F1得分都在80%以上,其中平板组装的最高得分为92.20%。在实际PC施工现场的案例研究中,该模型显示出较高的识别精度:起重100%,索具85.83 - 100%,装配操作93.75 - 100%。本文将计算机视觉应用于工程的实时自动化识别与分析,为PC施工管理理论做出了贡献。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Automation in Construction
Automation in Construction 工程技术-工程:土木
CiteScore
19.20
自引率
16.50%
发文量
563
审稿时长
8.5 months
期刊介绍: Automation in Construction is an international journal that focuses on publishing original research papers related to the use of Information Technologies in various aspects of the construction industry. The journal covers topics such as design, engineering, construction technologies, and the maintenance and management of constructed facilities. The scope of Automation in Construction is extensive and covers all stages of the construction life cycle. This includes initial planning and design, construction of the facility, operation and maintenance, as well as the eventual dismantling and recycling of buildings and engineering structures.
期刊最新文献
Automated compliance checking across the building lifecycle: Systematic and semantic review integrating PRISMA and deep search Three-dimensional subsurface digital twins via compressive sensing-enhanced Kriging of sparse cone penetration tests Scenario-based multimodal deep learning framework for simultaneous detection of construction accident causal factors and risk evaluation Multimodal large language model-driven framework for road crack assessment Data-driven rock capture with a standard excavator via model-free reinforcement learning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1