Task recognition integrating worker actions and machine operations: A video-based sensing approach without physical sensors

IF 7.5 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Engineering Applications of Artificial Intelligence Pub Date : 2025-02-21 DOI:10.1016/j.engappai.2025.110232
Shotaro Kataoka , Masashi Oba , Hirofumi Nonaka
{"title":"Task recognition integrating worker actions and machine operations: A video-based sensing approach without physical sensors","authors":"Shotaro Kataoka ,&nbsp;Masashi Oba ,&nbsp;Hirofumi Nonaka","doi":"10.1016/j.engappai.2025.110232","DOIUrl":null,"url":null,"abstract":"<div><div>Automating work process analysis is crucial in manufacturing to improve efficiency and productivity. However, traditional deep learning methods often fail to capture subtle temporal changes in machine operations, such as varying speeds. We propose a cost-effective approach called pseudo-sensing, which simulates sensor data by measuring machine speeds directly from video using wavelet transformation, a mathematical tool for time-frequency analysis. This approach eliminates the need for physical sensors.</div><div>We evaluated pseudo-sensing by integrating it into two task classification models. The first is a convolutional neural network-long short-term memory (CNN-LSTM) model, which extracts spatial features via a CNN and learns temporal patterns using an LSTM. The second is a three-dimensional residual network (3D ResNet, R3D), designed to process spatiotemporal data simultaneously. With pseudo-sensing, the CNN-LSTM’s micro-F1 score—an accuracy metric averaging precision and recall across all classes—improved from 0.712 to 0.736 (+2.4 points), while R3D’s score rose from 0.675 to 0.701 (+2.7 points).</div><div>To assess general applicability, we tested pseudo-sensing on another dataset featuring diverse machine motions: unidirectional movements (e.g., conveyor belts), oscillatory movements (e.g., pendulum-like motions), rotational movements (e.g., rotary presses), and intermittent movements (e.g., blinking or toggling mechanisms). The method achieved an 83% success rate in identifying machine dynamics.</div><div>By leveraging deep learning, this method integrates video-based machine operation sensing with task recognition, considering both human actions and machine states. Eliminating additional sensors while enhancing accuracy and efficiency, pseudo-sensing offers broad potential for advancing manufacturing process analysis.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"147 ","pages":"Article 110232"},"PeriodicalIF":7.5000,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197625002325","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Automating work process analysis is crucial in manufacturing to improve efficiency and productivity. However, traditional deep learning methods often fail to capture subtle temporal changes in machine operations, such as varying speeds. We propose a cost-effective approach called pseudo-sensing, which simulates sensor data by measuring machine speeds directly from video using wavelet transformation, a mathematical tool for time-frequency analysis. This approach eliminates the need for physical sensors.
We evaluated pseudo-sensing by integrating it into two task classification models. The first is a convolutional neural network-long short-term memory (CNN-LSTM) model, which extracts spatial features via a CNN and learns temporal patterns using an LSTM. The second is a three-dimensional residual network (3D ResNet, R3D), designed to process spatiotemporal data simultaneously. With pseudo-sensing, the CNN-LSTM’s micro-F1 score—an accuracy metric averaging precision and recall across all classes—improved from 0.712 to 0.736 (+2.4 points), while R3D’s score rose from 0.675 to 0.701 (+2.7 points).
To assess general applicability, we tested pseudo-sensing on another dataset featuring diverse machine motions: unidirectional movements (e.g., conveyor belts), oscillatory movements (e.g., pendulum-like motions), rotational movements (e.g., rotary presses), and intermittent movements (e.g., blinking or toggling mechanisms). The method achieved an 83% success rate in identifying machine dynamics.
By leveraging deep learning, this method integrates video-based machine operation sensing with task recognition, considering both human actions and machine states. Eliminating additional sensors while enhancing accuracy and efficiency, pseudo-sensing offers broad potential for advancing manufacturing process analysis.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
自动化工作流程分析对于制造业提高效率和生产力至关重要。然而,传统的深度学习方法往往无法捕捉到机器运行中细微的时间变化,例如不同的速度。我们提出了一种经济有效的方法,称为 "伪传感"(pseudo-sensing),它通过使用小波变换(一种用于时频分析的数学工具)直接从视频中测量机器速度来模拟传感器数据。我们将伪传感集成到两个任务分类模型中,对其进行了评估。第一个模型是卷积神经网络-长短期记忆(CNN-LSTM)模型,它通过 CNN 提取空间特征,并使用 LSTM 学习时间模式。第二个是三维残差网络(3D ResNet,R3D),旨在同时处理时空数据。使用伪感知后,CNN-LSTM 的 micro-F1 分数--一种对所有类别的精确度和召回率进行平均的精确度指标--从 0.712 提高到 0.736(+2.4 分),而 R3D 的分数则从 0.675 提高到 0.701(+2.7 分)、我们在另一个数据集上测试了伪感应的普遍适用性,该数据集包含多种机器运动:单向运动(如传送带)、振荡运动(如钟摆式运动)、旋转运动(如旋转压力机)和间歇运动(如闪烁或切换机制)。该方法在识别机器动态方面取得了 83% 的成功率。通过利用深度学习,该方法将基于视频的机器运行感测与任务识别整合在一起,同时考虑了人类行为和机器状态。伪传感无需额外的传感器,同时提高了准确性和效率,为推进制造过程分析提供了广阔的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Engineering Applications of Artificial Intelligence
Engineering Applications of Artificial Intelligence 工程技术-工程:电子与电气
CiteScore
9.60
自引率
10.00%
发文量
505
审稿时长
68 days
期刊介绍: Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.
期刊最新文献
Explainable Differential Privacy-Hyperdimensional Computing for Balancing Privacy and Transparency in Additive Manufacturing Monitoring SPARDA: Sparsity-constrained dimensional analysis via convex relaxation for parameter reduction in high-dimensional engineering systems On-board detection of rail corrugation using improved convolutional block attention mechanism Speech emotion recognition based on spiking neural network and convolutional neural network Few-shot machine reading comprehension for bridge inspection via domain-specific and task-aware pre-tuning approach
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1