{"title":"A semantic real-time activity recognition system for sequential procedures in vocational learning","authors":"J. Magro, Daren Scerri","doi":"10.1145/3571560.3571579","DOIUrl":null,"url":null,"abstract":"In various areas of study, standard established procedures are critical for the successful accomplishment of a kinaesthetic task. Such standard procedures are important in various industries like engineering and health. This study makes a case for the development of intelligent activity monitoring systems for learning purposes through a proof of concept in first-aid training. Minor accidents such as simple cuts, bruises and minor burns are frequently treated without the need of emergency medical services. However, an incorrect first-aid procedure may lead to medical complications. This study aims to aid a learner to train how to perform a first-aid procedure for treating a wound through real-time monitoring, instructions and feedback. We propose a three-phase system where fast object detection, activity recognition in a temporal dimension and sequencing are used to semantically understand leaner actions. The You Only Look Once (YOLOv5) was used in phase 1 to detect multiple objects like wounds and bandages and Mediapipe to detect hand landmarks. Each class was assigned a different threshold for more accurate detections. The object detection model achieved a mean Average Precision (mAP) of 72.74% on the validation set and was subsequently used in a temporal manner to recognize an action. This temporal method to recognize the action of applying pressure over a wound, achieved an F1-Score of 91.67%. The method using an ontology-based technique to recognize the action of applying a bandage, achieved an F1-Score of 90.91%. The optimum distance from camera was found to be the actor placed at a position where the arm of the wounded actor occupies a significant portion of the viewport, whilst the optimum camera angle was found to be 110°. The created sequencing algorithm was tested using three different scenarios with the aid of a number of participants. The overall accuracy was 83.33%, wherein the result highlights that the algorithm is able to identify the sequence being conducted even with minimal movement involved during bandage application. The proposed system has high prospects of addressing challenges in a real-world environment.","PeriodicalId":143909,"journal":{"name":"Proceedings of the 6th International Conference on Advances in Artificial Intelligence","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 6th International Conference on Advances in Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3571560.3571579","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In various areas of study, standard established procedures are critical for the successful accomplishment of a kinaesthetic task. Such standard procedures are important in various industries like engineering and health. This study makes a case for the development of intelligent activity monitoring systems for learning purposes through a proof of concept in first-aid training. Minor accidents such as simple cuts, bruises and minor burns are frequently treated without the need of emergency medical services. However, an incorrect first-aid procedure may lead to medical complications. This study aims to aid a learner to train how to perform a first-aid procedure for treating a wound through real-time monitoring, instructions and feedback. We propose a three-phase system where fast object detection, activity recognition in a temporal dimension and sequencing are used to semantically understand leaner actions. The You Only Look Once (YOLOv5) was used in phase 1 to detect multiple objects like wounds and bandages and Mediapipe to detect hand landmarks. Each class was assigned a different threshold for more accurate detections. The object detection model achieved a mean Average Precision (mAP) of 72.74% on the validation set and was subsequently used in a temporal manner to recognize an action. This temporal method to recognize the action of applying pressure over a wound, achieved an F1-Score of 91.67%. The method using an ontology-based technique to recognize the action of applying a bandage, achieved an F1-Score of 90.91%. The optimum distance from camera was found to be the actor placed at a position where the arm of the wounded actor occupies a significant portion of the viewport, whilst the optimum camera angle was found to be 110°. The created sequencing algorithm was tested using three different scenarios with the aid of a number of participants. The overall accuracy was 83.33%, wherein the result highlights that the algorithm is able to identify the sequence being conducted even with minimal movement involved during bandage application. The proposed system has high prospects of addressing challenges in a real-world environment.
在不同的研究领域,标准的既定程序是成功完成动觉任务的关键。这样的标准程序在工程和卫生等各个行业都很重要。本研究通过在急救培训中的概念验证,为学习目的的智能活动监测系统的发展提出了一个案例。诸如简单的割伤、瘀伤和轻微烧伤等轻微事故往往不需要紧急医疗服务就能得到治疗。然而,不正确的急救程序可能导致医学并发症。本研究旨在通过实时监测、指导和反馈来帮助学习者训练如何执行急救程序来处理伤口。我们提出了一个三相系统,其中使用快速对象检测,时间维度的活动识别和排序来从语义上理解更精简的动作。第一阶段使用You Only Look Once (YOLOv5)来检测多个物体,如伤口和绷带,使用Mediapipe来检测手部地标。为了更准确的检测,每个类别都被分配了不同的阈值。目标检测模型在验证集上的平均精度(mAP)达到72.74%,随后以时间方式用于识别动作。该方法用于识别在创面上施加压力的动作,f1评分为91.67%。该方法使用基于本体的技术来识别绷带的动作,获得了90.91%的f1评分。我们发现,演员与摄像机的最佳距离是放置在受伤演员的手臂占据视口很大一部分的位置,而最佳摄像机角度是110°。在许多参与者的帮助下,用三种不同的场景测试了所创建的排序算法。总体准确率为83.33%,其中结果突出表明,即使在绷带应用过程中涉及的最小运动,该算法也能够识别正在进行的序列。所提出的系统在解决现实环境中的挑战方面具有很高的前景。