Yoga Posture Recognition by Learning Spatial-Temporal Feature with Deep Learning Techniques

IF 0.8 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING International Journal of Image and Graphics Pub Date : 2023-07-21 DOI:10.1142/s0219467824500554

J. Palanimeera, K. Ponmozhi

{"title":"Yoga Posture Recognition by Learning Spatial-Temporal Feature with Deep Learning Techniques","authors":"J. Palanimeera, K. Ponmozhi","doi":"10.1142/s0219467824500554","DOIUrl":null,"url":null,"abstract":"Yoga posture recognition remains a difficult issue because of crowded backgrounds, varied settings, occlusions, viewpoint alterations, and camera motions, despite recent promising advances in deep learning. In this paper, the method for accurately detecting various yoga poses using DL (Deep Learning) algorithms is provided. Using a standard RGB camera, six yoga poses — Sukhasana, Kakasana, Naukasana, Dhanurasana, Tadasana, and Vrikshasana — were captured on ten people, five men and five women. In this study, a brand-new DL model is presented for representing the spatio-temporal (ST) variation of skeleton-based yoga poses in movies. It is advised to use a variety of representation learners to pry video-level temporal recordings, which combine spatio-temporal sampling with long-range time mastering to produce a successful and effective training approach. A novel feature extraction method using Open Pose is described, together with a DenceBi-directional LSTM network to represent spatial-temporal links in both the forward and backward directions. This will increase the efficacy and consistency of modeling long-range action detection. To improve temporal pattern modeling capability, they are stacked and combined with dense skip connections. To improve performance, two modalities from look and motion are fused with a fusion module and compared to other deep learning models are LSTMs including LSTM, Bi-LSTM, Res-LSTM, and Res-BiLSTM. Studies on real-time datasets of yoga poses show that the suggested DenseBi-LSTM model performs better and yields better results than state-of-the-art techniques for yoga pose detection.","PeriodicalId":44688,"journal":{"name":"International Journal of Image and Graphics","volume":null,"pages":null},"PeriodicalIF":0.8000,"publicationDate":"2023-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Image and Graphics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/s0219467824500554","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

Yoga posture recognition remains a difficult issue because of crowded backgrounds, varied settings, occlusions, viewpoint alterations, and camera motions, despite recent promising advances in deep learning. In this paper, the method for accurately detecting various yoga poses using DL (Deep Learning) algorithms is provided. Using a standard RGB camera, six yoga poses — Sukhasana, Kakasana, Naukasana, Dhanurasana, Tadasana, and Vrikshasana — were captured on ten people, five men and five women. In this study, a brand-new DL model is presented for representing the spatio-temporal (ST) variation of skeleton-based yoga poses in movies. It is advised to use a variety of representation learners to pry video-level temporal recordings, which combine spatio-temporal sampling with long-range time mastering to produce a successful and effective training approach. A novel feature extraction method using Open Pose is described, together with a DenceBi-directional LSTM network to represent spatial-temporal links in both the forward and backward directions. This will increase the efficacy and consistency of modeling long-range action detection. To improve temporal pattern modeling capability, they are stacked and combined with dense skip connections. To improve performance, two modalities from look and motion are fused with a fusion module and compared to other deep learning models are LSTMs including LSTM, Bi-LSTM, Res-LSTM, and Res-BiLSTM. Studies on real-time datasets of yoga poses show that the suggested DenseBi-LSTM model performs better and yields better results than state-of-the-art techniques for yoga pose detection.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用深度学习技术学习时空特征识别瑜伽姿势

瑜伽姿势识别仍然是一个困难的问题，因为拥挤的背景、不同的环境、遮挡、视点改变和相机运动，尽管最近在深度学习方面取得了可喜的进展。本文提供了一种使用DL（深度学习）算法精确检测各种瑜伽姿势的方法。使用标准RGB相机，在10个人身上拍摄到了六个瑜伽姿势——Sukhasana、Kakasana、Naukasana、Dhanurasana、Tadasana和Vrikshasana，其中包括5男5女。在这项研究中，提出了一个全新的DL模型来表示电影中基于骨骼的瑜伽姿势的时空变化。建议使用各种表示学习器来窥探视频级别的时间记录，将时空采样与长时间掌握相结合，以产生一种成功有效的训练方法。描述了一种使用开放姿态的新特征提取方法，以及DenceBi-directional LSTM网络来表示前向和后向的时空链路。这将提高远程动作检测建模的有效性和一致性。为了提高时间模式建模能力，它们被堆叠并与密集的跳跃连接相结合。为了提高性能，将来自视觉和运动的两种模式与融合模块融合，并与其他深度学习模型相比，LSTM包括LSTM、Bi-LSTM、Res-LSTM和Res-BiLSTM。对瑜伽姿势实时数据集的研究表明，与最先进的瑜伽姿势检测技术相比，所提出的DenseBi LSTM模型表现更好，产生更好的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

International Journal of Image and Graphics COMPUTER SCIENCE, SOFTWARE ENGINEERING-

CiteScore

2.40

自引率

18.80%

发文量