Video-Based Pose Estimation for Gait Analysis in Stroke Survivors during Clinical Assessments: A Proof-of-Concept Study

Q1 Computer Science Digital Biomarkers Pub Date : 2022-01-13 DOI:10.1159/000520732
L. Lonini, Y. Moon, Kyle R. Embry, R. Cotton, K. McKenzie, Sophia Jenz, A. Jayaraman
{"title":"Video-Based Pose Estimation for Gait Analysis in Stroke Survivors during Clinical Assessments: A Proof-of-Concept Study","authors":"L. Lonini, Y. Moon, Kyle R. Embry, R. Cotton, K. McKenzie, Sophia Jenz, A. Jayaraman","doi":"10.1159/000520732","DOIUrl":null,"url":null,"abstract":"Recent advancements in deep learning have produced significant progress in markerless human pose estimation, making it possible to estimate human kinematics from single camera videos without the need for reflective markers and specialized labs equipped with motion capture systems. Such algorithms have the potential to enable the quantification of clinical metrics from videos recorded with a handheld camera. Here we used DeepLabCut, an open-source framework for markerless pose estimation, to fine-tune a deep network to track 5 body keypoints (hip, knee, ankle, heel, and toe) in 82 below-waist videos of 8 patients with stroke performing overground walking during clinical assessments. We trained the pose estimation model by labeling the keypoints in 2 frames per video and then trained a convolutional neural network to estimate 5 clinically relevant gait parameters (cadence, double support time, swing time, stance time, and walking speed) from the trajectory of these keypoints. These results were then compared to those obtained from a clinical system for gait analysis (GAITRite®, CIR Systems). Absolute accuracy (mean error) and precision (standard deviation of error) for swing, stance, and double support time were within 0.04 ± 0.11 s; Pearson’s correlation with the reference system was moderate for swing times (r = 0.4–0.66), but stronger for stance and double support time (r = 0.93–0.95). Cadence mean error was −0.25 steps/min ± 3.9 steps/min (r = 0.97), while walking speed mean error was −0.02 ± 0.11 m/s (r = 0.92). These preliminary results suggest that single camera videos and pose estimation models based on deep networks could be used to quantify clinically relevant gait metrics in individuals poststroke, even while using assistive devices in uncontrolled environments. Such development opens the door to applications for gait analysis both inside and outside of clinical settings, without the need of sophisticated equipment.","PeriodicalId":11242,"journal":{"name":"Digital Biomarkers","volume":"6 1","pages":"9 - 18"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"27","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Biomarkers","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1159/000520732","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 27

Abstract

Recent advancements in deep learning have produced significant progress in markerless human pose estimation, making it possible to estimate human kinematics from single camera videos without the need for reflective markers and specialized labs equipped with motion capture systems. Such algorithms have the potential to enable the quantification of clinical metrics from videos recorded with a handheld camera. Here we used DeepLabCut, an open-source framework for markerless pose estimation, to fine-tune a deep network to track 5 body keypoints (hip, knee, ankle, heel, and toe) in 82 below-waist videos of 8 patients with stroke performing overground walking during clinical assessments. We trained the pose estimation model by labeling the keypoints in 2 frames per video and then trained a convolutional neural network to estimate 5 clinically relevant gait parameters (cadence, double support time, swing time, stance time, and walking speed) from the trajectory of these keypoints. These results were then compared to those obtained from a clinical system for gait analysis (GAITRite®, CIR Systems). Absolute accuracy (mean error) and precision (standard deviation of error) for swing, stance, and double support time were within 0.04 ± 0.11 s; Pearson’s correlation with the reference system was moderate for swing times (r = 0.4–0.66), but stronger for stance and double support time (r = 0.93–0.95). Cadence mean error was −0.25 steps/min ± 3.9 steps/min (r = 0.97), while walking speed mean error was −0.02 ± 0.11 m/s (r = 0.92). These preliminary results suggest that single camera videos and pose estimation models based on deep networks could be used to quantify clinically relevant gait metrics in individuals poststroke, even while using assistive devices in uncontrolled environments. Such development opens the door to applications for gait analysis both inside and outside of clinical settings, without the need of sophisticated equipment.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于视频的步态估计在中风幸存者的临床评估:一项概念验证研究
深度学习的最新进展在无标记人体姿态估计方面取得了重大进展,使从单摄像机视频中估计人体运动学成为可能,而无需反射标记和配备运动捕捉系统的专业实验室。这种算法有可能从手持摄像机记录的视频中量化临床指标。在这里,我们使用DeepLabCut,一个用于无标记姿势估计的开源框架,对一个深度网络进行微调,以跟踪82个腰部以下视频中的5个身体关键点(髋、膝、踝、脚跟和脚趾),这些视频中有8名中风患者在临床评估期间进行地上行走。我们通过标记每个视频2帧中的关键点来训练姿势估计模型,然后训练卷积神经网络,根据这些关键点的轨迹估计5个临床相关的步态参数(节奏、双支撑时间、摆动时间、站立时间和行走速度)。然后将这些结果与步态分析临床系统(GAITRite®,CIR Systems)获得的结果进行比较。摆动、站立和双支撑时间的绝对精度(平均误差)和精度(误差标准偏差)在0.04±0.11s内;Pearson与参考系统的相关性在摆动时间方面中等(r=0.4-0.66),但在站立和双支撑时间方面更强(r=0.93-0.95)。Cadence平均误差为-0.25步/分钟±3.9步/分钟(r=0.97),而步行速度平均误差为-0.02±0.11m/s(r=0.92)。这些初步结果表明,即使在不受控制的环境中使用辅助设备,基于深度网络的单摄像头视频和姿势估计模型也可以用于量化中风后个体的临床相关步态指标。这样的发展为步态分析在临床环境内外的应用打开了大门,而不需要复杂的设备。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Digital Biomarkers
Digital Biomarkers Medicine-Medicine (miscellaneous)
CiteScore
10.60
自引率
0.00%
发文量
12
审稿时长
23 weeks
期刊最新文献
The Imperative of Voice Data Collection in Clinical Trials. eHealth and mHealth in Antimicrobial Stewardship Programs. Detecting Longitudinal Trends between Passively Collected Phone Use and Anxiety among College Students. Video Assessment to Detect Amyotrophic Lateral Sclerosis. Digital Vocal Biomarker of Smoking Status Using Ecological Audio Recordings: Results from the Colive Voice Study.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1