Kaixin Chen, Lin Zhang, Zhong Wang, Shengjie Zhao, Yicong Zhou
{"title":"Skeleton-aware Graph-based Adversarial Networks for Human Pose Estimation from Sparse IMUs","authors":"Kaixin Chen, Lin Zhang, Zhong Wang, Shengjie Zhao, Yicong Zhou","doi":"10.1145/3669904","DOIUrl":null,"url":null,"abstract":"<p>Recently, sparse-inertial human pose estimation (SI-HPE) with only a few IMUs has shown great potential in various fields. The most advanced work in this area achieved fairish results using only six IMUs. However, there are still two major issues that remain to be addressed. First, existing methods typically treat SI-HPE as a temporal sequential learning problem and often ignore the important spatial prior of skeletal topology. Second, there are far more synthetic data in their training data than real data, and the data distribution of synthetic data and real data is quite different, which makes it difficult for the model to be applied to more diverse real data. To address these issues, we propose “Graph-based Adversarial Inertial Poser (GAIP)”, which tracks body movements using sparse data from six IMUs. To make full use of the spatial prior, we design a multi-stage pose regressor with graph convolution to explicitly learn the skeletal topology. A joint position loss is also introduced to implicitly mine spatial information. To enhance the generalization ability, we propose supervising the pose regression with an adversarial loss from a discriminator, bringing the ability of adversarial networks to learn implicit constraints into full play. Additionally, we construct a real dataset that includes hip support movements and a synthetic dataset containing various motion categories to enrich the diversity of inertial data for SI-HPE. Extensive experiments demonstrate that GAIP produces results with more precise limb movement amplitudes and relative joint positions, accompanied by smaller joint angle and position errors compared to state-of-the-art counterparts. The datasets and codes are publicly available at https://cslinzhang.github.io/GAIP/.</p>","PeriodicalId":50937,"journal":{"name":"ACM Transactions on Multimedia Computing Communications and Applications","volume":"63 1","pages":""},"PeriodicalIF":5.2000,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Multimedia Computing Communications and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3669904","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Recently, sparse-inertial human pose estimation (SI-HPE) with only a few IMUs has shown great potential in various fields. The most advanced work in this area achieved fairish results using only six IMUs. However, there are still two major issues that remain to be addressed. First, existing methods typically treat SI-HPE as a temporal sequential learning problem and often ignore the important spatial prior of skeletal topology. Second, there are far more synthetic data in their training data than real data, and the data distribution of synthetic data and real data is quite different, which makes it difficult for the model to be applied to more diverse real data. To address these issues, we propose “Graph-based Adversarial Inertial Poser (GAIP)”, which tracks body movements using sparse data from six IMUs. To make full use of the spatial prior, we design a multi-stage pose regressor with graph convolution to explicitly learn the skeletal topology. A joint position loss is also introduced to implicitly mine spatial information. To enhance the generalization ability, we propose supervising the pose regression with an adversarial loss from a discriminator, bringing the ability of adversarial networks to learn implicit constraints into full play. Additionally, we construct a real dataset that includes hip support movements and a synthetic dataset containing various motion categories to enrich the diversity of inertial data for SI-HPE. Extensive experiments demonstrate that GAIP produces results with more precise limb movement amplitudes and relative joint positions, accompanied by smaller joint angle and position errors compared to state-of-the-art counterparts. The datasets and codes are publicly available at https://cslinzhang.github.io/GAIP/.
期刊介绍:
The ACM Transactions on Multimedia Computing, Communications, and Applications is the flagship publication of the ACM Special Interest Group in Multimedia (SIGMM). It is soliciting paper submissions on all aspects of multimedia. Papers on single media (for instance, audio, video, animation) and their processing are also welcome.
TOMM is a peer-reviewed, archival journal, available in both print form and digital form. The Journal is published quarterly; with roughly 7 23-page articles in each issue. In addition, all Special Issues are published online-only to ensure a timely publication. The transactions consists primarily of research papers. This is an archival journal and it is intended that the papers will have lasting importance and value over time. In general, papers whose primary focus is on particular multimedia products or the current state of the industry will not be included.