利用稀疏部分正确分割蒙版检测四肢和滑雪板上的任意关键点

2023 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW) Pub Date : 2022-11-17 DOI:10.1109/WACVW58289.2023.00051

K. Ludwig, Daniel Kienzle, Julian Lorenz, R. Lienhart

{"title":"利用稀疏部分正确分割蒙版检测四肢和滑雪板上的任意关键点","authors":"K. Ludwig, Daniel Kienzle, Julian Lorenz, R. Lienhart","doi":"10.1109/WACVW58289.2023.00051","DOIUrl":null,"url":null,"abstract":"Analyses based on the body posture are crucial for top-class athletes in many sports disciplines. If at all, coaches label only the most important keypoints, since manual annotations are very costly. This paper proposes a method to detect arbitrary keypoints on the limbs and skis of professional ski jumpers that requires a few, only partly correct segmentation masks during training. Our model is based on the Vision Transformer architecture with a special design for the input tokens to query for the desired keypoints. Since we use segmentation masks only to generate ground truth labels for the freely selectable keypoints, partly correct segmentation masks are sufficient for our training procedure. Hence, there is no need for costly hand-annotated segmentation masks. We analyze different training techniques for freely selected and standard keypoints, including pseudo labels, and show in our experiments that only a few partly correct segmentation masks are sufficient for learning to detect arbitrary keypoints on limbs and skis.","PeriodicalId":306545,"journal":{"name":"2023 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Detecting Arbitrary Keypoints on Limbs and Skis with Sparse Partly Correct Segmentation Masks\",\"authors\":\"K. Ludwig, Daniel Kienzle, Julian Lorenz, R. Lienhart\",\"doi\":\"10.1109/WACVW58289.2023.00051\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Analyses based on the body posture are crucial for top-class athletes in many sports disciplines. If at all, coaches label only the most important keypoints, since manual annotations are very costly. This paper proposes a method to detect arbitrary keypoints on the limbs and skis of professional ski jumpers that requires a few, only partly correct segmentation masks during training. Our model is based on the Vision Transformer architecture with a special design for the input tokens to query for the desired keypoints. Since we use segmentation masks only to generate ground truth labels for the freely selectable keypoints, partly correct segmentation masks are sufficient for our training procedure. Hence, there is no need for costly hand-annotated segmentation masks. We analyze different training techniques for freely selected and standard keypoints, including pseudo labels, and show in our experiments that only a few partly correct segmentation masks are sufficient for learning to detect arbitrary keypoints on limbs and skis.\",\"PeriodicalId\":306545,\"journal\":{\"name\":\"2023 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW)\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WACVW58289.2023.00051\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WACVW58289.2023.00051","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

在许多体育项目中，基于身体姿势的分析对顶级运动员来说至关重要。如果有的话，教练只标记最重要的关键点，因为手动注释的成本非常高。本文提出了一种检测专业跳台滑雪运动员肢体和滑雪板上任意关键点的方法，该方法在训练过程中只需要少量且部分正确的分割掩码。我们的模型基于Vision Transformer体系结构，该体系结构具有用于查询所需关键点的输入令牌的特殊设计。由于我们使用分割掩码仅为自由选择的关键点生成基础真值标签，因此部分正确的分割掩码足以用于我们的训练过程。因此，不需要昂贵的手工注释分割掩码。我们分析了自由选择和标准关键点的不同训练技术，包括伪标签，并在我们的实验中表明，只有少数部分正确的分割掩码足以学习检测肢体和滑雪板上的任意关键点。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Detecting Arbitrary Keypoints on Limbs and Skis with Sparse Partly Correct Segmentation Masks

Analyses based on the body posture are crucial for top-class athletes in many sports disciplines. If at all, coaches label only the most important keypoints, since manual annotations are very costly. This paper proposes a method to detect arbitrary keypoints on the limbs and skis of professional ski jumpers that requires a few, only partly correct segmentation masks during training. Our model is based on the Vision Transformer architecture with a special design for the input tokens to query for the desired keypoints. Since we use segmentation masks only to generate ground truth labels for the freely selectable keypoints, partly correct segmentation masks are sufficient for our training procedure. Hence, there is no need for costly hand-annotated segmentation masks. We analyze different training techniques for freely selected and standard keypoints, including pseudo labels, and show in our experiments that only a few partly correct segmentation masks are sufficient for learning to detect arbitrary keypoints on limbs and skis.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW)

自引率

0.00%

发文量

期刊最新文献

Subjective and Objective Video Quality Assessment of High Dynamic Range Sports Content Improving the Detection of Small Oriented Objects in Aerial Images Image Quality Assessment using Semi-Supervised Representation Learning A Principal Component Analysis-Based Approach for Single Morphing Attack Detection Can Machines Learn to Map Creative Videos to Marketing Campaigns?