嵌入式设备上手势识别的深度神经网络基准测试*

Stefano Bini, Antonio Greco, Alessia Saggese, M. Vento
{"title":"嵌入式设备上手势识别的深度神经网络基准测试*","authors":"Stefano Bini, Antonio Greco, Alessia Saggese, M. Vento","doi":"10.1109/RO-MAN53752.2022.9900705","DOIUrl":null,"url":null,"abstract":"The gesture is one of the most used forms of communication between humans; in recent years, given the new trend of factories to be adapted to Industry 4.0 paradigm, the scientific community has shown a growing interest towards the design of Gesture Recognition (GR) algorithms for Human-Robot Interaction (HRI) applications. Within this context, the GR algorithm needs to work in real time and over embedded platforms, with limited resources. Anyway, when looking at the available scientific literature, the aim of the different proposed neural networks (i.e. 2D and 3D) and of the different modalities used for feeding the network (i.e. RGB, RGB-D, optical flow) is typically the optimization of the accuracy, without strongly paying attention to the feasibility over low power hardware devices. Anyway, the analysis related to the trade-off between accuracy and computational burden (for both networks and modalities) becomes important so as to allow GR algorithms to work in industrial robotics applications. In this paper, we perform a wide benchmarking focusing not only on the accuracy but also on the computational burden, involving two different architectures (2D and 3D), with two different backbones (MobileNet, ResNeXt) and four types of input modalities (RGB, Depth, Optical Flow, Motion History Image) and their combinations.","PeriodicalId":250997,"journal":{"name":"2022 31st IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Benchmarking deep neural networks for gesture recognition on embedded devices *\",\"authors\":\"Stefano Bini, Antonio Greco, Alessia Saggese, M. Vento\",\"doi\":\"10.1109/RO-MAN53752.2022.9900705\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The gesture is one of the most used forms of communication between humans; in recent years, given the new trend of factories to be adapted to Industry 4.0 paradigm, the scientific community has shown a growing interest towards the design of Gesture Recognition (GR) algorithms for Human-Robot Interaction (HRI) applications. Within this context, the GR algorithm needs to work in real time and over embedded platforms, with limited resources. Anyway, when looking at the available scientific literature, the aim of the different proposed neural networks (i.e. 2D and 3D) and of the different modalities used for feeding the network (i.e. RGB, RGB-D, optical flow) is typically the optimization of the accuracy, without strongly paying attention to the feasibility over low power hardware devices. Anyway, the analysis related to the trade-off between accuracy and computational burden (for both networks and modalities) becomes important so as to allow GR algorithms to work in industrial robotics applications. In this paper, we perform a wide benchmarking focusing not only on the accuracy but also on the computational burden, involving two different architectures (2D and 3D), with two different backbones (MobileNet, ResNeXt) and four types of input modalities (RGB, Depth, Optical Flow, Motion History Image) and their combinations.\",\"PeriodicalId\":250997,\"journal\":{\"name\":\"2022 31st IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 31st IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/RO-MAN53752.2022.9900705\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 31st IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RO-MAN53752.2022.9900705","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

手势是人类之间最常用的交流方式之一;近年来,鉴于工厂适应工业4.0范式的新趋势,科学界对人机交互(HRI)应用的手势识别(GR)算法的设计表现出越来越大的兴趣。在这种情况下,GR算法需要在资源有限的嵌入式平台上实时工作。无论如何,当查看现有的科学文献时,不同提出的神经网络(即2D和3D)和用于馈电网络的不同模式(即RGB, RGB- d,光流)的目的通常是优化精度,而不是强烈关注低功耗硬件设备的可行性。无论如何,与精度和计算负担(对于网络和模式)之间的权衡相关的分析变得重要,以便允许GR算法在工业机器人应用中工作。在本文中,我们进行了广泛的基准测试,不仅关注准确性,还关注计算负担,涉及两种不同的架构(2D和3D),两种不同的骨干(MobileNet, ResNeXt)和四种类型的输入模式(RGB, Depth,光流,运动历史图像)及其组合。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Benchmarking deep neural networks for gesture recognition on embedded devices *
The gesture is one of the most used forms of communication between humans; in recent years, given the new trend of factories to be adapted to Industry 4.0 paradigm, the scientific community has shown a growing interest towards the design of Gesture Recognition (GR) algorithms for Human-Robot Interaction (HRI) applications. Within this context, the GR algorithm needs to work in real time and over embedded platforms, with limited resources. Anyway, when looking at the available scientific literature, the aim of the different proposed neural networks (i.e. 2D and 3D) and of the different modalities used for feeding the network (i.e. RGB, RGB-D, optical flow) is typically the optimization of the accuracy, without strongly paying attention to the feasibility over low power hardware devices. Anyway, the analysis related to the trade-off between accuracy and computational burden (for both networks and modalities) becomes important so as to allow GR algorithms to work in industrial robotics applications. In this paper, we perform a wide benchmarking focusing not only on the accuracy but also on the computational burden, involving two different architectures (2D and 3D), with two different backbones (MobileNet, ResNeXt) and four types of input modalities (RGB, Depth, Optical Flow, Motion History Image) and their combinations.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
I Can’t Believe That Happened! : Exploring Expressivity in Collaborative Storytelling with the Tabletop Robot Haru Nothing About Us Without Us: a participatory design for an Inclusive Signing Tiago Robot Preliminary Investigation of Collision Risk Assessment with Vision for Selecting Targets Paid Attention to by Mobile Robot Step-by-Step Task Plan Explanations Beyond Causal Links Contributions of user tests in a Living Lab in the co-design process of human robot interaction
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1