AveRobot: An Audio-visual Dataset for People Re-identification and Verification in Human-Robot Interaction

M. Marras, Pedro A. Marín-Reyes, J. Lorenzo-Navarro, M. C. Santana, G. Fenu
{"title":"AveRobot: An Audio-visual Dataset for People Re-identification and Verification in Human-Robot Interaction","authors":"M. Marras, Pedro A. Marín-Reyes, J. Lorenzo-Navarro, M. C. Santana, G. Fenu","doi":"10.5220/0007690902550265","DOIUrl":null,"url":null,"abstract":"Intelligent technologies have pervaded our daily life, making it easier for people to complete their activities. One emerging application is involving the use of robots for assisting people in various tasks (e.g., visiting a museum). In this context, it is crucial to enable robots to correctly identify people. Existing robots often use facial information to establish the identity of a person of interest. But, the face alone may not offer enough relevant information due to variations in pose, illumination, resolution and recording distance. Other biometric modalities like the voice can improve the recognition performance in these conditions. However, the existing datasets in robotic scenarios usually do not include the audio cue and tend to suffer from one or more limitations: most of them are acquired under controlled conditions, limited in number of identities or samples per user, collected by the same recording device, and/or not freely available. In this paper, we propose AveRobot, an audio-visual dataset of 111 participants vocalizing short sentences under robot assistance scenarios. The collection took place into a three-floor building through eight different cameras with built-in microphones. The performance for face and voice re-identification and verification was evaluated on this dataset with deep learning baselines, and compared against audio-visual datasets from diverse scenarios. The results showed that AveRobot is a challenging dataset for people re-identification and verification.","PeriodicalId":410036,"journal":{"name":"International Conference on Pattern Recognition Applications and Methods","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Pattern Recognition Applications and Methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5220/0007690902550265","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Intelligent technologies have pervaded our daily life, making it easier for people to complete their activities. One emerging application is involving the use of robots for assisting people in various tasks (e.g., visiting a museum). In this context, it is crucial to enable robots to correctly identify people. Existing robots often use facial information to establish the identity of a person of interest. But, the face alone may not offer enough relevant information due to variations in pose, illumination, resolution and recording distance. Other biometric modalities like the voice can improve the recognition performance in these conditions. However, the existing datasets in robotic scenarios usually do not include the audio cue and tend to suffer from one or more limitations: most of them are acquired under controlled conditions, limited in number of identities or samples per user, collected by the same recording device, and/or not freely available. In this paper, we propose AveRobot, an audio-visual dataset of 111 participants vocalizing short sentences under robot assistance scenarios. The collection took place into a three-floor building through eight different cameras with built-in microphones. The performance for face and voice re-identification and verification was evaluated on this dataset with deep learning baselines, and compared against audio-visual datasets from diverse scenarios. The results showed that AveRobot is a challenging dataset for people re-identification and verification.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
AveRobot:人机交互中人再识别与验证的视听数据集
智能技术已经渗透到我们的日常生活中,使人们更容易完成他们的活动。一个新兴的应用是使用机器人来协助人们完成各种任务(例如,参观博物馆)。在这种情况下,使机器人能够正确识别人是至关重要的。现有的机器人经常使用面部信息来确定目标人物的身份。但是,由于姿势、光照、分辨率和记录距离的变化,面部本身可能无法提供足够的相关信息。在这些情况下,语音等其他生物识别模式可以提高识别性能。然而,机器人场景中的现有数据集通常不包括音频提示,并且往往受到一个或多个限制:大多数数据集是在受控条件下获得的,每个用户的身份或样本数量有限,由同一记录设备收集,并且/或者不是免费提供的。在本文中,我们提出了AveRobot,这是一个由111名参与者在机器人辅助场景下发出短句的视听数据集。展览在一栋三层楼高的建筑中进行,通过八个内置麦克风的不同摄像头进行。利用深度学习基线对该数据集的人脸和语音再识别和验证性能进行了评估,并与不同场景的视听数据集进行了比较。结果表明,AveRobot是一个具有挑战性的数据集,用于人们的重新识别和验证。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
PatchSVD: A Non-Uniform SVD-Based Image Compression Algorithm On Spectrogram Analysis in a Multiple Classifier Fusion Framework for Power Grid Classification Using Electric Network Frequency Semantic Properties of cosine based bias scores for word embeddings Double Trouble? Impact and Detection of Duplicates in Face Image Datasets Detecting Brain Tumors through Multimodal Neural Networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1