音频-运动地图在线适配中的计算视听场景分析

Rujiao Yan, Tobias Rodemann, B. Wrede
{"title":"音频-运动地图在线适配中的计算视听场景分析","authors":"Rujiao Yan, Tobias Rodemann, B. Wrede","doi":"10.1109/TAMD.2013.2257766","DOIUrl":null,"url":null,"abstract":"For sound localization, the binaural auditory system of a robot needs audio-motor maps, which represent the relationship between certain audio features and the position of the sound source. This mapping is normally learned during an offline calibration in controlled environments, but we show that using computational audiovisual scene analysis (CAVSA), it can be adapted online in free interaction with a number of a priori unknown speakers. CAVSA enables a robot to understand dynamic dialog scenarios, such as the number and position of speakers, as well as who is the current speaker. Our system does not require specific robot motions and thus can work during other tasks. The performance of online-adapted maps is continuously monitored by computing the difference between online-adapted and offline-calibrated maps and also comparing sound localization results with ground truth data (if available). We show that our approach is more robust in multiperson scenarios than the state of the art in terms of learning progress. We also show that our system is able to bootstrap with a randomized audio-motor map and adapt to hardware modifications that induce a change in audio-motor maps.","PeriodicalId":49193,"journal":{"name":"IEEE Transactions on Autonomous Mental Development","volume":"5 1","pages":"273-287"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TAMD.2013.2257766","citationCount":"3","resultStr":"{\"title\":\"Computational Audiovisual Scene Analysis in Online Adaptation of Audio-Motor Maps\",\"authors\":\"Rujiao Yan, Tobias Rodemann, B. Wrede\",\"doi\":\"10.1109/TAMD.2013.2257766\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"For sound localization, the binaural auditory system of a robot needs audio-motor maps, which represent the relationship between certain audio features and the position of the sound source. This mapping is normally learned during an offline calibration in controlled environments, but we show that using computational audiovisual scene analysis (CAVSA), it can be adapted online in free interaction with a number of a priori unknown speakers. CAVSA enables a robot to understand dynamic dialog scenarios, such as the number and position of speakers, as well as who is the current speaker. Our system does not require specific robot motions and thus can work during other tasks. The performance of online-adapted maps is continuously monitored by computing the difference between online-adapted and offline-calibrated maps and also comparing sound localization results with ground truth data (if available). We show that our approach is more robust in multiperson scenarios than the state of the art in terms of learning progress. We also show that our system is able to bootstrap with a randomized audio-motor map and adapt to hardware modifications that induce a change in audio-motor maps.\",\"PeriodicalId\":49193,\"journal\":{\"name\":\"IEEE Transactions on Autonomous Mental Development\",\"volume\":\"5 1\",\"pages\":\"273-287\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1109/TAMD.2013.2257766\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Autonomous Mental Development\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TAMD.2013.2257766\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Autonomous Mental Development","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TAMD.2013.2257766","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

对于声音定位,机器人的双耳听觉系统需要音频-电机地图,它代表了某些音频特征与声源位置之间的关系。这种映射通常是在受控环境下的离线校准中学习的,但我们表明,使用计算视听场景分析(CAVSA),它可以在线适应与许多先验未知说话者的自由交互。CAVSA使机器人能够理解动态对话场景,例如说话人的数量和位置,以及当前说话人是谁。我们的系统不需要特定的机器人运动,因此可以在其他任务中工作。通过计算在线调整地图和离线校准地图之间的差异,并将声音定位结果与地面真实数据(如果可用)进行比较,持续监测在线调整地图的性能。我们表明,就学习进度而言,我们的方法在多人场景中比目前的技术更健壮。我们还表明,我们的系统能够引导随机的音频-运动映射,并适应硬件修改,诱导音频-运动映射的变化。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Computational Audiovisual Scene Analysis in Online Adaptation of Audio-Motor Maps
For sound localization, the binaural auditory system of a robot needs audio-motor maps, which represent the relationship between certain audio features and the position of the sound source. This mapping is normally learned during an offline calibration in controlled environments, but we show that using computational audiovisual scene analysis (CAVSA), it can be adapted online in free interaction with a number of a priori unknown speakers. CAVSA enables a robot to understand dynamic dialog scenarios, such as the number and position of speakers, as well as who is the current speaker. Our system does not require specific robot motions and thus can work during other tasks. The performance of online-adapted maps is continuously monitored by computing the difference between online-adapted and offline-calibrated maps and also comparing sound localization results with ground truth data (if available). We show that our approach is more robust in multiperson scenarios than the state of the art in terms of learning progress. We also show that our system is able to bootstrap with a randomized audio-motor map and adapt to hardware modifications that induce a change in audio-motor maps.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Transactions on Autonomous Mental Development
IEEE Transactions on Autonomous Mental Development COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-ROBOTICS
自引率
0.00%
发文量
0
审稿时长
3 months
期刊最新文献
Types, Locations, and Scales from Cluttered Natural Video and Actions Guest Editorial Multimodal Modeling and Analysis Informed by Brain Imaging—Part 1 Discriminating Bipolar Disorder From Major Depression Based on SVM-FoBa: Efficient Feature Selection With Multimodal Brain Imaging Data A Robust Gradient-Based Algorithm to Correct Bias Fields of Brain MR Images Editorial Announcing the Title Change of the IEEE Transactions on Autonomous Mental Development in 2016
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1