学习连续的言语和动作之间的对应关系

O. Natsuki, N. Arata, I. Yoshiaki
{"title":"学习连续的言语和动作之间的对应关系","authors":"O. Natsuki, N. Arata, I. Yoshiaki","doi":"10.1109/DEVLRN.2005.1490983","DOIUrl":null,"url":null,"abstract":"Summary form only given. Roy (1999) developed a computational model of early lexical learning to address three questions: First, how do infants discover linguistic units? Second, how do they learn perceptually-grounded semantic categories? And third, how do they learn to associate linguistic units with appropriate semantic categories? His model coupled speech recordings with static images of objects, and acquired a lexicon of shape names. Kaplan et al. (2001) presented a model for teaching names of actions to an enhanced version of AIBO. The AIBO had built-in speech recognition facilities and behaviors. In this paper, we try to build a system that learns the correspondence between continuous speeches and continuous motions without a built-in speech recognizer nor built-in behaviors. We teach RobotPHONE to respond to voices properly by taking its hands. For example, one says 'bye-bye' to the RobotPHONE holding its hand and waving. From continuous input, the system must segment speech and discover acoustic units which correspond to words. The segmentation is done based on recurrent patterns which was found by incremental reference interval-free continuous DP (IRIFCDP) by Kiyama et al. (1996) and Utsunomiya et al. (2004), and we accelerate the IRIFCDP using ShiftCDP (Itoh and Tanaka, 2004). The system also segments motion by the accelerated IRIFCDP, and it memorizes co-occurring speech and motion patterns. Then, it can respond to taught words properly by detecting taught words in speech input by ShiftCDP. We gave a demonstration with a RobotPHONE at the conference. We expect that it can learn words in any languages because it has no built-in facilities specific to any language","PeriodicalId":297121,"journal":{"name":"Proceedings. The 4nd International Conference on Development and Learning, 2005.","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Learning the Correspondence between Continuous Speeches and Motions\",\"authors\":\"O. Natsuki, N. Arata, I. Yoshiaki\",\"doi\":\"10.1109/DEVLRN.2005.1490983\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Summary form only given. Roy (1999) developed a computational model of early lexical learning to address three questions: First, how do infants discover linguistic units? Second, how do they learn perceptually-grounded semantic categories? And third, how do they learn to associate linguistic units with appropriate semantic categories? His model coupled speech recordings with static images of objects, and acquired a lexicon of shape names. Kaplan et al. (2001) presented a model for teaching names of actions to an enhanced version of AIBO. The AIBO had built-in speech recognition facilities and behaviors. In this paper, we try to build a system that learns the correspondence between continuous speeches and continuous motions without a built-in speech recognizer nor built-in behaviors. We teach RobotPHONE to respond to voices properly by taking its hands. For example, one says 'bye-bye' to the RobotPHONE holding its hand and waving. From continuous input, the system must segment speech and discover acoustic units which correspond to words. The segmentation is done based on recurrent patterns which was found by incremental reference interval-free continuous DP (IRIFCDP) by Kiyama et al. (1996) and Utsunomiya et al. (2004), and we accelerate the IRIFCDP using ShiftCDP (Itoh and Tanaka, 2004). The system also segments motion by the accelerated IRIFCDP, and it memorizes co-occurring speech and motion patterns. Then, it can respond to taught words properly by detecting taught words in speech input by ShiftCDP. We gave a demonstration with a RobotPHONE at the conference. We expect that it can learn words in any languages because it has no built-in facilities specific to any language\",\"PeriodicalId\":297121,\"journal\":{\"name\":\"Proceedings. The 4nd International Conference on Development and Learning, 2005.\",\"volume\":\"43 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2005-07-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings. The 4nd International Conference on Development and Learning, 2005.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DEVLRN.2005.1490983\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. The 4nd International Conference on Development and Learning, 2005.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DEVLRN.2005.1490983","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

只提供摘要形式。Roy(1999)开发了一个早期词汇学习的计算模型来解决三个问题:第一,婴儿如何发现语言单位?第二,他们如何学习基于感知的语义范畴?第三,他们如何学会将语言单位与适当的语义范畴联系起来?他的模型将语音记录与物体的静态图像相结合,并获得了形状名称的词典。Kaplan等人(2001)提出了一个向AIBO增强版教授动作名称的模型。AIBO内置了语音识别功能和行为。在本文中,我们试图建立一个系统来学习连续语音和连续动作之间的对应关系,而不需要内置语音识别器和内置行为。我们教RobotPHONE通过握住它的手来正确地回应声音。例如,一个人握着RobotPHONE的手挥手说“再见”。从连续输入中,系统必须分割语音并发现与单词对应的声学单元。分割是基于Kiyama等人(1996)和Utsunomiya等人(2004)通过增量参考无间隔连续DP (IRIFCDP)发现的循环模式完成的,我们使用ShiftCDP加速IRIFCDP (Itoh和Tanaka, 2004)。该系统还通过加速的IRIFCDP分割运动,并记忆同时发生的语音和运动模式。然后,通过检测ShiftCDP输入的语音中的教词,对教词做出正确的响应。我们在会议上用机器人电话做了演示。我们期望它可以学习任何语言的单词,因为它没有任何特定于任何语言的内置功能
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Learning the Correspondence between Continuous Speeches and Motions
Summary form only given. Roy (1999) developed a computational model of early lexical learning to address three questions: First, how do infants discover linguistic units? Second, how do they learn perceptually-grounded semantic categories? And third, how do they learn to associate linguistic units with appropriate semantic categories? His model coupled speech recordings with static images of objects, and acquired a lexicon of shape names. Kaplan et al. (2001) presented a model for teaching names of actions to an enhanced version of AIBO. The AIBO had built-in speech recognition facilities and behaviors. In this paper, we try to build a system that learns the correspondence between continuous speeches and continuous motions without a built-in speech recognizer nor built-in behaviors. We teach RobotPHONE to respond to voices properly by taking its hands. For example, one says 'bye-bye' to the RobotPHONE holding its hand and waving. From continuous input, the system must segment speech and discover acoustic units which correspond to words. The segmentation is done based on recurrent patterns which was found by incremental reference interval-free continuous DP (IRIFCDP) by Kiyama et al. (1996) and Utsunomiya et al. (2004), and we accelerate the IRIFCDP using ShiftCDP (Itoh and Tanaka, 2004). The system also segments motion by the accelerated IRIFCDP, and it memorizes co-occurring speech and motion patterns. Then, it can respond to taught words properly by detecting taught words in speech input by ShiftCDP. We gave a demonstration with a RobotPHONE at the conference. We expect that it can learn words in any languages because it has no built-in facilities specific to any language
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
'Infants' preference for infants and adults','','','','','','','','93','95', Imitation faculty based on a simple visuo-motor mapping towards interaction rule learning with a human partner Online Injection of Teacher's Abstract Concepts into a Real-time Developmental Robot with Autonomous Navigation as Example How can prosody help to learn actions? Longitudinal Observations of Structural Changes in the Mother-Infant Interaction: A New Perspectives Based on Infants' Locomotion Development
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1