HIroshi G. Okuno, K. Nakadai, K. Hidai, H. Mizoguchi, H. Kitano
{"title":"Human-robot interaction through real-time auditory and visual multiple-talker tracking","authors":"HIroshi G. Okuno, K. Nakadai, K. Hidai, H. Mizoguchi, H. Kitano","doi":"10.1109/IROS.2001.977177","DOIUrl":null,"url":null,"abstract":"Nakadai et al. (2001) have developed a real-time auditory and visual multiple-talker tracking technique. In this paper, this technique is applied to human-robot interaction including a receptionist robot and a companion robot at a party. The system includes face identification, speech recognition, focus-of-attention control, and sensorimotor task in tracking multiple talkers. The system is implemented on a upper-torso humanoid and the talker tracking is attained by distributed processing on three nodes connected by 100Base-TX network. The delay of tracking is 200 msec. Focus-of-attention is controlled by associating auditory and visual streams by using the sound source direction and talker position as a clue. Once an association is established, the humanoid keeps its face to the direction of the associated talker.","PeriodicalId":319679,"journal":{"name":"Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No.01CH37180)","volume":"134 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2001-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"87","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No.01CH37180)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IROS.2001.977177","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 87
Abstract
Nakadai et al. (2001) have developed a real-time auditory and visual multiple-talker tracking technique. In this paper, this technique is applied to human-robot interaction including a receptionist robot and a companion robot at a party. The system includes face identification, speech recognition, focus-of-attention control, and sensorimotor task in tracking multiple talkers. The system is implemented on a upper-torso humanoid and the talker tracking is attained by distributed processing on three nodes connected by 100Base-TX network. The delay of tracking is 200 msec. Focus-of-attention is controlled by associating auditory and visual streams by using the sound source direction and talker position as a clue. Once an association is established, the humanoid keeps its face to the direction of the associated talker.