Towards affective integration of vision, behavior, and speech processing

Proceedings Integration of Speech and Image Understanding Pub Date : 1999-09-21 DOI:10.1109/ISIU.1999.824850

Naoyuki Okada, Kentaro Inui, M. Tokuhisa

{"title":"Towards affective integration of vision, behavior, and speech processing","authors":"Naoyuki Okada, Kentaro Inui, M. Tokuhisa","doi":"10.1109/ISIU.1999.824850","DOIUrl":null,"url":null,"abstract":"In each subfield of artificial intelligence such as image understanding, speech understanding, robotics, etc., a tremendous amount of research effort has so far yielded considerable results. Unfortunately, they have ended up too different to combine with one another straight-forwardly. We have been conducting a case study, or AESOPWORLD project, aiming at establishing an architectural foundation of \"integrated\" intelligent agents. In this article, we first review our agent model, which integrates the seven mental and the two physical faculties: recognition, planning, action, desire, emotion, memory, language, and sensor, actuator. We then describe each faculty of recognition, action, and planning, and their interaction by centering around planning. Image understanding is understood as a part of this recognition. Next, we show dialogue processing, where the faculties of recognition and planning also play an essential role for communications. Finally, we discuss the faculty of emotions to show an application of our agent to affective communications. This computation of emotions could be expected to be a base's for human-friendly interfaces.","PeriodicalId":227256,"journal":{"name":"Proceedings Integration of Speech and Image Understanding","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1999-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings Integration of Speech and Image Understanding","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISIU.1999.824850","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 11

Abstract

In each subfield of artificial intelligence such as image understanding, speech understanding, robotics, etc., a tremendous amount of research effort has so far yielded considerable results. Unfortunately, they have ended up too different to combine with one another straight-forwardly. We have been conducting a case study, or AESOPWORLD project, aiming at establishing an architectural foundation of "integrated" intelligent agents. In this article, we first review our agent model, which integrates the seven mental and the two physical faculties: recognition, planning, action, desire, emotion, memory, language, and sensor, actuator. We then describe each faculty of recognition, action, and planning, and their interaction by centering around planning. Image understanding is understood as a part of this recognition. Next, we show dialogue processing, where the faculties of recognition and planning also play an essential role for communications. Finally, we discuss the faculty of emotions to show an application of our agent to affective communications. This computation of emotions could be expected to be a base's for human-friendly interfaces.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

迈向视觉、行为和言语处理的情感整合

在人工智能的每一个子领域，如图像理解、语音理解、机器人等，大量的研究工作迄今已经取得了可观的成果。不幸的是，它们最终差异太大，无法直接结合在一起。我们一直在进行一个案例研究，即AESOPWORLD项目，旨在建立一个“集成”智能代理的架构基础。在本文中，我们首先回顾了我们的智能体模型，该模型集成了七种心理和两种身体机能:识别、计划、行动、欲望、情感、记忆、语言和传感器、执行器。然后，我们描述了识别、行动和计划的每一种能力，以及它们围绕计划的相互作用。图像理解被理解为这种认识的一部分。接下来，我们将展示对话处理，其中识别和计划能力也在交流中发挥重要作用。最后，我们讨论了情感的能力，以展示我们的代理在情感交流中的应用。这种对情感的计算有望成为人类友好界面的基础。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings Integration of Speech and Image Understanding

自引率

0.00%

发文量

期刊最新文献

Knowledge based image and speech analysis for service robots Towards affective integration of vision, behavior, and speech processing Connecting concepts from vision and speech processing From images to sentences via spatial relations From video to language-a detour via logic vs. jumping to conclusions