二元对话视频中情感识别的会话记忆网络。

Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting Pub Date : 2018-06-01 DOI:10.18653/v1/n18-1193

Devamanyu Hazarika, Soujanya Poria, Amir Zadeh, Erik Cambria, Louis-Philippe Morency, Roger Zimmermann

{"title":"二元对话视频中情感识别的会话记忆网络。","authors":"Devamanyu Hazarika, Soujanya Poria, Amir Zadeh, Erik Cambria, Louis-Philippe Morency, Roger Zimmermann","doi":"10.18653/v1/n18-1193","DOIUrl":null,"url":null,"abstract":"Emotion recognition in conversations is crucial for the development of empathetic machines. Present methods mostly ignore the role of inter-speaker dependency relations while classifying emotions in conversations. In this paper, we address recognizing utterance-level emotions in dyadic conversational videos. We propose a deep neural framework, termed conversational memory network, which leverages contextual information from the conversation history. The framework takes a multimodal approach comprising audio, visual and textual features with gated recurrent units to model past utterances of each speaker into memories. Such memories are then merged using attention-based hops to capture inter-speaker dependencies. Experiments show an accuracy improvement of 3-4% over the state of the art.","PeriodicalId":74542,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting","volume":"2018 ","pages":"2122-2132"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.18653/v1/n18-1193","citationCount":"282","resultStr":"{\"title\":\"Conversational Memory Network for Emotion Recognition in Dyadic Dialogue Videos.\",\"authors\":\"Devamanyu Hazarika, Soujanya Poria, Amir Zadeh, Erik Cambria, Louis-Philippe Morency, Roger Zimmermann\",\"doi\":\"10.18653/v1/n18-1193\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Emotion recognition in conversations is crucial for the development of empathetic machines. Present methods mostly ignore the role of inter-speaker dependency relations while classifying emotions in conversations. In this paper, we address recognizing utterance-level emotions in dyadic conversational videos. We propose a deep neural framework, termed conversational memory network, which leverages contextual information from the conversation history. The framework takes a multimodal approach comprising audio, visual and textual features with gated recurrent units to model past utterances of each speaker into memories. Such memories are then merged using attention-based hops to capture inter-speaker dependencies. Experiments show an accuracy improvement of 3-4% over the state of the art.\",\"PeriodicalId\":74542,\"journal\":{\"name\":\"Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting\",\"volume\":\"2018 \",\"pages\":\"2122-2132\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.18653/v1/n18-1193\",\"citationCount\":\"282\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18653/v1/n18-1193\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/n18-1193","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 282

摘要

对话中的情绪识别对于移情机器的发展至关重要。目前的方法在对会话情绪进行分类时，大多忽略了说话人间依赖关系的作用。在本文中，我们讨论了识别二元对话视频中的话语级情绪。我们提出了一个深层神经框架，称为会话记忆网络，它利用会话历史中的上下文信息。该框架采用多模态方法，包括音频、视觉和文本特征，以及门控循环单元，将每个说话者过去的话语建模为记忆。然后，这些记忆通过基于注意力的跳跃来合并，以捕捉说话者之间的依赖关系。实验表明，该方法的精度比目前的方法提高了3-4%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Conversational Memory Network for Emotion Recognition in Dyadic Dialogue Videos.

Emotion recognition in conversations is crucial for the development of empathetic machines. Present methods mostly ignore the role of inter-speaker dependency relations while classifying emotions in conversations. In this paper, we address recognizing utterance-level emotions in dyadic conversational videos. We propose a deep neural framework, termed conversational memory network, which leverages contextual information from the conversation history. The framework takes a multimodal approach comprising audio, visual and textual features with gated recurrent units to model past utterances of each speaker into memories. Such memories are then merged using attention-based hops to capture inter-speaker dependencies. Experiments show an accuracy improvement of 3-4% over the state of the art.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting

自引率

0.00%

发文量