EAV：用于对话语境中情感识别的脑电图-音频-视频数据集。

IF 5.8 2区综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES Scientific Data Pub Date : 2024-09-19 DOI:10.1038/s41597-024-03838-4

Min-Ho Lee, Adai Shomanov, Balgyn Begim, Zhuldyz Kabidenova, Aruna Nyssanbay, Adnan Yazici, Seong-Whan Lee

{"title":"EAV：用于对话语境中情感识别的脑电图-音频-视频数据集。","authors":"Min-Ho Lee, Adai Shomanov, Balgyn Begim, Zhuldyz Kabidenova, Aruna Nyssanbay, Adnan Yazici, Seong-Whan Lee","doi":"10.1038/s41597-024-03838-4","DOIUrl":null,"url":null,"abstract":"Understanding emotional states is pivotal for the development of next-generation human-machine interfaces. Human behaviors in social interactions have resulted in psycho-physiological processes influenced by perceptual inputs. Therefore, efforts to comprehend brain functions and human behavior could potentially catalyze the development of AI models with human-like attributes. In this study, we introduce a multimodal emotion dataset comprising data from 30-channel electroencephalography (EEG), audio, and video recordings from 42 participants. Each participant engaged in a cue-based conversation scenario, eliciting five distinct emotions: neutral, anger, happiness, sadness, and calmness. Throughout the experiment, each participant contributed 200 interactions, which encompassed both listening and speaking. This resulted in a cumulative total of 8,400 interactions across all participants. We evaluated the baseline performance of emotion recognition for each modality using established deep neural network (DNN) methods. The Emotion in EEG-Audio-Visual (EAV) dataset represents the first public dataset to incorporate three primary modalities for emotion recognition within a conversational context. We anticipate that this dataset will make significant contributions to the modeling of the human emotional process, encompassing both fundamental neuroscience and machine learning viewpoints.","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":null,"pages":null},"PeriodicalIF":5.8000,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11413008/pdf/","citationCount":"0","resultStr":"{\"title\":\"EAV: EEG-Audio-Video Dataset for Emotion Recognition in Conversational Contexts.\",\"authors\":\"Min-Ho Lee, Adai Shomanov, Balgyn Begim, Zhuldyz Kabidenova, Aruna Nyssanbay, Adnan Yazici, Seong-Whan Lee\",\"doi\":\"10.1038/s41597-024-03838-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Understanding emotional states is pivotal for the development of next-generation human-machine interfaces. Human behaviors in social interactions have resulted in psycho-physiological processes influenced by perceptual inputs. Therefore, efforts to comprehend brain functions and human behavior could potentially catalyze the development of AI models with human-like attributes. In this study, we introduce a multimodal emotion dataset comprising data from 30-channel electroencephalography (EEG), audio, and video recordings from 42 participants. Each participant engaged in a cue-based conversation scenario, eliciting five distinct emotions: neutral, anger, happiness, sadness, and calmness. Throughout the experiment, each participant contributed 200 interactions, which encompassed both listening and speaking. This resulted in a cumulative total of 8,400 interactions across all participants. We evaluated the baseline performance of emotion recognition for each modality using established deep neural network (DNN) methods. The Emotion in EEG-Audio-Visual (EAV) dataset represents the first public dataset to incorporate three primary modalities for emotion recognition within a conversational context. We anticipate that this dataset will make significant contributions to the modeling of the human emotional process, encompassing both fundamental neuroscience and machine learning viewpoints.\",\"PeriodicalId\":21597,\"journal\":{\"name\":\"Scientific Data\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":5.8000,\"publicationDate\":\"2024-09-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11413008/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Scientific Data\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.1038/s41597-024-03838-4\",\"RegionNum\":2,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific Data","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41597-024-03838-4","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 0

摘要

了解情绪状态对于开发下一代人机界面至关重要。人类在社会交往中的行为产生了受感知输入影响的心理生理过程。因此，努力理解大脑功能和人类行为有可能促进具有类人属性的人工智能模型的发展。在本研究中，我们介绍了一个多模态情感数据集，该数据集由来自 42 名参与者的 30 个通道的脑电图（EEG）、音频和视频记录组成。每位参与者都参与了一个基于提示的对话场景，引发了五种不同的情绪：中性、愤怒、快乐、悲伤和平静。在整个实验过程中，每位参与者进行了 200 次互动，其中包括听和说。这样，所有参与者的互动次数累计达 8,400 次。我们使用成熟的深度神经网络（DNN）方法评估了每种模式的情绪识别基线性能。脑电-音频-视觉（EAV）中的情感数据集代表了首个在会话语境中结合三种主要模式进行情感识别的公开数据集。我们预计，该数据集将为人类情感过程建模做出重大贡献，同时涵盖基础神经科学和机器学习观点。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

EAV: EEG-Audio-Video Dataset for Emotion Recognition in Conversational Contexts.

Understanding emotional states is pivotal for the development of next-generation human-machine interfaces. Human behaviors in social interactions have resulted in psycho-physiological processes influenced by perceptual inputs. Therefore, efforts to comprehend brain functions and human behavior could potentially catalyze the development of AI models with human-like attributes. In this study, we introduce a multimodal emotion dataset comprising data from 30-channel electroencephalography (EEG), audio, and video recordings from 42 participants. Each participant engaged in a cue-based conversation scenario, eliciting five distinct emotions: neutral, anger, happiness, sadness, and calmness. Throughout the experiment, each participant contributed 200 interactions, which encompassed both listening and speaking. This resulted in a cumulative total of 8,400 interactions across all participants. We evaluated the baseline performance of emotion recognition for each modality using established deep neural network (DNN) methods. The Emotion in EEG-Audio-Visual (EAV) dataset represents the first public dataset to incorporate three primary modalities for emotion recognition within a conversational context. We anticipate that this dataset will make significant contributions to the modeling of the human emotional process, encompassing both fundamental neuroscience and machine learning viewpoints.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Scientific Data Social Sciences-Education

CiteScore

11.20

自引率

4.10%

发文量

689

审稿时长

16 weeks

期刊介绍： Scientific Data is an open-access journal focused on data, publishing descriptions of research datasets and articles on data sharing across natural sciences, medicine, engineering, and social sciences. Its goal is to enhance the sharing and reuse of scientific data, encourage broader data sharing, and acknowledge those who share their data. The journal primarily publishes Data Descriptors, which offer detailed descriptions of research datasets, including data collection methods and technical analyses validating data quality. These descriptors aim to facilitate data reuse rather than testing hypotheses or presenting new interpretations, methods, or in-depth analyses.