基于面部多模态数据的抑郁症诊断。

IF 3.2 3区医学 Q2 PSYCHIATRY Frontiers in Psychiatry Pub Date : 2025-01-28 eCollection Date: 2025-01-01 DOI:10.3389/fpsyt.2025.1508772

Nani Jin, Renjia Ye, Peng Li

{"title":"基于面部多模态数据的抑郁症诊断。","authors":"Nani Jin, Renjia Ye, Peng Li","doi":"10.3389/fpsyt.2025.1508772","DOIUrl":null,"url":null,"abstract":"Introduction: Depression is a serious mental health disease. Traditional scale-based depression diagnosis methods often have problems of strong subjectivity and high misdiagnosis rate, so it is particularly important to develop automatic diagnostic tools based on objective indicators.Methods: This study proposes a deep learning method that fuses multimodal data to automatically diagnose depression using facial video and audio data. We use spatiotemporal attention module to enhance the extraction of visual features and combine the Graph Convolutional Network (GCN) and the Long and Short Term Memory (LSTM) to analyze the audio features. Through the multi-modal feature fusion, the model can effectively capture different feature patterns related to depression.Results: We conduct extensive experiments on the publicly available clinical dataset, the Extended Distress Analysis Interview Corpus (E-DAIC). The experimental results show that we achieve robust accuracy on the E-DAIC dataset, with a Mean Absolute Error (MAE) of 3.51 in estimating PHQ-8 scores from recorded interviews.Discussion: Compared with existing methods, our model shows excellent performance in multi-modal information fusion, which is suitable for early evaluation of depression.","PeriodicalId":12605,"journal":{"name":"Frontiers in Psychiatry","volume":"16 ","pages":"1508772"},"PeriodicalIF":3.2000,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11811426/pdf/","citationCount":"0","resultStr":"{\"title\":\"Diagnosis of depression based on facial multimodal data.\",\"authors\":\"Nani Jin, Renjia Ye, Peng Li\",\"doi\":\"10.3389/fpsyt.2025.1508772\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Introduction: Depression is a serious mental health disease. Traditional scale-based depression diagnosis methods often have problems of strong subjectivity and high misdiagnosis rate, so it is particularly important to develop automatic diagnostic tools based on objective indicators.Methods: This study proposes a deep learning method that fuses multimodal data to automatically diagnose depression using facial video and audio data. We use spatiotemporal attention module to enhance the extraction of visual features and combine the Graph Convolutional Network (GCN) and the Long and Short Term Memory (LSTM) to analyze the audio features. Through the multi-modal feature fusion, the model can effectively capture different feature patterns related to depression.Results: We conduct extensive experiments on the publicly available clinical dataset, the Extended Distress Analysis Interview Corpus (E-DAIC). The experimental results show that we achieve robust accuracy on the E-DAIC dataset, with a Mean Absolute Error (MAE) of 3.51 in estimating PHQ-8 scores from recorded interviews.Discussion: Compared with existing methods, our model shows excellent performance in multi-modal information fusion, which is suitable for early evaluation of depression.\",\"PeriodicalId\":12605,\"journal\":{\"name\":\"Frontiers in Psychiatry\",\"volume\":\"16 \",\"pages\":\"1508772\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2025-01-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11811426/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers in Psychiatry\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.3389/fpsyt.2025.1508772\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"PSYCHIATRY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Psychiatry","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3389/fpsyt.2025.1508772","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"PSYCHIATRY","Score":null,"Total":0}

引用次数: 0

摘要

抑郁症是一种严重的精神疾病。传统的基于量表的抑郁症诊断方法往往存在主观性强、误诊率高的问题，因此开发基于客观指标的自动诊断工具显得尤为重要。方法：本研究提出了一种融合多模态数据的深度学习方法，利用面部视频和音频数据自动诊断抑郁症。我们使用时空注意模块来增强视觉特征的提取，并结合图卷积网络（GCN）和长短期记忆（LSTM）来分析音频特征。通过多模态特征融合，该模型可以有效捕获与抑郁症相关的不同特征模式。结果：我们在公开可用的临床数据集——扩展痛苦分析访谈语料库（e - aic）上进行了广泛的实验。实验结果表明，我们在e - aic数据集上获得了稳健的准确性，从记录访谈中估计PHQ-8分数的平均绝对误差（MAE）为3.51。讨论：与现有方法相比，我们的模型在多模态信息融合方面表现出优异的性能，适用于抑郁症的早期评价。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Diagnosis of depression based on facial multimodal data.

Introduction: Depression is a serious mental health disease. Traditional scale-based depression diagnosis methods often have problems of strong subjectivity and high misdiagnosis rate, so it is particularly important to develop automatic diagnostic tools based on objective indicators.

Methods: This study proposes a deep learning method that fuses multimodal data to automatically diagnose depression using facial video and audio data. We use spatiotemporal attention module to enhance the extraction of visual features and combine the Graph Convolutional Network (GCN) and the Long and Short Term Memory (LSTM) to analyze the audio features. Through the multi-modal feature fusion, the model can effectively capture different feature patterns related to depression.

Results: We conduct extensive experiments on the publicly available clinical dataset, the Extended Distress Analysis Interview Corpus (E-DAIC). The experimental results show that we achieve robust accuracy on the E-DAIC dataset, with a Mean Absolute Error (MAE) of 3.51 in estimating PHQ-8 scores from recorded interviews.

Discussion: Compared with existing methods, our model shows excellent performance in multi-modal information fusion, which is suitable for early evaluation of depression.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Frontiers in Psychiatry Medicine-Psychiatry and Mental Health

CiteScore

6.20

自引率

8.50%

发文量

2813

审稿时长

14 weeks

期刊介绍： Frontiers in Psychiatry publishes rigorously peer-reviewed research across a wide spectrum of translational, basic and clinical research. Field Chief Editor Stefan Borgwardt at the University of Basel is supported by an outstanding Editorial Board of international researchers. This multidisciplinary open-access journal is at the forefront of disseminating and communicating scientific knowledge and impactful discoveries to researchers, academics, clinicians and the public worldwide. The journal''s mission is to use translational approaches to improve therapeutic options for mental illness and consequently to improve patient treatment outcomes.