RADIANCE: Reliable and interpretable depression detection from speech using transformer

IF 7 2区 医学 Q1 BIOLOGY Computers in biology and medicine Pub Date : 2024-11-02 DOI:10.1016/j.compbiomed.2024.109325
Anup Kumar Gupta, Ashutosh Dhamaniya, Puneet Gupta
{"title":"RADIANCE: Reliable and interpretable depression detection from speech using transformer","authors":"Anup Kumar Gupta,&nbsp;Ashutosh Dhamaniya,&nbsp;Puneet Gupta","doi":"10.1016/j.compbiomed.2024.109325","DOIUrl":null,"url":null,"abstract":"<div><div>Depression is a common but severe mental disorder that adversely impacts the ability of an individual to function normally in their day-to-day life. A majority of depressed individuals remain undiagnosed due to factors such as social stigma and a shortage of healthcare professionals. Consequently, several Machine Learning and Deep Learning (DL) models based on speech have been proposed for automatic depression detection, with the latter generally outperforming the former. However, DL models are blackbox and offer no transparency. In contrast, healthcare professionals prefer models that provide interpretability besides being accurate. In this direction, we propose a method <em>RADIANCE</em> (Reliable AnD InterpretAble depressioN deteCtion transformErs). <em>RADIANCE</em> incorporates a novel FilterBank VIsion Transformer (<em>FBViT</em>) network, which provides the symptoms of depression as interpretable features. Additionally, we employ a novel loss function that handles the class imbalance issue in the datasets. It also incorporates a penalty term that addresses the hierarchy of misclassification errors. We also propose a reliability predictor based on low-level descriptors that provides a reliability score to indicate the trustworthiness of the prediction by <em>FBViT</em>. Furthermore, in contrast to the conventional averaging and majority pooling, <em>RADIANCE</em> consolidates predictions from multiple clips of the input audio by intricately weighing each prediction based on its reliability score, ensuring a more accurate overall prediction. <em>RADIANCE</em> outperforms the state-of-the-art depression detection methods, achieving an accuracy of 89.36%, 80.36%, and 94.44% over the DAIC-WOZ, E-DAIC, and CMDC datasets, respectively. Further, <em>RADIANCE</em> achieves MAE scores of 3.27 and 5.04 on the DAIC-WOZ and E-DAIC datasets, respectively.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"183 ","pages":"Article 109325"},"PeriodicalIF":7.0000,"publicationDate":"2024-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in biology and medicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010482524014100","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Depression is a common but severe mental disorder that adversely impacts the ability of an individual to function normally in their day-to-day life. A majority of depressed individuals remain undiagnosed due to factors such as social stigma and a shortage of healthcare professionals. Consequently, several Machine Learning and Deep Learning (DL) models based on speech have been proposed for automatic depression detection, with the latter generally outperforming the former. However, DL models are blackbox and offer no transparency. In contrast, healthcare professionals prefer models that provide interpretability besides being accurate. In this direction, we propose a method RADIANCE (Reliable AnD InterpretAble depressioN deteCtion transformErs). RADIANCE incorporates a novel FilterBank VIsion Transformer (FBViT) network, which provides the symptoms of depression as interpretable features. Additionally, we employ a novel loss function that handles the class imbalance issue in the datasets. It also incorporates a penalty term that addresses the hierarchy of misclassification errors. We also propose a reliability predictor based on low-level descriptors that provides a reliability score to indicate the trustworthiness of the prediction by FBViT. Furthermore, in contrast to the conventional averaging and majority pooling, RADIANCE consolidates predictions from multiple clips of the input audio by intricately weighing each prediction based on its reliability score, ensuring a more accurate overall prediction. RADIANCE outperforms the state-of-the-art depression detection methods, achieving an accuracy of 89.36%, 80.36%, and 94.44% over the DAIC-WOZ, E-DAIC, and CMDC datasets, respectively. Further, RADIANCE achieves MAE scores of 3.27 and 5.04 on the DAIC-WOZ and E-DAIC datasets, respectively.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
RADIANCE:使用变压器从语音中进行可靠且可解释的抑郁检测。
抑郁症是一种常见但严重的精神障碍,会对个人日常生活的正常能力产生不利影响。由于社会耻辱感和医疗保健专业人员短缺等因素,大多数抑郁症患者仍未得到诊断。因此,人们提出了几种基于语音的机器学习和深度学习(DL)模型,用于自动检测抑郁症,后者的表现普遍优于前者。然而,深度学习模型是黑盒子,不透明。相比之下,医疗专业人员更喜欢除了准确之外还能提供可解释性的模型。为此,我们提出了一种方法 RADIANCE(Reliable AnD InterpretAble DepressioN DeteCtion transformErs)。RADIANCE 融合了一个新颖的滤波库虚拟转换器(FBViT)网络,它提供了可解释的抑郁症状特征。此外,我们还采用了一种新颖的损失函数来处理数据集中的类不平衡问题。它还包含一个惩罚项,可解决误分类错误的层次问题。我们还提出了一种基于低级描述符的可靠性预测器,该预测器可提供可靠性评分,以显示 FBViT 预测的可信度。此外,与传统的平均法和多数池法不同,RADIANCE 通过基于可靠性得分对每个预测进行复杂的权衡,对来自多个输入音频片段的预测进行整合,从而确保整体预测更加准确。RADIANCE 优于最先进的抑郁检测方法,在 DAIC-WOZ、E-DAIC 和 CMDC 数据集上的准确率分别达到 89.36%、80.36% 和 94.44%。此外,RADIANCE 在 DAIC-WOZ 和 E-DAIC 数据集上的 MAE 分数分别为 3.27 和 5.04。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Computers in biology and medicine
Computers in biology and medicine 工程技术-工程:生物医学
CiteScore
11.70
自引率
10.40%
发文量
1086
审稿时长
74 days
期刊介绍: Computers in Biology and Medicine is an international forum for sharing groundbreaking advancements in the use of computers in bioscience and medicine. This journal serves as a medium for communicating essential research, instruction, ideas, and information regarding the rapidly evolving field of computer applications in these domains. By encouraging the exchange of knowledge, we aim to facilitate progress and innovation in the utilization of computers in biology and medicine.
期刊最新文献
An adaptive enhanced human memory algorithm for multi-level image segmentation for pathological lung cancer images. Integrating multimodal learning for improved vital health parameter estimation. Riemannian manifold-based geometric clustering of continuous glucose monitoring to improve personalized diabetes management. Transformative artificial intelligence in gastric cancer: Advancements in diagnostic techniques. Artificial intelligence and deep learning algorithms for epigenetic sequence analysis: A review for epigeneticists and AI experts.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1