Detecting somatisation disorder via speech: introducing the Shenzhen Somatisation Speech Corpus

IF 4.4 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Intelligent medicine Pub Date : 2024-05-01 DOI:10.1016/j.imed.2023.03.001
Kun Qian , Ruolan Huang , Zhihao Bao , Yang Tan , Zhonghao Zhao , Mengkai Sun , Bin Hu , Björn W. Schuller , Yoshiharu Yamamoto
{"title":"Detecting somatisation disorder via speech: introducing the Shenzhen Somatisation Speech Corpus","authors":"Kun Qian ,&nbsp;Ruolan Huang ,&nbsp;Zhihao Bao ,&nbsp;Yang Tan ,&nbsp;Zhonghao Zhao ,&nbsp;Mengkai Sun ,&nbsp;Bin Hu ,&nbsp;Björn W. Schuller ,&nbsp;Yoshiharu Yamamoto","doi":"10.1016/j.imed.2023.03.001","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><p>Speech recognition technology is widely used as a mature technical approach in many fields. In the study of depression recognition, speech signals are commonly used due to their convenience and ease of acquisition. Though speech recognition is popular in the research field of depression recognition, it has been little studied in somatisation disorder recognition. The reason for this is the lack of a publicly accessible database of relevant speech and benchmark studies. To this end, we introduced our somatisation disorder speech database and gave benchmark results.</p></div><div><h3>Methods</h3><p>By collecting speech samples of somatisation disorder patients, in cooperation with the Shenzhen University General Hospital, we introduced our somatisation disorder speech database, the Shenzhen Somatisation Speech Corpus (SSSC). Moreover, a benchmark for SSSC using classic acoustic features and a machine learning model was proposed in our work.</p></div><div><h3>Results</h3><p>To obtain a more scientific benchmark, we compared and analysed the performance of different acoustic features, i. e., the full ComPare feature set, or only Mel frequency cepstral coefficients (MFCCs), fundamental frequency (F0), and frequency and bandwidth of the formants (F1-F3). By comparison, the best result of our benchmark was the 76.0% unweighted average recall achieved by a support vector machine with formants F1–F3.</p></div><div><h3>Conclusion</h3><p>The proposal of SSSC may bridge a research gap in somatisation disorder, providing researchers with a publicly accessible speech database. In addition, the results of the benchmark could show the scientific validity and feasibility of computer audition for speech recognition in somatization disorders.</p></div>","PeriodicalId":73400,"journal":{"name":"Intelligent medicine","volume":"4 2","pages":"Pages 96-103"},"PeriodicalIF":4.4000,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667102623000219/pdfft?md5=9ae4884ac76562266b28f28068f3f5a0&pid=1-s2.0-S2667102623000219-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligent medicine","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667102623000219","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Objective

Speech recognition technology is widely used as a mature technical approach in many fields. In the study of depression recognition, speech signals are commonly used due to their convenience and ease of acquisition. Though speech recognition is popular in the research field of depression recognition, it has been little studied in somatisation disorder recognition. The reason for this is the lack of a publicly accessible database of relevant speech and benchmark studies. To this end, we introduced our somatisation disorder speech database and gave benchmark results.

Methods

By collecting speech samples of somatisation disorder patients, in cooperation with the Shenzhen University General Hospital, we introduced our somatisation disorder speech database, the Shenzhen Somatisation Speech Corpus (SSSC). Moreover, a benchmark for SSSC using classic acoustic features and a machine learning model was proposed in our work.

Results

To obtain a more scientific benchmark, we compared and analysed the performance of different acoustic features, i. e., the full ComPare feature set, or only Mel frequency cepstral coefficients (MFCCs), fundamental frequency (F0), and frequency and bandwidth of the formants (F1-F3). By comparison, the best result of our benchmark was the 76.0% unweighted average recall achieved by a support vector machine with formants F1–F3.

Conclusion

The proposal of SSSC may bridge a research gap in somatisation disorder, providing researchers with a publicly accessible speech database. In addition, the results of the benchmark could show the scientific validity and feasibility of computer audition for speech recognition in somatization disorders.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过语音检测躯体化障碍:介绍深圳躯体化语音语料库
目标语音识别技术作为一种成熟的技术方法,在许多领域得到广泛应用。在抑郁症识别研究中,语音信号因其方便易得而被广泛使用。虽然语音识别在抑郁症识别研究领域很受欢迎,但在躯体化障碍识别方面却鲜有研究。究其原因,是缺乏可公开访问的相关语音数据库和基准研究。通过与深圳大学总医院合作收集躯体化障碍患者的语音样本,我们建立了躯体化障碍语音数据库--深圳躯体化语音语料库(SSSC)。结果为了获得更科学的基准,我们比较和分析了不同声学特征的性能,即完整的 ComPare 特征集,或仅有梅尔频率倒频谱系数(MFCC)、基频(F0)和声母的频率和带宽(F1-F3)。相比之下,我们基准测试的最佳结果是支持向量机使用声调 F1-F3 所取得的 76.0% 的非加权平均召回率。此外,基准测试的结果还能证明计算机听力在躯体化障碍语音识别方面的科学性和可行性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Intelligent medicine
Intelligent medicine Surgery, Radiology and Imaging, Artificial Intelligence, Biomedical Engineering
CiteScore
5.20
自引率
0.00%
发文量
19
期刊最新文献
Impact of data balancing a multiclass dataset before the creation of association rules to study bacterial vaginosis Neuropsychological detection and prediction using machine learning algorithms: a comprehensive review Improved neurological diagnoses and treatment strategies via automated human brain tissue segmentation from clinical magnetic resonance imaging Increasing the accuracy and reproducibility of positron emission tomography radiomics for predicting pelvic lymph node metastasis in patients with cervical cancer using 3D local binary pattern-based texture features A clinical decision support system using rough set theory and machine learning for disease prediction
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1