Kun Qian , Ruolan Huang , Zhihao Bao , Yang Tan , Zhonghao Zhao , Mengkai Sun , Bin Hu , Björn W. Schuller , Yoshiharu Yamamoto
{"title":"通过语音检测躯体化障碍:介绍深圳躯体化语音语料库","authors":"Kun Qian , Ruolan Huang , Zhihao Bao , Yang Tan , Zhonghao Zhao , Mengkai Sun , Bin Hu , Björn W. Schuller , Yoshiharu Yamamoto","doi":"10.1016/j.imed.2023.03.001","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><p>Speech recognition technology is widely used as a mature technical approach in many fields. In the study of depression recognition, speech signals are commonly used due to their convenience and ease of acquisition. Though speech recognition is popular in the research field of depression recognition, it has been little studied in somatisation disorder recognition. The reason for this is the lack of a publicly accessible database of relevant speech and benchmark studies. To this end, we introduced our somatisation disorder speech database and gave benchmark results.</p></div><div><h3>Methods</h3><p>By collecting speech samples of somatisation disorder patients, in cooperation with the Shenzhen University General Hospital, we introduced our somatisation disorder speech database, the Shenzhen Somatisation Speech Corpus (SSSC). Moreover, a benchmark for SSSC using classic acoustic features and a machine learning model was proposed in our work.</p></div><div><h3>Results</h3><p>To obtain a more scientific benchmark, we compared and analysed the performance of different acoustic features, i. e., the full ComPare feature set, or only Mel frequency cepstral coefficients (MFCCs), fundamental frequency (F0), and frequency and bandwidth of the formants (F1-F3). By comparison, the best result of our benchmark was the 76.0% unweighted average recall achieved by a support vector machine with formants F1–F3.</p></div><div><h3>Conclusion</h3><p>The proposal of SSSC may bridge a research gap in somatisation disorder, providing researchers with a publicly accessible speech database. In addition, the results of the benchmark could show the scientific validity and feasibility of computer audition for speech recognition in somatization disorders.</p></div>","PeriodicalId":73400,"journal":{"name":"Intelligent medicine","volume":"4 2","pages":"Pages 96-103"},"PeriodicalIF":4.4000,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667102623000219/pdfft?md5=9ae4884ac76562266b28f28068f3f5a0&pid=1-s2.0-S2667102623000219-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Detecting somatisation disorder via speech: introducing the Shenzhen Somatisation Speech Corpus\",\"authors\":\"Kun Qian , Ruolan Huang , Zhihao Bao , Yang Tan , Zhonghao Zhao , Mengkai Sun , Bin Hu , Björn W. Schuller , Yoshiharu Yamamoto\",\"doi\":\"10.1016/j.imed.2023.03.001\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Objective</h3><p>Speech recognition technology is widely used as a mature technical approach in many fields. In the study of depression recognition, speech signals are commonly used due to their convenience and ease of acquisition. Though speech recognition is popular in the research field of depression recognition, it has been little studied in somatisation disorder recognition. The reason for this is the lack of a publicly accessible database of relevant speech and benchmark studies. To this end, we introduced our somatisation disorder speech database and gave benchmark results.</p></div><div><h3>Methods</h3><p>By collecting speech samples of somatisation disorder patients, in cooperation with the Shenzhen University General Hospital, we introduced our somatisation disorder speech database, the Shenzhen Somatisation Speech Corpus (SSSC). Moreover, a benchmark for SSSC using classic acoustic features and a machine learning model was proposed in our work.</p></div><div><h3>Results</h3><p>To obtain a more scientific benchmark, we compared and analysed the performance of different acoustic features, i. e., the full ComPare feature set, or only Mel frequency cepstral coefficients (MFCCs), fundamental frequency (F0), and frequency and bandwidth of the formants (F1-F3). By comparison, the best result of our benchmark was the 76.0% unweighted average recall achieved by a support vector machine with formants F1–F3.</p></div><div><h3>Conclusion</h3><p>The proposal of SSSC may bridge a research gap in somatisation disorder, providing researchers with a publicly accessible speech database. In addition, the results of the benchmark could show the scientific validity and feasibility of computer audition for speech recognition in somatization disorders.</p></div>\",\"PeriodicalId\":73400,\"journal\":{\"name\":\"Intelligent medicine\",\"volume\":\"4 2\",\"pages\":\"Pages 96-103\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2024-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2667102623000219/pdfft?md5=9ae4884ac76562266b28f28068f3f5a0&pid=1-s2.0-S2667102623000219-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Intelligent medicine\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2667102623000219\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligent medicine","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667102623000219","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
Detecting somatisation disorder via speech: introducing the Shenzhen Somatisation Speech Corpus
Objective
Speech recognition technology is widely used as a mature technical approach in many fields. In the study of depression recognition, speech signals are commonly used due to their convenience and ease of acquisition. Though speech recognition is popular in the research field of depression recognition, it has been little studied in somatisation disorder recognition. The reason for this is the lack of a publicly accessible database of relevant speech and benchmark studies. To this end, we introduced our somatisation disorder speech database and gave benchmark results.
Methods
By collecting speech samples of somatisation disorder patients, in cooperation with the Shenzhen University General Hospital, we introduced our somatisation disorder speech database, the Shenzhen Somatisation Speech Corpus (SSSC). Moreover, a benchmark for SSSC using classic acoustic features and a machine learning model was proposed in our work.
Results
To obtain a more scientific benchmark, we compared and analysed the performance of different acoustic features, i. e., the full ComPare feature set, or only Mel frequency cepstral coefficients (MFCCs), fundamental frequency (F0), and frequency and bandwidth of the formants (F1-F3). By comparison, the best result of our benchmark was the 76.0% unweighted average recall achieved by a support vector machine with formants F1–F3.
Conclusion
The proposal of SSSC may bridge a research gap in somatisation disorder, providing researchers with a publicly accessible speech database. In addition, the results of the benchmark could show the scientific validity and feasibility of computer audition for speech recognition in somatization disorders.