Rafiul Islam, Md Taimur Ahad, Faruk Ahmed, Bo Song, Yan Li
{"title":"Mental Health Diagnosis From Voice Data Using Convolutional Neural Networks and Vision Transformers.","authors":"Rafiul Islam, Md Taimur Ahad, Faruk Ahmed, Bo Song, Yan Li","doi":"10.1016/j.jvoice.2024.10.010","DOIUrl":null,"url":null,"abstract":"<p><p>Integrating Convolutional Neural Networks and Vision Transformers in voice analysis has unveiled a new horizon in mental health identification. Human voice, a powerful indicator of mental health, was the focus of this study. Human voice data representing stable and unstable conditions were gathered from various mental health institutions in Bangladesh. The results of the experiment suggest that the proposed model achieved 91% accuracy, precision of 92% for the \"Unstable\" category and 90% for the \"Stable\" category, and recall of 91% for the \"Stable\" category and 92% for the \"Unstable\" category. In addition, a high F1 score of 91% was achieved. This study significantly contributes to computer-aided diagnosis in mental health by using deep learning (DL) to diagnose mental well-being. Our research underscores the substantial impact of DL on the advancement of mental health care, instilling hope for a brighter future in mental health care.</p>","PeriodicalId":49954,"journal":{"name":"Journal of Voice","volume":" ","pages":""},"PeriodicalIF":2.5000,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Voice","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.jvoice.2024.10.010","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Integrating Convolutional Neural Networks and Vision Transformers in voice analysis has unveiled a new horizon in mental health identification. Human voice, a powerful indicator of mental health, was the focus of this study. Human voice data representing stable and unstable conditions were gathered from various mental health institutions in Bangladesh. The results of the experiment suggest that the proposed model achieved 91% accuracy, precision of 92% for the "Unstable" category and 90% for the "Stable" category, and recall of 91% for the "Stable" category and 92% for the "Unstable" category. In addition, a high F1 score of 91% was achieved. This study significantly contributes to computer-aided diagnosis in mental health by using deep learning (DL) to diagnose mental well-being. Our research underscores the substantial impact of DL on the advancement of mental health care, instilling hope for a brighter future in mental health care.
期刊介绍:
The Journal of Voice is widely regarded as the world''s premiere journal for voice medicine and research. This peer-reviewed publication is listed in Index Medicus and is indexed by the Institute for Scientific Information. The journal contains articles written by experts throughout the world on all topics in voice sciences, voice medicine and surgery, and speech-language pathologists'' management of voice-related problems. The journal includes clinical articles, clinical research, and laboratory research. Members of the Foundation receive the journal as a benefit of membership.