A Robust Voice Pathology Detection System Based on the Combined BiLSTM–CNN Architecture

Mendel Pub Date : 2023-12-20 DOI:10.13164/mendel.2023.2.202

Rim Amami, Rim Amami, Chiraz Trabelsi, Sherin Hassan Mabrouk, Hassan A. Khalil

{"title":"A Robust Voice Pathology Detection System Based on the Combined BiLSTM–CNN Architecture","authors":"Rim Amami, Rim Amami, Chiraz Trabelsi, Sherin Hassan Mabrouk, Hassan A. Khalil","doi":"10.13164/mendel.2023.2.202","DOIUrl":null,"url":null,"abstract":"Voice recognition systems have become increasingly important in recent years due to the growing need for more efficient and intuitive human-machine interfaces. The use of Hybrid LSTM networks and deep learning has been very successful in improving speech detection systems. The aim of this paper is to develop a novel approach for the detection of voice pathologies using a hybrid deep learning model that combines the Bidirectional Long Short-Term Memory (BiLSTM) and the Convolutional Neural Network (CNN) architectures. The proposed model uses a combination of temporal and spectral features extracted from speech signals to detect the different types of voice pathologies. The performance of the proposed detection model is evaluated on a publicly available dataset of speech signals from individuals with various voice pathologies(MEEI database). The experimental results showed that the hybrid BiLSTM-CNN model outperforms several classifiers by achieving an accuracy of 98.86\\%. The proposed model has the potential to assist health care professionals in the accurate diagnosis and treatment of voice pathologies, and improving the quality of life for affected individuals.","PeriodicalId":38293,"journal":{"name":"Mendel","volume":"125 3","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mendel","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.13164/mendel.2023.2.202","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Voice recognition systems have become increasingly important in recent years due to the growing need for more efficient and intuitive human-machine interfaces. The use of Hybrid LSTM networks and deep learning has been very successful in improving speech detection systems. The aim of this paper is to develop a novel approach for the detection of voice pathologies using a hybrid deep learning model that combines the Bidirectional Long Short-Term Memory (BiLSTM) and the Convolutional Neural Network (CNN) architectures. The proposed model uses a combination of temporal and spectral features extracted from speech signals to detect the different types of voice pathologies. The performance of the proposed detection model is evaluated on a publicly available dataset of speech signals from individuals with various voice pathologies(MEEI database). The experimental results showed that the hybrid BiLSTM-CNN model outperforms several classifiers by achieving an accuracy of 98.86\%. The proposed model has the potential to assist health care professionals in the accurate diagnosis and treatment of voice pathologies, and improving the quality of life for affected individuals.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于 BiLSTM-CNN 组合架构的鲁棒语音病理学检测系统

近年来，由于对更高效、更直观的人机界面的需求日益增长，语音识别系统变得越来越重要。混合 LSTM 网络和深度学习的使用在改进语音检测系统方面非常成功。本文旨在开发一种新方法，利用混合深度学习模型检测语音病变，该模型结合了双向长短期记忆（BiLSTM）和卷积神经网络（CNN）架构。所提出的模型结合使用从语音信号中提取的时间和频谱特征来检测不同类型的语音病变。所提检测模型的性能在一个公开的数据集（MEEI 数据库）上进行了评估，该数据集包含来自不同嗓音病症患者的语音信号。实验结果表明，BiLSTM-CNN 混合模型的准确率高达 98.86%，优于多种分类器。所提出的模型有望帮助医护人员准确诊断和治疗嗓音病变，提高患者的生活质量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Mendel Decision Sciences-Decision Sciences (miscellaneous)

CiteScore

2.20

自引率

0.00%

发文量