{"title":"Deep learning and robotics enabled approach for audio based emotional pragmatics deficits identification in social communication disorders.","authors":"Muskan Chawla, Surya Narayan Panda, Vikas Khullar","doi":"10.1177/09544119251325331","DOIUrl":null,"url":null,"abstract":"<p><p>The aim of this study is to develop Deep Learning (DL) enabled robotic systems to identify audio-based emotional pragmatics deficits in individuals with social pragmatic communication deficits. The novelty of the work stems from its integration of deep learning with a robotics platform for identifying emotional pragmatics deficits. In this study, the proposed methodology utilizes the implementation of machine and DL-based classification techniques, which have been applied to a collection of open-source datasets to identify audio emotions. The application of pre-processing and converting audio signals of different emotions utilizing Mel-Frequency Cepstral Coefficients (MFCC) resulted in improved emotion classification. The data generated using MFCC were used for the training of machine or DL models. The trained models were then tested on a randomly selected dataset. DL has been proven to be more effective in the identification of emotions using robotic structure. As the data generated by MFCC is of a single dimension, therefore, one-dimensional DL algorithms, such as 1D-Convolution Neural Network, Long Short-Term Memory, and Bidirectional-Long Short-Term Memory, were utilized. In comparison to other algorithms, bidirectional Long Short-Term Memory model has resulted in higher accuracy (96.24%), loss (0.2524 in value), precision (92.87%), and recall (92.87%) in comparison to other machine and DL algorithms. Further, the proposed model was deployed on the robotic structure for real-time detection for improvement of social-emotional pragmatic responses in individuals with deficits. The approach can serve as a potential tool for the individuals with pragmatic communication deficits.</p>","PeriodicalId":20666,"journal":{"name":"Proceedings of the Institution of Mechanical Engineers, Part H: Journal of Engineering in Medicine","volume":" ","pages":"9544119251325331"},"PeriodicalIF":1.7000,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Institution of Mechanical Engineers, Part H: Journal of Engineering in Medicine","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1177/09544119251325331","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0
Abstract
The aim of this study is to develop Deep Learning (DL) enabled robotic systems to identify audio-based emotional pragmatics deficits in individuals with social pragmatic communication deficits. The novelty of the work stems from its integration of deep learning with a robotics platform for identifying emotional pragmatics deficits. In this study, the proposed methodology utilizes the implementation of machine and DL-based classification techniques, which have been applied to a collection of open-source datasets to identify audio emotions. The application of pre-processing and converting audio signals of different emotions utilizing Mel-Frequency Cepstral Coefficients (MFCC) resulted in improved emotion classification. The data generated using MFCC were used for the training of machine or DL models. The trained models were then tested on a randomly selected dataset. DL has been proven to be more effective in the identification of emotions using robotic structure. As the data generated by MFCC is of a single dimension, therefore, one-dimensional DL algorithms, such as 1D-Convolution Neural Network, Long Short-Term Memory, and Bidirectional-Long Short-Term Memory, were utilized. In comparison to other algorithms, bidirectional Long Short-Term Memory model has resulted in higher accuracy (96.24%), loss (0.2524 in value), precision (92.87%), and recall (92.87%) in comparison to other machine and DL algorithms. Further, the proposed model was deployed on the robotic structure for real-time detection for improvement of social-emotional pragmatic responses in individuals with deficits. The approach can serve as a potential tool for the individuals with pragmatic communication deficits.
期刊介绍:
The Journal of Engineering in Medicine is an interdisciplinary journal encompassing all aspects of engineering in medicine. The Journal is a vital tool for maintaining an understanding of the newest techniques and research in medical engineering.