用于无声语音识别的新颖 SDA-CNN 少量域适应框架

Journal of Intelligent & Fuzzy Systems Pub Date : 2024-03-09 DOI:10.3233/jifs-237890

N. Ramkumar, D. Karthika Renuka

{"title":"用于无声语音识别的新颖 SDA-CNN 少量域适应框架","authors":"N. Ramkumar, D. Karthika Renuka","doi":"10.3233/jifs-237890","DOIUrl":null,"url":null,"abstract":"In BCI (brain-computer interface) applications, it is difficult to obtain enough well-labeled EEG data because of the expensive annotation and time-consuming data capture procedure. Conventional classification techniques that repurpose EEG data across domains and subjects lead to significant decreases in silent speech recognition classification accuracy. This research provides a supervised domain adaptation using Convolutional Neural Network framework (SDA-CNN) to tackle this problem. The objective is to provide a solution for the distribution divergence issue in the categorization of speech recognition across domains. The suggested framework involves taking raw EEG data and deriving deep features from it and the proposed feature selection method also retrieves the statistical features from the corresponding channels. Moreover, it attempts to minimize the distribution divergence caused by variations in people and settings by aligning the correlation of both the source and destination EEG characteristic dissemination. In order to obtain minimal feature distribution divergence and discriminative classification performance, the last stage entails simultaneously optimizing the loss of classification and adaption loss. The usefulness of the suggested strategy in reducing distributed divergence among the source and target Electroencephalography (EEG) data is demonstrated by extensive experiments carried out on KaraOne datasets. The suggested method achieves an average accuracy for classification of 87.4% for single-subject classification and a noteworthy average class accuracy of 88.6% for cross-subject situations, which shows that it surpasses existing cutting-edge techniques in thinking tasks. Regarding the speaking task, the model’s median classification accuracy for single-subject categorization is 86.8%, while its average classification accuracy for cross-subject classification is 87.8% . These results underscore the innovative approach of SDA-CNN to mitigating distribution discrepancies while optimizing classification performance, offering a promising avenue to enhance accuracy and adaptability in brain-computer interface applications.","PeriodicalId":509313,"journal":{"name":"Journal of Intelligent & Fuzzy Systems","volume":"267 5","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An novel SDA-CNN few shot domain adaptation framework for silent speech recognition\",\"authors\":\"N. Ramkumar, D. Karthika Renuka\",\"doi\":\"10.3233/jifs-237890\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In BCI (brain-computer interface) applications, it is difficult to obtain enough well-labeled EEG data because of the expensive annotation and time-consuming data capture procedure. Conventional classification techniques that repurpose EEG data across domains and subjects lead to significant decreases in silent speech recognition classification accuracy. This research provides a supervised domain adaptation using Convolutional Neural Network framework (SDA-CNN) to tackle this problem. The objective is to provide a solution for the distribution divergence issue in the categorization of speech recognition across domains. The suggested framework involves taking raw EEG data and deriving deep features from it and the proposed feature selection method also retrieves the statistical features from the corresponding channels. Moreover, it attempts to minimize the distribution divergence caused by variations in people and settings by aligning the correlation of both the source and destination EEG characteristic dissemination. In order to obtain minimal feature distribution divergence and discriminative classification performance, the last stage entails simultaneously optimizing the loss of classification and adaption loss. The usefulness of the suggested strategy in reducing distributed divergence among the source and target Electroencephalography (EEG) data is demonstrated by extensive experiments carried out on KaraOne datasets. The suggested method achieves an average accuracy for classification of 87.4% for single-subject classification and a noteworthy average class accuracy of 88.6% for cross-subject situations, which shows that it surpasses existing cutting-edge techniques in thinking tasks. Regarding the speaking task, the model’s median classification accuracy for single-subject categorization is 86.8%, while its average classification accuracy for cross-subject classification is 87.8% . These results underscore the innovative approach of SDA-CNN to mitigating distribution discrepancies while optimizing classification performance, offering a promising avenue to enhance accuracy and adaptability in brain-computer interface applications.\",\"PeriodicalId\":509313,\"journal\":{\"name\":\"Journal of Intelligent & Fuzzy Systems\",\"volume\":\"267 5\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-03-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Intelligent & Fuzzy Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3233/jifs-237890\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Intelligent & Fuzzy Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/jifs-237890","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在 BCI（脑机接口）应用中，由于标注费用高昂，数据采集过程耗时，因此很难获得足够多的标记良好的脑电图数据。传统的分类技术在不同领域和受试者之间重新使用脑电图数据，会导致无声语音识别分类准确率显著下降。本研究利用卷积神经网络框架（SDA-CNN）提供了一种有监督的域适应方法来解决这一问题。其目的是为跨域语音识别分类中的分布发散问题提供解决方案。所建议的框架包括获取原始脑电图数据并从中得出深度特征，所建议的特征选择方法还能从相应的通道中检索统计特征。此外，它还试图通过调整源和目标脑电图特征传播的相关性，最大限度地减少因人和环境的变化而造成的分布差异。为了获得最小的特征分布偏差和分辨分类性能，最后一个阶段需要同时优化分类损失和适应损失。在 KaraOne 数据集上进行的大量实验证明了所建议的策略在减少源和目标脑电图（EEG）数据之间的分布发散性方面的实用性。所建议的方法在单主体分类中达到了 87.4% 的平均分类准确率，在跨主体情况下达到了 88.6% 的平均分类准确率，这表明它在思维任务中超越了现有的前沿技术。在口语任务中，该模型的单主体分类准确率中位数为 86.8%，而跨主体分类的平均分类准确率为 87.8%。这些结果凸显了 SDA-CNN 在优化分类性能的同时减轻分布差异的创新方法，为提高脑机接口应用的准确性和适应性提供了一个前景广阔的途径。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

An novel SDA-CNN few shot domain adaptation framework for silent speech recognition

In BCI (brain-computer interface) applications, it is difficult to obtain enough well-labeled EEG data because of the expensive annotation and time-consuming data capture procedure. Conventional classification techniques that repurpose EEG data across domains and subjects lead to significant decreases in silent speech recognition classification accuracy. This research provides a supervised domain adaptation using Convolutional Neural Network framework (SDA-CNN) to tackle this problem. The objective is to provide a solution for the distribution divergence issue in the categorization of speech recognition across domains. The suggested framework involves taking raw EEG data and deriving deep features from it and the proposed feature selection method also retrieves the statistical features from the corresponding channels. Moreover, it attempts to minimize the distribution divergence caused by variations in people and settings by aligning the correlation of both the source and destination EEG characteristic dissemination. In order to obtain minimal feature distribution divergence and discriminative classification performance, the last stage entails simultaneously optimizing the loss of classification and adaption loss. The usefulness of the suggested strategy in reducing distributed divergence among the source and target Electroencephalography (EEG) data is demonstrated by extensive experiments carried out on KaraOne datasets. The suggested method achieves an average accuracy for classification of 87.4% for single-subject classification and a noteworthy average class accuracy of 88.6% for cross-subject situations, which shows that it surpasses existing cutting-edge techniques in thinking tasks. Regarding the speaking task, the model’s median classification accuracy for single-subject categorization is 86.8%, while its average classification accuracy for cross-subject classification is 87.8% . These results underscore the innovative approach of SDA-CNN to mitigating distribution discrepancies while optimizing classification performance, offering a promising avenue to enhance accuracy and adaptability in brain-computer interface applications.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Intelligent & Fuzzy Systems

自引率

0.00%

发文量