基于卷积瓶颈网络的重度听力损失人的视听语音识别

Q1 Computer Science IPSJ Transactions on Computer Vision and Applications Pub Date : 2015-01-01 DOI:10.2197/ipsjtcva.7.64

Yuki Takashima, Yasuhiro Kakihara, Ryo Aihara, T. Takiguchi, Y. Ariki, Nobuyuki Mitani, K. Omori, Kaoru Nakazono

{"title":"基于卷积瓶颈网络的重度听力损失人的视听语音识别","authors":"Yuki Takashima, Yasuhiro Kakihara, Ryo Aihara, T. Takiguchi, Y. Ariki, Nobuyuki Mitani, K. Omori, Kaoru Nakazono","doi":"10.2197/ipsjtcva.7.64","DOIUrl":null,"url":null,"abstract":"In this paper, we propose an audio-visual speech recognition system for a person with an articulation disorder resulting from severe hearing loss. In the case of a person with this type of articulation disorder, the speech style is quite different from with the result that of people without hearing loss that a speaker-independent model for unimpaired persons is hardly useful for recognizing it. We investigate in this paper an audio-visual speech recognition system for a person with severe hearing loss in noisy environments, where a robust feature extraction method using a convolutive bottleneck network (CBN) is applied to audio-visual data. We confirmed the effectiveness of this approach through word-recognition experiments in noisy environments, where the CBN-based feature extraction method outperformed the conventional methods.","PeriodicalId":38957,"journal":{"name":"IPSJ Transactions on Computer Vision and Applications","volume":"26 1","pages":"64-68"},"PeriodicalIF":0.0000,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Audio-Visual Speech Recognition Using Convolutive Bottleneck Networks for a Person with Severe Hearing Loss\",\"authors\":\"Yuki Takashima, Yasuhiro Kakihara, Ryo Aihara, T. Takiguchi, Y. Ariki, Nobuyuki Mitani, K. Omori, Kaoru Nakazono\",\"doi\":\"10.2197/ipsjtcva.7.64\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose an audio-visual speech recognition system for a person with an articulation disorder resulting from severe hearing loss. In the case of a person with this type of articulation disorder, the speech style is quite different from with the result that of people without hearing loss that a speaker-independent model for unimpaired persons is hardly useful for recognizing it. We investigate in this paper an audio-visual speech recognition system for a person with severe hearing loss in noisy environments, where a robust feature extraction method using a convolutive bottleneck network (CBN) is applied to audio-visual data. We confirmed the effectiveness of this approach through word-recognition experiments in noisy environments, where the CBN-based feature extraction method outperformed the conventional methods.\",\"PeriodicalId\":38957,\"journal\":{\"name\":\"IPSJ Transactions on Computer Vision and Applications\",\"volume\":\"26 1\",\"pages\":\"64-68\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IPSJ Transactions on Computer Vision and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2197/ipsjtcva.7.64\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IPSJ Transactions on Computer Vision and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2197/ipsjtcva.7.64","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}

引用次数: 5

摘要

在本文中，我们提出了一种视听语音识别系统，用于严重听力损失导致的发音障碍患者。对于患有这种发音障碍的人来说，他们的说话风格与没有听力损失的人的说话风格大不相同，因此对于没有听力损失的人来说，独立于说话者的模型几乎无法识别。本文研究了一种针对重度听力损失患者在噪声环境下的视听语音识别系统，将一种基于卷积瓶颈网络(CBN)的鲁棒特征提取方法应用于视听数据。我们通过噪声环境下的词识别实验验证了该方法的有效性，其中基于cbn的特征提取方法优于传统方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Audio-Visual Speech Recognition Using Convolutive Bottleneck Networks for a Person with Severe Hearing Loss

In this paper, we propose an audio-visual speech recognition system for a person with an articulation disorder resulting from severe hearing loss. In the case of a person with this type of articulation disorder, the speech style is quite different from with the result that of people without hearing loss that a speaker-independent model for unimpaired persons is hardly useful for recognizing it. We investigate in this paper an audio-visual speech recognition system for a person with severe hearing loss in noisy environments, where a robust feature extraction method using a convolutive bottleneck network (CBN) is applied to audio-visual data. We confirmed the effectiveness of this approach through word-recognition experiments in noisy environments, where the CBN-based feature extraction method outperformed the conventional methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IPSJ Transactions on Computer Vision and Applications Computer Science-Computer Vision and Pattern Recognition

自引率

0.00%

发文量