An investigation of audio-visual speech recognition as applied to multimedia speech therapy applications

V. Georgopoulos
{"title":"An investigation of audio-visual speech recognition as applied to multimedia speech therapy applications","authors":"V. Georgopoulos","doi":"10.1109/MMCS.1999.779249","DOIUrl":null,"url":null,"abstract":"A multimedia speech therapy system should be able to be used for customized speech therapy for different problems and for different ages. The speech recognition must be designed to work with high inter- and intra-speaker variability. In addition to displaying text on a screen, recording the voice reading the text, analyzing the recorded spoken signal and performing speech recognition which includes identification of speech irregularities and tracking of patient progress, it should be capable of analyzing visual signal of the patients' speech and provide visual as well as audio feedback. This implies that the synchronization of different media is important in realizing effective multimedia speech therapy applications. In order to perform speech recognition and identification tasks, time-frequency analysis and neural networks are proposed with integration of visual information.","PeriodicalId":408680,"journal":{"name":"Proceedings IEEE International Conference on Multimedia Computing and Systems","volume":"57 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1999-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings IEEE International Conference on Multimedia Computing and Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MMCS.1999.779249","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

Abstract

A multimedia speech therapy system should be able to be used for customized speech therapy for different problems and for different ages. The speech recognition must be designed to work with high inter- and intra-speaker variability. In addition to displaying text on a screen, recording the voice reading the text, analyzing the recorded spoken signal and performing speech recognition which includes identification of speech irregularities and tracking of patient progress, it should be capable of analyzing visual signal of the patients' speech and provide visual as well as audio feedback. This implies that the synchronization of different media is important in realizing effective multimedia speech therapy applications. In order to perform speech recognition and identification tasks, time-frequency analysis and neural networks are proposed with integration of visual information.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
视听语音识别在多媒体语音治疗中的应用研究
多媒体语言治疗系统应该能够针对不同的问题和不同的年龄进行个性化的语言治疗。语音识别必须设计成具有较高的说话人之间和说话人内部的可变性。除了在屏幕上显示文本,记录阅读文本的声音,分析记录的语音信号以及进行语音识别(包括识别语音异常和跟踪患者进展)之外,它还应该能够分析患者语音的视觉信号,并提供视觉和音频反馈。这意味着不同媒体的同步是实现有效的多媒体语言治疗应用的重要因素。为了完成语音识别和识别任务,提出了融合视觉信息的时频分析和神经网络。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A fast H.261 software codec for high quality videoconferencing on PCs Virtual European School-VES An adaptive transport service for multimedia streams Virtual social clubs: meeting places for the Internet community An investigation of audio-visual speech recognition as applied to multimedia speech therapy applications
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1