仅使用语音信息的口语文档主题建模

Timothy J. Hazen, M. Siu, H. Gish, S. Lowe, Arthur Chan
{"title":"仅使用语音信息的口语文档主题建模","authors":"Timothy J. Hazen, M. Siu, H. Gish, S. Lowe, Arthur Chan","doi":"10.1109/ASRU.2011.6163964","DOIUrl":null,"url":null,"abstract":"This paper explores both supervised and unsupervised topic modeling for spoken audio documents using only phonetic information. In cases where word-based recognition is unavailable or infeasible, phonetic information can be used to indirectly learn and capture information provided by topically relevant lexical items. In some situations, a lack of transcribed data can prevent supervised training of a same-language phonetic recognition system. In these cases, phonetic recognition can use cross-language models or self-organizing units (SOUs) learned in a completely unsupervised fashion. This paper presents recent improvements in topic modeling using only phonetic information. We present new results using recently developed techniques for discriminative training for topic identification used in conjunction with recent improvements in SOU learning. A preliminary examination of the use of unsupervised latent topic modeling for unsupervised discovery of topics and topically relevant lexical items from phonetic information is also presented.","PeriodicalId":338241,"journal":{"name":"2011 IEEE Workshop on Automatic Speech Recognition & Understanding","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":"{\"title\":\"Topic modeling for spoken documents using only phonetic information\",\"authors\":\"Timothy J. Hazen, M. Siu, H. Gish, S. Lowe, Arthur Chan\",\"doi\":\"10.1109/ASRU.2011.6163964\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper explores both supervised and unsupervised topic modeling for spoken audio documents using only phonetic information. In cases where word-based recognition is unavailable or infeasible, phonetic information can be used to indirectly learn and capture information provided by topically relevant lexical items. In some situations, a lack of transcribed data can prevent supervised training of a same-language phonetic recognition system. In these cases, phonetic recognition can use cross-language models or self-organizing units (SOUs) learned in a completely unsupervised fashion. This paper presents recent improvements in topic modeling using only phonetic information. We present new results using recently developed techniques for discriminative training for topic identification used in conjunction with recent improvements in SOU learning. A preliminary examination of the use of unsupervised latent topic modeling for unsupervised discovery of topics and topically relevant lexical items from phonetic information is also presented.\",\"PeriodicalId\":338241,\"journal\":{\"name\":\"2011 IEEE Workshop on Automatic Speech Recognition & Understanding\",\"volume\":\"25 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"19\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 IEEE Workshop on Automatic Speech Recognition & Understanding\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASRU.2011.6163964\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE Workshop on Automatic Speech Recognition & Understanding","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU.2011.6163964","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19

摘要

本文探讨了仅使用语音信息的语音文档的有监督和无监督主题建模。在基于词的识别不可用或不可行的情况下,语音信息可以间接地学习和获取与主题相关的词汇项提供的信息。在某些情况下,缺乏转录数据可能会阻碍对同语言语音识别系统的监督训练。在这些情况下,语音识别可以使用跨语言模型或以完全无监督的方式学习的自组织单元(soe)。本文介绍了仅使用语音信息进行主题建模的最新进展。我们提出了新的结果,使用了最近开发的鉴别训练技术,用于主题识别,并结合了最近在SOU学习方面的改进。本文还提出了使用无监督潜在主题模型从语音信息中无监督地发现主题和话题相关词汇项目的初步研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Topic modeling for spoken documents using only phonetic information
This paper explores both supervised and unsupervised topic modeling for spoken audio documents using only phonetic information. In cases where word-based recognition is unavailable or infeasible, phonetic information can be used to indirectly learn and capture information provided by topically relevant lexical items. In some situations, a lack of transcribed data can prevent supervised training of a same-language phonetic recognition system. In these cases, phonetic recognition can use cross-language models or self-organizing units (SOUs) learned in a completely unsupervised fashion. This paper presents recent improvements in topic modeling using only phonetic information. We present new results using recently developed techniques for discriminative training for topic identification used in conjunction with recent improvements in SOU learning. A preliminary examination of the use of unsupervised latent topic modeling for unsupervised discovery of topics and topically relevant lexical items from phonetic information is also presented.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Applying feature bagging for more accurate and robust automated speaking assessment Towards choosing better primes for spoken dialog systems Accent level adjustment in bilingual Thai-English text-to-speech synthesis Fast speaker diarization using a high-level scripting language Evaluating prosodic features for automated scoring of non-native read speech
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1