{"title":"题目自动识别的随机方法","authors":"K. Scheffler, J. du Preez","doi":"10.1109/COMSIG.1998.736924","DOIUrl":null,"url":null,"abstract":"The field of topic spotting in conversational speech has been receiving growing attention in recent years. The goal of this field is to develop a system that can identify topics of interest among large volumes of speech data. In order to cope with practical considerations, researchers are concentrating on phoneme-based methods, which eliminate the need for topic specific data to be hand-transcribed. A number of different phoneme-based approaches have recently been proposed, of which the Euclidean nearest wrong neighbour (ENWN) system (Kuhn et al, 1997) has yielded the most promising experimental results. A phoneme-based topic spotter makes use of a phoneme recogniser to transcribe the speech data. The main problem of this approach is that the accuracy of such transcriptions is very poor. Typically, only between 40 and 50 percent of the phonemes are transcribed correctly. It is therefore important to compensate for the low quality of the transcriptions. However, existing techniques make no use of statistical modelling to compensate for transcription errors. In this research, a stochastic method for automatic recognition of topics (SMART) was developed to address the above mentioned problem. The resulting system is an extension of the existing ENWN algorithm. Comparative results indicate an improvement of SMART over ENWN characterized by a 26% reduction in ROC (receiver operating characteristic) error area. This difference was found to be statistically significant.","PeriodicalId":294473,"journal":{"name":"Proceedings of the 1998 South African Symposium on Communications and Signal Processing-COMSIG '98 (Cat. No. 98EX214)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Stochastic method for automatic recognition of topics\",\"authors\":\"K. Scheffler, J. du Preez\",\"doi\":\"10.1109/COMSIG.1998.736924\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The field of topic spotting in conversational speech has been receiving growing attention in recent years. The goal of this field is to develop a system that can identify topics of interest among large volumes of speech data. In order to cope with practical considerations, researchers are concentrating on phoneme-based methods, which eliminate the need for topic specific data to be hand-transcribed. A number of different phoneme-based approaches have recently been proposed, of which the Euclidean nearest wrong neighbour (ENWN) system (Kuhn et al, 1997) has yielded the most promising experimental results. A phoneme-based topic spotter makes use of a phoneme recogniser to transcribe the speech data. The main problem of this approach is that the accuracy of such transcriptions is very poor. Typically, only between 40 and 50 percent of the phonemes are transcribed correctly. It is therefore important to compensate for the low quality of the transcriptions. However, existing techniques make no use of statistical modelling to compensate for transcription errors. In this research, a stochastic method for automatic recognition of topics (SMART) was developed to address the above mentioned problem. The resulting system is an extension of the existing ENWN algorithm. Comparative results indicate an improvement of SMART over ENWN characterized by a 26% reduction in ROC (receiver operating characteristic) error area. This difference was found to be statistically significant.\",\"PeriodicalId\":294473,\"journal\":{\"name\":\"Proceedings of the 1998 South African Symposium on Communications and Signal Processing-COMSIG '98 (Cat. No. 98EX214)\",\"volume\":\"44 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1998-09-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 1998 South African Symposium on Communications and Signal Processing-COMSIG '98 (Cat. No. 98EX214)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/COMSIG.1998.736924\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 1998 South African Symposium on Communications and Signal Processing-COMSIG '98 (Cat. No. 98EX214)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COMSIG.1998.736924","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
摘要
近年来,会话话语中的话题定位受到越来越多的关注。该领域的目标是开发一个能够在大量语音数据中识别感兴趣主题的系统。为了应对实际的考虑,研究人员正在集中研究基于音素的方法,这种方法消除了对特定主题数据的手工转录的需要。最近提出了许多不同的基于音素的方法,其中欧几里得最近邻错误(ENWN)系统(Kuhn et al, 1997)产生了最有希望的实验结果。基于音素的主题识别器利用音素识别器转录语音数据。这种方法的主要问题是这种转录的准确性很差。通常情况下,只有40%到50%的音素被正确转录。因此,重要的是要弥补转录的低质量。然而,现有的技术没有使用统计模型来补偿转录错误。为了解决上述问题,本研究提出了一种随机主题自动识别方法(SMART)。所得到的系统是现有ENWN算法的扩展。对比结果表明,与ENWN相比,SMART的改进表现为ROC(接受者工作特征)误差面积减少了26%。这种差异在统计学上是显著的。
Stochastic method for automatic recognition of topics
The field of topic spotting in conversational speech has been receiving growing attention in recent years. The goal of this field is to develop a system that can identify topics of interest among large volumes of speech data. In order to cope with practical considerations, researchers are concentrating on phoneme-based methods, which eliminate the need for topic specific data to be hand-transcribed. A number of different phoneme-based approaches have recently been proposed, of which the Euclidean nearest wrong neighbour (ENWN) system (Kuhn et al, 1997) has yielded the most promising experimental results. A phoneme-based topic spotter makes use of a phoneme recogniser to transcribe the speech data. The main problem of this approach is that the accuracy of such transcriptions is very poor. Typically, only between 40 and 50 percent of the phonemes are transcribed correctly. It is therefore important to compensate for the low quality of the transcriptions. However, existing techniques make no use of statistical modelling to compensate for transcription errors. In this research, a stochastic method for automatic recognition of topics (SMART) was developed to address the above mentioned problem. The resulting system is an extension of the existing ENWN algorithm. Comparative results indicate an improvement of SMART over ENWN characterized by a 26% reduction in ROC (receiver operating characteristic) error area. This difference was found to be statistically significant.