{"title":"基于亚状态检测的语音识别置信度评分","authors":"A. Punnoose","doi":"10.1109/NCC48643.2020.9056075","DOIUrl":null,"url":null,"abstract":"This paper discusses an approach for confidence scoring at the phoneme level. Various features derived from multi layer perceptron (MLP) posteriors that indicates the strength of a phoneme detection are introduced. The capability of these features to discriminate between true positive and false positive phoneme detection is demonstrated. Appropriate distributions are fit on these features. These distributions are combined to derive the posterior odds ratio, which signals the confidence of a phoneme detection. Finally, simple thresholding on the posterior odds ratio is used to classify a detected phoneme as true/false positive. Relevant real world datasets are used to benchmark the proposed approach.","PeriodicalId":183772,"journal":{"name":"2020 National Conference on Communications (NCC)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Substate Detection Based Confidence Scoring in Speech Recognition\",\"authors\":\"A. Punnoose\",\"doi\":\"10.1109/NCC48643.2020.9056075\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper discusses an approach for confidence scoring at the phoneme level. Various features derived from multi layer perceptron (MLP) posteriors that indicates the strength of a phoneme detection are introduced. The capability of these features to discriminate between true positive and false positive phoneme detection is demonstrated. Appropriate distributions are fit on these features. These distributions are combined to derive the posterior odds ratio, which signals the confidence of a phoneme detection. Finally, simple thresholding on the posterior odds ratio is used to classify a detected phoneme as true/false positive. Relevant real world datasets are used to benchmark the proposed approach.\",\"PeriodicalId\":183772,\"journal\":{\"name\":\"2020 National Conference on Communications (NCC)\",\"volume\":\"25 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 National Conference on Communications (NCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NCC48643.2020.9056075\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 National Conference on Communications (NCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NCC48643.2020.9056075","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Substate Detection Based Confidence Scoring in Speech Recognition
This paper discusses an approach for confidence scoring at the phoneme level. Various features derived from multi layer perceptron (MLP) posteriors that indicates the strength of a phoneme detection are introduced. The capability of these features to discriminate between true positive and false positive phoneme detection is demonstrated. Appropriate distributions are fit on these features. These distributions are combined to derive the posterior odds ratio, which signals the confidence of a phoneme detection. Finally, simple thresholding on the posterior odds ratio is used to classify a detected phoneme as true/false positive. Relevant real world datasets are used to benchmark the proposed approach.