{"title":"Substate Detection Based Confidence Scoring in Speech Recognition","authors":"A. Punnoose","doi":"10.1109/NCC48643.2020.9056075","DOIUrl":null,"url":null,"abstract":"This paper discusses an approach for confidence scoring at the phoneme level. Various features derived from multi layer perceptron (MLP) posteriors that indicates the strength of a phoneme detection are introduced. The capability of these features to discriminate between true positive and false positive phoneme detection is demonstrated. Appropriate distributions are fit on these features. These distributions are combined to derive the posterior odds ratio, which signals the confidence of a phoneme detection. Finally, simple thresholding on the posterior odds ratio is used to classify a detected phoneme as true/false positive. Relevant real world datasets are used to benchmark the proposed approach.","PeriodicalId":183772,"journal":{"name":"2020 National Conference on Communications (NCC)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 National Conference on Communications (NCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NCC48643.2020.9056075","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper discusses an approach for confidence scoring at the phoneme level. Various features derived from multi layer perceptron (MLP) posteriors that indicates the strength of a phoneme detection are introduced. The capability of these features to discriminate between true positive and false positive phoneme detection is demonstrated. Appropriate distributions are fit on these features. These distributions are combined to derive the posterior odds ratio, which signals the confidence of a phoneme detection. Finally, simple thresholding on the posterior odds ratio is used to classify a detected phoneme as true/false positive. Relevant real world datasets are used to benchmark the proposed approach.