Sharlene A. Liu, S. Doyle, Allen Morris, Farzad Ehsani
{"title":"基频对普通话语音识别的影响","authors":"Sharlene A. Liu, S. Doyle, Allen Morris, Farzad Ehsani","doi":"10.21437/ICSLP.1998-761","DOIUrl":null,"url":null,"abstract":"We study the effects of modeling tone in Mandarin speech recognition. Including the neutral tone, there are 5 tones in Mandarin and these tones are syllable-level phenomena. A direct acoustic manifestation of tone is the fundamental frequency (f0). We will report on the effect of f0 on the acoustic recognition accuracy of a Mandarin recognizer. In particular, we put f0, its first derivative (f0 ¢ ), and its second derivative (f0 ¢¢ ) in separate streams of the feature vector. Stream weights are adjusted to investigate the individual effects of f0, f0 ¢ , and f0 ¢¢ to recognition accuracy. Our results show that incorporating the f0 feature negatively impacted accuracy, whereas f0’ increased accuracy and f0’’ seemed to have no effect.","PeriodicalId":117113,"journal":{"name":"5th International Conference on Spoken Language Processing (ICSLP 1998)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"The effect of fundamental frequency on Mandarin speech recognition\",\"authors\":\"Sharlene A. Liu, S. Doyle, Allen Morris, Farzad Ehsani\",\"doi\":\"10.21437/ICSLP.1998-761\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We study the effects of modeling tone in Mandarin speech recognition. Including the neutral tone, there are 5 tones in Mandarin and these tones are syllable-level phenomena. A direct acoustic manifestation of tone is the fundamental frequency (f0). We will report on the effect of f0 on the acoustic recognition accuracy of a Mandarin recognizer. In particular, we put f0, its first derivative (f0 ¢ ), and its second derivative (f0 ¢¢ ) in separate streams of the feature vector. Stream weights are adjusted to investigate the individual effects of f0, f0 ¢ , and f0 ¢¢ to recognition accuracy. Our results show that incorporating the f0 feature negatively impacted accuracy, whereas f0’ increased accuracy and f0’’ seemed to have no effect.\",\"PeriodicalId\":117113,\"journal\":{\"name\":\"5th International Conference on Spoken Language Processing (ICSLP 1998)\",\"volume\":\"32 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1998-11-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"5th International Conference on Spoken Language Processing (ICSLP 1998)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.21437/ICSLP.1998-761\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"5th International Conference on Spoken Language Processing (ICSLP 1998)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21437/ICSLP.1998-761","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The effect of fundamental frequency on Mandarin speech recognition
We study the effects of modeling tone in Mandarin speech recognition. Including the neutral tone, there are 5 tones in Mandarin and these tones are syllable-level phenomena. A direct acoustic manifestation of tone is the fundamental frequency (f0). We will report on the effect of f0 on the acoustic recognition accuracy of a Mandarin recognizer. In particular, we put f0, its first derivative (f0 ¢ ), and its second derivative (f0 ¢¢ ) in separate streams of the feature vector. Stream weights are adjusted to investigate the individual effects of f0, f0 ¢ , and f0 ¢¢ to recognition accuracy. Our results show that incorporating the f0 feature negatively impacted accuracy, whereas f0’ increased accuracy and f0’’ seemed to have no effect.