{"title":"LSP参数的分类条件熵编码","authors":"Junchen Du, S.P. Kim","doi":"10.1109/DCC.1995.515545","DOIUrl":null,"url":null,"abstract":"Summary form only given. A new LSP speech parameter compression scheme is proposed which uses conditional probability information through classification. For efficient compression of speech LSP parameter vectors it is essential that higher order correlations are exploited. The use of conditional probability information has been hindered by high complexity of the information. For example, a LSP vector has 34 bit representation at 4.8 K bps CELP coding (FS1016 standard). It is impractical to use the first order probability information directly since 2/sup 34//spl ap/1.7/spl times/10/sup 10/ number of probability tables would be required and training of such information would be practically impossible. In order to reduce the complexity, we reduce the input alphabet size by classifying the LSP vectors according to their phonetic relevance. In other words, speech LSP parameters are classified into groups representing loosely defined various phonemes. The number of phoneme groups used was 32 considering the ambiguity of similar phonemes and background noises. Then conditional probability tables are constructed for each class by training. In order to further reduce the complexity, split-VQ has been employed. The classification is achieved through vector quantization with a mean squared distortion measure in the LSP domain.","PeriodicalId":107017,"journal":{"name":"Proceedings DCC '95 Data Compression Conference","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1995-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Classified conditional entropy coding of LSP parameters\",\"authors\":\"Junchen Du, S.P. Kim\",\"doi\":\"10.1109/DCC.1995.515545\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Summary form only given. A new LSP speech parameter compression scheme is proposed which uses conditional probability information through classification. For efficient compression of speech LSP parameter vectors it is essential that higher order correlations are exploited. The use of conditional probability information has been hindered by high complexity of the information. For example, a LSP vector has 34 bit representation at 4.8 K bps CELP coding (FS1016 standard). It is impractical to use the first order probability information directly since 2/sup 34//spl ap/1.7/spl times/10/sup 10/ number of probability tables would be required and training of such information would be practically impossible. In order to reduce the complexity, we reduce the input alphabet size by classifying the LSP vectors according to their phonetic relevance. In other words, speech LSP parameters are classified into groups representing loosely defined various phonemes. The number of phoneme groups used was 32 considering the ambiguity of similar phonemes and background noises. Then conditional probability tables are constructed for each class by training. In order to further reduce the complexity, split-VQ has been employed. The classification is achieved through vector quantization with a mean squared distortion measure in the LSP domain.\",\"PeriodicalId\":107017,\"journal\":{\"name\":\"Proceedings DCC '95 Data Compression Conference\",\"volume\":\"46 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1995-03-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings DCC '95 Data Compression Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DCC.1995.515545\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings DCC '95 Data Compression Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DCC.1995.515545","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
摘要
只提供摘要形式。提出了一种利用条件概率信息进行分类的LSP语音参数压缩方案。为了有效地压缩语音LSP参数向量,必须利用高阶相关性。条件概率信息的高度复杂性阻碍了条件概率信息的应用。例如,在4.8 K bps的CELP编码(FS1016标准)下,LSP向量有34位表示。直接使用一阶概率信息是不切实际的,因为需要2/sup 34//spl / ap/1.7/spl倍/10/sup 10/数的概率表,并且训练这些信息实际上是不可能的。为了降低复杂度,我们根据LSP向量的语音相关性对其进行分类,从而减小输入字母的大小。换句话说,语音LSP参数被分成代表松散定义的各种音素的组。考虑到相似音素的模糊性和背景噪声,使用的音素组数为32个。然后通过训练为每个类构造条件概率表。为了进一步降低复杂度,采用了split-VQ。该分类是通过在LSP域中使用均方失真度量的矢量量化来实现的。
Classified conditional entropy coding of LSP parameters
Summary form only given. A new LSP speech parameter compression scheme is proposed which uses conditional probability information through classification. For efficient compression of speech LSP parameter vectors it is essential that higher order correlations are exploited. The use of conditional probability information has been hindered by high complexity of the information. For example, a LSP vector has 34 bit representation at 4.8 K bps CELP coding (FS1016 standard). It is impractical to use the first order probability information directly since 2/sup 34//spl ap/1.7/spl times/10/sup 10/ number of probability tables would be required and training of such information would be practically impossible. In order to reduce the complexity, we reduce the input alphabet size by classifying the LSP vectors according to their phonetic relevance. In other words, speech LSP parameters are classified into groups representing loosely defined various phonemes. The number of phoneme groups used was 32 considering the ambiguity of similar phonemes and background noises. Then conditional probability tables are constructed for each class by training. In order to further reduce the complexity, split-VQ has been employed. The classification is achieved through vector quantization with a mean squared distortion measure in the LSP domain.