{"title":"High-fidelity multichannel audio coding with Karhunen-Loeve transform","authors":"Dai Yang, H. Ai, C. Kyriakakis, C.-C. Jay Kuo","doi":"10.1109/TSA.2003.814375","DOIUrl":null,"url":null,"abstract":"A new quality-scalable high-fidelity multichannel audio compression algorithm based on MPEG-2 advanced audio coding (AAC) is presented. The Karhunen-Loeve transform (KLT) is applied to multichannel audio signals in the preprocessing stage to remove interchannel redundancy. Then, signals in decorrelated channels are compressed by a modified AAC main profile encoder. Finally, a channel transmission control mechanism is used to re-organize the bitstream so that the multichannel audio bitstream has a quality scalable property when it is transmitted over a heterogeneous network. Experimental results show that, compared with AAC, the proposed algorithm achieves a better performance while maintaining a similar computational complexity at the regular bit rate of 64 kbit/sec/ch. When the bitstream is transmitted to narrowband end users at a lower bit rate, packets in some channels can be dropped, and slightly degraded, yet full-channel, audio can still be reconstructed in a reasonable fashion without any additional computational cost.","PeriodicalId":13155,"journal":{"name":"IEEE Trans. Speech Audio Process.","volume":"29 1","pages":"365-380"},"PeriodicalIF":0.0000,"publicationDate":"2003-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"34","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Trans. Speech Audio Process.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TSA.2003.814375","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 34
Abstract
A new quality-scalable high-fidelity multichannel audio compression algorithm based on MPEG-2 advanced audio coding (AAC) is presented. The Karhunen-Loeve transform (KLT) is applied to multichannel audio signals in the preprocessing stage to remove interchannel redundancy. Then, signals in decorrelated channels are compressed by a modified AAC main profile encoder. Finally, a channel transmission control mechanism is used to re-organize the bitstream so that the multichannel audio bitstream has a quality scalable property when it is transmitted over a heterogeneous network. Experimental results show that, compared with AAC, the proposed algorithm achieves a better performance while maintaining a similar computational complexity at the regular bit rate of 64 kbit/sec/ch. When the bitstream is transmitted to narrowband end users at a lower bit rate, packets in some channels can be dropped, and slightly degraded, yet full-channel, audio can still be reconstructed in a reasonable fashion without any additional computational cost.