{"title":"A Lightweight Learning Framework for Packet Loss Concealment and Speech Enhancement","authors":"Syu-Siang Wang;Chen-Chih Tsai;Wei-Cheng Yu;Shih-Hau Fang","doi":"10.1109/TCCN.2024.3482355","DOIUrl":null,"url":null,"abstract":"Voice-related online communication applications are vulnerable to disruptions from complex environments, such as packet loss in IP-switch channels and ambient noise, which hinder communication efficiency. Various packet loss concealment (PLC) and speech enhancement (SE) techniques have been developed to enhance speech quality and to improve user experience. However, generating high-quality speech under noisy conditions with low computational cost remains a significant challenge. This study introduces a lightweight FCN-IPF approach that integrates a fully convolutional network (FCN) with an interpolation-based post-filter (IPF) to reduce signal processing time while improving sound quality and intelligibility. The proposed FCN-IPF is evaluated on system robustness, computational cost, and performance under packet loss and ambient noise conditions. Results show that FCN-IPF-processed speech achieves improvements of 17.65% in sound quality and 12.00% in intelligibility compared to the packet loss input. Under challenging conditions, sound quality and intelligibility are improved by 12.84% and 10.45%, respectively. Additionally, compared to the conventional CRN_FC method, FCN-IPF reduces model processing time per frame from 60.52 ms to 42.69 ms and decreases memory usage from 162 MB to 6.51 MB. These enhancements lower communication latency and boost noise reduction capabilities, making FCN-IPF well-suited for communication in challenging environments.","PeriodicalId":13069,"journal":{"name":"IEEE Transactions on Cognitive Communications and Networking","volume":"11 3","pages":"2043-2053"},"PeriodicalIF":7.0000,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cognitive Communications and Networking","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10720812/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TELECOMMUNICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Voice-related online communication applications are vulnerable to disruptions from complex environments, such as packet loss in IP-switch channels and ambient noise, which hinder communication efficiency. Various packet loss concealment (PLC) and speech enhancement (SE) techniques have been developed to enhance speech quality and to improve user experience. However, generating high-quality speech under noisy conditions with low computational cost remains a significant challenge. This study introduces a lightweight FCN-IPF approach that integrates a fully convolutional network (FCN) with an interpolation-based post-filter (IPF) to reduce signal processing time while improving sound quality and intelligibility. The proposed FCN-IPF is evaluated on system robustness, computational cost, and performance under packet loss and ambient noise conditions. Results show that FCN-IPF-processed speech achieves improvements of 17.65% in sound quality and 12.00% in intelligibility compared to the packet loss input. Under challenging conditions, sound quality and intelligibility are improved by 12.84% and 10.45%, respectively. Additionally, compared to the conventional CRN_FC method, FCN-IPF reduces model processing time per frame from 60.52 ms to 42.69 ms and decreases memory usage from 162 MB to 6.51 MB. These enhancements lower communication latency and boost noise reduction capabilities, making FCN-IPF well-suited for communication in challenging environments.
期刊介绍:
The IEEE Transactions on Cognitive Communications and Networking (TCCN) aims to publish high-quality manuscripts that push the boundaries of cognitive communications and networking research. Cognitive, in this context, refers to the application of perception, learning, reasoning, memory, and adaptive approaches in communication system design. The transactions welcome submissions that explore various aspects of cognitive communications and networks, focusing on innovative and holistic approaches to complex system design. Key topics covered include architecture, protocols, cross-layer design, and cognition cycle design for cognitive networks. Additionally, research on machine learning, artificial intelligence, end-to-end and distributed intelligence, software-defined networking, cognitive radios, spectrum sharing, and security and privacy issues in cognitive networks are of interest. The publication also encourages papers addressing novel services and applications enabled by these cognitive concepts.