{"title":"Low-Resource Speech Recognition of Radiotelephony Communications Based on Continuous Learning of In-Domain and Out-of-Domain Knowledge","authors":"Guimin Jia;Dong He;Xilong Zhou","doi":"10.1109/LSP.2025.3545955","DOIUrl":null,"url":null,"abstract":"Automatic speech recognition (ASR) in air traffic control (ATC) is a low-resource task with limited data and difficult annotation. Fine-tuning self-supervised pre-trained models is a potential solution, but it is time-consuming and computationally expensive, and may degrade the model's ability to extract robust features. Therefore, we propose a continuous learning approach for end-to-end ASR to maintain performance in both new and original tasks. To address catastrophic forgetting in continuous learning for ASR, we propose a knowledge distillation-based method combined with stochastic encoder-layer fine-tuning. This approach efficiently retains knowledge from previous tasks with limited training data, reducing the need for extensive joint training. Experiments on open-source ATC datasets show that our method effectively reduces forgetting and outperforms existing techniques.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"1136-1140"},"PeriodicalIF":3.2000,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Signal Processing Letters","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10904317/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Automatic speech recognition (ASR) in air traffic control (ATC) is a low-resource task with limited data and difficult annotation. Fine-tuning self-supervised pre-trained models is a potential solution, but it is time-consuming and computationally expensive, and may degrade the model's ability to extract robust features. Therefore, we propose a continuous learning approach for end-to-end ASR to maintain performance in both new and original tasks. To address catastrophic forgetting in continuous learning for ASR, we propose a knowledge distillation-based method combined with stochastic encoder-layer fine-tuning. This approach efficiently retains knowledge from previous tasks with limited training data, reducing the need for extensive joint training. Experiments on open-source ATC datasets show that our method effectively reduces forgetting and outperforms existing techniques.
期刊介绍:
The IEEE Signal Processing Letters is a monthly, archival publication designed to provide rapid dissemination of original, cutting-edge ideas and timely, significant contributions in signal, image, speech, language and audio processing. Papers published in the Letters can be presented within one year of their appearance in signal processing conferences such as ICASSP, GlobalSIP and ICIP, and also in several workshop organized by the Signal Processing Society.