A. Alimuradov, A. Tychkov, A. Ageykin, P. Churakov, Yury S. Kvitka, A. Zaretskiy
{"title":"Speech/pause detection algorithm based on the adaptive method of complementary decomposition and energy assessment of intrinsic mode functions","authors":"A. Alimuradov, A. Tychkov, A. Ageykin, P. Churakov, Yury S. Kvitka, A. Zaretskiy","doi":"10.1109/SCM.2017.7970665","DOIUrl":null,"url":null,"abstract":"Speech/pause detection is one of the important tasks in processing. Its effectiveness depends on the accuracy of measuring amplitude, time, frequency and energy characteristics of speech signals. The main reason for large errors in measurements is due to the use of non-adaptive processing methods. The goal is to develop an algorithm for effective speech/pause detection on the basis of the adaptive method of complementary ensemble empirical mode decomposition (CEEMD). The algorithm is implemented using the adaptive processing method of CEEMD. The adaptability of methods lies in the fact that the basic functions used in the decomposition are extracted from the original speech signal, and allow us to take into account only its inherent features (hidden modulation, energy concentration regions, etc.). To carry out the research of the developed algorithm, a software package for mathematical modeling MATLAB was used. A speech/pause detection algorithm is developed on the basis of the adaptive method of complementary decomposition and energy estimation of empirical modes. A block diagram for the algorithm with a detailed mathematical description is presented. The advantages of the developed algorithm over the known analogs that have gained a wide practical popularity are indicated (STE + ZCR, IE and MFCC). The developed algorithm provides an increase in the correct detection rate of speech/pause by an average of 6%. Comparison of research results with analogs suggests that the developed algorithm is recommended for practical use in voice control systems (VCS).","PeriodicalId":315574,"journal":{"name":"2017 XX IEEE International Conference on Soft Computing and Measurements (SCM)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 XX IEEE International Conference on Soft Computing and Measurements (SCM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCM.2017.7970665","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Speech/pause detection is one of the important tasks in processing. Its effectiveness depends on the accuracy of measuring amplitude, time, frequency and energy characteristics of speech signals. The main reason for large errors in measurements is due to the use of non-adaptive processing methods. The goal is to develop an algorithm for effective speech/pause detection on the basis of the adaptive method of complementary ensemble empirical mode decomposition (CEEMD). The algorithm is implemented using the adaptive processing method of CEEMD. The adaptability of methods lies in the fact that the basic functions used in the decomposition are extracted from the original speech signal, and allow us to take into account only its inherent features (hidden modulation, energy concentration regions, etc.). To carry out the research of the developed algorithm, a software package for mathematical modeling MATLAB was used. A speech/pause detection algorithm is developed on the basis of the adaptive method of complementary decomposition and energy estimation of empirical modes. A block diagram for the algorithm with a detailed mathematical description is presented. The advantages of the developed algorithm over the known analogs that have gained a wide practical popularity are indicated (STE + ZCR, IE and MFCC). The developed algorithm provides an increase in the correct detection rate of speech/pause by an average of 6%. Comparison of research results with analogs suggests that the developed algorithm is recommended for practical use in voice control systems (VCS).