Pub Date : 2022-11-26DOI: 10.1109/ICICSP55539.2022.10050592
Gang Yang, Xiaolei Wang, Lulu Wang, Yi Zhang, Yung-Su Han, Xin Tan, Shang Yong Zhang
Convolutional network models (CNN) are very vulnerable to adversarial samples, which poses a serious challenge to the security of CNN models. Based on the task of CNN's modulation and identification of communication signals, we propose a white-box attack algorithm, the shortest distance attack method (SD-Alg), which can generate extremely small disturbances and greatly reduce the classification performance of the model. Experiments show that our algorithm excels in attack success rate, running time and adversarial perturbation size among the same type of algorithms.
{"title":"Adversarial Attack on Communication Signal Modulation Recognition","authors":"Gang Yang, Xiaolei Wang, Lulu Wang, Yi Zhang, Yung-Su Han, Xin Tan, Shang Yong Zhang","doi":"10.1109/ICICSP55539.2022.10050592","DOIUrl":"https://doi.org/10.1109/ICICSP55539.2022.10050592","url":null,"abstract":"Convolutional network models (CNN) are very vulnerable to adversarial samples, which poses a serious challenge to the security of CNN models. Based on the task of CNN's modulation and identification of communication signals, we propose a white-box attack algorithm, the shortest distance attack method (SD-Alg), which can generate extremely small disturbances and greatly reduce the classification performance of the model. Experiments show that our algorithm excels in attack success rate, running time and adversarial perturbation size among the same type of algorithms.","PeriodicalId":281095,"journal":{"name":"2022 5th International Conference on Information Communication and Signal Processing (ICICSP)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127483282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-26DOI: 10.1109/ICICSP55539.2022.10050600
Yuyang Shao, Hui Ma, Hongzhi Liu
The direction of arrival (DOA) is a typical sparse parameter estimation problem. Its solution methods include greedy algorithm, norm minimization method and Bayesian estimation, in which the Bayesian methods are superior in estimation accuracy, but huge amount of computation has become the bottle-neck. This paper analyzes and compares the computation complexity of sparse Bayesian learning (SBL), multi-task sparse Bayesian learning (MSBL) and inverse-free sparse Bayesian learning (IFSBL) in DOA estimation. Simulations are also provided and prove that IFSBL is much better than SBL and MSBL in operational efficiency.
{"title":"A Study and Comparison of Different Sparse Bayesian Learning Algorithms in DOA Estimation","authors":"Yuyang Shao, Hui Ma, Hongzhi Liu","doi":"10.1109/ICICSP55539.2022.10050600","DOIUrl":"https://doi.org/10.1109/ICICSP55539.2022.10050600","url":null,"abstract":"The direction of arrival (DOA) is a typical sparse parameter estimation problem. Its solution methods include greedy algorithm, norm minimization method and Bayesian estimation, in which the Bayesian methods are superior in estimation accuracy, but huge amount of computation has become the bottle-neck. This paper analyzes and compares the computation complexity of sparse Bayesian learning (SBL), multi-task sparse Bayesian learning (MSBL) and inverse-free sparse Bayesian learning (IFSBL) in DOA estimation. Simulations are also provided and prove that IFSBL is much better than SBL and MSBL in operational efficiency.","PeriodicalId":281095,"journal":{"name":"2022 5th International Conference on Information Communication and Signal Processing (ICICSP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128942391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-26DOI: 10.1109/ICICSP55539.2022.10050581
Xuhua Huang, Weize Sun, Lei Huang, Shaowu Chen, Qiang Li
Radar target recognition is an important problem in radar system and had been widely studied in the pass decades. In this paper, we first build a radar target recognition system that classifies the target based on $i$ successive frames of echo values. A baseline model using the VGG-16 as backbone is then introduced. In order to perform target recognition under different values of $i$ or number of frames, two neural networks, referred to as Independent Self-supplementary Radar Target recognition Network (ISRTNet) and Share Self-supplementary Radar Target recognition Network (SSRTNet), are then proposed. Both networks will feed two input data samples of different frames of echo values to the networks, and study the similarities between the two samples in order to achieve better recognition result. The ISRTNet, furthermore, can reduce the total number of network parameters and is suitable to be deployed in the system when it is required to perform the target recognition under different number of frames. Experimental results show that the proposed models can achieve outstanding recognition performance comparing to the baseline model.
{"title":"Radar Target Recognition with Self-Supplementary Neural Network","authors":"Xuhua Huang, Weize Sun, Lei Huang, Shaowu Chen, Qiang Li","doi":"10.1109/ICICSP55539.2022.10050581","DOIUrl":"https://doi.org/10.1109/ICICSP55539.2022.10050581","url":null,"abstract":"Radar target recognition is an important problem in radar system and had been widely studied in the pass decades. In this paper, we first build a radar target recognition system that classifies the target based on $i$ successive frames of echo values. A baseline model using the VGG-16 as backbone is then introduced. In order to perform target recognition under different values of $i$ or number of frames, two neural networks, referred to as Independent Self-supplementary Radar Target recognition Network (ISRTNet) and Share Self-supplementary Radar Target recognition Network (SSRTNet), are then proposed. Both networks will feed two input data samples of different frames of echo values to the networks, and study the similarities between the two samples in order to achieve better recognition result. The ISRTNet, furthermore, can reduce the total number of network parameters and is suitable to be deployed in the system when it is required to perform the target recognition under different number of frames. Experimental results show that the proposed models can achieve outstanding recognition performance comparing to the baseline model.","PeriodicalId":281095,"journal":{"name":"2022 5th International Conference on Information Communication and Signal Processing (ICICSP)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125548787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-26DOI: 10.1109/ICICSP55539.2022.10050636
Y. Qi, S. Zhou, Changpeng Liu, Jincong Dun, Lei Zhou
Source localization can be achieved by matching the arrival angles of the direct (D) path and surface-reflected (SR) path with a vertical line array (VLA) in deep water. This source localization method works at regions where multipath arrivals are resolved as separate, while it fails at regions where multipath arrivals are indistinguishable. The separability of multipath arrivals by conventional beamforming (CBF) and the applicable region of the source localization method under one experimental configuration with water depth equal to 1507 m and a VLA deployed at depth from 1335.5 m to 1448 m are analyzed, based on the approximate expression of the half-power (−3 dB) beamwidth of CBF. Experimental results of explosive sources with nominal depth of 200 m and 300 m are also presented.
利用垂直线阵列(VLA)在深水中对直接路径(D)和表面反射路径(SR)的到达角进行匹配,可以实现震源定位。这种源定位方法适用于多路径到达被分解为独立的区域,而不适用于多路径到达不可区分的区域。基于半功率波束宽度(- 3 dB)的近似表达式,分析了传统波束形成(CBF)多径到达的可分性以及在水深为1507 m、部署深度为1335.5 m ~ 1448 m的VLA实验配置下的源定位方法的适用区域。给出了标称深度为200 m和300 m的炸药源的试验结果。
{"title":"Separability of Multipath Arrivals by Conventional Beamforming and Its Use for Source Localization","authors":"Y. Qi, S. Zhou, Changpeng Liu, Jincong Dun, Lei Zhou","doi":"10.1109/ICICSP55539.2022.10050636","DOIUrl":"https://doi.org/10.1109/ICICSP55539.2022.10050636","url":null,"abstract":"Source localization can be achieved by matching the arrival angles of the direct (D) path and surface-reflected (SR) path with a vertical line array (VLA) in deep water. This source localization method works at regions where multipath arrivals are resolved as separate, while it fails at regions where multipath arrivals are indistinguishable. The separability of multipath arrivals by conventional beamforming (CBF) and the applicable region of the source localization method under one experimental configuration with water depth equal to 1507 m and a VLA deployed at depth from 1335.5 m to 1448 m are analyzed, based on the approximate expression of the half-power (−3 dB) beamwidth of CBF. Experimental results of explosive sources with nominal depth of 200 m and 300 m are also presented.","PeriodicalId":281095,"journal":{"name":"2022 5th International Conference on Information Communication and Signal Processing (ICICSP)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123393092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-26DOI: 10.1109/ICICSP55539.2022.10050619
Chao Peng, Yiwen Wang, Xihong Wu, T. Qu
This paper presents a multi-channel speech separation system for an unknown number of speakers. It can be applied to cases with a different number of speakers using a single model by iterative speech separation based on beam signal. It first determines the spatial directions where speakers are located (Direction of Arrival, DOA), and then the beam signals in each direction are obtained with spectral features, spatial features, and directional features by deep neural networks. Finally, the iterative speech separation is performed on the basis of the beam signals. Experimental evaluations show that the proposed method is better than the multi-channel Permutation Invariant Training (PIT) and Deep Clustering (DPCL) for an unknown number of speakers and the one-and-rest speech separation method. Besides, the system can still keep a relatively good separation performance even though the number of speakers is enlarged to 9.
提出了一种针对未知说话人数量的多通道语音分离系统。通过基于波束信号的迭代语音分离,可以适用于使用单个模型的不同说话人数量的情况。它首先确定扬声器所在的空间方向(Direction of Arrival, DOA),然后通过深度神经网络获得每个方向上的波束信号的频谱特征、空间特征和方向特征。最后,基于波束信号进行迭代语音分离。实验结果表明,该方法比多通道排列不变训练(PIT)和深度聚类(DPCL)的未知说话者数量和一休息语音分离方法要好。此外,当扬声器数量增加到9个时,系统仍能保持较好的分离性能。
{"title":"A Multi-channel Speech Separation System for Unknown Number of Multiple Speakers","authors":"Chao Peng, Yiwen Wang, Xihong Wu, T. Qu","doi":"10.1109/ICICSP55539.2022.10050619","DOIUrl":"https://doi.org/10.1109/ICICSP55539.2022.10050619","url":null,"abstract":"This paper presents a multi-channel speech separation system for an unknown number of speakers. It can be applied to cases with a different number of speakers using a single model by iterative speech separation based on beam signal. It first determines the spatial directions where speakers are located (Direction of Arrival, DOA), and then the beam signals in each direction are obtained with spectral features, spatial features, and directional features by deep neural networks. Finally, the iterative speech separation is performed on the basis of the beam signals. Experimental evaluations show that the proposed method is better than the multi-channel Permutation Invariant Training (PIT) and Deep Clustering (DPCL) for an unknown number of speakers and the one-and-rest speech separation method. Besides, the system can still keep a relatively good separation performance even though the number of speakers is enlarged to 9.","PeriodicalId":281095,"journal":{"name":"2022 5th International Conference on Information Communication and Signal Processing (ICICSP)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115315231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-26DOI: 10.1109/ICICSP55539.2022.10050648
Long Chen, Lei Huang, Guitong Chen, Weize Sun
Ahstract-The working frequency range and the scale of the scanning area of a microphone array are typically limited by the array geometry. Owing to its movable feature, for the service robots, achieving a wider working frequency range with a 3-dimension global view requires a virtually larger and denser 3-dimension array, which can be realised by using non-synchronous measurements beamforming with a movable microphone prototype array. However, even when using the state-of-the-art method, it is challenging to localise multiple broadband sources, owing to the difficulty in selecting an appropriate operating frequency without any prior information about the target signal. Therefore, this paper proposes a tensor-completion-based non-synchronous measurement method for broadband multiple-sound-source localisation. The tensor data structure of the broadband signal is analysed, and an alternating direction method based on multiplier optimisation with a tensor multi-norm constraint is proposed. This algorithm can provide a sound map with a distinct 3-dimension global view of different speech signal sources with high accuracy via a 16-channel planar microphone array. Compared with the matrix-based optimisation method, the proposed method can significantly reduce the mean square error of the estimated source location.
{"title":"A Large Scale 3D Sound Source Localisation Approach Achieved via Small Size Microphone Array for Service Robots","authors":"Long Chen, Lei Huang, Guitong Chen, Weize Sun","doi":"10.1109/ICICSP55539.2022.10050648","DOIUrl":"https://doi.org/10.1109/ICICSP55539.2022.10050648","url":null,"abstract":"Ahstract-The working frequency range and the scale of the scanning area of a microphone array are typically limited by the array geometry. Owing to its movable feature, for the service robots, achieving a wider working frequency range with a 3-dimension global view requires a virtually larger and denser 3-dimension array, which can be realised by using non-synchronous measurements beamforming with a movable microphone prototype array. However, even when using the state-of-the-art method, it is challenging to localise multiple broadband sources, owing to the difficulty in selecting an appropriate operating frequency without any prior information about the target signal. Therefore, this paper proposes a tensor-completion-based non-synchronous measurement method for broadband multiple-sound-source localisation. The tensor data structure of the broadband signal is analysed, and an alternating direction method based on multiplier optimisation with a tensor multi-norm constraint is proposed. This algorithm can provide a sound map with a distinct 3-dimension global view of different speech signal sources with high accuracy via a 16-channel planar microphone array. Compared with the matrix-based optimisation method, the proposed method can significantly reduce the mean square error of the estimated source location.","PeriodicalId":281095,"journal":{"name":"2022 5th International Conference on Information Communication and Signal Processing (ICICSP)","volume":"133 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132788777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-26DOI: 10.1109/ICICSP55539.2022.10050573
Qixin Guo, Liang Yu, Rui Wang, Ran Wang, Weikang Jiang, Wancheng Ge
A layered medium is studied in this paper. The considered layered medium is divided into four layers in the vertical direction according to the variation of the sound speed, which is a constant or changes linearly in each layer. The sound speed does not change (or changes slowly) in the horizontal direction. The wave equation has no analytical solution owing to the irregular variation of the sound speed. In the forward model, the spectral element method is used for numerical simulations to get the numerical solution of the wave equation, which is implemented by performing a full-wave simulation in the time domain. The iterative Bayesian focusing algorithm is applied to implement acoustic imaging in the inverse problem. Finally, a numerical simulation is designed to validate the four-layer model and the acoustic imaging algorithm. The simulation results demonstrate that the iterative Bayesian focusing algorithm has accurate localization effects and reconstruction results in the four-layer model and its comparison with the conventional beamforming algorithm.
{"title":"Acoustic Imaging in Four-layer Medium by Iterative Bayesian Focusing Algorithm","authors":"Qixin Guo, Liang Yu, Rui Wang, Ran Wang, Weikang Jiang, Wancheng Ge","doi":"10.1109/ICICSP55539.2022.10050573","DOIUrl":"https://doi.org/10.1109/ICICSP55539.2022.10050573","url":null,"abstract":"A layered medium is studied in this paper. The considered layered medium is divided into four layers in the vertical direction according to the variation of the sound speed, which is a constant or changes linearly in each layer. The sound speed does not change (or changes slowly) in the horizontal direction. The wave equation has no analytical solution owing to the irregular variation of the sound speed. In the forward model, the spectral element method is used for numerical simulations to get the numerical solution of the wave equation, which is implemented by performing a full-wave simulation in the time domain. The iterative Bayesian focusing algorithm is applied to implement acoustic imaging in the inverse problem. Finally, a numerical simulation is designed to validate the four-layer model and the acoustic imaging algorithm. The simulation results demonstrate that the iterative Bayesian focusing algorithm has accurate localization effects and reconstruction results in the four-layer model and its comparison with the conventional beamforming algorithm.","PeriodicalId":281095,"journal":{"name":"2022 5th International Conference on Information Communication and Signal Processing (ICICSP)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131810105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Smart noise convolution jamming based on digital radio frequency memory (DRFM) has both oppressive jamming effect and deceptive jamming effect, which is difficult to be effectively suppressed by traditional anti-jamming methods and is a serious effect on the performance of modern radar systems. In this paper, a noise convolution jamming suppression method for multiple-input-mutiple-output (MIMO) radar with frequency diverse array (FDA) is proposed. The jamming signal is modulated by the jammer and emitted across the pulse repetition period (PRT), and is different from the target echo in the transmit spatial frequency and the Doppler frequency. On this basis, jamming samples are selected based on the range-dependent transmit steering vector of the FDA-MIMO radar firstly. The filter is subsequently designed in the transmitting spatial frequency domain, and the jamming is suppressed due to the range mismatch. Simulation results demonstrate the effectiveness of the proposed approach.
{"title":"A Method to Suppress the Noise Convolution Jamming in FDA-MIMO Radar","authors":"Yanxing Wang, Shengqi Zhu, Ximin Li, Jingwei Xu, Lan Lan, Zhuochen Chen","doi":"10.1109/ICICSP55539.2022.10050647","DOIUrl":"https://doi.org/10.1109/ICICSP55539.2022.10050647","url":null,"abstract":"Smart noise convolution jamming based on digital radio frequency memory (DRFM) has both oppressive jamming effect and deceptive jamming effect, which is difficult to be effectively suppressed by traditional anti-jamming methods and is a serious effect on the performance of modern radar systems. In this paper, a noise convolution jamming suppression method for multiple-input-mutiple-output (MIMO) radar with frequency diverse array (FDA) is proposed. The jamming signal is modulated by the jammer and emitted across the pulse repetition period (PRT), and is different from the target echo in the transmit spatial frequency and the Doppler frequency. On this basis, jamming samples are selected based on the range-dependent transmit steering vector of the FDA-MIMO radar firstly. The filter is subsequently designed in the transmitting spatial frequency domain, and the jamming is suppressed due to the range mismatch. Simulation results demonstrate the effectiveness of the proposed approach.","PeriodicalId":281095,"journal":{"name":"2022 5th International Conference on Information Communication and Signal Processing (ICICSP)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132196746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-26DOI: 10.1109/ICICSP55539.2022.10050696
Lirong Li, Bing Mei, Peng Chen, Liang Yu, Pengcheng Gong
In this paper, we propose a positioning algorithm based on sonar images, which is mainly used for the positioning of underwater robots, so that the robots can obtain the position information in real time when operating underwater and avoid colliding with the underwater walls. First, the underwater space was detected with multibeam sonar, and the sonar images of the underwater space wall were found to have line segment characteristics; Composite denoising, threshold segmentation and Canny edge detection are applied to the sonar image to extract the contours of the underwater spatial wall. Then the characteristic line segments of the underwater spatial wall are detected based on the LSD (Line Segment Detector) line segment detection algorithm. In terms of line segment classification, a method is proposed to effectively classify line segments using the origin of the sonar image and the slope of the detected line segment. To further demonstrate the effectiveness of the localization algorithm in this paper, the specific steps of the algorithm are illustrated with the example of underwater rectangular space.
{"title":"Localization Method of Underwater Robot Based on Sonar Image","authors":"Lirong Li, Bing Mei, Peng Chen, Liang Yu, Pengcheng Gong","doi":"10.1109/ICICSP55539.2022.10050696","DOIUrl":"https://doi.org/10.1109/ICICSP55539.2022.10050696","url":null,"abstract":"In this paper, we propose a positioning algorithm based on sonar images, which is mainly used for the positioning of underwater robots, so that the robots can obtain the position information in real time when operating underwater and avoid colliding with the underwater walls. First, the underwater space was detected with multibeam sonar, and the sonar images of the underwater space wall were found to have line segment characteristics; Composite denoising, threshold segmentation and Canny edge detection are applied to the sonar image to extract the contours of the underwater spatial wall. Then the characteristic line segments of the underwater spatial wall are detected based on the LSD (Line Segment Detector) line segment detection algorithm. In terms of line segment classification, a method is proposed to effectively classify line segments using the origin of the sonar image and the slope of the detected line segment. To further demonstrate the effectiveness of the localization algorithm in this paper, the specific steps of the algorithm are illustrated with the example of underwater rectangular space.","PeriodicalId":281095,"journal":{"name":"2022 5th International Conference on Information Communication and Signal Processing (ICICSP)","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133034089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-26DOI: 10.1109/ICICSP55539.2022.10050618
Sijun Bi, Liang Xu, Shenghui Zhao, Jing Wang
The air-conducted (AC) sound is usually used in the task of acoustic scene classification (ASC). Compared with the AC sound, bone-conducted (BC) sound has the unique advantage of shielding background noise. However, the amount of information contained in BC sound is far less than that in the AC sound due to its limited frequency bandwidth. In this paper, an acoustic scene classification method for BC sound is proposed with a small BC dataset. Firstly, the prosodic features are combined with the spectral features to capture more information, and feature fusion is adopted. Secondly, in order to deal with the small BC dataset, transfer learning is used with a large AC dataset. Finally, a deep learning network based on local residual learning is proposed. The experimental results show that the proposed method achieves the superior performance over the reference models.
{"title":"Acoustic Scene Classification for Bone-Conducted Sound Using Transfer Learning and Feature Fusion","authors":"Sijun Bi, Liang Xu, Shenghui Zhao, Jing Wang","doi":"10.1109/ICICSP55539.2022.10050618","DOIUrl":"https://doi.org/10.1109/ICICSP55539.2022.10050618","url":null,"abstract":"The air-conducted (AC) sound is usually used in the task of acoustic scene classification (ASC). Compared with the AC sound, bone-conducted (BC) sound has the unique advantage of shielding background noise. However, the amount of information contained in BC sound is far less than that in the AC sound due to its limited frequency bandwidth. In this paper, an acoustic scene classification method for BC sound is proposed with a small BC dataset. Firstly, the prosodic features are combined with the spectral features to capture more information, and feature fusion is adopted. Secondly, in order to deal with the small BC dataset, transfer learning is used with a large AC dataset. Finally, a deep learning network based on local residual learning is proposed. The experimental results show that the proposed method achieves the superior performance over the reference models.","PeriodicalId":281095,"journal":{"name":"2022 5th International Conference on Information Communication and Signal Processing (ICICSP)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114321002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}