Pub Date : 2022-08-29DOI: 10.23919/eusipco55093.2022.9909576
F. Llorente, E. Curbelo, L. Martino, P. Olmos, D. Delgado
In this work, we present two novel importance sampling (IS) methods, which can be considered safe in the sense that they avoid catastrophic scenarios where the IS estimators could have infinite variance. This is obtained by using a population of proposal densities where each one is wider than the posterior distribution. In fact, we consider partial posterior distributions (i.e., considering a smaller number of data) as proposal densities. Neuronal variational approximations are also discussed. The experimental results show the benefits of the proposed schemes.
{"title":"Safe importance sampling based on partial posteriors and neural variational approximations","authors":"F. Llorente, E. Curbelo, L. Martino, P. Olmos, D. Delgado","doi":"10.23919/eusipco55093.2022.9909576","DOIUrl":"https://doi.org/10.23919/eusipco55093.2022.9909576","url":null,"abstract":"In this work, we present two novel importance sampling (IS) methods, which can be considered safe in the sense that they avoid catastrophic scenarios where the IS estimators could have infinite variance. This is obtained by using a population of proposal densities where each one is wider than the posterior distribution. In fact, we consider partial posterior distributions (i.e., considering a smaller number of data) as proposal densities. Neuronal variational approximations are also discussed. The experimental results show the benefits of the proposed schemes.","PeriodicalId":231263,"journal":{"name":"2022 30th European Signal Processing Conference (EUSIPCO)","volume":"PP 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126361305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-29DOI: 10.23919/eusipco55093.2022.9909850
L. McCormack, A. Politis
Spatial audio coding and reproduction methods are often based on the estimation of primary directional and secondary ambience components. This paper details a study into the estimation and subsequent reproduction of the ambient components found in ambisonic sound scenes. More specifically, two different ambience estimation approaches are investigated. The first estimates the ambient Ambisonic signals through a source-separation and spatial subtraction approach, and there-fore requires an estimate of both the number of sources and their directions. The second instead requires only the number of sources to be known, and employs a multi-channel Wiener filter (MWF) to obtain the estimated ambient signals. One approach for reproducing estimated ambient signals is through a signal processing chain of: a plane-wave decomposition, signal decor-relation, and subsequent spatialisation for the target playback setup. However, this reproduction approach may be sensitive to spatial and signal fidelity degradations incurred during the beamforming and decorrelation operations. Therefore, an optimal mixing alternative is proposed for this reproduction task, which achieves spatially incoherent rendering of ambience directly for the target playback setup; bypassing intermediate plane-wave decomposition and excessive decorrelation. Listening tests indicate improved perceived quality when using the proposed reproduction method in conjunction with both tested ambience estimation approaches.
{"title":"Estimating and Reproducing Ambience in Ambisonic Recordings","authors":"L. McCormack, A. Politis","doi":"10.23919/eusipco55093.2022.9909850","DOIUrl":"https://doi.org/10.23919/eusipco55093.2022.9909850","url":null,"abstract":"Spatial audio coding and reproduction methods are often based on the estimation of primary directional and secondary ambience components. This paper details a study into the estimation and subsequent reproduction of the ambient components found in ambisonic sound scenes. More specifically, two different ambience estimation approaches are investigated. The first estimates the ambient Ambisonic signals through a source-separation and spatial subtraction approach, and there-fore requires an estimate of both the number of sources and their directions. The second instead requires only the number of sources to be known, and employs a multi-channel Wiener filter (MWF) to obtain the estimated ambient signals. One approach for reproducing estimated ambient signals is through a signal processing chain of: a plane-wave decomposition, signal decor-relation, and subsequent spatialisation for the target playback setup. However, this reproduction approach may be sensitive to spatial and signal fidelity degradations incurred during the beamforming and decorrelation operations. Therefore, an optimal mixing alternative is proposed for this reproduction task, which achieves spatially incoherent rendering of ambience directly for the target playback setup; bypassing intermediate plane-wave decomposition and excessive decorrelation. Listening tests indicate improved perceived quality when using the proposed reproduction method in conjunction with both tested ambience estimation approaches.","PeriodicalId":231263,"journal":{"name":"2022 30th European Signal Processing Conference (EUSIPCO)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114147452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-29DOI: 10.23919/eusipco55093.2022.9909798
Erez Shalev, I. Cohen
Automated audio systems, such as speech emotion recognition, can benefit from the ability to work from another room. No research has yet been conducted on the effectiveness of such systems when the sound source originates in a different room than the target system, and the sound has to travel between the rooms through the wall. New advancements in room-impulse-response generators enable a large-scale simulation of audio sources from adjacent rooms and integration into a training dataset. Such a capability improves the performance of datadriven methods such as deep learning. This paper presents the first evaluation of multiroom speech emotion recognition systems. The isolating policies due to COVID-19 presented many cases of isolated individuals suffering emotional difficulties, where such capabilities would be very beneficial. We perform training, with and without an audio simulation generator, and compare the results of three different models on real data recorded in a real multiroom audio scene. We show that models trained without the new generator achieve poor results when presented with multiroom data. We proceed to show that augmentation using the new generator improves the performances for all three models. Our results demonstrate the advantage of using such a generator. Furthermore, testing with two different deep learning architectures shows that the generator improves the results independently of the given architecture.
{"title":"Multiroom Speech Emotion Recognition","authors":"Erez Shalev, I. Cohen","doi":"10.23919/eusipco55093.2022.9909798","DOIUrl":"https://doi.org/10.23919/eusipco55093.2022.9909798","url":null,"abstract":"Automated audio systems, such as speech emotion recognition, can benefit from the ability to work from another room. No research has yet been conducted on the effectiveness of such systems when the sound source originates in a different room than the target system, and the sound has to travel between the rooms through the wall. New advancements in room-impulse-response generators enable a large-scale simulation of audio sources from adjacent rooms and integration into a training dataset. Such a capability improves the performance of datadriven methods such as deep learning. This paper presents the first evaluation of multiroom speech emotion recognition systems. The isolating policies due to COVID-19 presented many cases of isolated individuals suffering emotional difficulties, where such capabilities would be very beneficial. We perform training, with and without an audio simulation generator, and compare the results of three different models on real data recorded in a real multiroom audio scene. We show that models trained without the new generator achieve poor results when presented with multiroom data. We proceed to show that augmentation using the new generator improves the performances for all three models. Our results demonstrate the advantage of using such a generator. Furthermore, testing with two different deep learning architectures shows that the generator improves the results independently of the given architecture.","PeriodicalId":231263,"journal":{"name":"2022 30th European Signal Processing Conference (EUSIPCO)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116055966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-29DOI: 10.23919/eusipco55093.2022.9909524
Natsuki Ueno, H. Kameoka
We propose source localization methods for multiple sound sources. The proposed method requires only an observation of a sound pressure and its spatial gradient at one fixed point, which can be realized by a small microphone array. The key idea is to utilize the partial differential equation relating the observed signals and the source position, which was originally proposed for the direct method for the single source localization problem. We extend this framework using stochastic modeling and proposed a method for the mutliple source localization in the presence of noises. Two source localization methods are proposed: one is the expectation-minimization algorithm for a given number of sources, and the other is the variational Bayesian inference for an unknown number of sources. By numerical experiments, the localization accuracies of the two proposed methods are compared with the baseline method.
{"title":"Multiple Sound Source Localization Based on Stochastic Modeling of Spatial Gradient Spectra","authors":"Natsuki Ueno, H. Kameoka","doi":"10.23919/eusipco55093.2022.9909524","DOIUrl":"https://doi.org/10.23919/eusipco55093.2022.9909524","url":null,"abstract":"We propose source localization methods for multiple sound sources. The proposed method requires only an observation of a sound pressure and its spatial gradient at one fixed point, which can be realized by a small microphone array. The key idea is to utilize the partial differential equation relating the observed signals and the source position, which was originally proposed for the direct method for the single source localization problem. We extend this framework using stochastic modeling and proposed a method for the mutliple source localization in the presence of noises. Two source localization methods are proposed: one is the expectation-minimization algorithm for a given number of sources, and the other is the variational Bayesian inference for an unknown number of sources. By numerical experiments, the localization accuracies of the two proposed methods are compared with the baseline method.","PeriodicalId":231263,"journal":{"name":"2022 30th European Signal Processing Conference (EUSIPCO)","volume":"106 1-2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116114572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-29DOI: 10.23919/eusipco55093.2022.9909613
Dhanya Eledath, Narasimha Rao Thurlapati, V. Pavithra, Tirthankar Banerjee, V. Ramasubramanian
In this paper, we propose two architectural variants of our recent adaptation of a ‘few shot-learning’ (FSL) framework ‘Matching Networks’ (MN) to end-to-end (E2E) continuous speech recognition (CSR) in a formulation termed ‘MN-CTC’ which involves a CTC-loss based end-to-end episodic training of MN and an associated CTC-based decoding of continuous speech. An important component of the MN theory is the labelled support-set during training and inference. The architectural variants proposed and studied here for E2E CSR, namely, the ‘Uncoupled MN-CTC’ and the ‘Coupled MN-CTC’, address this problem of generating supervised support sets from continuous speech. While the ‘Uncoupled MN-CTC’ generates the support-sets ‘outside’ the MN-architecture, the ‘Coupled MN-CTC’ variant is a derivative framework which generates the support set ‘within’ the MN-architecture through a multi-task formulation coupling the support-set generation loss and the main MN-CTC loss for jointly optimizing the support-sets and the embedding functions of MN. On TIMIT and Librispeech datasets, we establish the ‘few-shot’ effectiveness of the proposed variants with PER and LER performances and also demonstrate the cross-domain applicability of the MN-CTC formulation with a Librispeech trained ‘Coupled MN-CTC’ variant inferencing on TIMIT low resource target-corpus with a 8% (absolute) LER advantage over a single-domain (TIMIT only) scenario.
{"title":"Few-shot learning for E2E speech recognition: architectural variants for support set generation","authors":"Dhanya Eledath, Narasimha Rao Thurlapati, V. Pavithra, Tirthankar Banerjee, V. Ramasubramanian","doi":"10.23919/eusipco55093.2022.9909613","DOIUrl":"https://doi.org/10.23919/eusipco55093.2022.9909613","url":null,"abstract":"In this paper, we propose two architectural variants of our recent adaptation of a ‘few shot-learning’ (FSL) framework ‘Matching Networks’ (MN) to end-to-end (E2E) continuous speech recognition (CSR) in a formulation termed ‘MN-CTC’ which involves a CTC-loss based end-to-end episodic training of MN and an associated CTC-based decoding of continuous speech. An important component of the MN theory is the labelled support-set during training and inference. The architectural variants proposed and studied here for E2E CSR, namely, the ‘Uncoupled MN-CTC’ and the ‘Coupled MN-CTC’, address this problem of generating supervised support sets from continuous speech. While the ‘Uncoupled MN-CTC’ generates the support-sets ‘outside’ the MN-architecture, the ‘Coupled MN-CTC’ variant is a derivative framework which generates the support set ‘within’ the MN-architecture through a multi-task formulation coupling the support-set generation loss and the main MN-CTC loss for jointly optimizing the support-sets and the embedding functions of MN. On TIMIT and Librispeech datasets, we establish the ‘few-shot’ effectiveness of the proposed variants with PER and LER performances and also demonstrate the cross-domain applicability of the MN-CTC formulation with a Librispeech trained ‘Coupled MN-CTC’ variant inferencing on TIMIT low resource target-corpus with a 8% (absolute) LER advantage over a single-domain (TIMIT only) scenario.","PeriodicalId":231263,"journal":{"name":"2022 30th European Signal Processing Conference (EUSIPCO)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130116163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-29DOI: 10.23919/eusipco55093.2022.9909726
Mohammad MohammadAmini, D. Matrouf, J. Bonastre, Sandipana Dowerah, R. Serizel, D. Jouvet
In this paper, a comprehensive exploration of noise robustness and noise compensation of ResNet and TDNN speaker recognition systems is presented. Firstly the robustness of the TDNN and ResNet in the presence of noise, reverberation, and both distortions is explored. Our experimental results show that in all cases the ResNet system is more robust than TDNN. After that, a noise compensation task is done with denoising autoen-coder (DAE) over the x-vectors extracted from both systems. We explored two scenarios: 1) compensation of artificial noise with artificial data, 2) compensation of real noise with artificial data. The second case is the most desired scenario, because it makes noise compensation affordable without having real data to train denoising techniques. The experimental results show that in the first scenario noise compensation gives significant improvement with TDNN while this improvement in Resnet is not significant. In the second scenario, we achieved 15% improvement of EER over VoiCes Eval challenge in both TDNN and ResNet systems. In most cases the performance of ResNet without compensation is superior to TDNN with noise compensation.
{"title":"A Comprehensive Exploration of Noise Robustness and Noise Compensation in ResNet and TDNN-based Speaker Recognition Systems","authors":"Mohammad MohammadAmini, D. Matrouf, J. Bonastre, Sandipana Dowerah, R. Serizel, D. Jouvet","doi":"10.23919/eusipco55093.2022.9909726","DOIUrl":"https://doi.org/10.23919/eusipco55093.2022.9909726","url":null,"abstract":"In this paper, a comprehensive exploration of noise robustness and noise compensation of ResNet and TDNN speaker recognition systems is presented. Firstly the robustness of the TDNN and ResNet in the presence of noise, reverberation, and both distortions is explored. Our experimental results show that in all cases the ResNet system is more robust than TDNN. After that, a noise compensation task is done with denoising autoen-coder (DAE) over the x-vectors extracted from both systems. We explored two scenarios: 1) compensation of artificial noise with artificial data, 2) compensation of real noise with artificial data. The second case is the most desired scenario, because it makes noise compensation affordable without having real data to train denoising techniques. The experimental results show that in the first scenario noise compensation gives significant improvement with TDNN while this improvement in Resnet is not significant. In the second scenario, we achieved 15% improvement of EER over VoiCes Eval challenge in both TDNN and ResNet systems. In most cases the performance of ResNet without compensation is superior to TDNN with noise compensation.","PeriodicalId":231263,"journal":{"name":"2022 30th European Signal Processing Conference (EUSIPCO)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122871725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-29DOI: 10.23919/eusipco55093.2022.9909956
C. López, J. Riba
This paper proposes a practical way to solve the Uplink-Downlink Covariance Conversion (UDCC) problem in a frequency Division Duplex (FDD) communication system. The UDCC problem consists in the estimation of the Downlink (DL) spatial covariance matrix from the prior knowledge of the Uplink (UL) spatial covariance matrix without the need of a feedback transmission from the User Equipment (UE) to the Base Station (BS). Estimating the DL sample spatial covariance matrix is unfeasible in current massive Multiple-Input Multiple-Output (MIMO) deployments in frequency selective or fast fading channels due to the required large training overhead. Our method is based on the application of sparse filtering ideas to the estimation of a quantized version of the so-called Angular Power Spectrum (APS), being the common factor between the UL and DL spatial channel covariance matrices.
{"title":"Sparse-Aware Approach for Covariance Conversion in FDD Systems","authors":"C. López, J. Riba","doi":"10.23919/eusipco55093.2022.9909956","DOIUrl":"https://doi.org/10.23919/eusipco55093.2022.9909956","url":null,"abstract":"This paper proposes a practical way to solve the Uplink-Downlink Covariance Conversion (UDCC) problem in a frequency Division Duplex (FDD) communication system. The UDCC problem consists in the estimation of the Downlink (DL) spatial covariance matrix from the prior knowledge of the Uplink (UL) spatial covariance matrix without the need of a feedback transmission from the User Equipment (UE) to the Base Station (BS). Estimating the DL sample spatial covariance matrix is unfeasible in current massive Multiple-Input Multiple-Output (MIMO) deployments in frequency selective or fast fading channels due to the required large training overhead. Our method is based on the application of sparse filtering ideas to the estimation of a quantized version of the so-called Angular Power Spectrum (APS), being the common factor between the UL and DL spatial channel covariance matrices.","PeriodicalId":231263,"journal":{"name":"2022 30th European Signal Processing Conference (EUSIPCO)","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122940394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-29DOI: 10.23919/eusipco55093.2022.9909935
Ziyu Zhou, Wei Dai
This paper investigates the problem of joint es-timation of delay, direction of arrival (DoA), and Doppler when an orthogonal frequency-division multiplexing (OFDM) signal is used for probing. A gridless approach is taken where the above three parameters live on a continuous space rather than a discrete grid. A low-rank multilevel Hankel matrix is used to capture the underlying structure of the back-scattered signals. A convex optimization, termed as Hankel nuclear norm minimization (HNNM), is developed for denoising and parameter estimation, and solved by alternating direction method of multi-pliers (ADMM). Simulations demonstrate that HNNM is robust to noise, and can go beyond the minimum separation bound required by another gridless method atomic norm minimization.
{"title":"Gridless Joint Delay-DoA-Doppler Estimation Using OFDM Signals: A Multilevel Hankel Matrix Approach","authors":"Ziyu Zhou, Wei Dai","doi":"10.23919/eusipco55093.2022.9909935","DOIUrl":"https://doi.org/10.23919/eusipco55093.2022.9909935","url":null,"abstract":"This paper investigates the problem of joint es-timation of delay, direction of arrival (DoA), and Doppler when an orthogonal frequency-division multiplexing (OFDM) signal is used for probing. A gridless approach is taken where the above three parameters live on a continuous space rather than a discrete grid. A low-rank multilevel Hankel matrix is used to capture the underlying structure of the back-scattered signals. A convex optimization, termed as Hankel nuclear norm minimization (HNNM), is developed for denoising and parameter estimation, and solved by alternating direction method of multi-pliers (ADMM). Simulations demonstrate that HNNM is robust to noise, and can go beyond the minimum separation bound required by another gridless method atomic norm minimization.","PeriodicalId":231263,"journal":{"name":"2022 30th European Signal Processing Conference (EUSIPCO)","volume":"11 13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131501708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-29DOI: 10.23919/eusipco55093.2022.9909609
Reza Mirzaeifard, Vinay Chakravarthi Gogineni, Naveen K. D. Venkategowda, Stefan Werner
This paper presents a majorization-minimization-based framework for learning time-varying graphs from spatial-temporal measurements with non-convex penalties. The proposed approach infers time-varying graphs by using the log-likelihood function in conjunction with two non-convex regularizers. Using the log-likelihood function under a total positivity constraint, we can construct the Laplacian matrix from the off-diagonal elements of the precision matrix. Furthermore, we employ non-convex regularizer functions to constrain the changes in graph topology and associated weight evolution to be sparse. The experimental results demonstrate that our proposed method outperforms the state-of-the-art methods in sparse and non-sparse situations.
{"title":"Dynamic Graph Topology Learning with Non-Convex Penalties","authors":"Reza Mirzaeifard, Vinay Chakravarthi Gogineni, Naveen K. D. Venkategowda, Stefan Werner","doi":"10.23919/eusipco55093.2022.9909609","DOIUrl":"https://doi.org/10.23919/eusipco55093.2022.9909609","url":null,"abstract":"This paper presents a majorization-minimization-based framework for learning time-varying graphs from spatial-temporal measurements with non-convex penalties. The proposed approach infers time-varying graphs by using the log-likelihood function in conjunction with two non-convex regularizers. Using the log-likelihood function under a total positivity constraint, we can construct the Laplacian matrix from the off-diagonal elements of the precision matrix. Furthermore, we employ non-convex regularizer functions to constrain the changes in graph topology and associated weight evolution to be sparse. The experimental results demonstrate that our proposed method outperforms the state-of-the-art methods in sparse and non-sparse situations.","PeriodicalId":231263,"journal":{"name":"2022 30th European Signal Processing Conference (EUSIPCO)","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127603378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-29DOI: 10.23919/eusipco55093.2022.9909851
Jérémy Fix, Israel David Hinostroza Sáenz, Chengfang Ren, G. Manfredi, T. Letertre
Deep Learning techniques require vast amount of data for a proper training. In human activity classification using radar signals, the data acquisition can be very expensive and takes a lot of time, but radar databases are starting to be available to the public. In this work we show that we can use these available radar databases to pretrain a neural network that will finish its training on the final radar data even though the radar configuration is different (geometry configuration and carrier frequency).
{"title":"Transfer learning for human activity classification in multiple radar setups","authors":"Jérémy Fix, Israel David Hinostroza Sáenz, Chengfang Ren, G. Manfredi, T. Letertre","doi":"10.23919/eusipco55093.2022.9909851","DOIUrl":"https://doi.org/10.23919/eusipco55093.2022.9909851","url":null,"abstract":"Deep Learning techniques require vast amount of data for a proper training. In human activity classification using radar signals, the data acquisition can be very expensive and takes a lot of time, but radar databases are starting to be available to the public. In this work we show that we can use these available radar databases to pretrain a neural network that will finish its training on the final radar data even though the radar configuration is different (geometry configuration and carrier frequency).","PeriodicalId":231263,"journal":{"name":"2022 30th European Signal Processing Conference (EUSIPCO)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133607531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}