F. Carvalho, M. P. Sousa, J. V. S. Filho, J. S. Rocha, W. Lopes, M. Alencar
Cognitive radio is one of the most promising techniques of wireless communications, due to its many applications. Cognitive networks have the capability to congregate different cognitive users via cooperative spectrum sensing. Examples of cognitive networks can be found in important and different applications, such as digital television and wireless sensor networks. The objective of this paper is to analyze how signal processing techniques are used to provide reliable performance in such networks. Applications of signal processing in cognitive networks are presented and detailed.
{"title":"Signal processing applications for cognitive networks: State of the art","authors":"F. Carvalho, M. P. Sousa, J. V. S. Filho, J. S. Rocha, W. Lopes, M. Alencar","doi":"10.5281/ZENODO.54514","DOIUrl":"https://doi.org/10.5281/ZENODO.54514","url":null,"abstract":"Cognitive radio is one of the most promising techniques of wireless communications, due to its many applications. Cognitive networks have the capability to congregate different cognitive users via cooperative spectrum sensing. Examples of cognitive networks can be found in important and different applications, such as digital television and wireless sensor networks. The objective of this paper is to analyze how signal processing techniques are used to provide reliable performance in such networks. Applications of signal processing in cognitive networks are presented and detailed.","PeriodicalId":198408,"journal":{"name":"2014 22nd European Signal Processing Conference (EUSIPCO)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115705354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Active cancellation systems rely on destructive interference to achieve rejection of unwanted disturbances entering the system of interest. Typical practical applications of this method employ a simple single input, single output arrangement. However, when a spatial wavefield (e.g. acoustic noise or vibration) needs to be controlled, multichannel active cancellation systems arise naturally. Among these, the so-called overdetermined control configuration, which employs more measurement outputs than control inputs, is often found to provide superior performance. The paper proposes an extension of the recently introduced control scheme, called self-optimizing narrowband interference canceller (SONIC), to the overdetermined case. The extension employs a novel variant of the extremum-seeking adaptation loop which uses random, rather than sinusoidal, probing signals. This modification simplifies design of the controller and improves its convergence. Simulations, performed using a realistic model of the plant, demonstrate improved properties of the new controller.
{"title":"Merging extremum seeking and self-optimizing narrowband interference canceller - overdetermined case","authors":"M. Meller","doi":"10.5281/ZENODO.43812","DOIUrl":"https://doi.org/10.5281/ZENODO.43812","url":null,"abstract":"Active cancellation systems rely on destructive interference to achieve rejection of unwanted disturbances entering the system of interest. Typical practical applications of this method employ a simple single input, single output arrangement. However, when a spatial wavefield (e.g. acoustic noise or vibration) needs to be controlled, multichannel active cancellation systems arise naturally. Among these, the so-called overdetermined control configuration, which employs more measurement outputs than control inputs, is often found to provide superior performance. The paper proposes an extension of the recently introduced control scheme, called self-optimizing narrowband interference canceller (SONIC), to the overdetermined case. The extension employs a novel variant of the extremum-seeking adaptation loop which uses random, rather than sinusoidal, probing signals. This modification simplifies design of the controller and improves its convergence. Simulations, performed using a realistic model of the plant, demonstrate improved properties of the new controller.","PeriodicalId":198408,"journal":{"name":"2014 22nd European Signal Processing Conference (EUSIPCO)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114409926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shahram Kalantari, David Dean, S. Sridharan, R. Wallace
This paper investigates the effect of topic dependent language models (TDLM) on phonetic spoken term detection (STD) using dynamic match lattice spotting (DMLS). Phonetic STD consists of two steps: indexing and search. The accuracy of indexing audio segments into phone sequences using phone recognition methods directly affects the accuracy of the final STD system. If the topic of a document in known, recognizing the spoken words and indexing them to an intermediate representation is an easier task and consequently, detecting a search word in it will be more accurate and robust. In this paper, we propose the use of TDLMs in the indexing stage to improve the accuracy of STD in situations where the topic of the audio document is known in advance. It is shown that using TDLMs instead of the traditional general language model (GLM) improves STD performance according to figure of merit (FOM) criteria.
{"title":"Topic dependent language modelling for spoken term detection","authors":"Shahram Kalantari, David Dean, S. Sridharan, R. Wallace","doi":"10.5281/ZENODO.44201","DOIUrl":"https://doi.org/10.5281/ZENODO.44201","url":null,"abstract":"This paper investigates the effect of topic dependent language models (TDLM) on phonetic spoken term detection (STD) using dynamic match lattice spotting (DMLS). Phonetic STD consists of two steps: indexing and search. The accuracy of indexing audio segments into phone sequences using phone recognition methods directly affects the accuracy of the final STD system. If the topic of a document in known, recognizing the spoken words and indexing them to an intermediate representation is an easier task and consequently, detecting a search word in it will be more accurate and robust. In this paper, we propose the use of TDLMs in the indexing stage to improve the accuracy of STD in situations where the topic of the audio document is known in advance. It is shown that using TDLMs instead of the traditional general language model (GLM) improves STD performance according to figure of merit (FOM) criteria.","PeriodicalId":198408,"journal":{"name":"2014 22nd European Signal Processing Conference (EUSIPCO)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114765466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Smart rooms are used for a growing number of practical applications. They are often equipped with microphones and cameras allowing acoustic and visual tracking of persons. For that, the geometry of the sensors has to be calibrated. In this paper, a method is introduced that calibrates the microphone arrays by using the visual localization of a speaker at a small number of fixed positions. By matching the positions to the direction of arrival (DoA) estimates of the microphone arrays, their absolute position and orientation are derived. Data from a reverberant smart room is used to show that the proposed method can estimate the absolute geometry with about 0.1m and 2° precision. The calibration is good enough for acoustic and multi modal tracking applications and eliminates the need for dedicated calibration measures by using the tracking data itself.
{"title":"Geometry calibration of distributed microphone arrays exploiting audio-visual correspondences","authors":"A. Plinge, G. Fink","doi":"10.5281/ZENODO.44001","DOIUrl":"https://doi.org/10.5281/ZENODO.44001","url":null,"abstract":"Smart rooms are used for a growing number of practical applications. They are often equipped with microphones and cameras allowing acoustic and visual tracking of persons. For that, the geometry of the sensors has to be calibrated. In this paper, a method is introduced that calibrates the microphone arrays by using the visual localization of a speaker at a small number of fixed positions. By matching the positions to the direction of arrival (DoA) estimates of the microphone arrays, their absolute position and orientation are derived. Data from a reverberant smart room is used to show that the proposed method can estimate the absolute geometry with about 0.1m and 2° precision. The calibration is good enough for acoustic and multi modal tracking applications and eliminates the need for dedicated calibration measures by using the tracking data itself.","PeriodicalId":198408,"journal":{"name":"2014 22nd European Signal Processing Conference (EUSIPCO)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117045344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper we propose a method for learning and recognizing human actions on dynamic binary volumetric (voxel-based) or 3D mesh movement data. The orientation of the human body in each 3D posture is estimated by detecting its feet and this information is used to orient all postures in a consistent manner. K-means is applied on the 3D postures space of the training data to discover characteristic movement patterns namely 3D dynemes. Subsequently, fuzzy vector quantization (FVQ) is utilized to represent each 3D posture in the 3D dynemes space and then information from all time instances is combined to represent the entire action sequence. Linear discriminant analysis (LDA) is then applied. The actual classification step utilizes support vector machines (SVM). Results on a 3D action database verified that the method can achieve good performance.
{"title":"Human action recognition in 3D motion sequences","authors":"Konstantinos Kelgeorgiadis, N. Nikolaidis","doi":"10.5281/ZENODO.43769","DOIUrl":"https://doi.org/10.5281/ZENODO.43769","url":null,"abstract":"In this paper we propose a method for learning and recognizing human actions on dynamic binary volumetric (voxel-based) or 3D mesh movement data. The orientation of the human body in each 3D posture is estimated by detecting its feet and this information is used to orient all postures in a consistent manner. K-means is applied on the 3D postures space of the training data to discover characteristic movement patterns namely 3D dynemes. Subsequently, fuzzy vector quantization (FVQ) is utilized to represent each 3D posture in the 3D dynemes space and then information from all time instances is combined to represent the entire action sequence. Linear discriminant analysis (LDA) is then applied. The actual classification step utilizes support vector machines (SVM). Results on a 3D action database verified that the method can achieve good performance.","PeriodicalId":198408,"journal":{"name":"2014 22nd European Signal Processing Conference (EUSIPCO)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127204471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose a new matrix recovery framework to partition brain activity using time series of resting-state functional Magnetic Resonance Imaging (fMRI). Spatial clusters are obtained with a new low-rank factorization algorithm that offers the ability to add different types of constraints. As an example we add a total variation type cost function in order to exploit neighborhood constraints. We first validate the performance of our algorithm on simulated data, which allows us to show that the neighborhood constraint improves the recovery in noisy or undersampled set-ups. Then we conduct experiments on real-world data, where we simulated an accelerated acquisition by randomly undersampling the time series. The obtained parcellation are reproducible when analysing data from different sets of individuals, and the estimation is robust to undersampling.
{"title":"A spatially constrained low-rank matrix factorization for the functional parcellation of the brain","authors":"Alexis Benichoux, T. Blumensath","doi":"10.5281/ZENODO.44129","DOIUrl":"https://doi.org/10.5281/ZENODO.44129","url":null,"abstract":"We propose a new matrix recovery framework to partition brain activity using time series of resting-state functional Magnetic Resonance Imaging (fMRI). Spatial clusters are obtained with a new low-rank factorization algorithm that offers the ability to add different types of constraints. As an example we add a total variation type cost function in order to exploit neighborhood constraints. We first validate the performance of our algorithm on simulated data, which allows us to show that the neighborhood constraint improves the recovery in noisy or undersampled set-ups. Then we conduct experiments on real-world data, where we simulated an accelerated acquisition by randomly undersampling the time series. The obtained parcellation are reproducible when analysing data from different sets of individuals, and the estimation is robust to undersampling.","PeriodicalId":198408,"journal":{"name":"2014 22nd European Signal Processing Conference (EUSIPCO)","volume":"2011 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127357768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Adaptive filters in noise control applications have to approximate the primary path and compensate for the secondary-path. This work shows that the primary- and secondary-path variations of noise control headphones depend above all on the direction of incident noise and the tightness of the ear-cups. Both kind of variations are investigated by preliminary measurements, and it is further shown that the measured variations can be approximated with the linear combination of only a few prototype filters. Thus, a parallel adaptive linear combiner is suggested instead of the typical adaptive transversal-filter. Theoretical considerations and experimental results reveal that the parallel structure performs equally well, converges even faster, and requires fewer adaptation weights.
{"title":"Least-mean-square weighted parallel IIR filters in active-noise-control headphones","authors":"Markus Guldenschuh","doi":"10.5281/ZENODO.43764","DOIUrl":"https://doi.org/10.5281/ZENODO.43764","url":null,"abstract":"Adaptive filters in noise control applications have to approximate the primary path and compensate for the secondary-path. This work shows that the primary- and secondary-path variations of noise control headphones depend above all on the direction of incident noise and the tightness of the ear-cups. Both kind of variations are investigated by preliminary measurements, and it is further shown that the measured variations can be approximated with the linear combination of only a few prototype filters. Thus, a parallel adaptive linear combiner is suggested instead of the typical adaptive transversal-filter. Theoretical considerations and experimental results reveal that the parallel structure performs equally well, converges even faster, and requires fewer adaptation weights.","PeriodicalId":198408,"journal":{"name":"2014 22nd European Signal Processing Conference (EUSIPCO)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132899607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. A. Belloch, Julian Parker, L. Savioja, Alberto González, V. Välimäki
Maximising loudness of audio signals by restricting their dynamic range has become an important issue in audio signal processing. Previous works indicate that an allpass filter chain can reduce the peak amplitude of an audio signal, without introducing the distortion associated with traditional non-linear techniques. Because of large search space and the consequential demand of the computational needs, the previous work selected randomly the delay-line lengths and fixed the filter coefficient values. In this work, we run on a GPU accelerator multiple allpass filter chains in parallel that cover all relevant delay-line lengths and perform a wide search on possible coefficient values in order to get closer to the optimal choice. Our most exhaustive method, which tests about 29 million parameter combinations, reduced the amplitude of test signals by 23% to 31%, whereas the previous work could only achieve a reduction of 23% at best.
{"title":"Dynamic range reduction of audio signals using multiple allpass filters on a GPU accelerator","authors":"J. A. Belloch, Julian Parker, L. Savioja, Alberto González, V. Välimäki","doi":"10.5281/ZENODO.43816","DOIUrl":"https://doi.org/10.5281/ZENODO.43816","url":null,"abstract":"Maximising loudness of audio signals by restricting their dynamic range has become an important issue in audio signal processing. Previous works indicate that an allpass filter chain can reduce the peak amplitude of an audio signal, without introducing the distortion associated with traditional non-linear techniques. Because of large search space and the consequential demand of the computational needs, the previous work selected randomly the delay-line lengths and fixed the filter coefficient values. In this work, we run on a GPU accelerator multiple allpass filter chains in parallel that cover all relevant delay-line lengths and perform a wide search on possible coefficient values in order to get closer to the optimal choice. Our most exhaustive method, which tests about 29 million parameter combinations, reduced the amplitude of test signals by 23% to 31%, whereas the previous work could only achieve a reduction of 23% at best.","PeriodicalId":198408,"journal":{"name":"2014 22nd European Signal Processing Conference (EUSIPCO)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130407482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
G. Cao, I. Ahmad, Honglei Zhang, Weiyi Xie, M. Gabbouj
We propose a distributed learning to rank method, and demonstrate its effectiveness in web-scale image retrieval. With the increasing amount of data, it is not applicable to train a centralized ranking model for any large scale learning problems. In distributed learning, the discrepancy between the training subsets and the whole when building the models are non-trivial but overlooked in the previous work. In this paper, we firstly include a cost factor to boosting algorithms to balance the individual models toward the whole data. Then, we propose to decompose the original algorithm to multiple layers, and their aggregation forms a superior ranker which can be easily scaled up to billions of images. The extensive experiments show the proposed method outperforms the straightforward aggregation of boosting algorithms.
{"title":"Balance learning to rank in big data","authors":"G. Cao, I. Ahmad, Honglei Zhang, Weiyi Xie, M. Gabbouj","doi":"10.5281/ZENODO.44026","DOIUrl":"https://doi.org/10.5281/ZENODO.44026","url":null,"abstract":"We propose a distributed learning to rank method, and demonstrate its effectiveness in web-scale image retrieval. With the increasing amount of data, it is not applicable to train a centralized ranking model for any large scale learning problems. In distributed learning, the discrepancy between the training subsets and the whole when building the models are non-trivial but overlooked in the previous work. In this paper, we firstly include a cost factor to boosting algorithms to balance the individual models toward the whole data. Then, we propose to decompose the original algorithm to multiple layers, and their aggregation forms a superior ranker which can be easily scaled up to billions of images. The extensive experiments show the proposed method outperforms the straightforward aggregation of boosting algorithms.","PeriodicalId":198408,"journal":{"name":"2014 22nd European Signal Processing Conference (EUSIPCO)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131668660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Head-related transfer functions (HRTFs) describe the acoustic filtering of incoming sounds by the human morphology. We propose three algorithms for representing HRTFs in subbands, i.e., as an analysis filterbank (FB) followed by a transfer matrix and a synthesis FB. These algorithms can be combined to achieve different design objectives. In the first algorithm, the choice of FBs is fixed, and a sparse approximation procedure minimizes the complexity of the transfer matrix associated to each HRTF. The other two algorithms jointly optimize the FBs and transfer matrices. The first variant aims at minimizing the complexity of the transfer matrices, while the second one does it for the FBs. Numerical experiments show that the proposed methods offer significant computational savings when compared with other available approaches.
{"title":"Efficient representation of head-related transfer functions in subbands","authors":"D. Marelli, Robert Baumgartner, P. Majdak","doi":"10.5281/ZENODO.43899","DOIUrl":"https://doi.org/10.5281/ZENODO.43899","url":null,"abstract":"Head-related transfer functions (HRTFs) describe the acoustic filtering of incoming sounds by the human morphology. We propose three algorithms for representing HRTFs in subbands, i.e., as an analysis filterbank (FB) followed by a transfer matrix and a synthesis FB. These algorithms can be combined to achieve different design objectives. In the first algorithm, the choice of FBs is fixed, and a sparse approximation procedure minimizes the complexity of the transfer matrix associated to each HRTF. The other two algorithms jointly optimize the FBs and transfer matrices. The first variant aims at minimizing the complexity of the transfer matrices, while the second one does it for the FBs. Numerical experiments show that the proposed methods offer significant computational savings when compared with other available approaches.","PeriodicalId":198408,"journal":{"name":"2014 22nd European Signal Processing Conference (EUSIPCO)","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126207832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}