Pub Date : 2017-09-01DOI: 10.1109/MLSP.2017.8168196
A. Ozerov, Srdan Kitic, P. Pérez
We consider example-guided audio source separation approaches, where the audio mixture to be separated is supplied with source examples that are assumed matching the sources in the mixture both in frequency and time. These approaches were successfully applied to the tasks such as source separation by humming, score-informed music source separation, and music source separation guided by covers. Most of proposed methods are based on nonnegative matrix factorization (NMF) and its variants, including methods using NMF models pre-trained from examples as an initialization of mixture NMF decomposition, methods using those models as hyperparameters of priors of mixture NMF decomposition, and methods using coupled NMF models. Moreover, those methods differ by the choice of the NMF divergence and the NMF prior. However, there is no systematic comparison of all these methods. In this work, we compare existing methods and some new variants on the score-informed and cover-guided source separation tasks.
{"title":"A comparative study of example-guided audio source separation approaches based on nonnegative matrix factorization","authors":"A. Ozerov, Srdan Kitic, P. Pérez","doi":"10.1109/MLSP.2017.8168196","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168196","url":null,"abstract":"We consider example-guided audio source separation approaches, where the audio mixture to be separated is supplied with source examples that are assumed matching the sources in the mixture both in frequency and time. These approaches were successfully applied to the tasks such as source separation by humming, score-informed music source separation, and music source separation guided by covers. Most of proposed methods are based on nonnegative matrix factorization (NMF) and its variants, including methods using NMF models pre-trained from examples as an initialization of mixture NMF decomposition, methods using those models as hyperparameters of priors of mixture NMF decomposition, and methods using coupled NMF models. Moreover, those methods differ by the choice of the NMF divergence and the NMF prior. However, there is no systematic comparison of all these methods. In this work, we compare existing methods and some new variants on the score-informed and cover-guided source separation tasks.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"18 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81634470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/MLSP.2017.8168111
Subhajit Chaudhury, Sakyasingha Dasgupta, Asim Munawar, Md. A. Salam Khan, Ryuki Tachibana
We present a conditional generative method that maps low-dimensional embeddings of image and natural language to a common latent space hence extracting semantic relationships between them. The embedding specific to a modality is first extracted and subsequently a constrained optimization procedure is performed to project the two embedding spaces to a common manifold. Based on this, we present a method to learn the conditional probability distribution of the two embedding spaces; first, by mapping them to a shared latent space and generating back the individual embeddings from this common space. However, in order to enable independent conditional inference for separately extracting the corresponding embeddings from the common latent space representation, we deploy a proxy variable trick — wherein, the single shared latent space is replaced by two separate latent spaces. We design an objective function, such that, during training we can force these separate spaces to lie close to each other, by minimizing the Euclidean distance between their distribution functions. Experimental results demonstrate that the learned joint model can generalize to learning concepts of double MNIST digits with additional attributes of colors, thereby enabling the generation of specific colored images from the respective text data.
{"title":"Text to image generative model using constrained embedding space mapping","authors":"Subhajit Chaudhury, Sakyasingha Dasgupta, Asim Munawar, Md. A. Salam Khan, Ryuki Tachibana","doi":"10.1109/MLSP.2017.8168111","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168111","url":null,"abstract":"We present a conditional generative method that maps low-dimensional embeddings of image and natural language to a common latent space hence extracting semantic relationships between them. The embedding specific to a modality is first extracted and subsequently a constrained optimization procedure is performed to project the two embedding spaces to a common manifold. Based on this, we present a method to learn the conditional probability distribution of the two embedding spaces; first, by mapping them to a shared latent space and generating back the individual embeddings from this common space. However, in order to enable independent conditional inference for separately extracting the corresponding embeddings from the common latent space representation, we deploy a proxy variable trick — wherein, the single shared latent space is replaced by two separate latent spaces. We design an objective function, such that, during training we can force these separate spaces to lie close to each other, by minimizing the Euclidean distance between their distribution functions. Experimental results demonstrate that the learned joint model can generalize to learning concepts of double MNIST digits with additional attributes of colors, thereby enabling the generation of specific colored images from the respective text data.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"62 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85222872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/MLSP.2017.8168173
F. Huang, P. Balázs
This paper presents an automatic approach for parameter training for a sparsity-based pitch estimation method that has been previously published. For this pitch estimation method, the harmonic dictionary is a key parameter that needs to be carefully prepared beforehand. In the original method, extensive human supervision and involvement are required to construct and label the dictionary. In this study, we propose to employ dictionary learning algorithms to learn the dictionary directly from training data. We apply and compare 3 typical dictionary learning algorithms, i.e., the method of optimized directions (MOD), K-SVD and online dictionary learning (ODL), and propose a post-processing method to label and adapt a learned dictionary for pitch estimation. Results show that MOD and properly initialized ODL (pi-ODL) can lead to dictionaries that exhibit the desired harmonic structures for pitch estimation, and the post-processing method can significantly improve performance of the learned dictionaries in pitch estimation. The dictionary obtained with pi-ODL and post-processing attained pitch estimation accuracy close to the optimal performance of the manual dictionary. It is positively shown that dictionary learning is feasible and promising for this application.
{"title":"Dictionary learning for pitch estimation in speech signals","authors":"F. Huang, P. Balázs","doi":"10.1109/MLSP.2017.8168173","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168173","url":null,"abstract":"This paper presents an automatic approach for parameter training for a sparsity-based pitch estimation method that has been previously published. For this pitch estimation method, the harmonic dictionary is a key parameter that needs to be carefully prepared beforehand. In the original method, extensive human supervision and involvement are required to construct and label the dictionary. In this study, we propose to employ dictionary learning algorithms to learn the dictionary directly from training data. We apply and compare 3 typical dictionary learning algorithms, i.e., the method of optimized directions (MOD), K-SVD and online dictionary learning (ODL), and propose a post-processing method to label and adapt a learned dictionary for pitch estimation. Results show that MOD and properly initialized ODL (pi-ODL) can lead to dictionaries that exhibit the desired harmonic structures for pitch estimation, and the post-processing method can significantly improve performance of the learned dictionaries in pitch estimation. The dictionary obtained with pi-ODL and post-processing attained pitch estimation accuracy close to the optimal performance of the manual dictionary. It is positively shown that dictionary learning is feasible and promising for this application.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"49 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88947433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/MLSP.2017.8168147
Kan Li, Ying Ma, J. Príncipe
In this paper, we propose a novel approach to automatically identify plant species using dynamics of plant growth and development or spatiotemporal evolution model (STEM). The online kernel adaptive autoregressive-moving-average (KAARMA) algorithm, a discrete-time dynamical system in the kernel reproducing Hilbert space (RKHS), is used to learn plant-development syntactic patterns from feature-vector sequences automatically extracted from 2D plant images, generated by stochastic L-systems. Results show multiclass KAARMA STEM can automatically identify plant species based on growth patterns. Furthermore, finite state machines extracted from trained KAARMA STEM retains competitive performance and are robust to noise. Automatically constructing an L-system or formal grammar to replicate a spatiotemporal structure is an open problem. This is an important first step to not only identify plants but also to generate realistic plant models automatically from observations.
{"title":"Automatic plant identification using stem automata","authors":"Kan Li, Ying Ma, J. Príncipe","doi":"10.1109/MLSP.2017.8168147","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168147","url":null,"abstract":"In this paper, we propose a novel approach to automatically identify plant species using dynamics of plant growth and development or spatiotemporal evolution model (STEM). The online kernel adaptive autoregressive-moving-average (KAARMA) algorithm, a discrete-time dynamical system in the kernel reproducing Hilbert space (RKHS), is used to learn plant-development syntactic patterns from feature-vector sequences automatically extracted from 2D plant images, generated by stochastic L-systems. Results show multiclass KAARMA STEM can automatically identify plant species based on growth patterns. Furthermore, finite state machines extracted from trained KAARMA STEM retains competitive performance and are robust to noise. Automatically constructing an L-system or formal grammar to replicate a spatiotemporal structure is an open problem. This is an important first step to not only identify plants but also to generate realistic plant models automatically from observations.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"21 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78059955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/MLSP.2017.8168115
E. Nadimi, J. Herp, M. M. Buijs, V. Blanes-Vidal
We studied the problem of classifying textured-materials from their single-imaged appearance, under general viewing and illumination conditions, using the theory of random matrices. To evaluate the performance of our algorithm, two distinct databases of images were used: The CUReT database and our database of colorectal polyp images collected from patients undergoing colon capsule endoscopy for early cancer detection. During the learning stage, our classifier algorithm established the universality laws for the empirical spectral density of the largest singular value and normalized largest singular value of the image intensity matrix adapted to the eigenvalues of the information-plus-noise model. We showed that these two densities converge to the generalized extreme value (GEV-Frechet) and Gaussian G1 distribution with rate O(N1/2), respectively. To validate the algorithm, we introduced a set of unseen images to the algorithm. Misclassification rate of approximately 1%–6%, depending on the database, was obtained, which is superior to the reported values of 5%–45% in previous research studies.
{"title":"Texture classification from single uncalibrated images: Random matrix theory approach","authors":"E. Nadimi, J. Herp, M. M. Buijs, V. Blanes-Vidal","doi":"10.1109/MLSP.2017.8168115","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168115","url":null,"abstract":"We studied the problem of classifying textured-materials from their single-imaged appearance, under general viewing and illumination conditions, using the theory of random matrices. To evaluate the performance of our algorithm, two distinct databases of images were used: The CUReT database and our database of colorectal polyp images collected from patients undergoing colon capsule endoscopy for early cancer detection. During the learning stage, our classifier algorithm established the universality laws for the empirical spectral density of the largest singular value and normalized largest singular value of the image intensity matrix adapted to the eigenvalues of the information-plus-noise model. We showed that these two densities converge to the generalized extreme value (GEV-Frechet) and Gaussian G1 distribution with rate O(N1/2), respectively. To validate the algorithm, we introduced a set of unseen images to the algorithm. Misclassification rate of approximately 1%–6%, depending on the database, was obtained, which is superior to the reported values of 5%–45% in previous research studies.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"34 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87945035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/MLSP.2017.8168132
J. Dong, Badong Chen, N. Lu, Haixian Wang, Nanning Zheng
Common spatial patterns (CSP) is a widely used method in the field of electroencephalogram (EEG) signal processing. The goal of CSP is to find spatial filters that maximize the ratio between the variances of two classes. The conventional CSP is however sensitive to outliers because it is based on the L2-norm. Inspired by the correntropy induced metric (CIM), we propose in this work a new algorithm, called CIM based CSP (CSP-CIM), to improve the robustness of CSP with respect to outliers. The CSP-CIM searches the optimal solution by a simple gradient based iterative algorithm. A toy example and a real EEG dataset are used to demonstrate the desirable performance of the new method.
{"title":"Correntropy induced metric based common spatial patterns","authors":"J. Dong, Badong Chen, N. Lu, Haixian Wang, Nanning Zheng","doi":"10.1109/MLSP.2017.8168132","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168132","url":null,"abstract":"Common spatial patterns (CSP) is a widely used method in the field of electroencephalogram (EEG) signal processing. The goal of CSP is to find spatial filters that maximize the ratio between the variances of two classes. The conventional CSP is however sensitive to outliers because it is based on the L2-norm. Inspired by the correntropy induced metric (CIM), we propose in this work a new algorithm, called CIM based CSP (CSP-CIM), to improve the robustness of CSP with respect to outliers. The CSP-CIM searches the optimal solution by a simple gradient based iterative algorithm. A toy example and a real EEG dataset are used to demonstrate the desirable performance of the new method.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"14 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81882011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/MLSP.2017.8168168
J. Stuchi, M. A. Angeloni, R. F. Pereira, L. Boccato, G. Folego, Paulo V. S. Prado, R. Attux
Machine learning has been increasingly used in current days. Great improvements, especially in deep neural networks, helped to boost the achievable performance in computer vision and signal processing applications. Although different techniques were applied for deep architectures, the frequency domain has not been thoroughly explored in this field. In this context, this paper presents a new method for extracting discriminative features according to the Fourier analysis. The proposed frequency extractor layer can be combined with deep architectures in order to improve image classification. Computational experiments were performed on face liveness detection problem, yielding better results than those presented in the literature for the grandtest protocol of Replay-Attack Database. This paper also aims to raise the discussion on how frequency domain layers can be used in deep architectures to further improve the network performance.
{"title":"Improving image classification with frequency domain layers for feature extraction","authors":"J. Stuchi, M. A. Angeloni, R. F. Pereira, L. Boccato, G. Folego, Paulo V. S. Prado, R. Attux","doi":"10.1109/MLSP.2017.8168168","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168168","url":null,"abstract":"Machine learning has been increasingly used in current days. Great improvements, especially in deep neural networks, helped to boost the achievable performance in computer vision and signal processing applications. Although different techniques were applied for deep architectures, the frequency domain has not been thoroughly explored in this field. In this context, this paper presents a new method for extracting discriminative features according to the Fourier analysis. The proposed frequency extractor layer can be combined with deep architectures in order to improve image classification. Computational experiments were performed on face liveness detection problem, yielding better results than those presented in the literature for the grandtest protocol of Replay-Attack Database. This paper also aims to raise the discussion on how frequency domain layers can be used in deep architectures to further improve the network performance.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"1 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90542996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper describes a semi-blind speech enhancement method using a semi-blind recurrent neural network (SB-RNN) for human-robot speech interaction. When a robot interacts with a human using speech signals, the robot inputs not only audio signals recorded by its own microphone but also speech signals made by the robot itself, which can be used for semi-blind speech enhancement. The SB-RNN consists of cascaded two modules: a semi-blind source separation module and a blind dereverberation module. Each module has a recurrent layer to capture the temporal correlations of speech signals. The SB-RNN is trained in a manner of multi-task learning, i.e., isolated echoic speech signals are used as teacher signals for the output of the separation module in addition to isolated unechoic signals for the output of the dereverberation module. Experimental results showed that the source to distortion ratio was improved by 2.30 dB on average compared to a conventional method based on a semi-blind independent component analysis. The results also showed the effectiveness of modularization of the network, multi-task learning, the recurrent structure, and semi-blind source separation.
{"title":"Semi-Blind speech enhancement basedon recurrent neural network for source separation and dereverberation","authors":"Masaya Wake, Yoshiaki Bando, M. Mimura, Katsutoshi Itoyama, Kazuyoshi Yoshii, Tatsuya Kawahara","doi":"10.1109/MLSP.2017.8168191","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168191","url":null,"abstract":"This paper describes a semi-blind speech enhancement method using a semi-blind recurrent neural network (SB-RNN) for human-robot speech interaction. When a robot interacts with a human using speech signals, the robot inputs not only audio signals recorded by its own microphone but also speech signals made by the robot itself, which can be used for semi-blind speech enhancement. The SB-RNN consists of cascaded two modules: a semi-blind source separation module and a blind dereverberation module. Each module has a recurrent layer to capture the temporal correlations of speech signals. The SB-RNN is trained in a manner of multi-task learning, i.e., isolated echoic speech signals are used as teacher signals for the output of the separation module in addition to isolated unechoic signals for the output of the dereverberation module. Experimental results showed that the source to distortion ratio was improved by 2.30 dB on average compared to a conventional method based on a semi-blind independent component analysis. The results also showed the effectiveness of modularization of the network, multi-task learning, the recurrent structure, and semi-blind source separation.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"98 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76982569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/MLSP.2017.8168151
Maria Scalabrin, Matteo Gadaleta, Riccardo Bonetto, M. Rossi
In this paper, we are concerned with the automated and runtime analysis of vehicular data from large scale traffic monitoring networks. This problem is tackled through localized and small-size Bayesian networks (BNs), which are utilized to capture the spatio-temporal relationships underpinning traffic data from nearby road links. A dedicated BN is set up, trained, and tested for each road in the monitored geographical map. The joint probability distribution between the cause nodes and the effect node in the BN is tracked through a Gaussian Mixture Model (GMM), whose parameters are estimated via Bayesian Variational Inference (BVI). Forecasting and anomaly detection are performed on statistical measures derived at runtime by the trained GMMs. Our design choices lead to several advantages: the approach is scalable as a small-size BN is associated with and independently trained for each road and the localized nature of the framework allows flagging atypical behaviors at their point of origin in the monitored geographical map. The effectiveness of the proposed framework is tested using a large dataset from a real network deployment, comparing its prediction performance with that of selected regression algorithms from the literature, while also quantifying its anomaly detection capabilities.
{"title":"A Bayesian forecasting and anomaly detection framework for vehicular monitoring networks","authors":"Maria Scalabrin, Matteo Gadaleta, Riccardo Bonetto, M. Rossi","doi":"10.1109/MLSP.2017.8168151","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168151","url":null,"abstract":"In this paper, we are concerned with the automated and runtime analysis of vehicular data from large scale traffic monitoring networks. This problem is tackled through localized and small-size Bayesian networks (BNs), which are utilized to capture the spatio-temporal relationships underpinning traffic data from nearby road links. A dedicated BN is set up, trained, and tested for each road in the monitored geographical map. The joint probability distribution between the cause nodes and the effect node in the BN is tracked through a Gaussian Mixture Model (GMM), whose parameters are estimated via Bayesian Variational Inference (BVI). Forecasting and anomaly detection are performed on statistical measures derived at runtime by the trained GMMs. Our design choices lead to several advantages: the approach is scalable as a small-size BN is associated with and independently trained for each road and the localized nature of the framework allows flagging atypical behaviors at their point of origin in the monitored geographical map. The effectiveness of the proposed framework is tested using a large dataset from a real network deployment, comparing its prediction performance with that of selected regression algorithms from the literature, while also quantifying its anomaly detection capabilities.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"123 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83473001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-09-01DOI: 10.1109/MLSP.2017.8168131
Cuong D. Tran, Ognjen Rudovic, V. Pavlovic
We study the task of unsupervised domain adaptation, where no labeled data from the target domain is provided during training time. To deal with the potential discrepancy between the source and target distributions, both in features and labels, we exploit a copula-based regression framework. The benefits of this approach are two-fold: (a) it allows us to model a broader range of conditional predictive densities beyond the common exponential family; (b) we show how to leverage Sklar's theorem, the essence of the copula formulation relating the joint density to the copula dependency functions, to find effective feature mappings that mitigate the domain mismatch. By transforming the data to a copula domain, we show on a number of benchmark datasets (including human emotion estimation), and using different regression models for prediction, that we can achieve a more robust and accurate estimation of target labels, compared to recently proposed feature transformation (adaptation) methods.
{"title":"Unsupervised domain adaptation with copula models","authors":"Cuong D. Tran, Ognjen Rudovic, V. Pavlovic","doi":"10.1109/MLSP.2017.8168131","DOIUrl":"https://doi.org/10.1109/MLSP.2017.8168131","url":null,"abstract":"We study the task of unsupervised domain adaptation, where no labeled data from the target domain is provided during training time. To deal with the potential discrepancy between the source and target distributions, both in features and labels, we exploit a copula-based regression framework. The benefits of this approach are two-fold: (a) it allows us to model a broader range of conditional predictive densities beyond the common exponential family; (b) we show how to leverage Sklar's theorem, the essence of the copula formulation relating the joint density to the copula dependency functions, to find effective feature mappings that mitigate the domain mismatch. By transforming the data to a copula domain, we show on a number of benchmark datasets (including human emotion estimation), and using different regression models for prediction, that we can achieve a more robust and accurate estimation of target labels, compared to recently proposed feature transformation (adaptation) methods.","PeriodicalId":6542,"journal":{"name":"2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP)","volume":"64 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76760070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}