Pub Date : 2003-09-17DOI: 10.1109/NNSP.2003.1318037
H. Saruwatari, H. Yamajo, T. Takatani, T. Nishikawa, K. Shikano
We propose a new two-stage blind separation and deconvolution algorithm for multiple-input multiple-output (MIMO)- FIR system driven by colored sound sources, in which a new single-input multiple-output (SIMO)-model-based ICA (SIMO-ICA) and blind multichannel inverse filtering are combined. SIMO-ICA can separate the mixed signals, not into monaural source signals but into SIMO-model-based signals from independent sources. After SIMO-ICA, a simple blind deconvolution technique for the SIMO model can be applied even when each source signal is temporally correlated. The simulation results reveal that the proposed algorithm can successfully achieve the separation and deconvolution for a convolutive mixture of speech.
{"title":"Blind separation and deconvolution of MIMO system driven by colored inputs using SIMO-model-based ICA with information-geometric learning","authors":"H. Saruwatari, H. Yamajo, T. Takatani, T. Nishikawa, K. Shikano","doi":"10.1109/NNSP.2003.1318037","DOIUrl":"https://doi.org/10.1109/NNSP.2003.1318037","url":null,"abstract":"We propose a new two-stage blind separation and deconvolution algorithm for multiple-input multiple-output (MIMO)- FIR system driven by colored sound sources, in which a new single-input multiple-output (SIMO)-model-based ICA (SIMO-ICA) and blind multichannel inverse filtering are combined. SIMO-ICA can separate the mixed signals, not into monaural source signals but into SIMO-model-based signals from independent sources. After SIMO-ICA, a simple blind deconvolution technique for the SIMO model can be applied even when each source signal is temporally correlated. The simulation results reveal that the proposed algorithm can successfully achieve the separation and deconvolution for a convolutive mixture of speech.","PeriodicalId":315958,"journal":{"name":"2003 IEEE XIII Workshop on Neural Networks for Signal Processing (IEEE Cat. No.03TH8718)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121139304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-09-17DOI: 10.1109/NNSP.2003.1318028
Samuel Kaski, J. Sinkkonen, Arto Klami
A generative distributional clustering model for continuous data is reviewed and methods for optimizing and regularizing it are introduced and compared. Based on pairs of auxiliary and primary data, the primary data space is partitioned into Voronoi regions that are maximally homogeneous in terms of auxiliary data. Then only variation in the primary data associated with variation in the auxiliary data influences the clusters. Because the whole primary space is partitioned, new samples can be easily clustered in terms of primary data alone. In experiments, the approach is shown to produce more homogeneous clusters than alternative methods. Two regularization methods are demonstrated to further improve the results: an entropy-type penalty for unequal cluster sizes, and the inclusion of a K-means component to the model. The latter can alternatively be interpreted as special kind of joint distribution modeling where the emphasis between discrimination and unsupervised modeling of primary data can be tuned.
{"title":"Regularized discriminative clustering","authors":"Samuel Kaski, J. Sinkkonen, Arto Klami","doi":"10.1109/NNSP.2003.1318028","DOIUrl":"https://doi.org/10.1109/NNSP.2003.1318028","url":null,"abstract":"A generative distributional clustering model for continuous data is reviewed and methods for optimizing and regularizing it are introduced and compared. Based on pairs of auxiliary and primary data, the primary data space is partitioned into Voronoi regions that are maximally homogeneous in terms of auxiliary data. Then only variation in the primary data associated with variation in the auxiliary data influences the clusters. Because the whole primary space is partitioned, new samples can be easily clustered in terms of primary data alone. In experiments, the approach is shown to produce more homogeneous clusters than alternative methods. Two regularization methods are demonstrated to further improve the results: an entropy-type penalty for unequal cluster sizes, and the inclusion of a K-means component to the model. The latter can alternatively be interpreted as special kind of joint distribution modeling where the emphasis between discrimination and unsupervised modeling of primary data can be tuned.","PeriodicalId":315958,"journal":{"name":"2003 IEEE XIII Workshop on Neural Networks for Signal Processing (IEEE Cat. No.03TH8718)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128079079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-09-17DOI: 10.1109/NNSP.2003.1318017
Keisuke Yamazaki, Sumio Watanabe
Hidden Markov models are now used in many fields, for example, speech recognition, natural language processing etc. However, the mathematical foundation of analysis for the models has not yet been constructed, since the HMMs are non-identifiable. In recent years, we have developed the algebraic geometrical method that allows us to analyze the non-regular and non-identifiable models. In this paper, we apply this method to the HMM and reveal the asymptotic order of its stochastic complexity in the mathematically rigorous way.
{"title":"Stochastic complexities of hidden Markov models","authors":"Keisuke Yamazaki, Sumio Watanabe","doi":"10.1109/NNSP.2003.1318017","DOIUrl":"https://doi.org/10.1109/NNSP.2003.1318017","url":null,"abstract":"Hidden Markov models are now used in many fields, for example, speech recognition, natural language processing etc. However, the mathematical foundation of analysis for the models has not yet been constructed, since the HMMs are non-identifiable. In recent years, we have developed the algebraic geometrical method that allows us to analyze the non-regular and non-identifiable models. In this paper, we apply this method to the HMM and reveal the asymptotic order of its stochastic complexity in the mathematically rigorous way.","PeriodicalId":315958,"journal":{"name":"2003 IEEE XIII Workshop on Neural Networks for Signal Processing (IEEE Cat. No.03TH8718)","volume":"398 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131848121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-09-17DOI: 10.1109/NNSP.2003.1318046
M. Barnard, J. Odobez, Samy Bengio
The recognition of events within multi-modal data is a challenging problem. In this paper we focus on the recognition of events by using both audio and video data. We investigate the use of data fusion techniques in order to recognise these sequences within the framework of hidden Markov models (HMM) used to model audio and video data sequences. Specifically we look at the recognition of play and break sequences in football and the segmentation of football games based on these two events. Recognising relatively simple semantic events such as this is an important step towards full automatic indexing of such video material. These experiments were done using approximately 3 hours of data from two games of the Euro96 competition. We propose that modelling the audio and video streams separately for each sequence and fusing the decisions from each stream should yield an accurate and robust method of segmenting multi-modal data.
{"title":"Multi-modal audio-visual event recognition for football analysis","authors":"M. Barnard, J. Odobez, Samy Bengio","doi":"10.1109/NNSP.2003.1318046","DOIUrl":"https://doi.org/10.1109/NNSP.2003.1318046","url":null,"abstract":"The recognition of events within multi-modal data is a challenging problem. In this paper we focus on the recognition of events by using both audio and video data. We investigate the use of data fusion techniques in order to recognise these sequences within the framework of hidden Markov models (HMM) used to model audio and video data sequences. Specifically we look at the recognition of play and break sequences in football and the segmentation of football games based on these two events. Recognising relatively simple semantic events such as this is an important step towards full automatic indexing of such video material. These experiments were done using approximately 3 hours of data from two games of the Euro96 competition. We propose that modelling the audio and video streams separately for each sequence and fusing the decisions from each stream should yield an accurate and robust method of segmenting multi-modal data.","PeriodicalId":315958,"journal":{"name":"2003 IEEE XIII Workshop on Neural Networks for Signal Processing (IEEE Cat. No.03TH8718)","volume":"146 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133104277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-09-17DOI: 10.1109/NNSP.2003.1318047
A. Heittmann, U. Ramacher
The feature extraction and detection in visual scenes set up the basis for robust image processing and scene analysis. While the receptive fields of simple cells in the visual cortex are modeled by Gabor functions, simple cells are commonly treated as linear filters. In this paper, we demonstrate how the non-linear operations on pulses like correlation, synchronization and detection of decorrelation can be used for implementation of feature detectors. Using essentially two data-driven adaption rules dependent on dendritic currents and to membrane potentials, linear detection of intensity gradients can be realized. As a technical application, a feature detector sensitive to orientation is presented.
{"title":"Correlation-based feature detection using pulsed neural networks","authors":"A. Heittmann, U. Ramacher","doi":"10.1109/NNSP.2003.1318047","DOIUrl":"https://doi.org/10.1109/NNSP.2003.1318047","url":null,"abstract":"The feature extraction and detection in visual scenes set up the basis for robust image processing and scene analysis. While the receptive fields of simple cells in the visual cortex are modeled by Gabor functions, simple cells are commonly treated as linear filters. In this paper, we demonstrate how the non-linear operations on pulses like correlation, synchronization and detection of decorrelation can be used for implementation of feature detectors. Using essentially two data-driven adaption rules dependent on dendritic currents and to membrane potentials, linear detection of intensity gradients can be realized. As a technical application, a feature detector sensitive to orientation is presented.","PeriodicalId":315958,"journal":{"name":"2003 IEEE XIII Workshop on Neural Networks for Signal Processing (IEEE Cat. No.03TH8718)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114622496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-09-17DOI: 10.1109/NNSP.2003.1318012
J. Farges, P. Fabiani, S. L. Ménec
A solution using neural networks for the problem of the choice of the control mode of a missile is proposed and implemented. The test on 7000 interceptions shows that this approach makes it possible to reduce the number of failures (miss distance larger than 5 meters) compared to the use of expert rules.
{"title":"Blending of missile control modes with neural networks","authors":"J. Farges, P. Fabiani, S. L. Ménec","doi":"10.1109/NNSP.2003.1318012","DOIUrl":"https://doi.org/10.1109/NNSP.2003.1318012","url":null,"abstract":"A solution using neural networks for the problem of the choice of the control mode of a missile is proposed and implemented. The test on 7000 interceptions shows that this approach makes it possible to reduce the number of failures (miss distance larger than 5 meters) compared to the use of expert rules.","PeriodicalId":315958,"journal":{"name":"2003 IEEE XIII Workshop on Neural Networks for Signal Processing (IEEE Cat. No.03TH8718)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128643840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-09-17DOI: 10.1109/NNSP.2003.1318020
V. Asirvadam, S. McLoone, G. Irwin
Novel fast and efficient sequential learning algorithms are proposed for direct-link radial basis function (DRBF) networks. The dynamic DRBF network is trained using the recently proposed decomposed/parallel recursive Levenberg Marquardt (PRLM) algorithm by neglecting the interneuron weight interactions. The resulting sequential learning approach enables weights to be updated in an efficient parallel manner and facilitates a minimal update extension for real-time applications. Simulation results for two benchmark problems show the feasibility of the new training algorithms.
{"title":"Fast and efficient sequential learning algorithms using direct-link RBF networks","authors":"V. Asirvadam, S. McLoone, G. Irwin","doi":"10.1109/NNSP.2003.1318020","DOIUrl":"https://doi.org/10.1109/NNSP.2003.1318020","url":null,"abstract":"Novel fast and efficient sequential learning algorithms are proposed for direct-link radial basis function (DRBF) networks. The dynamic DRBF network is trained using the recently proposed decomposed/parallel recursive Levenberg Marquardt (PRLM) algorithm by neglecting the interneuron weight interactions. The resulting sequential learning approach enables weights to be updated in an efficient parallel manner and facilitates a minimal update extension for real-time applications. Simulation results for two benchmark problems show the feasibility of the new training algorithms.","PeriodicalId":315958,"journal":{"name":"2003 IEEE XIII Workshop on Neural Networks for Signal Processing (IEEE Cat. No.03TH8718)","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123661463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-09-17DOI: 10.1109/NNSP.2003.1318035
F. Vrins, J. Lee, M. Verleysen, V. Vigneron, C. Jutten
Blind source separation (BSS) consists in recovering unobserved signals from observed mixtures of them. In most cases the whole set of mixtures is used for the separation, possibly after a dimension reduction by PCA. This paper aims to show that in many applications the quality of the separation can be improved by first selecting a subset of some mixtures among the available ones, possibly by an information content criterion, and performing PCA and BSS afterwards. The benefit of this procedure is shown on simulated electrocardiographic data by extracting the fetal electrocardiogram signal from mixtures recorded on the abdomen of a pregnant woman.
{"title":"Improving independent component analysis performances by variable selection","authors":"F. Vrins, J. Lee, M. Verleysen, V. Vigneron, C. Jutten","doi":"10.1109/NNSP.2003.1318035","DOIUrl":"https://doi.org/10.1109/NNSP.2003.1318035","url":null,"abstract":"Blind source separation (BSS) consists in recovering unobserved signals from observed mixtures of them. In most cases the whole set of mixtures is used for the separation, possibly after a dimension reduction by PCA. This paper aims to show that in many applications the quality of the separation can be improved by first selecting a subset of some mixtures among the available ones, possibly by an information content criterion, and performing PCA and BSS afterwards. The benefit of this procedure is shown on simulated electrocardiographic data by extracting the fetal electrocardiogram signal from mixtures recorded on the abdomen of a pregnant woman.","PeriodicalId":315958,"journal":{"name":"2003 IEEE XIII Workshop on Neural Networks for Signal Processing (IEEE Cat. No.03TH8718)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125120946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-09-17DOI: 10.1109/NNSP.2003.1318027
D. Luengo, I. Santamaría, L. Vielva, C. Pantaleón
We consider the underdetermined blind source separation problem with linear instantaneous and convolutive mixtures when the input signals are sparse, or have been rendered sparse. In the underdetermined case the problem requires solving three sub-problems: detecting the number of sources, estimating the mixing matrix, and finding an adequate inversion strategy to obtain the sources. This paper solves the first two problems. We assume that the number of sources is unknown, and estimate it by means of an information theoretic criterion (MDL). Then the mixing matrix is expressed in spheric coordinates and we estimate sequentially the angles and amplitudes of each column, and their order. The performance of the method is illustrated through simulations.
{"title":"Underdetermined blind separation of sparse sources with instantaneous and convolutive mixtures","authors":"D. Luengo, I. Santamaría, L. Vielva, C. Pantaleón","doi":"10.1109/NNSP.2003.1318027","DOIUrl":"https://doi.org/10.1109/NNSP.2003.1318027","url":null,"abstract":"We consider the underdetermined blind source separation problem with linear instantaneous and convolutive mixtures when the input signals are sparse, or have been rendered sparse. In the underdetermined case the problem requires solving three sub-problems: detecting the number of sources, estimating the mixing matrix, and finding an adequate inversion strategy to obtain the sources. This paper solves the first two problems. We assume that the number of sources is unknown, and estimate it by means of an information theoretic criterion (MDL). Then the mixing matrix is expressed in spheric coordinates and we estimate sequentially the angles and amplitudes of each column, and their order. The performance of the method is illustrated through simulations.","PeriodicalId":315958,"journal":{"name":"2003 IEEE XIII Workshop on Neural Networks for Signal Processing (IEEE Cat. No.03TH8718)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123953690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-09-17DOI: 10.1109/NNSP.2003.1318055
G. Costantini, A. Rizzi, D. Casali
The correct classification of single musical sources is a relevant aspect for the source separation task and the automatic transcription of polyphonic music. In this paper, we deal with a classification problem concerning the recognition of six different musical instruments: violin, clarinet, flute, oboe, saxophone and piano. A satisfactory solution of such a recognition problem depends mainly on both the preprocessing procedure (set of features extracted from row data) and the adopted classification system. As concerns feature extraction, a suitable signal preprocessing based on FFT, QFT (Q-constant frequency transform) and cepstrum coefficients are employed. We adopt min-max neurofuzzy networks as the classification model, both in their classical and generalized version. The synthesis of these classifiers is performed by the adaptive resolution training technique (ARC, PARC and GPARC algorithms), since it assures good performances and an excellent automation degree.
单一音源的正确分类是音源分离任务和复调音乐自动转写的一个相关方面。本文研究了小提琴、单簧管、长笛、双簧管、萨克斯管和钢琴六种不同乐器的分类问题。这种识别问题的满意解决方案主要取决于预处理程序(从行数据中提取的特征集)和所采用的分类系统。在特征提取方面,采用了基于FFT、QFT (Q-constant frequency transform)和倒谱系数的信号预处理。我们采用最小-最大神经模糊网络作为分类模型,包括经典模型和广义模型。这些分类器的综合是通过自适应分辨率训练技术(ARC, PARC和GPARC算法)进行的,因为它保证了良好的性能和良好的自动化程度。
{"title":"Recognition of musical instruments by generalized min-max classifiers","authors":"G. Costantini, A. Rizzi, D. Casali","doi":"10.1109/NNSP.2003.1318055","DOIUrl":"https://doi.org/10.1109/NNSP.2003.1318055","url":null,"abstract":"The correct classification of single musical sources is a relevant aspect for the source separation task and the automatic transcription of polyphonic music. In this paper, we deal with a classification problem concerning the recognition of six different musical instruments: violin, clarinet, flute, oboe, saxophone and piano. A satisfactory solution of such a recognition problem depends mainly on both the preprocessing procedure (set of features extracted from row data) and the adopted classification system. As concerns feature extraction, a suitable signal preprocessing based on FFT, QFT (Q-constant frequency transform) and cepstrum coefficients are employed. We adopt min-max neurofuzzy networks as the classification model, both in their classical and generalized version. The synthesis of these classifiers is performed by the adaptive resolution training technique (ARC, PARC and GPARC algorithms), since it assures good performances and an excellent automation degree.","PeriodicalId":315958,"journal":{"name":"2003 IEEE XIII Workshop on Neural Networks for Signal Processing (IEEE Cat. No.03TH8718)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126386103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}