Pub Date : 2015-10-01DOI: 10.1109/IJCNN.2015.7280729
C. Vineyard, Stephen J Verzi, C. James, J. Aimone, G. Heileman
The field of machine learning strives to develop algorithms that, through learning, lead to generalization; that is, the ability of a machine to perform a task that it was not explicitly trained for. An added challenge arises when the problem domain is dynamic or non-stationary with the data distributions or categorizations changing over time. This phenomenon is known as concept drift. Game-theoretic algorithms are often iterative by nature, consisting of repeated game play rather than a single interaction. Effectively, rather than requiring extensive retraining to update a learning model, a game-theoretic approach can adjust strategies as a novel approach to concept drift. In this paper we present a variant of our Support Vector Machine (SVM) Game classifier which may be used in an adaptive manner with repeated play to address concept drift, and show results of applying this algorithm to synthetic as well as real data.
{"title":"Repeated play of the SVM game as a means of adaptive classification","authors":"C. Vineyard, Stephen J Verzi, C. James, J. Aimone, G. Heileman","doi":"10.1109/IJCNN.2015.7280729","DOIUrl":"https://doi.org/10.1109/IJCNN.2015.7280729","url":null,"abstract":"The field of machine learning strives to develop algorithms that, through learning, lead to generalization; that is, the ability of a machine to perform a task that it was not explicitly trained for. An added challenge arises when the problem domain is dynamic or non-stationary with the data distributions or categorizations changing over time. This phenomenon is known as concept drift. Game-theoretic algorithms are often iterative by nature, consisting of repeated game play rather than a single interaction. Effectively, rather than requiring extensive retraining to update a learning model, a game-theoretic approach can adjust strategies as a novel approach to concept drift. In this paper we present a variant of our Support Vector Machine (SVM) Game classifier which may be used in an adaptive manner with repeated play to address concept drift, and show results of applying this algorithm to synthetic as well as real data.","PeriodicalId":6539,"journal":{"name":"2015 International Joint Conference on Neural Networks (IJCNN)","volume":"32 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2015-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89312086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-10-01DOI: 10.1109/IJCNN.2015.7280763
U. Johansson, Cecilia Sönströd, H. Linusson
Conformal predictors use machine learning models to output prediction sets. For regression, a prediction set is simply a prediction interval. All conformal predictors are valid, meaning that the error rate on novel data is bounded by a preset significance level. The key performance metric for conformal predictors is their efficiency, i.e., the size of the prediction sets. Inductive conformal predictors utilize real-valued functions, called nonconformity functions, and a calibration set, i.e., a set of labeled instances not used for the model training, to obtain the prediction regions. In state-of-the-art conformal regressors, the nonconformity functions are normalized, i.e., they include a component estimating the difficulty of each instance. In this study, conformal regressors are built on top of ensembles of bagged neural networks, and several nonconformity functions are evaluated. In addition, the option to calibrate on out-of-bag instances instead of setting aside a calibration set is investigated. The experiments, using 33 publicly available data sets, show that normalized nonconformity functions can produce smaller prediction sets, but the efficiency is highly dependent on the quality of the difficulty estimation. Specifically, in this study, the most efficient normalized nonconformity function estimated the difficulty of an instance by calculating the average error of neighboring instances. These results are consistent with previous studies using random forests as underlying models. Calibrating on out-of-bag did, however, only lead to more efficient conformal predictors on smaller data sets, which is in sharp contrast to the random forest study, where out-out-of bag calibration was significantly better overall.
{"title":"Efficient conformal regressors using bagged neural nets","authors":"U. Johansson, Cecilia Sönströd, H. Linusson","doi":"10.1109/IJCNN.2015.7280763","DOIUrl":"https://doi.org/10.1109/IJCNN.2015.7280763","url":null,"abstract":"Conformal predictors use machine learning models to output prediction sets. For regression, a prediction set is simply a prediction interval. All conformal predictors are valid, meaning that the error rate on novel data is bounded by a preset significance level. The key performance metric for conformal predictors is their efficiency, i.e., the size of the prediction sets. Inductive conformal predictors utilize real-valued functions, called nonconformity functions, and a calibration set, i.e., a set of labeled instances not used for the model training, to obtain the prediction regions. In state-of-the-art conformal regressors, the nonconformity functions are normalized, i.e., they include a component estimating the difficulty of each instance. In this study, conformal regressors are built on top of ensembles of bagged neural networks, and several nonconformity functions are evaluated. In addition, the option to calibrate on out-of-bag instances instead of setting aside a calibration set is investigated. The experiments, using 33 publicly available data sets, show that normalized nonconformity functions can produce smaller prediction sets, but the efficiency is highly dependent on the quality of the difficulty estimation. Specifically, in this study, the most efficient normalized nonconformity function estimated the difficulty of an instance by calculating the average error of neighboring instances. These results are consistent with previous studies using random forests as underlying models. Calibrating on out-of-bag did, however, only lead to more efficient conformal predictors on smaller data sets, which is in sharp contrast to the random forest study, where out-out-of bag calibration was significantly better overall.","PeriodicalId":6539,"journal":{"name":"2015 International Joint Conference on Neural Networks (IJCNN)","volume":"7 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2015-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78883678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-07-17DOI: 10.1109/IJCNN.2015.7280446
Zhile Yang, Kang Li, Qun Niu, A. Foley
Electric vehicles provide an opportunity to reduce fossil fuel consumptions and to decrease the emissions of green-house gas and air pollutants from the transport sector. The adoption of a large number of plug-in electric vehicles however imposes significant impacts on the power system operation due to uncertain charging and discharging patterns. In this paper, multiple charging and discharging scenarios of electric vehicles together with the grid integration of renewable energy sources are examined and evaluated within the unit commitment problem. A quantum-inspired binary particle swarm optimization method is employed to determine the on/off status of each unit. Comparative studies show that the off-peak charging and peak discharging scenario is a viable option to significantly reduce the economic cost and to complement the renewable energy generation.
{"title":"Unit commitment considering multiple charging and discharging scenarios of plug-in electric vehicles","authors":"Zhile Yang, Kang Li, Qun Niu, A. Foley","doi":"10.1109/IJCNN.2015.7280446","DOIUrl":"https://doi.org/10.1109/IJCNN.2015.7280446","url":null,"abstract":"Electric vehicles provide an opportunity to reduce fossil fuel consumptions and to decrease the emissions of green-house gas and air pollutants from the transport sector. The adoption of a large number of plug-in electric vehicles however imposes significant impacts on the power system operation due to uncertain charging and discharging patterns. In this paper, multiple charging and discharging scenarios of electric vehicles together with the grid integration of renewable energy sources are examined and evaluated within the unit commitment problem. A quantum-inspired binary particle swarm optimization method is employed to determine the on/off status of each unit. Comparative studies show that the off-peak charging and peak discharging scenario is a viable option to significantly reduce the economic cost and to complement the renewable energy generation.","PeriodicalId":6539,"journal":{"name":"2015 International Joint Conference on Neural Networks (IJCNN)","volume":"42 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2015-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89820232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-07-12DOI: 10.1109/IJCNN.2015.7280727
M. Maniadakis, P. Trahanias
Time perception is a fundamental component of intelligence that structures the way humans act in various contexts. As action evolves over time, timing is necessary to appreciate environmental contingencies, estimate relations between events and predict the effects of our actions at future moments. Despite the fundamental role of time in human cognition it remains largely unexplored in the field of artificial cognitive systems. The present work makes concrete steps towards making artificial systems aware that the notion of time as a unique entity that can be processed on its own right. To this end, we evolve artificial neural networks to perceive the flow of time and to be able to accomplish three different duration processing tasks. Subsequently we study the internal dynamics of neural networks to obtain insight on the representation and processing mechanisms of time. The self-organized neural network solutions exhibit important brain-like properties and suggests directions for extending existing theories in timing neuro-psychology.
{"title":"Artificial agents perceiving and processing time","authors":"M. Maniadakis, P. Trahanias","doi":"10.1109/IJCNN.2015.7280727","DOIUrl":"https://doi.org/10.1109/IJCNN.2015.7280727","url":null,"abstract":"Time perception is a fundamental component of intelligence that structures the way humans act in various contexts. As action evolves over time, timing is necessary to appreciate environmental contingencies, estimate relations between events and predict the effects of our actions at future moments. Despite the fundamental role of time in human cognition it remains largely unexplored in the field of artificial cognitive systems. The present work makes concrete steps towards making artificial systems aware that the notion of time as a unique entity that can be processed on its own right. To this end, we evolve artificial neural networks to perceive the flow of time and to be able to accomplish three different duration processing tasks. Subsequently we study the internal dynamics of neural networks to obtain insight on the representation and processing mechanisms of time. The self-organized neural network solutions exhibit important brain-like properties and suggests directions for extending existing theories in timing neuro-psychology.","PeriodicalId":6539,"journal":{"name":"2015 International Joint Conference on Neural Networks (IJCNN)","volume":"25 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2015-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73468836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-07-12DOI: 10.1109/IJCNN.2015.7280440
Pravin Chandra, Udayan Ghose, A. Sood
For a single hidden layer feedforward artificial neural network to possess the universal approximation property, it is sufficient that the hidden layer nodes activation functions are continuous non-polynomial function. It is not required that the activation function be a sigmoidal function. In this paper a simple continuous, bounded, non-constant, differentiable, non-sigmoid and non-polynomial function is proposed, for usage as the activation function at hidden layer nodes. The proposed activation function does require the computation of an exponential function, and thus is computationally less intensive as compared to either the log-sigmoid or the hyperbolic tangent function. On a set of 10 function approximation tasks we demonstrate the efficiency and efficacy of the usage of the proposed activation functions. The results obtained allow us to assert that, at least on the 10 function approximation tasks, the results demonstrate that in equal epochs of training, the networks using the proposed activation function reach deeper minima of the error functional and also generalize better in most of the cases, and statistically are as good as if not better than networks using the logistic function as the activation function at the hidden nodes.
{"title":"A non-sigmoidal activation function for feedforward artificial neural networks","authors":"Pravin Chandra, Udayan Ghose, A. Sood","doi":"10.1109/IJCNN.2015.7280440","DOIUrl":"https://doi.org/10.1109/IJCNN.2015.7280440","url":null,"abstract":"For a single hidden layer feedforward artificial neural network to possess the universal approximation property, it is sufficient that the hidden layer nodes activation functions are continuous non-polynomial function. It is not required that the activation function be a sigmoidal function. In this paper a simple continuous, bounded, non-constant, differentiable, non-sigmoid and non-polynomial function is proposed, for usage as the activation function at hidden layer nodes. The proposed activation function does require the computation of an exponential function, and thus is computationally less intensive as compared to either the log-sigmoid or the hyperbolic tangent function. On a set of 10 function approximation tasks we demonstrate the efficiency and efficacy of the usage of the proposed activation functions. The results obtained allow us to assert that, at least on the 10 function approximation tasks, the results demonstrate that in equal epochs of training, the networks using the proposed activation function reach deeper minima of the error functional and also generalize better in most of the cases, and statistically are as good as if not better than networks using the logistic function as the activation function at the hidden nodes.","PeriodicalId":6539,"journal":{"name":"2015 International Joint Conference on Neural Networks (IJCNN)","volume":"5 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2015-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75252911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-07-12DOI: 10.1109/IJCNN.2015.7280371
Qiuwen Chen, Qing Wu, Morgan Bishop, R. Linderman, Qinru Qiu
Inference models such as the confabulation network are particularly useful in anomaly detection applications because they allow introspection to the decision process. However, building such network model always requires expert knowledge. In this paper, we present a self-structuring technique that learns the structure of a confabulation network from unlabeled data. Without any assumption of the distribution of data, we leverage the mutual information between features to learn a succinct network configuration, and enable fast incremental learning to refine the knowledge bases from continuous data streams. Compared to several existing anomaly detection methods, the proposed approach provides higher detection performance and excellent reasoning capability. We also exploit the massive parallelism that is inherent to the inference model and accelerate the detection process using GPUs. Experimental results show significant speedups and the potential to be applied to real-time applications with high-volume data streams.
{"title":"Self-structured confabulation network for fast anomaly detection and reasoning","authors":"Qiuwen Chen, Qing Wu, Morgan Bishop, R. Linderman, Qinru Qiu","doi":"10.1109/IJCNN.2015.7280371","DOIUrl":"https://doi.org/10.1109/IJCNN.2015.7280371","url":null,"abstract":"Inference models such as the confabulation network are particularly useful in anomaly detection applications because they allow introspection to the decision process. However, building such network model always requires expert knowledge. In this paper, we present a self-structuring technique that learns the structure of a confabulation network from unlabeled data. Without any assumption of the distribution of data, we leverage the mutual information between features to learn a succinct network configuration, and enable fast incremental learning to refine the knowledge bases from continuous data streams. Compared to several existing anomaly detection methods, the proposed approach provides higher detection performance and excellent reasoning capability. We also exploit the massive parallelism that is inherent to the inference model and accelerate the detection process using GPUs. Experimental results show significant speedups and the potential to be applied to real-time applications with high-volume data streams.","PeriodicalId":6539,"journal":{"name":"2015 International Joint Conference on Neural Networks (IJCNN)","volume":"19 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2015-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72992028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-07-12DOI: 10.1109/IJCNN.2015.7280743
A. Taherkhani, A. Belatreche, Yuhua Li, L. Maguire
Spikes are an important part of information transmission between neurons in the biological brain. Biological evidence shows that information is carried in the timing of individual action potentials, rather than only the firing rate. Spiking neural networks are devised to capture more biological characteristics of the brain to construct more powerful intelligent systems. In this paper, we extend our newly proposed supervised learning algorithm called DL-ReSuMe (Delay Learning Remote Supervised Method) to train multiple neurons to classify spatiotemporal spiking patterns. In this method, a number of neurons instead of a single neuron is trained to perform the classification task. The simulation results show that a population of neurons has significantly higher processing ability compared to a single neuron. It is also shown that the performance of Multi-DL-ReSuMe (Multiple DL-ReSuMe) is increased when the number of desired spikes is increased in the desired spike trains to an appropriate number.
{"title":"Multi-DL-ReSuMe: Multiple neurons Delay Learning Remote Supervised Method","authors":"A. Taherkhani, A. Belatreche, Yuhua Li, L. Maguire","doi":"10.1109/IJCNN.2015.7280743","DOIUrl":"https://doi.org/10.1109/IJCNN.2015.7280743","url":null,"abstract":"Spikes are an important part of information transmission between neurons in the biological brain. Biological evidence shows that information is carried in the timing of individual action potentials, rather than only the firing rate. Spiking neural networks are devised to capture more biological characteristics of the brain to construct more powerful intelligent systems. In this paper, we extend our newly proposed supervised learning algorithm called DL-ReSuMe (Delay Learning Remote Supervised Method) to train multiple neurons to classify spatiotemporal spiking patterns. In this method, a number of neurons instead of a single neuron is trained to perform the classification task. The simulation results show that a population of neurons has significantly higher processing ability compared to a single neuron. It is also shown that the performance of Multi-DL-ReSuMe (Multiple DL-ReSuMe) is increased when the number of desired spikes is increased in the desired spike trains to an appropriate number.","PeriodicalId":6539,"journal":{"name":"2015 International Joint Conference on Neural Networks (IJCNN)","volume":"29 1","pages":"1-7"},"PeriodicalIF":0.0,"publicationDate":"2015-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75300798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-07-12DOI: 10.1109/IJCNN.2015.7280569
M. Rashid, I. Gondal, J. Kamruzzaman
In machine health monitoring, fault frequency identification of potential bearing faults is very important and necessary when it comes to reliable operation of a given system. In this paper, we proposed a data mining based scheme for fault frequency identification from the bearing data. In this scheme, we propose a compact tree called SAP-tree (sliding window associated frequency pattern tree) which is built upon the analysis of frequency domain characteristics of machine vibration data. Using this tree we devised a sliding window-based associated frequency pattern mining technique, called SAP algorithm, that mines for the frequencies relevant to machine fault. Our SAP algorithm can mine associated frequency patterns in the current window with frequent pattern (FP)-growth like pattern-growth method and used these patterns to identify the fault frequency. Extensive experimental analyses show that our technique is very efficient in identifying fault frequency over vibration data stream.
{"title":"Condition monitoring through mining fault frequency from machine vibration data","authors":"M. Rashid, I. Gondal, J. Kamruzzaman","doi":"10.1109/IJCNN.2015.7280569","DOIUrl":"https://doi.org/10.1109/IJCNN.2015.7280569","url":null,"abstract":"In machine health monitoring, fault frequency identification of potential bearing faults is very important and necessary when it comes to reliable operation of a given system. In this paper, we proposed a data mining based scheme for fault frequency identification from the bearing data. In this scheme, we propose a compact tree called SAP-tree (sliding window associated frequency pattern tree) which is built upon the analysis of frequency domain characteristics of machine vibration data. Using this tree we devised a sliding window-based associated frequency pattern mining technique, called SAP algorithm, that mines for the frequencies relevant to machine fault. Our SAP algorithm can mine associated frequency patterns in the current window with frequent pattern (FP)-growth like pattern-growth method and used these patterns to identify the fault frequency. Extensive experimental analyses show that our technique is very efficient in identifying fault frequency over vibration data stream.","PeriodicalId":6539,"journal":{"name":"2015 International Joint Conference on Neural Networks (IJCNN)","volume":"32 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2015-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74056189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-07-12DOI: 10.1109/IJCNN.2015.7280825
Laurence Morissette, S. Chartier
In this paper we present a model of saliency as the driving force behind endogenous attention in auditory processing using a competitive winner take all process. The model uses frequency, amplitude and spatial location bound together by temporal correlations in an oscillatory network to create unified perceptual objects that are consistent. The model also implements the interaction with exogenous attention.
{"title":"Saliency model of auditory attention based on frequency, amplitude and spatial location","authors":"Laurence Morissette, S. Chartier","doi":"10.1109/IJCNN.2015.7280825","DOIUrl":"https://doi.org/10.1109/IJCNN.2015.7280825","url":null,"abstract":"In this paper we present a model of saliency as the driving force behind endogenous attention in auditory processing using a competitive winner take all process. The model uses frequency, amplitude and spatial location bound together by temporal correlations in an oscillatory network to create unified perceptual objects that are consistent. The model also implements the interaction with exogenous attention.","PeriodicalId":6539,"journal":{"name":"2015 International Joint Conference on Neural Networks (IJCNN)","volume":"39 1","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2015-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74235670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2015-07-12DOI: 10.1109/IJCNN.2015.7280589
Sangwook Kim, Minho Lee, Jixiang Shen
Deep learning methods allow a classifier to learn features automatically through multiple layers of training. In a deep learning process, low-level features are abstracted into high-level features. In this paper, we propose a new probabilistic deep learning method that combines a discriminative model, namely, Support Vector Machine (SVM), with a generative model, namely, Gaussian Mixture Model (GMM). Combining the SVM with the GMM, we can represent a new input feature for deeper layer training of uncertain data in current layer construction. Bayesian rule is used to re-represent the output data of the previous layer of the SVM with GMM to serve as the input data for the next deep layer. As a result, deep features are reliably extracted without additional feature extraction efforts, using multiple layers of the SVM with GMM. Experimental results show that the proposed deep structure model allows for an easier classification of the uncertain data through multiple-layer training and it gives more accurate results.
{"title":"A novel deep learning by combining discriminative model with generative model","authors":"Sangwook Kim, Minho Lee, Jixiang Shen","doi":"10.1109/IJCNN.2015.7280589","DOIUrl":"https://doi.org/10.1109/IJCNN.2015.7280589","url":null,"abstract":"Deep learning methods allow a classifier to learn features automatically through multiple layers of training. In a deep learning process, low-level features are abstracted into high-level features. In this paper, we propose a new probabilistic deep learning method that combines a discriminative model, namely, Support Vector Machine (SVM), with a generative model, namely, Gaussian Mixture Model (GMM). Combining the SVM with the GMM, we can represent a new input feature for deeper layer training of uncertain data in current layer construction. Bayesian rule is used to re-represent the output data of the previous layer of the SVM with GMM to serve as the input data for the next deep layer. As a result, deep features are reliably extracted without additional feature extraction efforts, using multiple layers of the SVM with GMM. Experimental results show that the proposed deep structure model allows for an easier classification of the uncertain data through multiple-layer training and it gives more accurate results.","PeriodicalId":6539,"journal":{"name":"2015 International Joint Conference on Neural Networks (IJCNN)","volume":"55 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2015-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74562625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}