Pub Date : 2013-08-01DOI: 10.1109/IJCNN.2013.6706996
Min Han, Xinying Wang
A robust neural predictor is designed for noisy chaotic time series prediction in this paper. The main idea is based on the consideration of the bounded uncertainty in predictor input, and it is a typical Errors-in-Variables problem. The robust design is based on the linear-in-parameters ESN (Echo State Network) model. By minimizing the worst-case residual induced by the bounded perturbations in the echo state variables, the robust predictor is obtained in coping with the uncertainty in the noisy time series. In the experiment, the classical Mackey-Glass 84-step benchmark prediction task is investigated. The prediction performance is studied for the nominal and robust design of ESN predictors.
{"title":"Robust neural predictor for noisy chaotic time series prediction","authors":"Min Han, Xinying Wang","doi":"10.1109/IJCNN.2013.6706996","DOIUrl":"https://doi.org/10.1109/IJCNN.2013.6706996","url":null,"abstract":"A robust neural predictor is designed for noisy chaotic time series prediction in this paper. The main idea is based on the consideration of the bounded uncertainty in predictor input, and it is a typical Errors-in-Variables problem. The robust design is based on the linear-in-parameters ESN (Echo State Network) model. By minimizing the worst-case residual induced by the bounded perturbations in the echo state variables, the robust predictor is obtained in coping with the uncertainty in the noisy time series. In the experiment, the classical Mackey-Glass 84-step benchmark prediction task is investigated. The prediction performance is studied for the nominal and robust design of ESN predictors.","PeriodicalId":376975,"journal":{"name":"The 2013 International Joint Conference on Neural Networks (IJCNN)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122650832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-08-01DOI: 10.1109/IJCNN.2013.6707108
K. Balasubramaniam, G. Venayagamoorthy, N. Watson
Situational awareness (SA) in simple terms is to understand the current state of the system and based on that understanding predict how system states are to evolve over time. Predictive modeling of power systems using conventional methods is time consuming and hence not well suited for real-time operation. In this study, neural network (NN) based non-linear predictor is used to predict states of power system for future time instance. Required control signals are computed based on predicted state variables and control set points. In order to reduce computation the problem is decoupled and solved in a cellular array of NNs. The cellular neural network (CNN) framework allows for accurate prediction with only minimal information exchange between neighboring predictors. The predicted states are then used in computing stability metrics that give proximity to point of instability. The situational awareness platform developed using CNN framework extracts information from data for the next time instance i.e. a step ahead of time and maps this data with geographical coordinates of power system components. The geographic information system (GIS) provides a visual indication of operating status of individual components as well as that of the entire system.
{"title":"Cellular neural network based situational awareness system for power grids","authors":"K. Balasubramaniam, G. Venayagamoorthy, N. Watson","doi":"10.1109/IJCNN.2013.6707108","DOIUrl":"https://doi.org/10.1109/IJCNN.2013.6707108","url":null,"abstract":"Situational awareness (SA) in simple terms is to understand the current state of the system and based on that understanding predict how system states are to evolve over time. Predictive modeling of power systems using conventional methods is time consuming and hence not well suited for real-time operation. In this study, neural network (NN) based non-linear predictor is used to predict states of power system for future time instance. Required control signals are computed based on predicted state variables and control set points. In order to reduce computation the problem is decoupled and solved in a cellular array of NNs. The cellular neural network (CNN) framework allows for accurate prediction with only minimal information exchange between neighboring predictors. The predicted states are then used in computing stability metrics that give proximity to point of instability. The situational awareness platform developed using CNN framework extracts information from data for the next time instance i.e. a step ahead of time and maps this data with geographical coordinates of power system components. The geographic information system (GIS) provides a visual indication of operating status of individual components as well as that of the entire system.","PeriodicalId":376975,"journal":{"name":"The 2013 International Joint Conference on Neural Networks (IJCNN)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122685482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-08-01DOI: 10.1109/IJCNN.2013.6707018
Yiming Peng, Shaoning Pang, Gang Chen, A. Sarrafzadeh, Tao Ban, D. Inoue
Training data in real world is often presented in random chunks. Yet existing sequential Incremental IDR/QR LDA (s-QR/IncLDA) can only process data one sample after another. This paper proposes a constructive chunk Incremental IDR/QR LDA (c-QR/IncLDA) for multiple data samples incremental learning. Given a chunk of s samples for incremental learning, the proposed c-QR/IncLDA increments current discriminant model Ω, by implementing computation on the compressed the residue matrix Δ ϵ Rd×n, instead of the entire incoming data chunk X ϵ Rd×s, where η ≤ s holds. Meanwhile, we derive a more accurate reduced within-class scatter matrix W to minimize the discriminative information loss at every incremental learning cycle. It is noted that the computational complexity of c-QR/IncLDA can be more expensive than s-QR/IncLDA for single sample processing. However, for multiple samples processing, the computational efficiency of c-QR/IncLDA deterministically surpasses s-QR/IncLDA when the chunk size is large, i.e., s ≫ η holds. Moreover, experiments evaluation shows that the proposed c-QR/IncLDA can achieve an accuracy level that is competitive to batch QR/LDA and is consistently higher than s-QR/IncLDA.
现实世界中的训练数据通常以随机块的形式呈现。而现有的顺序增量式IDR/QR LDA (s-QR/IncLDA)只能一个样本接一个样本地处理数据。本文提出了一种用于多数据样本增量学习的构造块增量IDR/QR LDA (c-QR/IncLDA)算法。给定s个样本块用于增量学习,所提出的c-QR/IncLDA通过在压缩残差矩阵Δ御Rd×n上实现计算,而不是在η≤s成立的整个传入数据块X御Rd×s上实现对当前区分模型Ω的增量计算。同时,我们导出了一个更精确的类内散点矩阵W,以最小化每个增量学习周期的判别信息损失。对于单样本处理,c-QR/IncLDA的计算复杂度可能比s-QR/IncLDA要高。然而,对于多样本处理,当块大小较大时,c-QR/IncLDA的计算效率确定性地优于s- qr /IncLDA,即s比η保持不变。实验结果表明,c-QR/IncLDA的精度水平与批量QR/LDA相当,且始终高于s-QR/IncLDA。
{"title":"Chunk incremental IDR/QR LDA learning","authors":"Yiming Peng, Shaoning Pang, Gang Chen, A. Sarrafzadeh, Tao Ban, D. Inoue","doi":"10.1109/IJCNN.2013.6707018","DOIUrl":"https://doi.org/10.1109/IJCNN.2013.6707018","url":null,"abstract":"Training data in real world is often presented in random chunks. Yet existing sequential Incremental IDR/QR LDA (s-QR/IncLDA) can only process data one sample after another. This paper proposes a constructive chunk Incremental IDR/QR LDA (c-QR/IncLDA) for multiple data samples incremental learning. Given a chunk of s samples for incremental learning, the proposed c-QR/IncLDA increments current discriminant model Ω, by implementing computation on the compressed the residue matrix Δ ϵ Rd×n, instead of the entire incoming data chunk X ϵ Rd×s, where η ≤ s holds. Meanwhile, we derive a more accurate reduced within-class scatter matrix W to minimize the discriminative information loss at every incremental learning cycle. It is noted that the computational complexity of c-QR/IncLDA can be more expensive than s-QR/IncLDA for single sample processing. However, for multiple samples processing, the computational efficiency of c-QR/IncLDA deterministically surpasses s-QR/IncLDA when the chunk size is large, i.e., s ≫ η holds. Moreover, experiments evaluation shows that the proposed c-QR/IncLDA can achieve an accuracy level that is competitive to batch QR/LDA and is consistently higher than s-QR/IncLDA.","PeriodicalId":376975,"journal":{"name":"The 2013 International Joint Conference on Neural Networks (IJCNN)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122920618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-08-01DOI: 10.1109/IJCNN.2013.6707039
J. C. Xavier, A. Canuto, N. D. Almeida, L. Gonçalves
Similarity and dissimilarity (distance) between objects is an important aspect that must be considered when clustering data. When clustering categorical data, for instance, these distance (similarity or dissimilarity) measures need to address properly the real particularities of categorical data. In this paper, we perform a comparative analysis with four different dissimilarity measures used as a distance metric for clustering categorical data. The first one is the Simple Matching Dissimilarity Measure (SMDM), which is one of the simplest and the most used metric for categorical attribute. The other two are context-based approaches (DIstance Learning in Categorical Attributes - DILCA and Domain Value Dissimilarity-DVD), and the last one is an extension of the SMDM, which is proposed in this paper. All four dissimilarities are applied as distance metrics in two well known clustering algorithms, k-means and agglomerative hierarchical clustering algorithms. In this analysis, we also use internal and external cluster validity measures, aiming to compare the effectiveness of all four distance measures in both clustering algorithms.
{"title":"A comparative analysis of dissimilarity measures for clustering categorical data","authors":"J. C. Xavier, A. Canuto, N. D. Almeida, L. Gonçalves","doi":"10.1109/IJCNN.2013.6707039","DOIUrl":"https://doi.org/10.1109/IJCNN.2013.6707039","url":null,"abstract":"Similarity and dissimilarity (distance) between objects is an important aspect that must be considered when clustering data. When clustering categorical data, for instance, these distance (similarity or dissimilarity) measures need to address properly the real particularities of categorical data. In this paper, we perform a comparative analysis with four different dissimilarity measures used as a distance metric for clustering categorical data. The first one is the Simple Matching Dissimilarity Measure (SMDM), which is one of the simplest and the most used metric for categorical attribute. The other two are context-based approaches (DIstance Learning in Categorical Attributes - DILCA and Domain Value Dissimilarity-DVD), and the last one is an extension of the SMDM, which is proposed in this paper. All four dissimilarities are applied as distance metrics in two well known clustering algorithms, k-means and agglomerative hierarchical clustering algorithms. In this analysis, we also use internal and external cluster validity measures, aiming to compare the effectiveness of all four distance measures in both clustering algorithms.","PeriodicalId":376975,"journal":{"name":"The 2013 International Joint Conference on Neural Networks (IJCNN)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122438294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-08-01DOI: 10.1109/IJCNN.2013.6706973
M. D. Pérez-Godoy, A. J. Rivera, M. J. Jesús, F. Martínez
Many real applications are composed of data sets where the distribution of the classes is significantly different. These data sets are commonly known as imbalanced data sets. Proposed approaches that address this problem can be categorized into two types: data-based, which resample problem data in a preprocessing phase and algorithm-based which modify or create new methods to address the imbalance problem. In this paper, CO2 RBFN a cooperative-competitive design method for Radial Basis Function Networks that has previously demonstrated a good behaviour tackling imbalanced data sets, is tested using two different training weights algorithms, local and global, in order to gain knowledge about this problem. As conclusions we can outline that a more global optimizer training algorithm obtains worse results.
{"title":"A first analysis of the effect of local and global optimization weights methods in the cooperative-competitive design of RBFN for imbalanced environments","authors":"M. D. Pérez-Godoy, A. J. Rivera, M. J. Jesús, F. Martínez","doi":"10.1109/IJCNN.2013.6706973","DOIUrl":"https://doi.org/10.1109/IJCNN.2013.6706973","url":null,"abstract":"Many real applications are composed of data sets where the distribution of the classes is significantly different. These data sets are commonly known as imbalanced data sets. Proposed approaches that address this problem can be categorized into two types: data-based, which resample problem data in a preprocessing phase and algorithm-based which modify or create new methods to address the imbalance problem. In this paper, CO2 RBFN a cooperative-competitive design method for Radial Basis Function Networks that has previously demonstrated a good behaviour tackling imbalanced data sets, is tested using two different training weights algorithms, local and global, in order to gain knowledge about this problem. As conclusions we can outline that a more global optimizer training algorithm obtains worse results.","PeriodicalId":376975,"journal":{"name":"The 2013 International Joint Conference on Neural Networks (IJCNN)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122487957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-08-01DOI: 10.1109/IJCNN.2013.6707007
M. R. Batista, R. Calvo, R. Romero
Area Coverage is a standard problem in which Robotics techniques can be applied. An approach to solve this problem is through techniques based on Centroidal Voronoi Tesselations (CVT), considering that each robot is a generator used to build Voronoi polygons. In this work, a new approach named by Sample Lloyd Area Coverage System (SLACS), is proposed that does not need of the explicit building of the diagram based in the Probabilistic Lloyd method to estimate a Voronoi polygon's centroid. In addition, it is proposed a method to close Voronoi diagrams to apply in a classic Lloyd CVT procedure. Both approaches are compared in empty and roomlike environments done in simulated tests using both Player interface and Stage simulator. Results obtained show that the proposed approach is well suited to solve the area coverage problem via mobile sensor deployment and it is a simple and effective substitute to a Lloyd CVT method.
{"title":"A robot on-line area coverage approach based on the probabilistic Lloyd method","authors":"M. R. Batista, R. Calvo, R. Romero","doi":"10.1109/IJCNN.2013.6707007","DOIUrl":"https://doi.org/10.1109/IJCNN.2013.6707007","url":null,"abstract":"Area Coverage is a standard problem in which Robotics techniques can be applied. An approach to solve this problem is through techniques based on Centroidal Voronoi Tesselations (CVT), considering that each robot is a generator used to build Voronoi polygons. In this work, a new approach named by Sample Lloyd Area Coverage System (SLACS), is proposed that does not need of the explicit building of the diagram based in the Probabilistic Lloyd method to estimate a Voronoi polygon's centroid. In addition, it is proposed a method to close Voronoi diagrams to apply in a classic Lloyd CVT procedure. Both approaches are compared in empty and roomlike environments done in simulated tests using both Player interface and Stage simulator. Results obtained show that the proposed approach is well suited to solve the area coverage problem via mobile sensor deployment and it is a simple and effective substitute to a Lloyd CVT method.","PeriodicalId":376975,"journal":{"name":"The 2013 International Joint Conference on Neural Networks (IJCNN)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114186720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-08-01DOI: 10.1109/IJCNN.2013.6706943
D. Anguita, A. Ghio, L. Oneto, S. Ridella
This paper deals with the problem of identifying a connection between the Vapnik-Chervonenkis (VC) Entropy, a notion of complexity introduced by Vapnik in his seminal work, and the Rademacher Complexity, a more powerful notion of complexity, which has been in the limelight of several works in the recent Machine Learning literature. In order to establish this connection, we refine some previously known relationships and derive a new result. Our proposal allows computing an admissible range for the Rademacher Complexity, given a value of the VC-Entropy, and vice versa, therefore opening new appealing research perspectives in the field of assessing the complexity of an hypothesis space.
{"title":"Some results about the Vapnik-Chervonenkis entropy and the rademacher complexity","authors":"D. Anguita, A. Ghio, L. Oneto, S. Ridella","doi":"10.1109/IJCNN.2013.6706943","DOIUrl":"https://doi.org/10.1109/IJCNN.2013.6706943","url":null,"abstract":"This paper deals with the problem of identifying a connection between the Vapnik-Chervonenkis (VC) Entropy, a notion of complexity introduced by Vapnik in his seminal work, and the Rademacher Complexity, a more powerful notion of complexity, which has been in the limelight of several works in the recent Machine Learning literature. In order to establish this connection, we refine some previously known relationships and derive a new result. Our proposal allows computing an admissible range for the Rademacher Complexity, given a value of the VC-Entropy, and vice versa, therefore opening new appealing research perspectives in the field of assessing the complexity of an hypothesis space.","PeriodicalId":376975,"journal":{"name":"The 2013 International Joint Conference on Neural Networks (IJCNN)","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114264424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-08-01DOI: 10.1109/IJCNN.2013.6706756
Piotr Suszynski, Pawel Wawrzynski
In this paper a method is presented for learning of spiking neural networks. It is based on perturbation of synaptic conductances. While this approach is known to be model-free, it is also known to be slow, because it applies improvement direction estimates with large variance. Two ideas are analysed to alleviate this problem: First, learning of many networks at the same time instead of one. Second, autocorrelation of perturbations in time. In the experimental study the method is validated on three learning tasks in which information is conveyed with frequency and spike timing.
{"title":"Learning population of spiking neural networks with perturbation of conductances","authors":"Piotr Suszynski, Pawel Wawrzynski","doi":"10.1109/IJCNN.2013.6706756","DOIUrl":"https://doi.org/10.1109/IJCNN.2013.6706756","url":null,"abstract":"In this paper a method is presented for learning of spiking neural networks. It is based on perturbation of synaptic conductances. While this approach is known to be model-free, it is also known to be slow, because it applies improvement direction estimates with large variance. Two ideas are analysed to alleviate this problem: First, learning of many networks at the same time instead of one. Second, autocorrelation of perturbations in time. In the experimental study the method is validated on three learning tasks in which information is conveyed with frequency and spike timing.","PeriodicalId":376975,"journal":{"name":"The 2013 International Joint Conference on Neural Networks (IJCNN)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114601319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-08-01DOI: 10.1109/IJCNN.2013.6707078
A. Amir, Pallab Datta, W. Risk, A. Cassidy, J. Kusnitz, Steven K. Esser, Alexander Andreopoulos, T. Wong, M. Flickner, Rodrigo Alvarez-Icaza, E. McQuinn, Ben Shaw, Norm Pass, D. Modha
Marching along the DARPA SyNAPSE roadmap, IBM unveils a trilogy of innovations towards the TrueNorth cognitive computing system inspired by the brain's function and efficiency. The sequential programming paradigm of the von Neumann architecture is wholly unsuited for TrueNorth. Therefore, as our main contribution, we develop a new programming paradigm that permits construction of complex cognitive algorithms and applications while being efficient for TrueNorth and effective for programmer productivity. The programming paradigm consists of (a) an abstraction for a TrueNorth program, named Corelet, for representing a network of neurosynaptic cores that encapsulates all details except external inputs and outputs; (b) an object-oriented Corelet Language for creating, composing, and decomposing corelets; (c) a Corelet Library that acts as an ever-growing repository of reusable corelets from which programmers compose new corelets; and (d) an end-to-end Corelet Laboratory that is a programming environment which integrates with the TrueNorth architectural simulator, Compass, to support all aspects of the programming cycle from design, through development, debugging, and up to deployment. The new paradigm seamlessly scales from a handful of synapses and neurons to networks of neurosynaptic cores of progressively increasing size and complexity. The utility of the new programming paradigm is underscored by the fact that we have designed and implemented more than 100 algorithms as corelets for TrueNorth in a very short time span.
{"title":"Cognitive computing programming paradigm: A Corelet Language for composing networks of neurosynaptic cores","authors":"A. Amir, Pallab Datta, W. Risk, A. Cassidy, J. Kusnitz, Steven K. Esser, Alexander Andreopoulos, T. Wong, M. Flickner, Rodrigo Alvarez-Icaza, E. McQuinn, Ben Shaw, Norm Pass, D. Modha","doi":"10.1109/IJCNN.2013.6707078","DOIUrl":"https://doi.org/10.1109/IJCNN.2013.6707078","url":null,"abstract":"Marching along the DARPA SyNAPSE roadmap, IBM unveils a trilogy of innovations towards the TrueNorth cognitive computing system inspired by the brain's function and efficiency. The sequential programming paradigm of the von Neumann architecture is wholly unsuited for TrueNorth. Therefore, as our main contribution, we develop a new programming paradigm that permits construction of complex cognitive algorithms and applications while being efficient for TrueNorth and effective for programmer productivity. The programming paradigm consists of (a) an abstraction for a TrueNorth program, named Corelet, for representing a network of neurosynaptic cores that encapsulates all details except external inputs and outputs; (b) an object-oriented Corelet Language for creating, composing, and decomposing corelets; (c) a Corelet Library that acts as an ever-growing repository of reusable corelets from which programmers compose new corelets; and (d) an end-to-end Corelet Laboratory that is a programming environment which integrates with the TrueNorth architectural simulator, Compass, to support all aspects of the programming cycle from design, through development, debugging, and up to deployment. The new paradigm seamlessly scales from a handful of synapses and neurons to networks of neurosynaptic cores of progressively increasing size and complexity. The utility of the new programming paradigm is underscored by the fact that we have designed and implemented more than 100 algorithms as corelets for TrueNorth in a very short time span.","PeriodicalId":376975,"journal":{"name":"The 2013 International Joint Conference on Neural Networks (IJCNN)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122079403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2013-08-01DOI: 10.1109/IJCNN.2013.6706831
Kyunghyun Cho, T. Raiko, A. Ilin
In this paper, we study a model that we call Gaussian-Bernoulli deep Boltzmann machine (GDBM) and discuss potential improvements in training the model. GDBM is designed to be applicable to continuous data and it is constructed from Gaussian-Bernoulli restricted Boltzmann machine (GRBM) by adding multiple layers of binary hidden neurons. The studied improvements of the learning algorithm for GDBM include parallel tempering, enhanced gradient, adaptive learning rate and layer-wise pretraining. We empirically show that they help avoid some of the common difficulties found in training deep Boltzmann machines such as divergence of learning, the difficulty in choosing right learning rate scheduling, and the existence of meaningless higher layers.
{"title":"Gaussian-Bernoulli deep Boltzmann machine","authors":"Kyunghyun Cho, T. Raiko, A. Ilin","doi":"10.1109/IJCNN.2013.6706831","DOIUrl":"https://doi.org/10.1109/IJCNN.2013.6706831","url":null,"abstract":"In this paper, we study a model that we call Gaussian-Bernoulli deep Boltzmann machine (GDBM) and discuss potential improvements in training the model. GDBM is designed to be applicable to continuous data and it is constructed from Gaussian-Bernoulli restricted Boltzmann machine (GRBM) by adding multiple layers of binary hidden neurons. The studied improvements of the learning algorithm for GDBM include parallel tempering, enhanced gradient, adaptive learning rate and layer-wise pretraining. We empirically show that they help avoid some of the common difficulties found in training deep Boltzmann machines such as divergence of learning, the difficulty in choosing right learning rate scheduling, and the existence of meaningless higher layers.","PeriodicalId":376975,"journal":{"name":"The 2013 International Joint Conference on Neural Networks (IJCNN)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122120981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}