We study a game-theoretic model of how individuals learn by observing others' acting, and how (causal) knowledge grows in communities as result. We devise a cooperative solution in this game, which motivates a new recommendation system where causality (not correlation) is the central concept. We use the system in low-income communities, where individuals make judgments about the efficiency of educational activities ("if I take course x, I will get a job"). We show that, uncoordinated, individuals easily "herd" on visible but ineffectual actions. And, in turn, that, coordinated, individuals become massively more responsive - with the intelligence to quickly discern errors, mark them, share them, and move there from, towards "what really works."
{"title":"A Model of Joint Learning in Poverty: Coordination and Recommendation Systems in Low-Income Communities","authors":"Andre Ribeiro","doi":"10.1109/ICMLA.2011.15","DOIUrl":"https://doi.org/10.1109/ICMLA.2011.15","url":null,"abstract":"We study a game-theoretic model of how individuals learn by observing others' acting, and how (causal) knowledge grows in communities as result. We devise a cooperative solution in this game, which motivates a new recommendation system where causality (not correlation) is the central concept. We use the system in low-income communities, where individuals make judgments about the efficiency of educational activities (\"if I take course x, I will get a job\"). We show that, uncoordinated, individuals easily \"herd\" on visible but ineffectual actions. And, in turn, that, coordinated, individuals become massively more responsive - with the intelligence to quickly discern errors, mark them, share them, and move there from, towards \"what really works.\"","PeriodicalId":439926,"journal":{"name":"2011 10th International Conference on Machine Learning and Applications and Workshops","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120980063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Enis Bayramoglu, N. Andersen, Ole Ravn, N. K. Poulsen
The paper focuses on nonlinear state estimation assuming non-Gaussian distributions of the states and the disturbances. The posterior distribution and the a posteriori distribution is described by a chosen family of parametric distributions. The state transformation then results in a transformation of the parameters in the distribution. This transformation is approximated by a neural network using offline training, which is based on Monte Carlo Sampling. In the paper, there will also be presented a method to construct a flexible distributions well suited for covering the effect of the non-linear ties. The method can also be used to improve other parametric methods around regions with strong non-linear ties by including them inside the network.
{"title":"Pre-trained Neural Networks Used for Non-linear State Estimation","authors":"Enis Bayramoglu, N. Andersen, Ole Ravn, N. K. Poulsen","doi":"10.1109/ICMLA.2011.118","DOIUrl":"https://doi.org/10.1109/ICMLA.2011.118","url":null,"abstract":"The paper focuses on nonlinear state estimation assuming non-Gaussian distributions of the states and the disturbances. The posterior distribution and the a posteriori distribution is described by a chosen family of parametric distributions. The state transformation then results in a transformation of the parameters in the distribution. This transformation is approximated by a neural network using offline training, which is based on Monte Carlo Sampling. In the paper, there will also be presented a method to construct a flexible distributions well suited for covering the effect of the non-linear ties. The method can also be used to improve other parametric methods around regions with strong non-linear ties by including them inside the network.","PeriodicalId":439926,"journal":{"name":"2011 10th International Conference on Machine Learning and Applications and Workshops","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121292695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Metaheuristics is a framework of practical methods for global optimization problems. We recently proposed a new metaheuristics method inspired from spiral phenomena in nature which is called spiral optimization. However, the spiral optimization was restricted to 2-dimensional continuous optimization problems. In this paper, we develop a spiral optimization method for n-dimensional continuous optimization problems by constructing an n-dimensional spiral model. The n-dimensional spiral model is designed using rotation matrices in n-dimensional space. Simulation results for different benchmark problems show the effectiveness of our proposal compared to PSO and DE.
{"title":"Spiral Multipoint Search for Global Optimization","authors":"K. Tamura, K. Yasuda","doi":"10.1109/ICMLA.2011.131","DOIUrl":"https://doi.org/10.1109/ICMLA.2011.131","url":null,"abstract":"Metaheuristics is a framework of practical methods for global optimization problems. We recently proposed a new metaheuristics method inspired from spiral phenomena in nature which is called spiral optimization. However, the spiral optimization was restricted to 2-dimensional continuous optimization problems. In this paper, we develop a spiral optimization method for n-dimensional continuous optimization problems by constructing an n-dimensional spiral model. The n-dimensional spiral model is designed using rotation matrices in n-dimensional space. Simulation results for different benchmark problems show the effectiveness of our proposal compared to PSO and DE.","PeriodicalId":439926,"journal":{"name":"2011 10th International Conference on Machine Learning and Applications and Workshops","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122663208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this research, we consider the supervised learning problem of seismic phase classification. In seismology, knowledge of the seismic activity arrival time and phase leads to epicenter localization and surface velocity estimates useful in developing seismic early warning systems and detecting man-made seismic events. Formally, the activity arrival time refers to the moment at which a seismic wave is first detected and the seismic phase classifies the physics of the wave. We propose a new perspective for the classification of seismic phases in three-channel seismic data collected within a network of regional recording stations. Our method extends current techniques and incorporates concepts from machine learning. Machine learning techniques attempt to leverage the concept of "learning'' the patterns associated with different types of data characteristics. In this case, the data characteristics are the seismic phases. This concept makes sense because the characteristics of the phase types are dictated by the physics of wave propagation. Thus by "learning'' a signature for each type of phase, we can apply classification algorithms to identify the phase of incoming data from a database of known phases observed over the recording network. Our method first uses a multi-scale feature extraction technique for clustering seismic data on low-dimensional manifolds. We then apply kernel ridge regression on each feature manifold for phase classification. In addition, we have designed an information theoretic measure used to merge regression scores across the multi-scale feature manifolds. Our approach complements current methods in seismic phase classification and brings to light machine learning techniques not yet fully examined in the context of seismology. We have applied our technique to a seismic data set from the Idaho, Montana, Wyoming, and Utah regions collected during 2005 and 2006. This data set contained compression wave and surface wave seismic phases. Through cross-validation, our method achieves a 74.6% average correct classification rate when compared to analyst classifications.
{"title":"Machine Learning for Seismic Signal Processing: Phase Classification on a Manifold","authors":"J. Ramirez, François G. Meyer","doi":"10.1109/ICMLA.2011.91","DOIUrl":"https://doi.org/10.1109/ICMLA.2011.91","url":null,"abstract":"In this research, we consider the supervised learning problem of seismic phase classification. In seismology, knowledge of the seismic activity arrival time and phase leads to epicenter localization and surface velocity estimates useful in developing seismic early warning systems and detecting man-made seismic events. Formally, the activity arrival time refers to the moment at which a seismic wave is first detected and the seismic phase classifies the physics of the wave. We propose a new perspective for the classification of seismic phases in three-channel seismic data collected within a network of regional recording stations. Our method extends current techniques and incorporates concepts from machine learning. Machine learning techniques attempt to leverage the concept of \"learning'' the patterns associated with different types of data characteristics. In this case, the data characteristics are the seismic phases. This concept makes sense because the characteristics of the phase types are dictated by the physics of wave propagation. Thus by \"learning'' a signature for each type of phase, we can apply classification algorithms to identify the phase of incoming data from a database of known phases observed over the recording network. Our method first uses a multi-scale feature extraction technique for clustering seismic data on low-dimensional manifolds. We then apply kernel ridge regression on each feature manifold for phase classification. In addition, we have designed an information theoretic measure used to merge regression scores across the multi-scale feature manifolds. Our approach complements current methods in seismic phase classification and brings to light machine learning techniques not yet fully examined in the context of seismology. We have applied our technique to a seismic data set from the Idaho, Montana, Wyoming, and Utah regions collected during 2005 and 2006. This data set contained compression wave and surface wave seismic phases. Through cross-validation, our method achieves a 74.6% average correct classification rate when compared to analyst classifications.","PeriodicalId":439926,"journal":{"name":"2011 10th International Conference on Machine Learning and Applications and Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122655332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper describes an optimization problem with one target function to be optimized and several supporting functions that can be used to speed up the optimization process. A method based on reinforcement learning is proposed for choosing a good supporting function during optimization using genetic algorithm. Results of applying this method to a model problem are shown.
{"title":"Choosing Best Fitness Function with Reinforcement Learning","authors":"Arina Buzdalova, M. Buzdalov","doi":"10.1109/ICMLA.2011.163","DOIUrl":"https://doi.org/10.1109/ICMLA.2011.163","url":null,"abstract":"This paper describes an optimization problem with one target function to be optimized and several supporting functions that can be used to speed up the optimization process. A method based on reinforcement learning is proposed for choosing a good supporting function during optimization using genetic algorithm. Results of applying this method to a model problem are shown.","PeriodicalId":439926,"journal":{"name":"2011 10th International Conference on Machine Learning and Applications and Workshops","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115685488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Toshiaki Takano, H. Takase, H. Kawanaka, S. Tsuruoka
We aim to accelerate learning processes in reinforcement learning by transfer learning. Its concept is that knowledge to solve similar tasks accelerates a learning process of a target task. We have proposed that the basic transfer method based on forbidden rule set that is a set of rules which cause to immediately failure of a target task. However, the basic method works poorly for the gSame Transition Model,h which has same state transition probability and different goal. In this article, we propose an effective transfer learning method in same transition model. In detail, it consists of two strategies: (1) approaching to the goal for the selected source task quickly, and (2) exploring states around the goal preferentially.
{"title":"Transfer Method for Reinforcement Learning in Same Transition Model -- Quick Approach and Preferential Exploration","authors":"Toshiaki Takano, H. Takase, H. Kawanaka, S. Tsuruoka","doi":"10.1109/ICMLA.2011.148","DOIUrl":"https://doi.org/10.1109/ICMLA.2011.148","url":null,"abstract":"We aim to accelerate learning processes in reinforcement learning by transfer learning. Its concept is that knowledge to solve similar tasks accelerates a learning process of a target task. We have proposed that the basic transfer method based on forbidden rule set that is a set of rules which cause to immediately failure of a target task. However, the basic method works poorly for the gSame Transition Model,h which has same state transition probability and different goal. In this article, we propose an effective transfer learning method in same transition model. In detail, it consists of two strategies: (1) approaching to the goal for the selected source task quickly, and (2) exploring states around the goal preferentially.","PeriodicalId":439926,"journal":{"name":"2011 10th International Conference on Machine Learning and Applications and Workshops","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132405232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this study a motion compensated X-ray CT algorithm based on a statistical model is proposed. The important feature of our motion compensated X-ray CT algorithm is that the target object is assumed to move or deform along the time. Then the projections of the deforming target object are described by a state-space model. The deformation is described by motion vectors each attached to each pixel. To reduce the ill-posed ness we incorporate into the prior distribution our a priori knowledge that the target object is composed of a restricted number of materials whose X-ray absorption coefficients are roughly known. To perform Bayesian inference based on our statistical model, the posterior distribution is approximated by a computationally tractable distribution such to minimize Kullback-Leibler (KL) divergence between the posterior and the tractable distributions. Computer simulations using phantom images show the effectiveness of our CT algorithm, suggesting the state-space model works even when the target object is deforming.
{"title":"Motion Compensated X-ray CT Algorithm for Moving Objects","authors":"Takumi Tanaka, S. Maeda, S. Ishii","doi":"10.1109/ICMLA.2011.97","DOIUrl":"https://doi.org/10.1109/ICMLA.2011.97","url":null,"abstract":"In this study a motion compensated X-ray CT algorithm based on a statistical model is proposed. The important feature of our motion compensated X-ray CT algorithm is that the target object is assumed to move or deform along the time. Then the projections of the deforming target object are described by a state-space model. The deformation is described by motion vectors each attached to each pixel. To reduce the ill-posed ness we incorporate into the prior distribution our a priori knowledge that the target object is composed of a restricted number of materials whose X-ray absorption coefficients are roughly known. To perform Bayesian inference based on our statistical model, the posterior distribution is approximated by a computationally tractable distribution such to minimize Kullback-Leibler (KL) divergence between the posterior and the tractable distributions. Computer simulations using phantom images show the effectiveness of our CT algorithm, suggesting the state-space model works even when the target object is deforming.","PeriodicalId":439926,"journal":{"name":"2011 10th International Conference on Machine Learning and Applications and Workshops","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132472556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Takeshi Yamamoto, Katsuhiro Honda, A. Notsu, H. Ichihashi
Relational clustering is actively studied in data mining, in which intrinsic data structure is summarized into cluster structure. A linear fuzzy clustering model based on Fuzzy c-Medoids (FCMdd) is proposed for extracting intrinsic local linear substructures from relational data. Alternative Fuzzy c- Means (AFCM) is an extension of Fuzzy c-means, in which a modified distance measure instead of the conventional Euclidean distance is used based on the robust M-estimation concept. In this paper, the FCMdd-based linear clustering model is further modified in order to extract linear substructure from relational data including outliers, using a pseudo-M-estimation procedure with a weight function for the modified distance measure in AFCM.
{"title":"Robust FCMdd-based Linear Clustering for Relational Data with Alternative c-Means Criterion","authors":"Takeshi Yamamoto, Katsuhiro Honda, A. Notsu, H. Ichihashi","doi":"10.1109/ICMLA.2011.164","DOIUrl":"https://doi.org/10.1109/ICMLA.2011.164","url":null,"abstract":"Relational clustering is actively studied in data mining, in which intrinsic data structure is summarized into cluster structure. A linear fuzzy clustering model based on Fuzzy c-Medoids (FCMdd) is proposed for extracting intrinsic local linear substructures from relational data. Alternative Fuzzy c- Means (AFCM) is an extension of Fuzzy c-means, in which a modified distance measure instead of the conventional Euclidean distance is used based on the robust M-estimation concept. In this paper, the FCMdd-based linear clustering model is further modified in order to extract linear substructure from relational data including outliers, using a pseudo-M-estimation procedure with a weight function for the modified distance measure in AFCM.","PeriodicalId":439926,"journal":{"name":"2011 10th International Conference on Machine Learning and Applications and Workshops","volume":"233 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131564516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A new statistical pattern classifying system is proposed to solve the problem of the "peaking phenomenon". In this phenomenon, the accuracy of a pattern classifier peaks as the features increase under a fixed size of training samples. Instead of estimating the distribution of class objects, the system generates a region on the feature space, in which a certain rate of class objects is included. The pattern classifier identifies the class if the object belongs to only one class of the coverage region, but answers "unable to detect" if the object belongs to the coverage region of more than one class or belongs to none. Here, the coverage region is simply produced from the coverage regions of each feature and then extended if necessary. Unlike the Naive-Bayes classifier, the independence of each feature is not assumed. In tests of the system on the classification of characters, the performance does not significantly decrease as the features increase unless apparently useless features are added.
{"title":"A Pattern Classifying System Based on the Coverage Regions of Objects","authors":"Izumi Suzuki","doi":"10.1109/ICMLA.2011.20","DOIUrl":"https://doi.org/10.1109/ICMLA.2011.20","url":null,"abstract":"A new statistical pattern classifying system is proposed to solve the problem of the \"peaking phenomenon\". In this phenomenon, the accuracy of a pattern classifier peaks as the features increase under a fixed size of training samples. Instead of estimating the distribution of class objects, the system generates a region on the feature space, in which a certain rate of class objects is included. The pattern classifier identifies the class if the object belongs to only one class of the coverage region, but answers \"unable to detect\" if the object belongs to the coverage region of more than one class or belongs to none. Here, the coverage region is simply produced from the coverage regions of each feature and then extended if necessary. Unlike the Naive-Bayes classifier, the independence of each feature is not assumed. In tests of the system on the classification of characters, the performance does not significantly decrease as the features increase unless apparently useless features are added.","PeriodicalId":439926,"journal":{"name":"2011 10th International Conference on Machine Learning and Applications and Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130958324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Many problems in machine learning involve variable-size structured data, such as sets, sequences, trees, and graphs. Generative (i.e. model based) kernels are well suited for handling structured data since they are able to capture their underlying structure by allowing the inclusion of prior information via specification of the source models. In this paper we focus on marginalisation kernels for variable length sequences generated by hidden Markov models. In particular, we propose a new class of generative embeddings, obtained through a nonlinear transformation of the original marginalisation mappings. This allows to embed the input data into a new feature space where a better separation can be achieved and leads to a new kernel defined as the inner product in the transformed feature space. Different nonlinear transformations are proposed and two different ways of applying these transformations to the original mappings are considered. The main contribution of this paper is the proof that the proposed nonlinear transformations increase the margin of the optimal hyper plane of an SVM classifier thus enhancing the classification performance. The proposed mappings are tested on two different sequence classification problems with really satisfying results that outperform state of the art methods.
{"title":"Nonlinear Transformations of Marginalisation Mappings for Kernels on Hidden Markov Models","authors":"A. C. Carli, Francesca P. Carli","doi":"10.1109/ICMLA.2011.106","DOIUrl":"https://doi.org/10.1109/ICMLA.2011.106","url":null,"abstract":"Many problems in machine learning involve variable-size structured data, such as sets, sequences, trees, and graphs. Generative (i.e. model based) kernels are well suited for handling structured data since they are able to capture their underlying structure by allowing the inclusion of prior information via specification of the source models. In this paper we focus on marginalisation kernels for variable length sequences generated by hidden Markov models. In particular, we propose a new class of generative embeddings, obtained through a nonlinear transformation of the original marginalisation mappings. This allows to embed the input data into a new feature space where a better separation can be achieved and leads to a new kernel defined as the inner product in the transformed feature space. Different nonlinear transformations are proposed and two different ways of applying these transformations to the original mappings are considered. The main contribution of this paper is the proof that the proposed nonlinear transformations increase the margin of the optimal hyper plane of an SVM classifier thus enhancing the classification performance. The proposed mappings are tested on two different sequence classification problems with really satisfying results that outperform state of the art methods.","PeriodicalId":439926,"journal":{"name":"2011 10th International Conference on Machine Learning and Applications and Workshops","volume":"88 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115652409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}