The co-clustering consists in reorganizing a data matrix into homogeneous blocks by considering simultaneously the sets of rows and columns. Setting this aim in model-based clustering, adapted block latent models were proposed for binary data and co-occurrence matrix. Regarding continuous data, the latent block model is not appropriated in many cases. As non-negative matrix factorization, it treats symmetrically the two sets, and the estimation of associated parameters requires a variational approximation. In this paper we focus on continuous data matrix without restriction to non negative matrix. We propose a parsimonious mixture model allowing to overcome the limits of the latent block model.
{"title":"Model-Based Co-clustering for Continuous Data","authors":"M. Nadif, G. Govaert","doi":"10.1109/ICMLA.2010.33","DOIUrl":"https://doi.org/10.1109/ICMLA.2010.33","url":null,"abstract":"The co-clustering consists in reorganizing a data matrix into homogeneous blocks by considering simultaneously the sets of rows and columns. Setting this aim in model-based clustering, adapted block latent models were proposed for binary data and co-occurrence matrix. Regarding continuous data, the latent block model is not appropriated in many cases. As non-negative matrix factorization, it treats symmetrically the two sets, and the estimation of associated parameters requires a variational approximation. In this paper we focus on continuous data matrix without restriction to non negative matrix. We propose a parsimonious mixture model allowing to overcome the limits of the latent block model.","PeriodicalId":336514,"journal":{"name":"2010 Ninth International Conference on Machine Learning and Applications","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114785971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Eigen-decomposition is a key step in spectral clustering and some kernel methods. The Nyström method is often used to speed up kernel matrix decomposition. However, it cannot effectively update eigenvectors of matrices when datasets dynamically increase with time. In this paper, we propose an incremental Nyström method for dynamic learning. Experimental results demonstrate the feasibility and effectiveness of the proposed method.
{"title":"Incremental Nyström Low-Rank Decomposition for Dynamic Learning","authors":"Lin Zhang, Hongyu Li","doi":"10.1109/ICMLA.2010.87","DOIUrl":"https://doi.org/10.1109/ICMLA.2010.87","url":null,"abstract":"Eigen-decomposition is a key step in spectral clustering and some kernel methods. The Nyström method is often used to speed up kernel matrix decomposition. However, it cannot effectively update eigenvectors of matrices when datasets dynamically increase with time. In this paper, we propose an incremental Nyström method for dynamic learning. Experimental results demonstrate the feasibility and effectiveness of the proposed method.","PeriodicalId":336514,"journal":{"name":"2010 Ninth International Conference on Machine Learning and Applications","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114889210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Faddoul, Boris Chidlovskii, Fabien Torre, Rémi Gilleron
Learning multiple related tasks from data simultaneously can improve predictive performance relative to learning these tasks independently. In this paper we propose a novel multi-task learning algorithm called MT-Adaboost: it extends Adaboost algorithm Freund1999Short to the multi-task setting, it uses as multi-task weak classifier a multi-task decision stump. This allows to learn different dependencies between tasks for different regions of the learning space. Thus, we relax the conventional hypothesis that tasks behave similarly in the whole learning space. Moreover, MT-Adaboost can learn multiple tasks without imposing the constraint of sharing the same label set and/or examples between tasks. A theoretical analysis is derived from the analysis of the original Adaboost. Experiments for multiple tasks over large scale textual data sets with social context (Enron and Tobacco) give rise to very promising results.
{"title":"Boosting Multi-Task Weak Learners with Applications to Textual and Social Data","authors":"J. Faddoul, Boris Chidlovskii, Fabien Torre, Rémi Gilleron","doi":"10.1109/ICMLA.2010.61","DOIUrl":"https://doi.org/10.1109/ICMLA.2010.61","url":null,"abstract":"Learning multiple related tasks from data simultaneously can improve predictive performance relative to learning these tasks independently. In this paper we propose a novel multi-task learning algorithm called MT-Adaboost: it extends Adaboost algorithm Freund1999Short to the multi-task setting, it uses as multi-task weak classifier a multi-task decision stump. This allows to learn different dependencies between tasks for different regions of the learning space. Thus, we relax the conventional hypothesis that tasks behave similarly in the whole learning space. Moreover, MT-Adaboost can learn multiple tasks without imposing the constraint of sharing the same label set and/or examples between tasks. A theoretical analysis is derived from the analysis of the original Adaboost. Experiments for multiple tasks over large scale textual data sets with social context (Enron and Tobacco) give rise to very promising results.","PeriodicalId":336514,"journal":{"name":"2010 Ninth International Conference on Machine Learning and Applications","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115352482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Emerging general-purpose Graphics Processing Unit (GPU) provides a multi-core platform for wide applications, including machine learning algorithms. In this paper, we proposed several techniques to accelerate Support Vector Machines (SVM) on GPUs. Sparse matrix format is introduced into parallel SVM to achieve better performance. Experimental results show that the speedup of 55x–133.8x over LIBSVM can be achieved in training process on NVIDIA GeForce GTX470.
{"title":"Support Vector Machines on GPU with Sparse Matrix Format","authors":"Tsung-Kai Lin, Shao-Yi Chien","doi":"10.1109/ICMLA.2010.53","DOIUrl":"https://doi.org/10.1109/ICMLA.2010.53","url":null,"abstract":"Emerging general-purpose Graphics Processing Unit (GPU) provides a multi-core platform for wide applications, including machine learning algorithms. In this paper, we proposed several techniques to accelerate Support Vector Machines (SVM) on GPUs. Sparse matrix format is introduced into parallel SVM to achieve better performance. Experimental results show that the speedup of 55x–133.8x over LIBSVM can be achieved in training process on NVIDIA GeForce GTX470.","PeriodicalId":336514,"journal":{"name":"2010 Ninth International Conference on Machine Learning and Applications","volume":"124 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115918205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ulrich Weiss, P. Biber, Stefan Laible, K. Bohlmann, A. Zell
In the domain of agricultural robotics, one major application is crop scouting, e.g., for the task of weed control. For this task a key enabler is a robust detection and classification of the plant and species. Automatically distinguishing between plant species is a challenging task, because some species look very similar. It is also difficult to translate the symbolic high level description of the appearances and the differences between the plants used by humans, into a formal, computer understandable form. Also it is not possible to reliably detect structures, like leaves and branches in 3D data provided by our sensor. One approach to solve this problem is to learn how to classify the species by using a set of example plants and machine learning methods. In this paper we are introducing a method for distinguishing plant species using a 3D LIDAR sensor and supervised learning. For that we have developed a set of size and rotation invariant features and evaluated experimentally which are the most descriptive ones. Besides these features we have also compared different learning methods using the toolbox Weka. It turned out that the best methods for our application are simple logistic regression functions, support vector machines and neural networks. In our experiments we used six different plant species, typically available at common nurseries, and about 20 examples of each species. In the laboratory we were able to identify over 98% of these plants correctly.
{"title":"Plant Species Classification Using a 3D LIDAR Sensor and Machine Learning","authors":"Ulrich Weiss, P. Biber, Stefan Laible, K. Bohlmann, A. Zell","doi":"10.1109/ICMLA.2010.57","DOIUrl":"https://doi.org/10.1109/ICMLA.2010.57","url":null,"abstract":"In the domain of agricultural robotics, one major application is crop scouting, e.g., for the task of weed control. For this task a key enabler is a robust detection and classification of the plant and species. Automatically distinguishing between plant species is a challenging task, because some species look very similar. It is also difficult to translate the symbolic high level description of the appearances and the differences between the plants used by humans, into a formal, computer understandable form. Also it is not possible to reliably detect structures, like leaves and branches in 3D data provided by our sensor. One approach to solve this problem is to learn how to classify the species by using a set of example plants and machine learning methods. In this paper we are introducing a method for distinguishing plant species using a 3D LIDAR sensor and supervised learning. For that we have developed a set of size and rotation invariant features and evaluated experimentally which are the most descriptive ones. Besides these features we have also compared different learning methods using the toolbox Weka. It turned out that the best methods for our application are simple logistic regression functions, support vector machines and neural networks. In our experiments we used six different plant species, typically available at common nurseries, and about 20 examples of each species. In the laboratory we were able to identify over 98% of these plants correctly.","PeriodicalId":336514,"journal":{"name":"2010 Ninth International Conference on Machine Learning and Applications","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115581041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present in this paper a new clustering method which provides self-organization of hierarchical clustering. This method represents large datasets on a forest of original trees which are projected on a simple 2D geometric relationship using tree map representation. The obtained partition is represented by a map of tree maps, which define a tree of data. In this paper, we provide the rules that build a tree of node/data by using distance between data in order to decide where connect nodes. Visual and empirical results based on both synthetic and real datasets from the UCI repository, are given and discussed.
{"title":"Map-TreeMaps: A New Approach for Hierarchical and Topological Clustering","authors":"Hanene Azzag, M. Lebbah, A. Arfaoui","doi":"10.1109/ICMLA.2010.136","DOIUrl":"https://doi.org/10.1109/ICMLA.2010.136","url":null,"abstract":"We present in this paper a new clustering method which provides self-organization of hierarchical clustering. This method represents large datasets on a forest of original trees which are projected on a simple 2D geometric relationship using tree map representation. The obtained partition is represented by a map of tree maps, which define a tree of data. In this paper, we provide the rules that build a tree of node/data by using distance between data in order to decide where connect nodes. Visual and empirical results based on both synthetic and real datasets from the UCI repository, are given and discussed.","PeriodicalId":336514,"journal":{"name":"2010 Ninth International Conference on Machine Learning and Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114831481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present a method for fully automated selection of treatment beam ensembles for external radiation therapy. We reformulate the beam angle selection problem as a clustering problem of locally ideal beam orientations distributed on the unit sphere. For this purpose we construct an infinite mixture of von Mises-Fisher distributions, which is suited in general for density estimation from data on the D-dimensional sphere. Using a nonparametric Dirichlet process prior, our model infers probability distributions over both the number of clusters and their parameter values. We describe an efficient Markov chain Monte Carlo inference algorithm for posterior inference from experimental data in this model. The performance of the suggested beam angle selection framework is illustrated for one intra-cranial, pancreas, and prostate case each. The infinite von Mises-Fisher mixture model (iMFMM) creates between 18 and 32 clusters, depending on the patient anatomy. This suggests to use the iMFMM directly for beam ensemble selection in robotic radio surgery, or to generate low-dimensional input for both subsequent optimization of trajectories for arc therapy and beam ensemble selection for conventional radiation therapy.
{"title":"Using an Infinite Von Mises-Fisher Mixture Model to Cluster Treatment Beam Directions in External Radiation Therapy","authors":"M. Bangert, Philipp Hennig, U. Oelfke","doi":"10.1109/ICMLA.2010.114","DOIUrl":"https://doi.org/10.1109/ICMLA.2010.114","url":null,"abstract":"We present a method for fully automated selection of treatment beam ensembles for external radiation therapy. We reformulate the beam angle selection problem as a clustering problem of locally ideal beam orientations distributed on the unit sphere. For this purpose we construct an infinite mixture of von Mises-Fisher distributions, which is suited in general for density estimation from data on the D-dimensional sphere. Using a nonparametric Dirichlet process prior, our model infers probability distributions over both the number of clusters and their parameter values. We describe an efficient Markov chain Monte Carlo inference algorithm for posterior inference from experimental data in this model. The performance of the suggested beam angle selection framework is illustrated for one intra-cranial, pancreas, and prostate case each. The infinite von Mises-Fisher mixture model (iMFMM) creates between 18 and 32 clusters, depending on the patient anatomy. This suggests to use the iMFMM directly for beam ensemble selection in robotic radio surgery, or to generate low-dimensional input for both subsequent optimization of trajectories for arc therapy and beam ensemble selection for conventional radiation therapy.","PeriodicalId":336514,"journal":{"name":"2010 Ninth International Conference on Machine Learning and Applications","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121090506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Many sequential decision making problems require an agent to balance exploration and exploitation to maximise long-term reward. Existing policies that address this tradeoff typically have parameters that are set a priori to control the amount of exploration. In finite-time problems, the optimal values of these parameters are highly dependent on the problem faced. In this paper, we propose adapting the amount of exploration performed on-line, as information is gathered by the agent. To this end we introduce a novel algorithm, e-ADAPT, which has no free parameters. The algorithm adapts as it plays and sequentially chooses whether to explore or exploit, driven by the amount of uncertainty in the system. We provide simulation results for the one armed bandit with covariates problem, which demonstrate the effectiveness of e-ADAPT to correctly control the amount of exploration in finite-time problems and yield rewards that are close to optimally tuned off-line policies. Furthermore, we show that e-ADAPT is robust to a high-dimensional covariate, as well as misspecified models. Finally, we describe how our methods could be extended to other sequential decision making problems, such as dynamic bandit problems with changing reward structures.
{"title":"On-Line Adaptation of Exploration in the One-Armed Bandit with Covariates Problem","authors":"A. Sykulski, N. Adams, N. Jennings","doi":"10.1109/ICMLA.2010.74","DOIUrl":"https://doi.org/10.1109/ICMLA.2010.74","url":null,"abstract":"Many sequential decision making problems require an agent to balance exploration and exploitation to maximise long-term reward. Existing policies that address this tradeoff typically have parameters that are set a priori to control the amount of exploration. In finite-time problems, the optimal values of these parameters are highly dependent on the problem faced. In this paper, we propose adapting the amount of exploration performed on-line, as information is gathered by the agent. To this end we introduce a novel algorithm, e-ADAPT, which has no free parameters. The algorithm adapts as it plays and sequentially chooses whether to explore or exploit, driven by the amount of uncertainty in the system. We provide simulation results for the one armed bandit with covariates problem, which demonstrate the effectiveness of e-ADAPT to correctly control the amount of exploration in finite-time problems and yield rewards that are close to optimally tuned off-line policies. Furthermore, we show that e-ADAPT is robust to a high-dimensional covariate, as well as misspecified models. Finally, we describe how our methods could be extended to other sequential decision making problems, such as dynamic bandit problems with changing reward structures.","PeriodicalId":336514,"journal":{"name":"2010 Ninth International Conference on Machine Learning and Applications","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127545346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Time series data naturally arises in many domains, such as industrial process control, robotics, finance, medicine, climatology, and numerous others. In many cases variables known to be causally relevant cannot be measured directly or the existence of such variables is unknown. This paper presents an extension of the neural network architecture, called the LO-net [1], for inferring both the existence and values of hidden variables in streaming multivariate time series, leading to deeper understanding of the domain and more accurate prediction. The core idea is to initially make predictions with one network (the observable or O net) based on a time delay embedding, following this with a gradual reduction in the temporal scope of the embedding that forces a second network (the latent or L net) to learn to approximate the value of a single hidden variable, which is then input to the O net based on the original time delay embedding. Experiments show that the architecture efficiently and accurately identifies the number of hidden variables and their values over time.
{"title":"Discovering and Characterizing Hidden Variables in Streaming Multivariate Time Series","authors":"Soumi Ray, T. Oates","doi":"10.1109/ICMLA.2010.144","DOIUrl":"https://doi.org/10.1109/ICMLA.2010.144","url":null,"abstract":"Time series data naturally arises in many domains, such as industrial process control, robotics, finance, medicine, climatology, and numerous others. In many cases variables known to be causally relevant cannot be measured directly or the existence of such variables is unknown. This paper presents an extension of the neural network architecture, called the LO-net [1], for inferring both the existence and values of hidden variables in streaming multivariate time series, leading to deeper understanding of the domain and more accurate prediction. The core idea is to initially make predictions with one network (the observable or O net) based on a time delay embedding, following this with a gradual reduction in the temporal scope of the embedding that forces a second network (the latent or L net) to learn to approximate the value of a single hidden variable, which is then input to the O net based on the original time delay embedding. Experiments show that the architecture efficiently and accurately identifies the number of hidden variables and their values over time.","PeriodicalId":336514,"journal":{"name":"2010 Ninth International Conference on Machine Learning and Applications","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127949894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Most technical and manufacturing processes are based on an empiric process understanding, there only very incomplete formal relations exist. To establish a process model, the identification of the appropriate process is essential. In addition, this process model has to feature a quality of execution to enable forward-looking properties like an online prediction mode. This report argues that the agent-based identification is appropriate to this modelling issue. Although there were many predecessor approaches, which tried to design formal models of manufacturing processes, all of them fell short of the data based identification of complex systems, like paper manufacturing: complex systems consisting of continuous and discrete parts, called hybrid manufacturing systems. This paper focuses on the system identification with agent based evolutionary computation using a local optimization kernel. It presents the system architecture and introduces a data based identification method with different local optimization lgorithms. Finally we consider the characteristics of an identification framework with large-scale data processing. We close with identification results related to the 2-step optimization algorithm.
{"title":"System Identification with Multi-Agent-based Evolutionary Computation Using a Local Optimization Kernel","authors":"S. Bohlmann, V. Klinger, H. Szczerbicka","doi":"10.1109/ICMLA.2010.130","DOIUrl":"https://doi.org/10.1109/ICMLA.2010.130","url":null,"abstract":"Most technical and manufacturing processes are based on an empiric process understanding, there only very incomplete formal relations exist. To establish a process model, the identification of the appropriate process is essential. In addition, this process model has to feature a quality of execution to enable forward-looking properties like an online prediction mode. This report argues that the agent-based identification is appropriate to this modelling issue. Although there were many predecessor approaches, which tried to design formal models of manufacturing processes, all of them fell short of the data based identification of complex systems, like paper manufacturing: complex systems consisting of continuous and discrete parts, called hybrid manufacturing systems. This paper focuses on the system identification with agent based evolutionary computation using a local optimization kernel. It presents the system architecture and introduces a data based identification method with different local optimization lgorithms. Finally we consider the characteristics of an identification framework with large-scale data processing. We close with identification results related to the 2-step optimization algorithm.","PeriodicalId":336514,"journal":{"name":"2010 Ninth International Conference on Machine Learning and Applications","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130351522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}