P. Kirchner, Matthias Boehm, B. Reinwald, D. Sow, J. M. Schmidt, D. Turaga, A. Biem
{"title":"Large Scale Discriminative Metric Learning","authors":"P. Kirchner, Matthias Boehm, B. Reinwald, D. Sow, J. M. Schmidt, D. Turaga, A. Biem","doi":"10.1109/IPDPSW.2014.181","DOIUrl":null,"url":null,"abstract":"We consider the learning of a distance metric, using the Localized Supervised Metric Learning (LSML) scheme, that discriminates entities characterized by high dimensional feature attributes, with respect to labels assigned to each entity. LSML is a supervised learning scheme that learns a Mahalanobis distance grouping together features with the same label and repulsing features with different labels. In this paper, we propose an efficient and scalable implementation of LSML allowing us to scale significantly and process large data sets, both in terms of dimensions and instances. This implementation of LSML is programmed in SystemML with an R-like syntax, and compiled, optimized, and executed on Hadoop. We also propose experimental approaches for the tuning of LSML parameters yielding significant analytical and empirical improvements in terms of discriminative measures such as label prediction accuracy. We present experimental results on both synthetic and real-world data (feature vectors representing patients in an Intensive Care Unit with labels corresponding to different conditions) assessing respectively how well the algorithm scales and how well it works on real world prediction problems.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPSW.2014.181","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
We consider the learning of a distance metric, using the Localized Supervised Metric Learning (LSML) scheme, that discriminates entities characterized by high dimensional feature attributes, with respect to labels assigned to each entity. LSML is a supervised learning scheme that learns a Mahalanobis distance grouping together features with the same label and repulsing features with different labels. In this paper, we propose an efficient and scalable implementation of LSML allowing us to scale significantly and process large data sets, both in terms of dimensions and instances. This implementation of LSML is programmed in SystemML with an R-like syntax, and compiled, optimized, and executed on Hadoop. We also propose experimental approaches for the tuning of LSML parameters yielding significant analytical and empirical improvements in terms of discriminative measures such as label prediction accuracy. We present experimental results on both synthetic and real-world data (feature vectors representing patients in an Intensive Care Unit with labels corresponding to different conditions) assessing respectively how well the algorithm scales and how well it works on real world prediction problems.