混合并行环境下高斯过程学习算法的实现

ACM SIGPLAN Symposium on Scala Pub Date : 2011-11-14 DOI:10.1145/2133173.2133176

V. Chandola, Ranga Raju Vatsavai

{"title":"混合并行环境下高斯过程学习算法的实现","authors":"V. Chandola, Ranga Raju Vatsavai","doi":"10.1145/2133173.2133176","DOIUrl":null,"url":null,"abstract":"In this paper, we present a scalability analysis of a parallel Gaussian process training algorithm to simultaneously analyze a massive number of time series. We study three different parallel implementations: using threads, MPI, and a hybrid implementation using threads and MPI. We compare the scalability for the multi-threaded implementation on three different hardware platforms: a Mac desktop with two quad-core Intel Xeon processors (16 virtual cores), a Linux cluster node with four quad-core 2.3 GHz AMD Opteron processors, and SGI Altix ICE 8200 cluster node with two quad-core Intel Xeon processors (16 virtual cores). We also study the scalability of the MPI based and the hybrid MPI and thread based implementations on the SGI cluster with 128 nodes (2048 cores). Experimental results show that the hybrid implementation scales better than the multi-threaded and MPI based implementations. The application of the proposed algorithm is demonstrated in analyzing massive remote sensing observation data. The hybrid implementation, using 1536 cores, can analyze a data set with over 4 million time series in nearly 5 seconds while the serial algorithm takes nearly 12 hours to process the same data set.","PeriodicalId":259517,"journal":{"name":"ACM SIGPLAN Symposium on Scala","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Implementing a gaussian process learning algorithm in mixed parallel environment\",\"authors\":\"V. Chandola, Ranga Raju Vatsavai\",\"doi\":\"10.1145/2133173.2133176\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we present a scalability analysis of a parallel Gaussian process training algorithm to simultaneously analyze a massive number of time series. We study three different parallel implementations: using threads, MPI, and a hybrid implementation using threads and MPI. We compare the scalability for the multi-threaded implementation on three different hardware platforms: a Mac desktop with two quad-core Intel Xeon processors (16 virtual cores), a Linux cluster node with four quad-core 2.3 GHz AMD Opteron processors, and SGI Altix ICE 8200 cluster node with two quad-core Intel Xeon processors (16 virtual cores). We also study the scalability of the MPI based and the hybrid MPI and thread based implementations on the SGI cluster with 128 nodes (2048 cores). Experimental results show that the hybrid implementation scales better than the multi-threaded and MPI based implementations. The application of the proposed algorithm is demonstrated in analyzing massive remote sensing observation data. The hybrid implementation, using 1536 cores, can analyze a data set with over 4 million time series in nearly 5 seconds while the serial algorithm takes nearly 12 hours to process the same data set.\",\"PeriodicalId\":259517,\"journal\":{\"name\":\"ACM SIGPLAN Symposium on Scala\",\"volume\":\"23 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-11-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM SIGPLAN Symposium on Scala\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2133173.2133176\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM SIGPLAN Symposium on Scala","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2133173.2133176","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

本文提出了一种并行高斯过程训练算法的可扩展性分析，以同时分析大量时间序列。我们研究了三种不同的并行实现:使用线程、MPI和使用线程和MPI的混合实现。我们比较了多线程实现在三种不同硬件平台上的可扩展性:带有两个四核英特尔至强处理器(16个虚拟核)的Mac桌面，带有四个四核2.3 GHz AMD Opteron处理器的Linux集群节点，以及带有两个四核英特尔至强处理器(16个虚拟核)的SGI Altix ICE 8200集群节点。我们还研究了基于MPI和基于混合MPI和线程的实现在128节点(2048核)的SGI集群上的可扩展性。实验结果表明，混合实现比基于多线程和基于MPI的实现具有更好的可扩展性。通过对海量遥感观测数据的分析，验证了该算法的应用。混合实现使用1536个内核，可以在近5秒内分析超过400万个时间序列的数据集，而串行算法需要近12个小时来处理相同的数据集。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Implementing a gaussian process learning algorithm in mixed parallel environment

In this paper, we present a scalability analysis of a parallel Gaussian process training algorithm to simultaneously analyze a massive number of time series. We study three different parallel implementations: using threads, MPI, and a hybrid implementation using threads and MPI. We compare the scalability for the multi-threaded implementation on three different hardware platforms: a Mac desktop with two quad-core Intel Xeon processors (16 virtual cores), a Linux cluster node with four quad-core 2.3 GHz AMD Opteron processors, and SGI Altix ICE 8200 cluster node with two quad-core Intel Xeon processors (16 virtual cores). We also study the scalability of the MPI based and the hybrid MPI and thread based implementations on the SGI cluster with 128 nodes (2048 cores). Experimental results show that the hybrid implementation scales better than the multi-threaded and MPI based implementations. The application of the proposed algorithm is demonstrated in analyzing massive remote sensing observation data. The hybrid implementation, using 1536 cores, can analyze a data set with over 4 million time series in nearly 5 seconds while the serial algorithm takes nearly 12 hours to process the same data set.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM SIGPLAN Symposium on Scala

自引率

0.00%

发文量