在Intel KNL多核处理器上对数据分析和机器学习应用进行基准测试

2017 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2017-07-12 DOI:10.1109/HPEC.2017.8091067

C. Byun, J. Kepner, W. Arcand, David Bestor, Bill Bergeron, V. Gadepally, Michael Houle, M. Hubbell, Michael Jones, Anna Klein, P. Michaleas, Lauren Milechin, J. Mullen, Andrew Prout, Antonio Rosa, S. Samsi, Charles Yee, A. Reuther

{"title":"在Intel KNL多核处理器上对数据分析和机器学习应用进行基准测试","authors":"C. Byun, J. Kepner, W. Arcand, David Bestor, Bill Bergeron, V. Gadepally, Michael Houle, M. Hubbell, Michael Jones, Anna Klein, P. Michaleas, Lauren Milechin, J. Mullen, Andrew Prout, Antonio Rosa, S. Samsi, Charles Yee, A. Reuther","doi":"10.1109/HPEC.2017.8091067","DOIUrl":null,"url":null,"abstract":"Knights Landing (KNL) is the code name for the second-generation Intel Xeon Phi product family. KNL has generated significant interest in the data analysis and machine learning communities because its new many-core architecture targets both of these workloads. The KNL many-core vector processor design enables it to exploit much higher levels of parallelism. At the Lincoln Laboratory Supercomputing Center (LLSC), the majority of users are running data analysis applications such as MATLAB and Octave. More recently, machine learning applications, such as the UC Berkeley Caffe deep learning framework, have become increasingly important to LLSC users. Thus, the performance of these applications on KNL systems is of high interest to LLSC users and the broader data analysis and machine learning communities. Our data analysis benchmarks of these application on the Intel KNL processor indicate that single-core double-precision generalized matrix multiply (DGEMM) performance on KNL systems has improved by ∼3.5× compared to prior Intel Xeon technologies. Our data analysis applications also achieved ∼60% of the theoretical peak performance. Also a performance comparison of a machine learning application, Caffe, between the two different Intel CPUs, Xeon E5 v3 and Xeon Phi 7210, demonstrated a 2.7× improvement on a KNL node.","PeriodicalId":364903,"journal":{"name":"2017 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"Benchmarking data analysis and machine learning applications on the Intel KNL many-core processor\",\"authors\":\"C. Byun, J. Kepner, W. Arcand, David Bestor, Bill Bergeron, V. Gadepally, Michael Houle, M. Hubbell, Michael Jones, Anna Klein, P. Michaleas, Lauren Milechin, J. Mullen, Andrew Prout, Antonio Rosa, S. Samsi, Charles Yee, A. Reuther\",\"doi\":\"10.1109/HPEC.2017.8091067\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Knights Landing (KNL) is the code name for the second-generation Intel Xeon Phi product family. KNL has generated significant interest in the data analysis and machine learning communities because its new many-core architecture targets both of these workloads. The KNL many-core vector processor design enables it to exploit much higher levels of parallelism. At the Lincoln Laboratory Supercomputing Center (LLSC), the majority of users are running data analysis applications such as MATLAB and Octave. More recently, machine learning applications, such as the UC Berkeley Caffe deep learning framework, have become increasingly important to LLSC users. Thus, the performance of these applications on KNL systems is of high interest to LLSC users and the broader data analysis and machine learning communities. Our data analysis benchmarks of these application on the Intel KNL processor indicate that single-core double-precision generalized matrix multiply (DGEMM) performance on KNL systems has improved by ∼3.5× compared to prior Intel Xeon technologies. Our data analysis applications also achieved ∼60% of the theoretical peak performance. Also a performance comparison of a machine learning application, Caffe, between the two different Intel CPUs, Xeon E5 v3 and Xeon Phi 7210, demonstrated a 2.7× improvement on a KNL node.\",\"PeriodicalId\":364903,\"journal\":{\"name\":\"2017 IEEE High Performance Extreme Computing Conference (HPEC)\",\"volume\":\"25 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-07-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE High Performance Extreme Computing Conference (HPEC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPEC.2017.8091067\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE High Performance Extreme Computing Conference (HPEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPEC.2017.8091067","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 15

摘要

骑士登陆(KNL)是第二代英特尔至强Phi产品系列的代号。KNL引起了数据分析和机器学习社区的极大兴趣，因为它的新多核架构针对这两个工作负载。KNL多核矢量处理器设计使其能够利用更高级别的并行性。在林肯实验室超级计算中心(LLSC)，大多数用户都在运行数据分析应用程序，如MATLAB和Octave。最近，机器学习应用程序，如加州大学伯克利分校的Caffe深度学习框架，对LLSC用户变得越来越重要。因此，这些应用程序在KNL系统上的性能对LLSC用户和更广泛的数据分析和机器学习社区非常感兴趣。我们在英特尔KNL处理器上对这些应用程序进行的数据分析基准测试表明，与之前的英特尔至强技术相比，KNL系统上的单核双精度广义矩阵乘法(DGEMM)性能提高了约3.5倍。我们的数据分析应用程序也达到了理论峰值性能的约60%。此外，机器学习应用程序Caffe在两种不同的英特尔cpu (Xeon E5 v3和Xeon Phi 7210)之间的性能比较显示，KNL节点的性能提高了2.7倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Benchmarking data analysis and machine learning applications on the Intel KNL many-core processor

Knights Landing (KNL) is the code name for the second-generation Intel Xeon Phi product family. KNL has generated significant interest in the data analysis and machine learning communities because its new many-core architecture targets both of these workloads. The KNL many-core vector processor design enables it to exploit much higher levels of parallelism. At the Lincoln Laboratory Supercomputing Center (LLSC), the majority of users are running data analysis applications such as MATLAB and Octave. More recently, machine learning applications, such as the UC Berkeley Caffe deep learning framework, have become increasingly important to LLSC users. Thus, the performance of these applications on KNL systems is of high interest to LLSC users and the broader data analysis and machine learning communities. Our data analysis benchmarks of these application on the Intel KNL processor indicate that single-core double-precision generalized matrix multiply (DGEMM) performance on KNL systems has improved by ∼3.5× compared to prior Intel Xeon technologies. Our data analysis applications also achieved ∼60% of the theoretical peak performance. Also a performance comparison of a machine learning application, Caffe, between the two different Intel CPUs, Xeon E5 v3 and Xeon Phi 7210, demonstrated a 2.7× improvement on a KNL node.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 IEEE High Performance Extreme Computing Conference (HPEC)

自引率

0.00%

发文量

期刊最新文献

Optimized task graph mapping on a many-core neuromorphic supercomputer Software-defined extreme scale networks for bigdata applications Power-aware computing: Measurement, control, and performance analysis for Intel Xeon Phi xDCI, a data science cyberinfrastructure for interdisciplinary research Leakage energy reduction for hard real-time caches