Characterizing Data Analytics Workloads on Intel Xeon Phi

Biwei Xie, Xu Liu, Jianfeng Zhan, Zhen Jia, Yuqing Zhu, Lei Wang, Lixin Zhang
{"title":"Characterizing Data Analytics Workloads on Intel Xeon Phi","authors":"Biwei Xie, Xu Liu, Jianfeng Zhan, Zhen Jia, Yuqing Zhu, Lei Wang, Lixin Zhang","doi":"10.1109/IISWC.2015.20","DOIUrl":null,"url":null,"abstract":"With the growing computation demands of data analytics, heterogeneous architectures become popular for their support of high parallelism. Intel Xeon Phi, a many-core coprocessor originally designed for high performance computing applications, is promising for data analytics workloads. However, to the best of knowledge, there is no prior work systematically characterizing the performance of data analytics workloads on Xeon Phi. It is difficult to design a benchmark suite to represent the behavior of data analytics workloads on Xeon Phi. The main challenge resides in fully exploiting Xeon Phi's features, such as long SIMD instruction, simultaneous multithreading, and complex memory hierarchy. To address this issue, we develop Big Data Bench-Phi, which consists of seven representative data analytics workloads. All of these benchmarks are optimized for Xeon Phi and able to characterize Xeon Phi's support for data analytics workloads. Compared with a 24-core Xeon E5-2620 machine, Big Data Bench-Phi achieves reasonable speedups for most of its benchmarks, ranging from 1.5 to 23.4X. Our experiments show that workloads working on high-dimensional matrices can significantly benefit from instruction- and thread-level parallelism on Xeon Phi.","PeriodicalId":142698,"journal":{"name":"2015 IEEE International Symposium on Workload Characterization","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Symposium on Workload Characterization","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IISWC.2015.20","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

With the growing computation demands of data analytics, heterogeneous architectures become popular for their support of high parallelism. Intel Xeon Phi, a many-core coprocessor originally designed for high performance computing applications, is promising for data analytics workloads. However, to the best of knowledge, there is no prior work systematically characterizing the performance of data analytics workloads on Xeon Phi. It is difficult to design a benchmark suite to represent the behavior of data analytics workloads on Xeon Phi. The main challenge resides in fully exploiting Xeon Phi's features, such as long SIMD instruction, simultaneous multithreading, and complex memory hierarchy. To address this issue, we develop Big Data Bench-Phi, which consists of seven representative data analytics workloads. All of these benchmarks are optimized for Xeon Phi and able to characterize Xeon Phi's support for data analytics workloads. Compared with a 24-core Xeon E5-2620 machine, Big Data Bench-Phi achieves reasonable speedups for most of its benchmarks, ranging from 1.5 to 23.4X. Our experiments show that workloads working on high-dimensional matrices can significantly benefit from instruction- and thread-level parallelism on Xeon Phi.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
在Intel Xeon Phi处理器上表征数据分析工作负载
随着数据分析计算需求的不断增长,异构架构因其对高并行性的支持而受到欢迎。英特尔至强协处理器是一款多核协处理器,最初是为高性能计算应用而设计的,有望用于数据分析工作负载。然而,据我所知,目前还没有研究系统地描述Xeon Phi协处理器上数据分析工作负载的性能。很难设计一个基准套件来表示Xeon Phi处理器上数据分析工作负载的行为。主要的挑战在于充分利用Xeon Phi处理器的特性,如长SIMD指令、同时多线程和复杂的内存层次结构。为了解决这个问题,我们开发了Big Data Bench-Phi,它由七个代表性的数据分析工作负载组成。所有这些基准测试都针对至强协处理器进行了优化,并能够表征至强协处理器对数据分析工作负载的支持。与24核至强E5-2620机器相比,大数据Bench-Phi在大多数基准测试中都达到了合理的速度,范围从1.5到23.4倍。我们的实验表明,处理高维矩阵的工作负载可以显著受益于Xeon Phi处理器上的指令级和线程级并行性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Fast Computational GPU Design with GT-Pin On Power-Performance Characterization of Concurrent Throughput Kernels CRONO: A Benchmark Suite for Multithreaded Graph Algorithms Executing on Futuristic Multicores Exploring Parallel Programming Models for Heterogeneous Computing Systems Revealing Critical Loads and Hidden Data Locality in GPGPU Applications
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1