多核处理器上处理基于xml的应用程序数据的缓存性能优化

Rajdeep Bhowmik, M. Govindaraju
{"title":"多核处理器上处理基于xml的应用程序数据的缓存性能优化","authors":"Rajdeep Bhowmik, M. Govindaraju","doi":"10.1109/CCGRID.2010.122","DOIUrl":null,"url":null,"abstract":"There is a critical need to develop new programming paradigms for grid middleware tools and applications to harness the opportunities presented by emerging multi-core processors. Implementations of grid middleware and applications that do not adapt to the programming paradigm when executing on emerging processors can severely impact the overall performance. We focus on the utilization of the L2 cache, which is a critical shared resource on Chip Multiprocessors. The access pattern of the shared L2 cache, which is dependent on how the application schedules and assigns processing work to each thread, can either enhance or undermine the ability to hide memory latency on a multi-core processor. None of the current grid simulators and emulators provides feedback and fine-grained performance data that is essential for a detailed analysis. Using the feedback from an emulation framework, we present performance analysis and provide recommendations on how processing threads can be scheduled on multi-core nodes to enhance the performance of a class of grid applications that requires processing of large-scale XML data. In particular, we discuss the gains associated with the use of the adaptations we have made to the Cache-Affinity and Balanced-Set scheduling algorithms to improve L2 cache performance, and hence the overall application execution time.","PeriodicalId":444485,"journal":{"name":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Cache Performance Optimization for Processing XML-Based Application Data on Multi-core Processors\",\"authors\":\"Rajdeep Bhowmik, M. Govindaraju\",\"doi\":\"10.1109/CCGRID.2010.122\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"There is a critical need to develop new programming paradigms for grid middleware tools and applications to harness the opportunities presented by emerging multi-core processors. Implementations of grid middleware and applications that do not adapt to the programming paradigm when executing on emerging processors can severely impact the overall performance. We focus on the utilization of the L2 cache, which is a critical shared resource on Chip Multiprocessors. The access pattern of the shared L2 cache, which is dependent on how the application schedules and assigns processing work to each thread, can either enhance or undermine the ability to hide memory latency on a multi-core processor. None of the current grid simulators and emulators provides feedback and fine-grained performance data that is essential for a detailed analysis. Using the feedback from an emulation framework, we present performance analysis and provide recommendations on how processing threads can be scheduled on multi-core nodes to enhance the performance of a class of grid applications that requires processing of large-scale XML data. In particular, we discuss the gains associated with the use of the adaptations we have made to the Cache-Affinity and Balanced-Set scheduling algorithms to improve L2 cache performance, and hence the overall application execution time.\",\"PeriodicalId\":444485,\"journal\":{\"name\":\"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing\",\"volume\":\"38 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-05-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCGRID.2010.122\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCGRID.2010.122","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

迫切需要为网格中间件工具和应用程序开发新的编程范例,以利用新兴的多核处理器带来的机会。在新兴处理器上执行不适应编程范例的网格中间件和应用程序的实现可能会严重影响整体性能。我们关注的是二级缓存的利用率,这是芯片多处理器上一个关键的共享资源。共享L2缓存的访问模式取决于应用程序如何调度和为每个线程分配处理工作,它可以增强或破坏多核处理器上隐藏内存延迟的能力。目前的网格模拟器和模拟器都没有提供反馈和细粒度的性能数据,而这些数据是详细分析所必需的。利用仿真框架的反馈,我们给出了性能分析,并就如何在多核节点上调度处理线程以增强需要处理大规模XML数据的一类网格应用程序的性能提供了建议。特别是,我们讨论了与使用我们对cache - affinity和Balanced-Set调度算法所做的调整相关的收益,以提高L2缓存性能,从而提高整个应用程序的执行时间。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Cache Performance Optimization for Processing XML-Based Application Data on Multi-core Processors
There is a critical need to develop new programming paradigms for grid middleware tools and applications to harness the opportunities presented by emerging multi-core processors. Implementations of grid middleware and applications that do not adapt to the programming paradigm when executing on emerging processors can severely impact the overall performance. We focus on the utilization of the L2 cache, which is a critical shared resource on Chip Multiprocessors. The access pattern of the shared L2 cache, which is dependent on how the application schedules and assigns processing work to each thread, can either enhance or undermine the ability to hide memory latency on a multi-core processor. None of the current grid simulators and emulators provides feedback and fine-grained performance data that is essential for a detailed analysis. Using the feedback from an emulation framework, we present performance analysis and provide recommendations on how processing threads can be scheduled on multi-core nodes to enhance the performance of a class of grid applications that requires processing of large-scale XML data. In particular, we discuss the gains associated with the use of the adaptations we have made to the Cache-Affinity and Balanced-Set scheduling algorithms to improve L2 cache performance, and hence the overall application execution time.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
In Search of Visualization Metaphors for PlanetLab Multi-criteria Content Adaptation Service Selection Broker Enabling the Next Generation of Scalable Clusters Development and Support of Platforms for Research into Rare Diseases Using Cloud Constructs and Predictive Analysis to Enable Pre-Failure Process Migration in HPC Systems
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1