预测大规模共享内存多处理器中的应用程序行为

Proceedings of the IEEE/ACM SC95 Conference Pub Date : 1995-12-08 DOI:10.1145/224170.224356

Karim Harzallah, K. Sevcik

{"title":"预测大规模共享内存多处理器中的应用程序行为","authors":"Karim Harzallah, K. Sevcik","doi":"10.1145/224170.224356","DOIUrl":null,"url":null,"abstract":"In this paper we present an analytical-based framework for parallel program performance prediction. The main thrust of this work is to provide a means for treating realistic applications within a single unified framework. Our approach is based upon the specification of a set of non-linear equations which describe the application, processor configuration, network and memory operations. These equations are solved iteratively since the application execution rate depends on the communication latencies. The iterative solution technique is found to be efficient as it typically requires only few iterations to reach convergence. Our modeling methodology achieves a good balance between abstraction and accuracy. This is attained by accounting for both time and space dimensions of memory references, while maintaining a simple description of the workload. We demonstrate both the practicality and the accuracy of our approach by comparing predicted results with measurements taken on a commercial multiprocessor system. We found the model to be faithful in reflecting changes in processor speed, and changes in the number and placement of allocated processors.","PeriodicalId":269909,"journal":{"name":"Proceedings of the IEEE/ACM SC95 Conference","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1995-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Predicting Application Behavior in Large Scale Shared-Memory Multiprocessors\",\"authors\":\"Karim Harzallah, K. Sevcik\",\"doi\":\"10.1145/224170.224356\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper we present an analytical-based framework for parallel program performance prediction. The main thrust of this work is to provide a means for treating realistic applications within a single unified framework. Our approach is based upon the specification of a set of non-linear equations which describe the application, processor configuration, network and memory operations. These equations are solved iteratively since the application execution rate depends on the communication latencies. The iterative solution technique is found to be efficient as it typically requires only few iterations to reach convergence. Our modeling methodology achieves a good balance between abstraction and accuracy. This is attained by accounting for both time and space dimensions of memory references, while maintaining a simple description of the workload. We demonstrate both the practicality and the accuracy of our approach by comparing predicted results with measurements taken on a commercial multiprocessor system. We found the model to be faithful in reflecting changes in processor speed, and changes in the number and placement of allocated processors.\",\"PeriodicalId\":269909,\"journal\":{\"name\":\"Proceedings of the IEEE/ACM SC95 Conference\",\"volume\":\"33 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1995-12-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the IEEE/ACM SC95 Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/224170.224356\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the IEEE/ACM SC95 Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/224170.224356","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

本文提出了一个基于分析的并行程序性能预测框架。这项工作的主要目的是提供一种在单一统一框架内处理实际应用的方法。我们的方法是基于一组描述应用程序、处理器配置、网络和内存操作的非线性方程的规范。由于应用程序的执行速率取决于通信延迟，因此这些方程是迭代求解的。迭代求解技术被认为是有效的，因为它通常只需要很少的迭代就可以达到收敛。我们的建模方法在抽象和准确性之间取得了很好的平衡。这是通过考虑内存引用的时间和空间维度来实现的，同时保持对工作负载的简单描述。我们通过将预测结果与商业多处理器系统上的测量结果进行比较，证明了我们方法的实用性和准确性。我们发现该模型忠实地反映了处理器速度的变化，以及分配的处理器数量和位置的变化。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Predicting Application Behavior in Large Scale Shared-Memory Multiprocessors

In this paper we present an analytical-based framework for parallel program performance prediction. The main thrust of this work is to provide a means for treating realistic applications within a single unified framework. Our approach is based upon the specification of a set of non-linear equations which describe the application, processor configuration, network and memory operations. These equations are solved iteratively since the application execution rate depends on the communication latencies. The iterative solution technique is found to be efficient as it typically requires only few iterations to reach convergence. Our modeling methodology achieves a good balance between abstraction and accuracy. This is attained by accounting for both time and space dimensions of memory references, while maintaining a simple description of the workload. We demonstrate both the practicality and the accuracy of our approach by comparing predicted results with measurements taken on a commercial multiprocessor system. We found the model to be faithful in reflecting changes in processor speed, and changes in the number and placement of allocated processors.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the IEEE/ACM SC95 Conference

自引率

0.00%

发文量