利用空闲工作站实现预测性预取

Proceedings the Ninth International Symposium on High-Performance Distributed Computing Pub Date : 2000-08-01 DOI:10.1109/HPDC.2000.868638

Jasmine Y. Q. Wang, J. Ong, Y. Coady, M. Feeley

{"title":"利用空闲工作站实现预测性预取","authors":"Jasmine Y. Q. Wang, J. Ong, Y. Coady, M. Feeley","doi":"10.1109/HPDC.2000.868638","DOIUrl":null,"url":null,"abstract":"The benefits of Markov-based predictive prefetching have been largely overshadowed by the overhead required to produce high-quality predictions. While both theoretical and simulation results for prediction algorithms appear promising, substantial limitations exist in practice. This outcome can be partially attributed to the fact that practical implementations ultimately make compromises in order to reduce overhead. These compromises limit the level of algorithm complexity, the variety of access patterns and the granularity of trace data that the implementation supports. This paper describes the design and implementation of GMS-3P (Global Memory System with Parallel Predictive Prefetching), an operating system kernel extension that offloads prediction overhead to idle network nodes. GMS-3P builds on the GMS global memory system, which pages to and from remote workstation memory. In GMS-3P, the target node sends an online trace of an application's page faults to an idle node that is running a Markov-based prediction algorithm. The prediction node then uses GMS to prefetch pages to the target node from the memory of other workstations in the network. Our preliminary results show that predictive prefetching can reduce the remote-memory page fault time by 60% or more and that, by offloading prediction overhead to an idle node, GMS-3P can reduce this improved latency by between 24% and 44%, depending on the Markov model order.","PeriodicalId":400728,"journal":{"name":"Proceedings the Ninth International Symposium on High-Performance Distributed Computing","volume":"248 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2000-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Using idle workstations to implement predictive prefetching\",\"authors\":\"Jasmine Y. Q. Wang, J. Ong, Y. Coady, M. Feeley\",\"doi\":\"10.1109/HPDC.2000.868638\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The benefits of Markov-based predictive prefetching have been largely overshadowed by the overhead required to produce high-quality predictions. While both theoretical and simulation results for prediction algorithms appear promising, substantial limitations exist in practice. This outcome can be partially attributed to the fact that practical implementations ultimately make compromises in order to reduce overhead. These compromises limit the level of algorithm complexity, the variety of access patterns and the granularity of trace data that the implementation supports. This paper describes the design and implementation of GMS-3P (Global Memory System with Parallel Predictive Prefetching), an operating system kernel extension that offloads prediction overhead to idle network nodes. GMS-3P builds on the GMS global memory system, which pages to and from remote workstation memory. In GMS-3P, the target node sends an online trace of an application's page faults to an idle node that is running a Markov-based prediction algorithm. The prediction node then uses GMS to prefetch pages to the target node from the memory of other workstations in the network. Our preliminary results show that predictive prefetching can reduce the remote-memory page fault time by 60% or more and that, by offloading prediction overhead to an idle node, GMS-3P can reduce this improved latency by between 24% and 44%, depending on the Markov model order.\",\"PeriodicalId\":400728,\"journal\":{\"name\":\"Proceedings the Ninth International Symposium on High-Performance Distributed Computing\",\"volume\":\"248 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2000-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings the Ninth International Symposium on High-Performance Distributed Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPDC.2000.868638\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings the Ninth International Symposium on High-Performance Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPDC.2000.868638","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

摘要

基于马尔可夫的预测预取的好处在很大程度上被产生高质量预测所需的开销所掩盖。虽然预测算法的理论和仿真结果都很有希望，但在实践中存在很大的局限性。这一结果可以部分归因于实际实现最终为了减少开销而做出妥协的事实。这些妥协限制了算法的复杂性、访问模式的多样性以及实现所支持的跟踪数据的粒度。本文描述了GMS-3P (Global Memory System with Parallel Predictive prefetch)的设计和实现，GMS-3P是一个操作系统内核扩展，可以将预测开销转移到空闲的网络节点上。GMS- 3p建立在GMS全局内存系统之上，该系统在远程工作站内存之间进行分页。在GMS-3P中，目标节点将应用程序页面错误的在线跟踪发送到正在运行基于markov的预测算法的空闲节点。然后，预测节点使用GMS从网络中其他工作站的内存中预取页面到目标节点。我们的初步结果表明，预测性预取可以将远程内存页面故障时间减少60%或更多，并且通过将预测开销卸载到空闲节点，GMS-3P可以将这种改进的延迟减少24%到44%，具体取决于马尔可夫模型的顺序。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Using idle workstations to implement predictive prefetching

The benefits of Markov-based predictive prefetching have been largely overshadowed by the overhead required to produce high-quality predictions. While both theoretical and simulation results for prediction algorithms appear promising, substantial limitations exist in practice. This outcome can be partially attributed to the fact that practical implementations ultimately make compromises in order to reduce overhead. These compromises limit the level of algorithm complexity, the variety of access patterns and the granularity of trace data that the implementation supports. This paper describes the design and implementation of GMS-3P (Global Memory System with Parallel Predictive Prefetching), an operating system kernel extension that offloads prediction overhead to idle network nodes. GMS-3P builds on the GMS global memory system, which pages to and from remote workstation memory. In GMS-3P, the target node sends an online trace of an application's page faults to an idle node that is running a Markov-based prediction algorithm. The prediction node then uses GMS to prefetch pages to the target node from the memory of other workstations in the network. Our preliminary results show that predictive prefetching can reduce the remote-memory page fault time by 60% or more and that, by offloading prediction overhead to an idle node, GMS-3P can reduce this improved latency by between 24% and 44%, depending on the Markov model order.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助