使用随机共享内存的大块同步并行模型优雅地降级系统

Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers Pub Date : 1995-06-27 DOI:10.1109/FTCS.1995.466969

Andreas G. Savva, T. Nanya

{"title":"使用随机共享内存的大块同步并行模型优雅地降级系统","authors":"Andreas G. Savva, T. Nanya","doi":"10.1109/FTCS.1995.466969","DOIUrl":null,"url":null,"abstract":"The bulk-synchronous parallel model (BSPM) was proposed as a bridging model for parallel computation by Valiant (1990). By using randomised shared memory (RSM), this model offers an asymptotically optimal emulation of the PRAM. By using the BSPM with RSM, we show how a gracefully degrading massively parallel system can be obtained through: memory duplication to ensure global memory integrity, and to speed up the reconfiguration; a global reconfiguration method that restores the logical properties of the system, after a fault occurs. We assume fail-stop processors, single faults, no spare processors, and no significant loss of network throughput as a result of faults. Work done during reconfiguration is shared equally among the live processors, with minimal coordination. The overhead of the scheme and the graceful degradation achieved depend on the program being executed. We evaluate the reconfiguration, overhead, and graceful degradation of the system experimentally.<<ETX>>","PeriodicalId":309075,"journal":{"name":"Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1995-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Gracefully degrading systems using the bulk-synchronous parallel model with randomised shared memory\",\"authors\":\"Andreas G. Savva, T. Nanya\",\"doi\":\"10.1109/FTCS.1995.466969\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The bulk-synchronous parallel model (BSPM) was proposed as a bridging model for parallel computation by Valiant (1990). By using randomised shared memory (RSM), this model offers an asymptotically optimal emulation of the PRAM. By using the BSPM with RSM, we show how a gracefully degrading massively parallel system can be obtained through: memory duplication to ensure global memory integrity, and to speed up the reconfiguration; a global reconfiguration method that restores the logical properties of the system, after a fault occurs. We assume fail-stop processors, single faults, no spare processors, and no significant loss of network throughput as a result of faults. Work done during reconfiguration is shared equally among the live processors, with minimal coordination. The overhead of the scheme and the graceful degradation achieved depend on the program being executed. We evaluate the reconfiguration, overhead, and graceful degradation of the system experimentally.<<ETX>>\",\"PeriodicalId\":309075,\"journal\":{\"name\":\"Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1995-06-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FTCS.1995.466969\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FTCS.1995.466969","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

块同步并行模型(BSPM)是Valiant(1990)提出的并行计算桥接模型。通过使用随机共享内存(RSM)，该模型提供了PRAM的渐近最优仿真。通过使用BSPM和RSM，我们展示了如何通过内存复制来获得优雅退化的大规模并行系统，以确保全局内存完整性，并加快重构;一种在故障发生后恢复系统逻辑属性的全局重新配置方法。我们假设有故障停止处理器、单个故障、没有备用处理器，并且没有由于故障而导致的网络吞吐量的重大损失。在重新配置过程中完成的工作在活动处理器之间平等地共享，很少进行协调。方案的开销和实现的优雅降级取决于正在执行的程序。我们通过实验评估了系统的重构、开销和优雅退化。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Gracefully degrading systems using the bulk-synchronous parallel model with randomised shared memory

The bulk-synchronous parallel model (BSPM) was proposed as a bridging model for parallel computation by Valiant (1990). By using randomised shared memory (RSM), this model offers an asymptotically optimal emulation of the PRAM. By using the BSPM with RSM, we show how a gracefully degrading massively parallel system can be obtained through: memory duplication to ensure global memory integrity, and to speed up the reconfiguration; a global reconfiguration method that restores the logical properties of the system, after a fault occurs. We assume fail-stop processors, single faults, no spare processors, and no significant loss of network throughput as a result of faults. Work done during reconfiguration is shared equally among the live processors, with minimal coordination. The overhead of the scheme and the graceful degradation achieved depend on the program being executed. We evaluate the reconfiguration, overhead, and graceful degradation of the system experimentally.<>

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers

自引率

0.00%

发文量

期刊最新文献

Design verification of a super-scalar RISC processor ARMOR: analyzer for reducing module operational risk Evaluation of software dependability based on stability test data Modeling and testing a critical fault-tolerant multi-process system Measuring robustness of a fault tolerant aerospace system