首页 > 最新文献

[1990] Proceedings. The 17th Annual International Symposium on Computer Architecture最新文献

英文 中文
Memory consistency and event ordering in scalable shared-memory multiprocessors 可扩展共享内存多处理器中的内存一致性和事件排序
K. Gharachorloo, D. Lenoski, J. Laudon, Phillip B. Gibbons, Anoop Gupta, J. Hennessy
A new model of memory consistency, called release consistency, that allows for more buffering and pipelining than previously proposed models is introduced. A framework for classifying shared accesses and reasoning about event ordering is developed. The release consistency model is shown to be equivalent to the sequential consistency model for parallel programs with sufficient synchronization. Possible performance gains from the less strict constraints of the release consistency model are explored. Finally, practical implementation issues are discussed, with the discussion concentrating on issues relevant to scalable architectures.<>
引入了一种新的内存一致性模型,称为释放一致性,它比以前提出的模型允许更多的缓冲和流水线。提出了一个共享访问分类和事件排序推理的框架。对于具有充分同步的并行程序,释放一致性模型与顺序一致性模型是等价的。本文探讨了从发布一致性模型的不太严格的约束中可能获得的性能收益。最后,讨论了实际实现问题,重点讨论了与可扩展架构相关的问题。
{"title":"Memory consistency and event ordering in scalable shared-memory multiprocessors","authors":"K. Gharachorloo, D. Lenoski, J. Laudon, Phillip B. Gibbons, Anoop Gupta, J. Hennessy","doi":"10.1145/285930.285997","DOIUrl":"https://doi.org/10.1145/285930.285997","url":null,"abstract":"A new model of memory consistency, called release consistency, that allows for more buffering and pipelining than previously proposed models is introduced. A framework for classifying shared accesses and reasoning about event ordering is developed. The release consistency model is shown to be equivalent to the sequential consistency model for parallel programs with sufficient synchronization. Possible performance gains from the less strict constraints of the release consistency model are explored. Finally, practical implementation issues are discussed, with the discussion concentrating on issues relevant to scalable architectures.<<ETX>>","PeriodicalId":297046,"journal":{"name":"[1990] Proceedings. The 17th Annual International Symposium on Computer Architecture","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114344582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 451
Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers 通过添加一个小的全关联缓存和预取缓冲区来提高直接映射缓存的性能
N. Jouppi
Hardware techniques for improving the performance of caches are presented. Miss caching places a small, fully associative cache between a cache and its refill path. Misses in the cache that hit in the miss cache have only a 1-cycle miss penalty. Small miss caches of 2 to 5 entries are shown to be very effective in removing mapping conflict misses in first-level direct-mapped caches. Victim caching is an improvement to miss caching in that it loads the small fully associative cache with the victim of a miss and not the requested line. Small victim caches of 1 to 5 entries are even more effective at removing conflict misses than miss caching. Stream buffers prefetch cache lines starting at a cache miss address. The prefetched data are placed in the buffer and not in the cache. Stream buffers are useful in removing capacity and compulsory cache misses, as well as some instruction cache conflict misses. An extension to the basic stream buffer, called a multiway stream buffer, is introduced.<>
提出了提高高速缓存性能的硬件技术。遗漏缓存在缓存和它的重新填充路径之间放置一个小的、完全关联的缓存。缓存中的未命中命中缓存只有1个周期的未命中惩罚。在第一级直接映射缓存中,2到5个条目的小缺失缓存在消除映射冲突缺失方面非常有效。受害者缓存是对错过缓存的改进,因为它加载了一个小的完全关联的缓存与错过的受害者,而不是请求的行。1到5个条目的小受害者缓存在消除冲突遗漏方面比遗漏缓存更有效。流缓冲区从cache miss地址开始预取cache行。预取的数据放在缓冲区中,而不是缓存中。流缓冲区在删除容量和强制缓存缺失以及一些指令缓存冲突缺失时非常有用。引入了对基本流缓冲区的扩展,称为多路流缓冲区
{"title":"Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers","authors":"N. Jouppi","doi":"10.1109/ISCA.1990.134547","DOIUrl":"https://doi.org/10.1109/ISCA.1990.134547","url":null,"abstract":"Hardware techniques for improving the performance of caches are presented. Miss caching places a small, fully associative cache between a cache and its refill path. Misses in the cache that hit in the miss cache have only a 1-cycle miss penalty. Small miss caches of 2 to 5 entries are shown to be very effective in removing mapping conflict misses in first-level direct-mapped caches. Victim caching is an improvement to miss caching in that it loads the small fully associative cache with the victim of a miss and not the requested line. Small victim caches of 1 to 5 entries are even more effective at removing conflict misses than miss caching. Stream buffers prefetch cache lines starting at a cache miss address. The prefetched data are placed in the buffer and not in the cache. Stream buffers are useful in removing capacity and compulsory cache misses, as well as some instruction cache conflict misses. An extension to the basic stream buffer, called a multiway stream buffer, is introduced.<<ETX>>","PeriodicalId":297046,"journal":{"name":"[1990] Proceedings. The 17th Annual International Symposium on Computer Architecture","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116764427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1447
A new approach to fast control of r/sup 2/*r/sup 2/ 3-stage Benes networks of r*r crossbar switches 一种快速控制r*r交叉开关r/sup 2/*r/sup 2/ 3级Benes网络的新方法
A. Youssef, B. Arden
The authors introduce an approach to fast control of N*N three-stage Benes networks of r*r crossbar switches as building blocks. The approach consists of setting the leftmost column of switches to an appropriately chosen configuration so that the network becomes self-routed while still able to realize a given family of permutations. This approach requires that, for any given family of permutations, a configuration for the leftmost column be found. Such a family is called compatibles; and the configuration of the leftmost column is called the compatibility factor. Compatibility is characterized, and a technique to determine compatibility and the compatibility factor is developed and applied to Omega -realizable permutations, the permutations needed to emulate a hypercube, and the families of permutations required by FFT, bitonic sorting, tree computations, multidimensional mesh and torus computations, and multigrid computations. An O(log/sup 2/N) time routing algorithm for the three-stage Benes is also developed. Finally, since only three compatibility factors are required by the preceding families of permutations, it is proposed that the first column be replaced by three multiplexed connections yielding a self-routing network with strong communication capabilities.<>
介绍了一种以r*r交叉开关为基本单元的N*N三级Benes网络的快速控制方法。该方法包括将最左边的交换机列设置为适当选择的配置,以便网络在能够实现给定排列族的同时实现自路由。这种方法要求,对于任何给定的排列族,找到最左边列的配置。这样的家族被称为相容家族;最左边一列的配置称为兼容系数。对兼容性进行了表征,开发了一种确定兼容性和兼容性因子的技术,并将其应用于Omega可实现的排列、模拟超立方体所需的排列以及FFT、双元排序、树计算、多维网格和环面计算以及多网格计算所需的排列族。提出了一种O(log/sup 2/N)时间的三阶段Benes路由算法。最后,由于上述排列族只需要三个兼容性因素,因此建议将第一列替换为三个多路连接,从而产生具有强通信能力的自路由网络。
{"title":"A new approach to fast control of r/sup 2/*r/sup 2/ 3-stage Benes networks of r*r crossbar switches","authors":"A. Youssef, B. Arden","doi":"10.1109/ISCA.1990.134507","DOIUrl":"https://doi.org/10.1109/ISCA.1990.134507","url":null,"abstract":"The authors introduce an approach to fast control of N*N three-stage Benes networks of r*r crossbar switches as building blocks. The approach consists of setting the leftmost column of switches to an appropriately chosen configuration so that the network becomes self-routed while still able to realize a given family of permutations. This approach requires that, for any given family of permutations, a configuration for the leftmost column be found. Such a family is called compatibles; and the configuration of the leftmost column is called the compatibility factor. Compatibility is characterized, and a technique to determine compatibility and the compatibility factor is developed and applied to Omega -realizable permutations, the permutations needed to emulate a hypercube, and the families of permutations required by FFT, bitonic sorting, tree computations, multidimensional mesh and torus computations, and multigrid computations. An O(log/sup 2/N) time routing algorithm for the three-stage Benes is also developed. Finally, since only three compatibility factors are required by the preceding families of permutations, it is proposed that the first column be replaced by three multiplexed connections yielding a self-routing network with strong communication capabilities.<<ETX>>","PeriodicalId":297046,"journal":{"name":"[1990] Proceedings. The 17th Annual International Symposium on Computer Architecture","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127543284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
[1990] Proceedings. The 17th Annual International Symposium on Computer Architecture
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1