一个有效的,自包含的,片上目录:DIR1-SISD

Mahdad Davari, Alberto Ros, Erik Hagersten, S. Kaxiras
{"title":"一个有效的,自包含的,片上目录:DIR1-SISD","authors":"Mahdad Davari, Alberto Ros, Erik Hagersten, S. Kaxiras","doi":"10.1109/PACT.2015.23","DOIUrl":null,"url":null,"abstract":"Directory-based cache coherence is the de-facto standard for scalable shared-memory multi/many-cores and significant effort is invested in reducing its overhead. However, directory area and complexity optimizations are often antithetical to each other. Novel directory-less coherence schemes have been introduced to remove the complexity and cost associated with directories in their entirety. However, such schemes introduce new challenges by transferring some of the directory complexity and functionality to the OS and using the page table and the TLBs to store data classification information. In this work we bridge the gap between directory-based and directory-less coherence schemes and propose a hybrid scheme called DIR1-SISD which employs self-invalidation and self-downgrade as directory policies for the shared entries. DIR1-SISD allows simultaneous optimizations in area and complexity without relying on the OS. DIR1-SISD keeps track of a single -- private -- owner, or allows multiple-readers-multiple-writers to exist simultaneously by transferring the responsibility for their coherence to the corresponding cores. A DIR1-SISD self-contained directory cache has a unique ability to minimize eviction-induced complexities by allowing directory entries to be evicted without maintaining inclusion with the cached data (thus avoiding the complexities of broadcasts) and without the need to have a backing store. Using simulation we show that a small, self-contained, DIR1-SISD cache outperforms a traditional DIR16-NB MESI protocol with a directory cache embedded in the LLC (8% in execution time and 15% in traffic) and, further, outperforms a SISD protocol that relies on the OS to provide a persistent page-based directory (4% in execution time and 20% in traffic).","PeriodicalId":385398,"journal":{"name":"2015 International Conference on Parallel Architecture and Compilation (PACT)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2015-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"An Efficient, Self-Contained, On-chip Directory: DIR1-SISD\",\"authors\":\"Mahdad Davari, Alberto Ros, Erik Hagersten, S. Kaxiras\",\"doi\":\"10.1109/PACT.2015.23\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Directory-based cache coherence is the de-facto standard for scalable shared-memory multi/many-cores and significant effort is invested in reducing its overhead. However, directory area and complexity optimizations are often antithetical to each other. Novel directory-less coherence schemes have been introduced to remove the complexity and cost associated with directories in their entirety. However, such schemes introduce new challenges by transferring some of the directory complexity and functionality to the OS and using the page table and the TLBs to store data classification information. In this work we bridge the gap between directory-based and directory-less coherence schemes and propose a hybrid scheme called DIR1-SISD which employs self-invalidation and self-downgrade as directory policies for the shared entries. DIR1-SISD allows simultaneous optimizations in area and complexity without relying on the OS. DIR1-SISD keeps track of a single -- private -- owner, or allows multiple-readers-multiple-writers to exist simultaneously by transferring the responsibility for their coherence to the corresponding cores. A DIR1-SISD self-contained directory cache has a unique ability to minimize eviction-induced complexities by allowing directory entries to be evicted without maintaining inclusion with the cached data (thus avoiding the complexities of broadcasts) and without the need to have a backing store. Using simulation we show that a small, self-contained, DIR1-SISD cache outperforms a traditional DIR16-NB MESI protocol with a directory cache embedded in the LLC (8% in execution time and 15% in traffic) and, further, outperforms a SISD protocol that relies on the OS to provide a persistent page-based directory (4% in execution time and 20% in traffic).\",\"PeriodicalId\":385398,\"journal\":{\"name\":\"2015 International Conference on Parallel Architecture and Compilation (PACT)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-10-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 International Conference on Parallel Architecture and Compilation (PACT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PACT.2015.23\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on Parallel Architecture and Compilation (PACT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PACT.2015.23","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14

摘要

基于目录的缓存一致性是可扩展共享内存多核/多核的事实标准,并且在减少其开销方面投入了大量精力。然而,目录面积和复杂性优化通常是相互对立的。为了从整体上消除与目录相关的复杂性和成本,引入了新的无目录一致性方案。然而,这种模式引入了新的挑战,因为它将一些目录复杂性和功能转移到操作系统,并使用页表和tlb来存储数据分类信息。在这项工作中,我们弥合了基于目录和无目录一致性方案之间的差距,并提出了一种称为DIR1-SISD的混合方案,该方案采用自失效和自降级作为共享条目的目录策略。DIR1-SISD允许在不依赖于操作系统的情况下同时优化面积和复杂性。DIR1-SISD跟踪单个私有所有者,或者通过将其一致性的责任转移到相应的核心,允许多个读取器-多个写入器同时存在。DIR1-SISD自包含目录缓存具有一种独特的能力,它允许在不维护包含缓存数据(从而避免广播的复杂性)和不需要备份存储的情况下清除目录条目,从而最大限度地减少由清除引起的复杂性。通过模拟,我们证明了一个小型的、自包含的DIR1-SISD缓存优于嵌入在LLC中的目录缓存的传统DIR16-NB MESI协议(执行时间减少8%,流量减少15%),并且优于依赖于操作系统提供持久的基于页面的目录的SISD协议(执行时间减少4%,流量减少20%)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
An Efficient, Self-Contained, On-chip Directory: DIR1-SISD
Directory-based cache coherence is the de-facto standard for scalable shared-memory multi/many-cores and significant effort is invested in reducing its overhead. However, directory area and complexity optimizations are often antithetical to each other. Novel directory-less coherence schemes have been introduced to remove the complexity and cost associated with directories in their entirety. However, such schemes introduce new challenges by transferring some of the directory complexity and functionality to the OS and using the page table and the TLBs to store data classification information. In this work we bridge the gap between directory-based and directory-less coherence schemes and propose a hybrid scheme called DIR1-SISD which employs self-invalidation and self-downgrade as directory policies for the shared entries. DIR1-SISD allows simultaneous optimizations in area and complexity without relying on the OS. DIR1-SISD keeps track of a single -- private -- owner, or allows multiple-readers-multiple-writers to exist simultaneously by transferring the responsibility for their coherence to the corresponding cores. A DIR1-SISD self-contained directory cache has a unique ability to minimize eviction-induced complexities by allowing directory entries to be evicted without maintaining inclusion with the cached data (thus avoiding the complexities of broadcasts) and without the need to have a backing store. Using simulation we show that a small, self-contained, DIR1-SISD cache outperforms a traditional DIR16-NB MESI protocol with a directory cache embedded in the LLC (8% in execution time and 15% in traffic) and, further, outperforms a SISD protocol that relies on the OS to provide a persistent page-based directory (4% in execution time and 20% in traffic).
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Storage Consolidation on SSDs: Not Always a Panacea, but Can We Ease the Pain? AREP: Adaptive Resource Efficient Prefetching for Maximizing Multicore Performance NVMMU: A Non-volatile Memory Management Unit for Heterogeneous GPU-SSD Architectures Scalable Task Scheduling and Synchronization Using Hierarchical Effects Scalable SIMD-Efficient Graph Processing on GPUs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1