MEDEA: a hybrid shared-memory/message-passing multiprocessor NoC-based architecture

S. Tota, M. Casu, M. R. Roch, Luca Rostagno, M. Zamboni
{"title":"MEDEA: a hybrid shared-memory/message-passing multiprocessor NoC-based architecture","authors":"S. Tota, M. Casu, M. R. Roch, Luca Rostagno, M. Zamboni","doi":"10.1109/DATE.2010.5457237","DOIUrl":null,"url":null,"abstract":"The shared-memory model has been adopted, both for data exchange as well as synchronization using semaphores in almost every on-chip multiprocessor implementation, ranging from general purpose chip multiprocessors (CMPs) to domain specific multi-core graphics processing units (GPUs). Low-latency synchronization is desirable but is hard to achieve in practice due to the memory hierarchy. On the contrary, an explicit exchange of synchronization tokens among the processing elements through dedicated on-chip links would be beneficial for the overall system performance. In this paper we propose the Medea NoC-based framework, a hybrid shared-memory/message-passing approach. Medea has been modeled with a fast, cycle-accurate SystemC implementation enabling a fast system exploration varying several parameters like number and types of cores, cache size and policy and NoC features. In addition, every SystemC block has its RTL counterpart for physical implementation on FPGAs and ASICs. A parallel version of the Jacobi algorithm has been used as a test application to validate the metodology. Results confirm expectations about performance and effectiveness of system exploration and design.","PeriodicalId":432902,"journal":{"name":"2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)","volume":"94 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"37","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DATE.2010.5457237","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 37

Abstract

The shared-memory model has been adopted, both for data exchange as well as synchronization using semaphores in almost every on-chip multiprocessor implementation, ranging from general purpose chip multiprocessors (CMPs) to domain specific multi-core graphics processing units (GPUs). Low-latency synchronization is desirable but is hard to achieve in practice due to the memory hierarchy. On the contrary, an explicit exchange of synchronization tokens among the processing elements through dedicated on-chip links would be beneficial for the overall system performance. In this paper we propose the Medea NoC-based framework, a hybrid shared-memory/message-passing approach. Medea has been modeled with a fast, cycle-accurate SystemC implementation enabling a fast system exploration varying several parameters like number and types of cores, cache size and policy and NoC features. In addition, every SystemC block has its RTL counterpart for physical implementation on FPGAs and ASICs. A parallel version of the Jacobi algorithm has been used as a test application to validate the metodology. Results confirm expectations about performance and effectiveness of system exploration and design.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
MEDEA:一种混合的共享内存/消息传递多处理器基于noc的体系结构
在几乎所有片上多处理器实现(从通用芯片多处理器(cmp)到特定领域的多核图形处理单元(gpu))中,都采用了共享内存模型,用于数据交换和使用信号量的同步。低延迟同步是理想的,但由于内存层次结构的原因,在实践中很难实现。相反,通过专用的片上链路在处理元素之间显式交换同步令牌将有利于整体系统性能。在本文中,我们提出了基于Medea noc的框架,这是一种混合共享内存/消息传递方法。Medea通过快速、周期精确的SystemC实现建模,实现了快速的系统探索,可以改变几个参数,如内核数量和类型、缓存大小、策略和NoC特征。此外,每个SystemC块都有对应的RTL,用于fpga和asic上的物理实现。Jacobi算法的并行版本已被用作验证方法的测试应用程序。结果证实了对系统探索和设计的性能和有效性的期望。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
High temperature polymer capacitors for aerospace applications Control network generator for latency insensitive designs Low-complexity high throughput VLSI architecture of soft-output ML MIMO detector Energy-efficient real-time task scheduling with temperature-dependent leakage A GPU based implementation of Center-Surround Distribution Distance for feature extraction and matching
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1