首页 > 最新文献

2009 International Conference on Reconfigurable Computing and FPGAs最新文献

英文 中文
Self-Adaptive Network Interface (SANI): Local Component of a NoC Configuration Manager SANI (Self-Adaptive Network Interface): NoC配置管理器的本地组件
Pub Date : 2009-12-09 DOI: 10.1109/ReConFig.2009.62
Rachid Dafali, J. Diguet
This paper presents our approach considering the needs of reconfiguration in the domain of NoCs. We introduce our motivations and then detail our strategy based on local (delegate) and global configuration managers. Finally we describe an original self-adaptive Network Interface architecture, which is a part of the configuration manager, in charge of run-time buffer sizing. The challenge is clearly a tradeoff between the complexity of decision implementation and expected gains in terms of cost and performances. Our results obtained on FPGA within an emulator board demonstrate the interest of the proposed approach.
本文提出了一种考虑noc领域重构需求的方法。我们介绍了我们的动机,然后详细介绍了基于本地(委托)和全局配置管理器的策略。最后,我们描述了一个原始的自适应网络接口体系结构,它是配置管理器的一部分,负责运行时缓冲区的大小。挑战显然是在决策实现的复杂性与成本和性能方面的预期收益之间进行权衡。我们在仿真板内的FPGA上获得的结果表明了所提出方法的兴趣。
{"title":"Self-Adaptive Network Interface (SANI): Local Component of a NoC Configuration Manager","authors":"Rachid Dafali, J. Diguet","doi":"10.1109/ReConFig.2009.62","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.62","url":null,"abstract":"This paper presents our approach considering the needs of reconfiguration in the domain of NoCs. We introduce our motivations and then detail our strategy based on local (delegate) and global configuration managers. Finally we describe an original self-adaptive Network Interface architecture, which is a part of the configuration manager, in charge of run-time buffer sizing. The challenge is clearly a tradeoff between the complexity of decision implementation and expected gains in terms of cost and performances. Our results obtained on FPGA within an emulator board demonstrate the interest of the proposed approach.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115534980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Virtualization of Computing Resources in RCS for Multi-task Stream Applications 面向多任务流应用的RCS计算资源虚拟化
Pub Date : 2009-12-09 DOI: 10.1109/ReConFig.2009.51
L. Kirischian, V. Dumitriu, P. Chun
The possibility for distribution of FPGA resources in the temporal domain for multi-modal & multi-task workloads conceptually allows virtualization of logic, communication and input/output resources similar to memory virtualization in advanced conventional computers (e.g. superscalar). This, in turn, can dramatically increase the cost-effectiveness of FPGA based Reconfigurable Computing Systems (RCS). In the presented “proof-of-concept” research the following topics have been investigated, developed and tested: i) architecture of a platform to support the dynamic allocation of Application Specific Virtual Processors (ASVP), ii) mechanisms for run-time on-chip assembly of ASVP from Virtual Hardware Components (VHC) and iii) mechanisms for run-time on-chip components (VHC) relocation in predetermined regions of the FPGA device. The above mechanisms have been implemented and tested on a specially developed platform: the Multi-task Adaptive Reconfigurable System (MARS) Platform. The actual application of MARS was prototyping a high-performance multi-mode stereo-vision system (200 fps) for the next generation of space-borne computing platforms.
FPGA资源在多模态和多任务工作负载的时域分布的可能性在概念上允许逻辑、通信和输入/输出资源的虚拟化,类似于高级传统计算机(例如超标量)的内存虚拟化。反过来,这可以显著提高基于FPGA的可重构计算系统(RCS)的成本效益。在提出的“概念验证”研究中,已经调查,开发和测试了以下主题:i)支持应用特定虚拟处理器(ASVP)动态分配的平台架构,ii)从虚拟硬件组件(VHC)中运行时ASVP的片上组装机制,以及iii)在FPGA设备的预定区域中运行时片上组件(VHC)重新定位机制。上述机制已经在一个专门开发的平台上实现和测试:多任务自适应可重构系统(MARS)平台。MARS的实际应用是为下一代星载计算平台制作高性能多模式立体视觉系统(200帧/秒)原型。
{"title":"Virtualization of Computing Resources in RCS for Multi-task Stream Applications","authors":"L. Kirischian, V. Dumitriu, P. Chun","doi":"10.1109/ReConFig.2009.51","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.51","url":null,"abstract":"The possibility for distribution of FPGA resources in the temporal domain for multi-modal & multi-task workloads conceptually allows virtualization of logic, communication and input/output resources similar to memory virtualization in advanced conventional computers (e.g. superscalar). This, in turn, can dramatically increase the cost-effectiveness of FPGA based Reconfigurable Computing Systems (RCS). In the presented “proof-of-concept” research the following topics have been investigated, developed and tested: i) architecture of a platform to support the dynamic allocation of Application Specific Virtual Processors (ASVP), ii) mechanisms for run-time on-chip assembly of ASVP from Virtual Hardware Components (VHC) and iii) mechanisms for run-time on-chip components (VHC) relocation in predetermined regions of the FPGA device. The above mechanisms have been implemented and tested on a specially developed platform: the Multi-task Adaptive Reconfigurable System (MARS) Platform. The actual application of MARS was prototyping a high-performance multi-mode stereo-vision system (200 fps) for the next generation of space-borne computing platforms.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131421084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
High-Level FPGA Programming through Mapping Process Networks to FPGA Resources 通过映射过程网络到FPGA资源的高级FPGA编程
Pub Date : 2009-12-09 DOI: 10.1109/ReConFig.2009.73
F. Mayer-Lindenberg
We describe a simple and fast approach to FPGA programming that allows to efficiently exploit the numeric processing capabilities of recent FPGA chips. It basically consists in programming on top of a library of complex components for FPGA based scalable processor networks and providing a high-level programming interface to it. The FPGA application is presented as a network of processes which is automatically transformed into a corresponding network of simple processor components by a compiler. The compiler then generates individual program code for each of the simple processors. The coarse-grained processor network is eventually compiled into an FPGA configuration bitstream using standard FPGA tools at close-to-interactive speeds. Our approach has the additional benefit of being fully compatible with processor programming and extendible to mixed multi-component FPGA and processor systems. An experimental implementation of the process mapping scheme uses the p-Nets language that provides convenient structures for the presentation of the application processes and supports composite targets including processors linked to the FPGA chips. The evaluation of our concept on some FPGA chips includes an estimate of their floating point processing performances.
我们描述了一种简单快速的FPGA编程方法,可以有效地利用最新FPGA芯片的数字处理能力。它基本上包括在基于FPGA的可扩展处理器网络的复杂组件库上进行编程,并为其提供高级编程接口。FPGA应用程序以进程网络的形式呈现,由编译器自动转换为相应的简单处理器组件网络。然后编译器为每个简单处理器生成单独的程序代码。粗粒度处理器网络最终使用标准FPGA工具以接近交互的速度编译成FPGA配置比特流。我们的方法还有一个额外的好处,即完全兼容处理器编程,并可扩展到混合多组件FPGA和处理器系统。进程映射方案的实验实现使用了p-Nets语言,该语言为应用进程的表示提供了方便的结构,并支持复合目标,包括链接到FPGA芯片的处理器。我们的概念在一些FPGA芯片上的评估包括对其浮点处理性能的估计。
{"title":"High-Level FPGA Programming through Mapping Process Networks to FPGA Resources","authors":"F. Mayer-Lindenberg","doi":"10.1109/ReConFig.2009.73","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.73","url":null,"abstract":"We describe a simple and fast approach to FPGA programming that allows to efficiently exploit the numeric processing capabilities of recent FPGA chips. It basically consists in programming on top of a library of complex components for FPGA based scalable processor networks and providing a high-level programming interface to it. The FPGA application is presented as a network of processes which is automatically transformed into a corresponding network of simple processor components by a compiler. The compiler then generates individual program code for each of the simple processors. The coarse-grained processor network is eventually compiled into an FPGA configuration bitstream using standard FPGA tools at close-to-interactive speeds. Our approach has the additional benefit of being fully compatible with processor programming and extendible to mixed multi-component FPGA and processor systems. An experimental implementation of the process mapping scheme uses the p-Nets language that provides convenient structures for the presentation of the application processes and supports composite targets including processors linked to the FPGA chips. The evaluation of our concept on some FPGA chips includes an estimate of their floating point processing performances.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123521399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Lightweight Cryptography for FPGAs fpga的轻量级加密
Pub Date : 2009-12-09 DOI: 10.1109/ReConFig.2009.54
P. Yalla, J. Kaps
The advent of new low-power Field Programmable Gate Arrays (FPGA) for battery powered devices opens a host of new applications to FPGAs. In order to provide security on resource constrained devices lightweight cryptographic algorithms have been developed. However, there has not been much research on porting these algorithms to FPGAs. In this paper we propose lightweight cryptography for FPGAs by introducing block cipher independent optimization techniques for Xilinx Spartan3 FPGAs and applying them to the lightweight cryptographic algorithms HIGHT and Present. Our implementations are the first reported of these block ciphers on FPGAs. Furthermore, they are the smallest block cipher implementations on FPGAs using only 117 and 91 slices respectively, which makes them comparable in size to stream cipher implementations. Both are less than half the size of the AES implementation by Chodowiec and Gaj without using block RAMs. Present’s throughput over area ratio of 240 Kbps/slice is similar to that of AES, however, HIGHT outperforms them by far with 720 Kbps/slice.
用于电池供电设备的新型低功耗现场可编程门阵列(FPGA)的出现为FPGA开辟了许多新的应用。为了在资源受限的设备上提供安全性,开发了轻量级加密算法。然而,将这些算法移植到fpga上的研究并不多。本文通过引入Xilinx Spartan3 fpga的分组密码独立优化技术,并将其应用于轻量级加密算法ight和Present,提出了fpga的轻量级加密技术。我们的实现是第一个在fpga上报道这些分组密码。此外,它们是fpga上最小的分组密码实现,分别仅使用117和91片,这使得它们在大小上与流密码实现相当。两者的大小都不到Chodowiec和Gaj在不使用块ram的情况下实现AES的一半。目前的吞吐量面积比为240 Kbps/片,与AES相似,但ight的吞吐量面积比为720 Kbps/片,远远超过AES。
{"title":"Lightweight Cryptography for FPGAs","authors":"P. Yalla, J. Kaps","doi":"10.1109/ReConFig.2009.54","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.54","url":null,"abstract":"The advent of new low-power Field Programmable Gate Arrays (FPGA) for battery powered devices opens a host of new applications to FPGAs. In order to provide security on resource constrained devices lightweight cryptographic algorithms have been developed. However, there has not been much research on porting these algorithms to FPGAs. In this paper we propose lightweight cryptography for FPGAs by introducing block cipher independent optimization techniques for Xilinx Spartan3 FPGAs and applying them to the lightweight cryptographic algorithms HIGHT and Present. Our implementations are the first reported of these block ciphers on FPGAs. Furthermore, they are the smallest block cipher implementations on FPGAs using only 117 and 91 slices respectively, which makes them comparable in size to stream cipher implementations. Both are less than half the size of the AES implementation by Chodowiec and Gaj without using block RAMs. Present’s throughput over area ratio of 240 Kbps/slice is similar to that of AES, however, HIGHT outperforms them by far with 720 Kbps/slice.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122638855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 123
Speeding up Fault Injection for Asynchronous Logic by FPGA-Based Emulation 基于fpga的异步逻辑故障注入仿真研究
Pub Date : 2009-12-09 DOI: 10.1109/ReConFig.2009.35
M. Jeitler, J. Lechner
While stability and robustness of synchronous circuits becomes increasingly problematic due to shrinking feature sizes, delay-insensitive asynchronous circuits are supposed to provide inherent protection against various fault types. However, results on experimental evaluation and analysis of these fault tolerance properties are scarce, mainly due to the lack of suitable prototyping platforms. Using a soft-core processor as an example, this paper shows how an off-the-shelf FPGA can be used for asynchronous Four State Logic designs, on which future fault injection experiments will be conducted.
由于特征尺寸的缩小,同步电路的稳定性和鲁棒性变得越来越成问题,延迟不敏感的异步电路应该提供针对各种故障类型的固有保护。然而,由于缺乏合适的原型平台,对这些容错性能的实验评估和分析结果很少。本文以软核处理器为例,展示了如何将现成的FPGA用于异步四态逻辑设计,并将在此基础上进行故障注入实验。
{"title":"Speeding up Fault Injection for Asynchronous Logic by FPGA-Based Emulation","authors":"M. Jeitler, J. Lechner","doi":"10.1109/ReConFig.2009.35","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.35","url":null,"abstract":"While stability and robustness of synchronous circuits becomes increasingly problematic due to shrinking feature sizes, delay-insensitive asynchronous circuits are supposed to provide inherent protection against various fault types. However, results on experimental evaluation and analysis of these fault tolerance properties are scarce, mainly due to the lack of suitable prototyping platforms. Using a soft-core processor as an example, this paper shows how an off-the-shelf FPGA can be used for asynchronous Four State Logic designs, on which future fault injection experiments will be conducted.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125443027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Tailoring a Reconfigurable Platform to SHA-256 and HMAC through Custom Instructions and Peripherals 通过自定义指令和外设定制SHA-256和HMAC的可重构平台
Pub Date : 2009-12-09 DOI: 10.1109/ReConFig.2009.40
M. Juliato, C. Gebotys
This paper introduces the specialization of a NIOS2 processor targeting the computation of message authentication codes and integrity checks in constrained environments. Several hardware/software partitioning levels are considered, which vary from simple functions implemented as custom instructions to complete algorithms as peripherals. Our experimental results show that functions Sum, Sig, Ch, Maj implemented as custom instructions allows for SHA-256 and HMAC to be accelerated 1.38 and 1.36 times respectively, while keeping a small area footprint. If the entire SHA-256 algorithm is implemented as a peripheral, the hash computation is performed 11 times faster while decreasing the program size in 16%. Furthermore, the HMAC/SHA-256 peripheral accelerates the computation of a message authentication code 19 times with a 26% smaller program. These results allow for the specialization of the computational platform of constrained embedded systems to the processing requirements of cryptographic applications performing message authentication codes and integrity checks.
本文介绍了NIOS2处理器的专门化,其目标是在受限环境中计算消息认证码和完整性检查。考虑了几个硬件/软件分区级别,从作为自定义指令实现的简单功能到作为外围设备的完整算法。我们的实验结果表明,作为自定义指令实现的函数Sum, Sig, Ch, Maj允许SHA-256和HMAC分别加速1.38和1.36倍,同时保持较小的面积占用。如果将整个SHA-256算法实现为外设,则散列计算的执行速度将提高11倍,同时将程序大小减少16%。此外,HMAC/SHA-256外设将消息验证码的计算速度提高了19倍,程序减少了26%。这些结果允许将受约束嵌入式系统的计算平台专门化,以满足执行消息身份验证代码和完整性检查的加密应用程序的处理需求。
{"title":"Tailoring a Reconfigurable Platform to SHA-256 and HMAC through Custom Instructions and Peripherals","authors":"M. Juliato, C. Gebotys","doi":"10.1109/ReConFig.2009.40","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.40","url":null,"abstract":"This paper introduces the specialization of a NIOS2 processor targeting the computation of message authentication codes and integrity checks in constrained environments. Several hardware/software partitioning levels are considered, which vary from simple functions implemented as custom instructions to complete algorithms as peripherals. Our experimental results show that functions Sum, Sig, Ch, Maj implemented as custom instructions allows for SHA-256 and HMAC to be accelerated 1.38 and 1.36 times respectively, while keeping a small area footprint. If the entire SHA-256 algorithm is implemented as a peripheral, the hash computation is performed 11 times faster while decreasing the program size in 16%. Furthermore, the HMAC/SHA-256 peripheral accelerates the computation of a message authentication code 19 times with a 26% smaller program. These results allow for the specialization of the computational platform of constrained embedded systems to the processing requirements of cryptographic applications performing message authentication codes and integrity checks.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"208 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122144937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Design and Performance of a Grid of Asynchronously Clocked Run-Time Reconfigurable Modules on a FPGA FPGA上异步时钟运行时可重构模块网格的设计与性能
Pub Date : 2009-12-09 DOI: 10.1109/ReConFig.2009.24
Jochen Strunk, Toni Volkmer, W. Rehm, H. Schick
This paper examines the feasibility of utilizing a grid of asynchronously clocked run-time reconfigurable modules (RTRMs) on a dynamically and partially reconfigurable (DPR) FPGA. In contrast to a synchronously clocked grid studied in research, the design, the implementation, the performance and the resource utilization of an asynchronously clocked grid is shown. Such a run-time reconfigurable (RTR) grid on a FPGA can be utilized to dynamically offload compute functions on a host coupled system, providing multi-user and multi-context execution on behalf of user demands. For embedded systems it can be utilized as a highly dynamical platform by providing functional enhancement by module replacement during run-time. The presented platform leverages synthesis and development constraints and is able to increase the overall throughput by allowing multiple clock domains within the grid. The performance and the additional resource utilization of handling multiple clock domains is compared to synchronously clocked grids. As proof of concept a case study with a grid of 47 RTRMs is conducted on state of the art Virtex-5 FPGAs.
本文研究了在动态和部分可重构FPGA上利用异步时钟运行时可重构模块(rtrm)网格的可行性。在研究同步时钟网格的基础上,给出了异步时钟网格的设计、实现、性能和资源利用率。FPGA上的这种运行时可重构(RTR)网格可用于动态卸载主机耦合系统上的计算功能,提供代表用户需求的多用户和多上下文执行。对于嵌入式系统,它可以作为一个高度动态的平台,在运行时通过模块替换提供功能增强。所提出的平台利用了综合和开发约束,并且能够通过在网格中允许多个时钟域来提高总体吞吐量。将处理多个时钟域的性能和额外资源利用率与同步时钟网格进行了比较。作为概念验证,在最先进的Virtex-5 fpga上进行了47个rtrm网格的案例研究。
{"title":"Design and Performance of a Grid of Asynchronously Clocked Run-Time Reconfigurable Modules on a FPGA","authors":"Jochen Strunk, Toni Volkmer, W. Rehm, H. Schick","doi":"10.1109/ReConFig.2009.24","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.24","url":null,"abstract":"This paper examines the feasibility of utilizing a grid of asynchronously clocked run-time reconfigurable modules (RTRMs) on a dynamically and partially reconfigurable (DPR) FPGA. In contrast to a synchronously clocked grid studied in research, the design, the implementation, the performance and the resource utilization of an asynchronously clocked grid is shown. Such a run-time reconfigurable (RTR) grid on a FPGA can be utilized to dynamically offload compute functions on a host coupled system, providing multi-user and multi-context execution on behalf of user demands. For embedded systems it can be utilized as a highly dynamical platform by providing functional enhancement by module replacement during run-time. The presented platform leverages synthesis and development constraints and is able to increase the overall throughput by allowing multiple clock domains within the grid. The performance and the additional resource utilization of handling multiple clock domains is compared to synchronously clocked grids. As proof of concept a case study with a grid of 47 RTRMs is conducted on state of the art Virtex-5 FPGAs.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128215228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modeling and Analyzing of Blocking Time Effects on Power Consumption in Network-on-Chips 片上网络中阻塞时间对功耗影响的建模与分析
Pub Date : 2009-12-09 DOI: 10.1109/ReConFig.2009.48
Arghavan Asad, A. E. Zonouz, M. Seyrafi, M. Soryani, M. Fathy
Networks-on-Chip (NoC) has been proposed as an only efficient and scalable solution for providing global on-chip communications in any large VLSI design. Simultaneously, power dissipation issues have grown to such importance that they now constrain attainable performance. The large value of power consumption, relative to the active power, can therefore have serious implications for the feasibility of deploying NoCs. If NoCs are to be accepted, their full power implications need to be known. Moreover, these power characteristics must be accurately understood across the large possible design space of NoCs. Blocking time is one of the effective factors on NoC power consumption. In this paper we present a Markovian model for evaluating the amount of the dissipated power comes from packet blocking and show the blocking time effects on total power consumption of on-chip networks approach.
片上网络(NoC)已被提出作为在任何大型VLSI设计中提供全局片上通信的唯一有效和可扩展的解决方案。同时,功耗问题已经变得如此重要,以至于它们现在限制了可实现的性能。因此,相对于有功功率而言,电力消耗的巨大价值可能对部署noc的可行性产生严重影响。如果要接受国家石油公司,就必须了解其全部权力影响。此外,这些功率特性必须在noc的大设计空间内准确理解。阻塞时间是影响NoC能耗的重要因素之一。在本文中,我们提出了一个马尔可夫模型来评估来自分组阻塞的耗散功率,并展示了阻塞时间对片上网络方法的总功耗的影响。
{"title":"Modeling and Analyzing of Blocking Time Effects on Power Consumption in Network-on-Chips","authors":"Arghavan Asad, A. E. Zonouz, M. Seyrafi, M. Soryani, M. Fathy","doi":"10.1109/ReConFig.2009.48","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.48","url":null,"abstract":"Networks-on-Chip (NoC) has been proposed as an only efficient and scalable solution for providing global on-chip communications in any large VLSI design. Simultaneously, power dissipation issues have grown to such importance that they now constrain attainable performance. The large value of power consumption, relative to the active power, can therefore have serious implications for the feasibility of deploying NoCs. If NoCs are to be accepted, their full power implications need to be known. Moreover, these power characteristics must be accurately understood across the large possible design space of NoCs. Blocking time is one of the effective factors on NoC power consumption. In this paper we present a Markovian model for evaluating the amount of the dissipated power comes from packet blocking and show the blocking time effects on total power consumption of on-chip networks approach.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132619992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
On the Implementation of Central Pattern Generators for Periodic Rhythmic Locomotion 周期节律运动中心模式发生器的实现
Pub Date : 2009-12-09 DOI: 10.1109/ReConFig.2009.81
C. Torres-Huitzil
This paper presents the feasibility study of the efficient digital hardware implementation of a neural model to generate locomotion patterns of periodic rhythmic movements inspired by biological neural networks found in animal nervous system called Central Pattern Generators (CPGs). The proposed implementation contains a dedicated digital module that mimics the functionality and organization of the fundamental Amari- Hopfield CPG. This module is attached to an embedded processor running the uclinux operating system. The present paper deals only with the implementation of the basic CPG component and how to embed it under a System on a Chip (SoC) approach in order to be controlled by external commands in a high level transparent way for application development. The system is implemented on a Field Programmable Gate Array (FPGA) device providing a compact, flexible and expandable solution for generating periodic rhythmic patterns in robot control applications. According to experimental results, the architecture can be used as a basis for a biomimetic intelligent embedded control platform for articulated autonomous robots.
本文介绍了一种神经模型的高效数字硬件实现的可行性研究,该模型受动物神经系统中称为中枢模式发生器(CPGs)的生物神经网络的启发,产生周期性有节奏运动的运动模式。提议的实现包含一个专用的数字模块,模仿基本的Amari- Hopfield CPG的功能和组织。该模块连接到运行uclinux操作系统的嵌入式处理器上。本文仅讨论基本CPG组件的实现以及如何将其嵌入到片上系统(SoC)方法下,以便以高水平透明的方式由外部命令控制,用于应用程序开发。该系统在现场可编程门阵列(FPGA)器件上实现,为机器人控制应用中的周期性节奏模式生成提供了紧凑、灵活和可扩展的解决方案。实验结果表明,该体系结构可作为关节式自主机器人仿生智能嵌入式控制平台的基础。
{"title":"On the Implementation of Central Pattern Generators for Periodic Rhythmic Locomotion","authors":"C. Torres-Huitzil","doi":"10.1109/ReConFig.2009.81","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.81","url":null,"abstract":"This paper presents the feasibility study of the efficient digital hardware implementation of a neural model to generate locomotion patterns of periodic rhythmic movements inspired by biological neural networks found in animal nervous system called Central Pattern Generators (CPGs). The proposed implementation contains a dedicated digital module that mimics the functionality and organization of the fundamental Amari- Hopfield CPG. This module is attached to an embedded processor running the uclinux operating system. The present paper deals only with the implementation of the basic CPG component and how to embed it under a System on a Chip (SoC) approach in order to be controlled by external commands in a high level transparent way for application development. The system is implemented on a Field Programmable Gate Array (FPGA) device providing a compact, flexible and expandable solution for generating periodic rhythmic patterns in robot control applications. According to experimental results, the architecture can be used as a basis for a biomimetic intelligent embedded control platform for articulated autonomous robots.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"596 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113966791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A Systolic Array Based Architecture for Implementing Multivariate Polynomial Interpolation Tasks 一种基于收缩阵列的多变量多项式插值算法
Pub Date : 2009-12-09 DOI: 10.1109/ReConFig.2009.70
R. Arce-Nazario, E. Orozco, D. Bollman
Multivariate polynomial interpolation is a key computation for the reverse engineering of genetic networks modeled by finite fields. Faster implementations of such algorithms are needed to cope with the increasing quantity and complexity of genetic data. Our implementation of an interpolation methodology to FPGA has led us to identify a systolic array-based hardware architecture that is useful for performing at least three interpolation sub-tasks: Boolean cover, uniqueness, and multivariate polynomial addition. We present a generalization of these algorithms that simplifies mapping to the systolic-array structure, as well as control and storage considerations to guarantee correct results when the input sequence is longer than the processing array. The three interpolation sub-tasks were modeled and implemented to FPGA using the proposed structure, obtaining speedups up to 172x when compared to a software implementation, while achieving low resource utilization.
多元多项式插值是有限域遗传网络逆向工程的关键计算方法。这种算法需要更快的实现来处理日益增加的遗传数据的数量和复杂性。我们对FPGA的插值方法的实现使我们确定了一个基于收缩数组的硬件架构,该架构可用于执行至少三个插值子任务:布尔覆盖、唯一性和多元多项式加法。我们提出了这些算法的推广,简化了到收缩阵列结构的映射,以及控制和存储方面的考虑,以保证当输入序列比处理阵列长时的正确结果。使用所提出的结构对三个插值子任务进行建模并在FPGA上实现,与软件实现相比,获得了高达172倍的加速,同时实现了低资源利用率。
{"title":"A Systolic Array Based Architecture for Implementing Multivariate Polynomial Interpolation Tasks","authors":"R. Arce-Nazario, E. Orozco, D. Bollman","doi":"10.1109/ReConFig.2009.70","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.70","url":null,"abstract":"Multivariate polynomial interpolation is a key computation for the reverse engineering of genetic networks modeled by finite fields. Faster implementations of such algorithms are needed to cope with the increasing quantity and complexity of genetic data. Our implementation of an interpolation methodology to FPGA has led us to identify a systolic array-based hardware architecture that is useful for performing at least three interpolation sub-tasks: Boolean cover, uniqueness, and multivariate polynomial addition. We present a generalization of these algorithms that simplifies mapping to the systolic-array structure, as well as control and storage considerations to guarantee correct results when the input sequence is longer than the processing array. The three interpolation sub-tasks were modeled and implemented to FPGA using the proposed structure, obtaining speedups up to 172x when compared to a software implementation, while achieving low resource utilization.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"277 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120886444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2009 International Conference on Reconfigurable Computing and FPGAs
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1