首页 > 最新文献

2017 30th IEEE International System-on-Chip Conference (SOCC)最新文献

英文 中文
Router-level performance driven dynamic management in hierarchical networks-on-chip 分层片上网络中路由器级性能驱动的动态管理
Pub Date : 2017-09-01 DOI: 10.1109/SOCC.2017.8226067
Mingmin Bai, Dan Zhao, M. Bayoumi
When on-chip interconnection network scales to integrate more processing elements, the average end-to-end latency is highly increased due to long average hop distance. Though it has been discovered that, almost of the communication in large scale networks is between nodes in a short range, it revealed that the small portion of data delivery between distant nodes consumes or occupies most of the network bandwidth. Hierarchical NoCs caters an attractive solution to resolve the distant data transmission problem by taking advantage of the network hierarchy. However, it brings about new sever congestion challenge because of uneven traffic distribution among hierarchy. In previous work, we performed a detouring scheme on a layered hierarchical NoC. When congestion is formed on the access link to adjacent hierarchical layer, the detouring scheme seeks and reroutes the packets to an nearby node to access the next adjacent network layer. It revealed that the links, which bridges the packets up to higher layers, are more essential for distributing the traffic and avoiding congestion between hierarchy levels. In this paper, we proposed dynamic schemes to solve the congestion problem introduced by region-based hierarchical routing on a hierarchical NoC. The results exposed that the dynamic approaches are efficient to manage the congestion under heavier long range traffic load, yielding significant average network latency reduction and throughput increment under mixed synthetic traffic patterns.
当片上互连网络扩展到集成更多处理元素时,由于平均跳距较长,端到端平均延迟大大增加。虽然已经发现,在大规模网络中,几乎所有的通信都是在近距离的节点之间进行的,但这表明,远距离节点之间的一小部分数据传输消耗或占用了大部分网络带宽。分层noc利用网络分层的优势,为解决远程数据传输问题提供了一个有吸引力的解决方案。然而,由于层级之间的流量分布不均匀,给交通拥挤带来了新的严峻挑战。在之前的工作中,我们在分层分层NoC上执行了绕行方案。当到相邻分层层的访问链路上形成拥塞时,绕道方案寻找并重新路由数据包到附近的节点以访问下一个相邻的网络层。它揭示了连接数据包到更高层的链路对于分配流量和避免分层层之间的拥塞更为重要。本文提出了一种动态方案来解决分层NoC上基于区域的分层路由所带来的拥塞问题。结果表明,在较重的远程流量负载下,动态方法可以有效地管理拥塞,在混合综合流量模式下,平均网络延迟显著降低,吞吐量显著增加。
{"title":"Router-level performance driven dynamic management in hierarchical networks-on-chip","authors":"Mingmin Bai, Dan Zhao, M. Bayoumi","doi":"10.1109/SOCC.2017.8226067","DOIUrl":"https://doi.org/10.1109/SOCC.2017.8226067","url":null,"abstract":"When on-chip interconnection network scales to integrate more processing elements, the average end-to-end latency is highly increased due to long average hop distance. Though it has been discovered that, almost of the communication in large scale networks is between nodes in a short range, it revealed that the small portion of data delivery between distant nodes consumes or occupies most of the network bandwidth. Hierarchical NoCs caters an attractive solution to resolve the distant data transmission problem by taking advantage of the network hierarchy. However, it brings about new sever congestion challenge because of uneven traffic distribution among hierarchy. In previous work, we performed a detouring scheme on a layered hierarchical NoC. When congestion is formed on the access link to adjacent hierarchical layer, the detouring scheme seeks and reroutes the packets to an nearby node to access the next adjacent network layer. It revealed that the links, which bridges the packets up to higher layers, are more essential for distributing the traffic and avoiding congestion between hierarchy levels. In this paper, we proposed dynamic schemes to solve the congestion problem introduced by region-based hierarchical routing on a hierarchical NoC. The results exposed that the dynamic approaches are efficient to manage the congestion under heavier long range traffic load, yielding significant average network latency reduction and throughput increment under mixed synthetic traffic patterns.","PeriodicalId":366264,"journal":{"name":"2017 30th IEEE International System-on-Chip Conference (SOCC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131183127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Generative adversarial network based scalable on-chip noise sensor placement 基于可扩展片上噪声传感器放置的生成对抗网络
Pub Date : 2017-09-01 DOI: 10.1109/SOCC.2017.8226048
Jinglan Liu, Yukun Ding, Jianlei Yang, Ulf Schlichtmann, Yiyu Shi
The relentless efforts towards power reduction of integrated circuits have led to the prevalence of near-threshold computing paradigms. With the significantly reduced noise margin, therefore, it is no longer possible to fully assure power integrity at design time. As a result, designers seek to contain noise violations, commonly known as voltage emergencies, through various runtime techniques. All these techniques require accurate capture of voltage emergencies through noise sensors. Although existing approaches have explored the optimal placement of noise sensors, they all exploited the statistical modeling of noise, which requires a large number of samples in a high-dimensional space. For large scale power grids, these techniques may not work due to the very long simulation time required to get the samples. In this paper, we explore a novel approach based on generative adversarial network (GAN), which only requires a small number of samples to train. Experimental results show that compared with a simple heuristic which takes in the same number of samples, our approach can reduce the miss rate of voltage emergency detection by up to 65.3% on an industrial design.
对集成电路功耗降低的不懈努力导致了近阈值计算范式的流行。因此,随着噪声裕度的显著降低,在设计时不再可能完全保证电源的完整性。因此,设计人员试图通过各种运行时技术来控制噪声违规,通常称为电压紧急情况。所有这些技术都需要通过噪声传感器精确捕获电压突发事件。虽然现有的方法已经探索了噪声传感器的最佳放置,但它们都利用了噪声的统计建模,这需要在高维空间中进行大量的样本。对于大型电网,由于需要很长的模拟时间来获取样本,这些技术可能无法工作。在本文中,我们探索了一种基于生成对抗网络(GAN)的新方法,该方法只需要少量的样本进行训练。实验结果表明,与采用相同样本数的简单启发式方法相比,该方法可将工业设计的电压紧急检测的失误率降低65.3%。
{"title":"Generative adversarial network based scalable on-chip noise sensor placement","authors":"Jinglan Liu, Yukun Ding, Jianlei Yang, Ulf Schlichtmann, Yiyu Shi","doi":"10.1109/SOCC.2017.8226048","DOIUrl":"https://doi.org/10.1109/SOCC.2017.8226048","url":null,"abstract":"The relentless efforts towards power reduction of integrated circuits have led to the prevalence of near-threshold computing paradigms. With the significantly reduced noise margin, therefore, it is no longer possible to fully assure power integrity at design time. As a result, designers seek to contain noise violations, commonly known as voltage emergencies, through various runtime techniques. All these techniques require accurate capture of voltage emergencies through noise sensors. Although existing approaches have explored the optimal placement of noise sensors, they all exploited the statistical modeling of noise, which requires a large number of samples in a high-dimensional space. For large scale power grids, these techniques may not work due to the very long simulation time required to get the samples. In this paper, we explore a novel approach based on generative adversarial network (GAN), which only requires a small number of samples to train. Experimental results show that compared with a simple heuristic which takes in the same number of samples, our approach can reduce the miss rate of voltage emergency detection by up to 65.3% on an industrial design.","PeriodicalId":366264,"journal":{"name":"2017 30th IEEE International System-on-Chip Conference (SOCC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131243191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Secure digital communication based on Lorenz stream cipher 基于洛伦兹流密码的安全数字通信
Pub Date : 2017-09-01 DOI: 10.1109/SOCC.2017.8225999
A. Alshammari, M. Sobhy, P. Lee
A new cryptosystem approach based on Lorenz chaotic systems is presented for secure data transmission. The system uses a stream cipher, in which the encryption key varies continuously. Furthermore one or more of the parameters of the Lorenz generator is controlled by an auxiliary chaotic generator for increased security. The system is implemented by using two separate Spartan 6 FPGA boards. Security analysis (Section VII) shows the system to have a high degree of security compared to other communication systems.
提出了一种新的基于洛伦兹混沌系统的数据安全传输方法。系统采用流密码,加密密钥连续变化。此外,为了提高安全性,洛伦兹发生器的一个或多个参数由辅助混沌发生器控制。该系统使用两个独立的Spartan 6 FPGA板实现。安全性分析(第七节)表明,与其他通信系统相比,该系统具有很高的安全性。
{"title":"Secure digital communication based on Lorenz stream cipher","authors":"A. Alshammari, M. Sobhy, P. Lee","doi":"10.1109/SOCC.2017.8225999","DOIUrl":"https://doi.org/10.1109/SOCC.2017.8225999","url":null,"abstract":"A new cryptosystem approach based on Lorenz chaotic systems is presented for secure data transmission. The system uses a stream cipher, in which the encryption key varies continuously. Furthermore one or more of the parameters of the Lorenz generator is controlled by an auxiliary chaotic generator for increased security. The system is implemented by using two separate Spartan 6 FPGA boards. Security analysis (Section VII) shows the system to have a high degree of security compared to other communication systems.","PeriodicalId":366264,"journal":{"name":"2017 30th IEEE International System-on-Chip Conference (SOCC)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121784763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Haar-based interconnect coding for energy effective medium/long range data transport 基于haar的节能中/远程数据传输互连编码
Pub Date : 2017-09-01 DOI: 10.1109/SOCC.2017.8226081
N. C. Laurenciu, S. Cotofana
In this paper we introduce and evaluate Haar based codec assisted medium and long range data transport structures, e.g., bus segments, Network on Chip interconnects, able to deal with technology scaling related phenomena (e.g., increased susceptibility to proximity coupling noise and transmission delay variability), targeting energy savings at the expense of a reasonably small overhead, i.e., 1 extra wire, a 2-gate encoder, and a 2-gate decoder, for each and every pair of uncoded wires. For practical evaluation we employed a 45nm commercial CMOS technology and different random, uncorrelated workload profiles. For 5mm and 10mm long 8-bit buses (without repeaters), we obtain energy savings of 55% and 34%, and a transmission frequency increase of 35% and 41%, respectively, at the expense of less than 1% area overhead with respect to the reference system (i.e., 8-wire synchronous uncoded bus), which prove energy and delay effectiveness. We further augment our proposal with a Single Error Correction and Double Error Detection (SECDED) scheme particularly adapted to its structure, in order to cope with very deep sub-micron noise (e.g., supply voltage variations, electromagnetic interference) induced transmission errors. When compared to the reference system (not SECDED protected), for 10mm long buses, our Haar tailored SECDED approach consumes 27% less energy at the expense of 2% area overhead.
在本文中,我们介绍并评估了基于Haar的编解码器辅助的中远距离数据传输结构,例如总线段,片上网络互连,能够处理技术缩放相关现象(例如,增加对邻近耦合噪声和传输延迟可变性的敏感性),以节省能源为代价,以合理的小开销为代价,即1个额外的电线,一个2门编码器和一个2门解码器,为每对未编码的电线。为了进行实际评估,我们采用了45纳米商用CMOS技术和不同的随机、不相关的工作负载配置文件。对于5mm和10mm长的8位总线(不带中继器),我们分别获得了55%和34%的节能,传输频率分别提高了35%和41%,而相对于参考系统(即8线同步无编码总线)的面积开销不到1%,这证明了能源和延迟有效性。我们进一步扩大了我们的建议,采用了一种特别适合其结构的单误差校正和双误差检测(SECDED)方案,以应对非常深的亚微米噪声(例如,电源电压变化,电磁干扰)引起的传输误差。与参考系统(非SECDED保护)相比,对于10mm长的母线,我们的Haar定制SECDED方法消耗的能量减少了27%,而面积开销则减少了2%。
{"title":"Haar-based interconnect coding for energy effective medium/long range data transport","authors":"N. C. Laurenciu, S. Cotofana","doi":"10.1109/SOCC.2017.8226081","DOIUrl":"https://doi.org/10.1109/SOCC.2017.8226081","url":null,"abstract":"In this paper we introduce and evaluate Haar based codec assisted medium and long range data transport structures, e.g., bus segments, Network on Chip interconnects, able to deal with technology scaling related phenomena (e.g., increased susceptibility to proximity coupling noise and transmission delay variability), targeting energy savings at the expense of a reasonably small overhead, i.e., 1 extra wire, a 2-gate encoder, and a 2-gate decoder, for each and every pair of uncoded wires. For practical evaluation we employed a 45nm commercial CMOS technology and different random, uncorrelated workload profiles. For 5mm and 10mm long 8-bit buses (without repeaters), we obtain energy savings of 55% and 34%, and a transmission frequency increase of 35% and 41%, respectively, at the expense of less than 1% area overhead with respect to the reference system (i.e., 8-wire synchronous uncoded bus), which prove energy and delay effectiveness. We further augment our proposal with a Single Error Correction and Double Error Detection (SECDED) scheme particularly adapted to its structure, in order to cope with very deep sub-micron noise (e.g., supply voltage variations, electromagnetic interference) induced transmission errors. When compared to the reference system (not SECDED protected), for 10mm long buses, our Haar tailored SECDED approach consumes 27% less energy at the expense of 2% area overhead.","PeriodicalId":366264,"journal":{"name":"2017 30th IEEE International System-on-Chip Conference (SOCC)","volume":"71 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121004647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Low-noise high input impedance 8-channels chopper-stabilized EEG acquisition system 低噪声高输入阻抗8通道斩波稳定脑电采集系统
Pub Date : 2017-09-01 DOI: 10.1109/SOCC.2017.8226005
Z. Yan, M. Atef, Guoxing Wang, Y. Lian
This paper presents the design and implementation of an 8-channel low-noise chopper-stabilized analog front-end (AFE) for electroencephalogram (EEG) acquisition system. Each channel of the AFE is composed of an AC-coupled chopper instrumentation amplifier (ACCIA), a programmable gain amplifier (PGA), and a buffer. A positive feedback loop is adopted to boost its input impedance while the low-pass property suppresses the chopping ripple. The proposed AFE is implemented in 0.35 gm CMOS technology with the ADC, MUX, digital part and other control blocks. Post-layout simulation results show that the AFE achieves 46/52/58/64 dB programmable gain, 108 dB CMRR, and 0.32 μVrms input-referred noise for a bandwidth of 0.5–150 Hz. Each channel consumes 7.5 μA from a 3 V supply.
介绍了一种用于脑电图采集系统的8通道低噪声斩波稳定模拟前端(AFE)的设计与实现。AFE的每个通道由一个交流耦合斩波仪表放大器(ACCIA)、一个可编程增益放大器(PGA)和一个缓冲器组成。采用正反馈回路提高输入阻抗,低通特性抑制斩波。该AFE采用0.35 gm CMOS技术,采用ADC、MUX、数字部分等控制模块实现。布局后仿真结果表明,在0.5 ~ 150 Hz的带宽范围内,AFE可实现46/52/58/64 dB的可编程增益、108 dB的CMRR和0.32 μVrms的输入参考噪声。每个通道从3v电源消耗7.5 μA。
{"title":"Low-noise high input impedance 8-channels chopper-stabilized EEG acquisition system","authors":"Z. Yan, M. Atef, Guoxing Wang, Y. Lian","doi":"10.1109/SOCC.2017.8226005","DOIUrl":"https://doi.org/10.1109/SOCC.2017.8226005","url":null,"abstract":"This paper presents the design and implementation of an 8-channel low-noise chopper-stabilized analog front-end (AFE) for electroencephalogram (EEG) acquisition system. Each channel of the AFE is composed of an AC-coupled chopper instrumentation amplifier (ACCIA), a programmable gain amplifier (PGA), and a buffer. A positive feedback loop is adopted to boost its input impedance while the low-pass property suppresses the chopping ripple. The proposed AFE is implemented in 0.35 gm CMOS technology with the ADC, MUX, digital part and other control blocks. Post-layout simulation results show that the AFE achieves 46/52/58/64 dB programmable gain, 108 dB CMRR, and 0.32 μVrms input-referred noise for a bandwidth of 0.5–150 Hz. Each channel consumes 7.5 μA from a 3 V supply.","PeriodicalId":366264,"journal":{"name":"2017 30th IEEE International System-on-Chip Conference (SOCC)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115407012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Magneto-electric magnetic tunnel junction based analog circuit options 基于磁-电隧道结的模拟电路选项
Pub Date : 2017-09-01 DOI: 10.1109/SOCC.2017.8226032
N. Sharma, J. Bird, P. Dowben, A. Marshall
The magneto-electric magnetic tunnel junction (ME-MTJ) is a voltage controlled beyond CMOS device based on the principle of ME anti-ferromagnetic (AFM) exchange biasing of chromia (Cr2O3) and the tunneling magnetoresistance (TMR) of a magnetic tunnel junction (fixed/free ferromagnet (FM) stack). These devices have previously been demonstrated for the implementation of digital logic and memory applications. We here demonstrate their analog capabilities with a variety of analog functions adapted specifically to the characteristics of ME-MTJ — based devices. The novel circuit options proposed in this paper includes a ME-MTJ based analog comparator and the two variations of an 8-level analog-to-digital converter (ADC) using serial and parallel ME-MTJ circuit configurations.
磁-电磁隧道结(ME- mtj)是一种基于铬(Cr2O3)的ME-反铁磁(AFM)交换偏置原理和磁隧道结(固定/自由铁磁(FM)堆叠)的隧穿磁阻(TMR)原理的电压控制的超CMOS器件。这些器件先前已被证明用于实现数字逻辑和存储应用。我们在这里展示了它们的模拟能力,具有各种模拟功能,专门适用于基于ME-MTJ的器件的特性。本文提出的新颖电路选项包括基于ME-MTJ的模拟比较器和使用串行和并行ME-MTJ电路配置的8电平模数转换器(ADC)的两种变体。
{"title":"Magneto-electric magnetic tunnel junction based analog circuit options","authors":"N. Sharma, J. Bird, P. Dowben, A. Marshall","doi":"10.1109/SOCC.2017.8226032","DOIUrl":"https://doi.org/10.1109/SOCC.2017.8226032","url":null,"abstract":"The magneto-electric magnetic tunnel junction (ME-MTJ) is a voltage controlled beyond CMOS device based on the principle of ME anti-ferromagnetic (AFM) exchange biasing of chromia (Cr2O3) and the tunneling magnetoresistance (TMR) of a magnetic tunnel junction (fixed/free ferromagnet (FM) stack). These devices have previously been demonstrated for the implementation of digital logic and memory applications. We here demonstrate their analog capabilities with a variety of analog functions adapted specifically to the characteristics of ME-MTJ — based devices. The novel circuit options proposed in this paper includes a ME-MTJ based analog comparator and the two variations of an 8-level analog-to-digital converter (ADC) using serial and parallel ME-MTJ circuit configurations.","PeriodicalId":366264,"journal":{"name":"2017 30th IEEE International System-on-Chip Conference (SOCC)","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125751592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Linux-based support for developing real-time applications on heterogeneous platforms with dynamic FPGA reconfiguration 一个基于linux的支持开发实时应用程序的异构平台与动态FPGA重构
Pub Date : 2017-09-01 DOI: 10.1109/SOCC.2017.8226015
Marco Pagani, Alessio Balsini, Alessandro Biondi, Mauro Marinoni, G. Buttazzo
Heterogeneous computing platforms including both processors and field programmable gate arrays (FPGAs) represent an attractive solution for balancing software flexibility with high performance and energy efficiency of custom hardware modules. Furthermore, the dynamic partial reconfiguration (DPR) capabilities of modern FPGAs allow virtualizing the available area to support several hardware modules in time sharing, hence making them even more attractive. Such a feature is exploited by the FRED framework, recently proposed to support the development of real-time applications upon such platforms. This paper presents an implementation of the FRED framework for the Linux operating system over the Zynq-7000 platform produced by Xilinx. Design solutions for managing hardware accelerators are first discussed. Then, a software architecture for Linux is presented, which comprises (i) support for shared-memory communication with hardware accelerators, (ii) an improved driver to handle the FPGA reconfiguration and (iii) a scheduler for requests of hardware acceleration. The proposed solution allows exploiting the enormous number of software systems available for Linux (such as drivers, libraries, communication stacks, etc.) and the typical programming flexibility of software, while relying on predictable hardware acceleration of heavy computations.
包括处理器和现场可编程门阵列(fpga)在内的异构计算平台代表了一种有吸引力的解决方案,可以平衡软件灵活性与定制硬件模块的高性能和能效。此外,现代fpga的动态部分重新配置(DPR)功能允许虚拟化可用区域,以分时支持多个硬件模块,从而使它们更具吸引力。FRED框架利用了这一特性,它最近被提议用于支持在此类平台上开发实时应用程序。本文介绍了FRED框架在Xilinx公司生产的Zynq-7000平台上用于Linux操作系统的实现。首先讨论了硬件加速器管理的设计方案。然后,提出了Linux的软件体系结构,它包括(i)支持与硬件加速器的共享内存通信,(ii)改进的驱动程序来处理FPGA重构,(iii)硬件加速请求的调度程序。所提出的解决方案允许利用可用于Linux的大量软件系统(如驱动程序、库、通信堆栈等)和软件的典型编程灵活性,同时依赖于可预测的繁重计算的硬件加速。
{"title":"A Linux-based support for developing real-time applications on heterogeneous platforms with dynamic FPGA reconfiguration","authors":"Marco Pagani, Alessio Balsini, Alessandro Biondi, Mauro Marinoni, G. Buttazzo","doi":"10.1109/SOCC.2017.8226015","DOIUrl":"https://doi.org/10.1109/SOCC.2017.8226015","url":null,"abstract":"Heterogeneous computing platforms including both processors and field programmable gate arrays (FPGAs) represent an attractive solution for balancing software flexibility with high performance and energy efficiency of custom hardware modules. Furthermore, the dynamic partial reconfiguration (DPR) capabilities of modern FPGAs allow virtualizing the available area to support several hardware modules in time sharing, hence making them even more attractive. Such a feature is exploited by the FRED framework, recently proposed to support the development of real-time applications upon such platforms. This paper presents an implementation of the FRED framework for the Linux operating system over the Zynq-7000 platform produced by Xilinx. Design solutions for managing hardware accelerators are first discussed. Then, a software architecture for Linux is presented, which comprises (i) support for shared-memory communication with hardware accelerators, (ii) an improved driver to handle the FPGA reconfiguration and (iii) a scheduler for requests of hardware acceleration. The proposed solution allows exploiting the enormous number of software systems available for Linux (such as drivers, libraries, communication stacks, etc.) and the typical programming flexibility of software, while relying on predictable hardware acceleration of heavy computations.","PeriodicalId":366264,"journal":{"name":"2017 30th IEEE International System-on-Chip Conference (SOCC)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126848490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
A low-pass continuous-time delta-sigma interface circuit for wideband MEMS gyroscope readout ASIC 一种用于宽带MEMS陀螺仪读出专用集成电路的低通连续delta-sigma接口电路
Pub Date : 2017-09-01 DOI: 10.1109/SOCC.2017.8226002
Youngtae Yang, Jaehoon Jun, Suhwan Kim
This paper presents a low-noise, low-power CMOS interface circuit for MEMS gyroscope readout ASIC. Our interface circuit is composed of a continuous-time delta-sigma modulator, an anti-aliasing filter, and an on-chip reference generator. By using a low-pass delta-sigma modulator instead of a band-pass delta-sigma modulator, a frequency matching circuit is unnecessary which enables wideband operation. A switched-capacitor resistor digital-to-analog converter is exploited to reduce clock jitter sensitivity of the modulator. An anti-aliasing filter rejects the out-band signal, and a low-noise on-chip reference generator is embedded for miniaturization. The proposed circuit is realized in a 0.18 μm CMOS process. It achieves 70.3 dB signal-to-noise ratio in a signal bandwidth from 29.5 kHz to 30.5 kHz with only 0.2 V differential peak-peak input. It dissipates 2.6 mW from a 3.3 V supply.
提出了一种用于微机电系统陀螺仪读出专用集成电路的低噪声、低功耗CMOS接口电路。我们的接口电路由一个连续时间δ - σ调制器、一个抗混叠滤波器和一个片上参考发生器组成。通过使用低通δ - σ调制器而不是带通δ - σ调制器,无需频率匹配电路,从而实现宽带操作。为了降低调制器的时钟抖动灵敏度,采用了开关电容电阻数模转换器。抗混叠滤波器抑制带外信号,并嵌入低噪声片上参考发生器以实现小型化。该电路采用0.18 μm CMOS工艺实现。它在29.5 kHz到30.5 kHz的信号带宽范围内实现了70.3 dB的信噪比,仅需0.2 V差分峰峰输入。它从3.3 V电源耗散2.6 mW。
{"title":"A low-pass continuous-time delta-sigma interface circuit for wideband MEMS gyroscope readout ASIC","authors":"Youngtae Yang, Jaehoon Jun, Suhwan Kim","doi":"10.1109/SOCC.2017.8226002","DOIUrl":"https://doi.org/10.1109/SOCC.2017.8226002","url":null,"abstract":"This paper presents a low-noise, low-power CMOS interface circuit for MEMS gyroscope readout ASIC. Our interface circuit is composed of a continuous-time delta-sigma modulator, an anti-aliasing filter, and an on-chip reference generator. By using a low-pass delta-sigma modulator instead of a band-pass delta-sigma modulator, a frequency matching circuit is unnecessary which enables wideband operation. A switched-capacitor resistor digital-to-analog converter is exploited to reduce clock jitter sensitivity of the modulator. An anti-aliasing filter rejects the out-band signal, and a low-noise on-chip reference generator is embedded for miniaturization. The proposed circuit is realized in a 0.18 μm CMOS process. It achieves 70.3 dB signal-to-noise ratio in a signal bandwidth from 29.5 kHz to 30.5 kHz with only 0.2 V differential peak-peak input. It dissipates 2.6 mW from a 3.3 V supply.","PeriodicalId":366264,"journal":{"name":"2017 30th IEEE International System-on-Chip Conference (SOCC)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114351604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
System management recovery protocol for MPSoCs 用于mpsoc的系统管理恢复协议
Pub Date : 2017-09-01 DOI: 10.1109/SOCC.2017.8226080
Vinicius Fochi, L. L. Caimi, Marcelo Ruaro, E. Wächter, F. Moraes
The advances in silicon technology lead to systems with hundreds of processors, the NoC-based MPSoCs. However, the higher fault probability in deep sub-micron technologies shortens the integrated circuits lifetime. Operating systems enable to execute distributed applications in the MPSoC processing elements (PEs). Large systems require PEs dedicated to management purposes, for example, execute the task mapping, handle monitoring data, and run self-awareness adaptation. This paper addresses an MPSoC hierarchically organized: PEs with an embedded operating system executing the applications (SpE) and dedicated PEs manage at runtime the system resources (Mpe). A rich literature presents fault-tolerant proposals for the hardware and software components of the MPSoC, but there is a significant gap related to fault-tolerant approaches at the system level, i.e., related to the PEs with the function to manage the system. Consider for example an Mpe responsible for managing a set of SpE s. A fault in an Mpe prevents the access to the set of SpE s to execute new applications. The goal of this paper is to present a method to determine when an Mpe became faulty, and propose a protocol to migrate the management software safely to an Spe. The management data is preserved, without saving the context in redundant structures. The proposal is transparent to the applications executing in the system, with a small execution overhead observed during the management migration, presented in the results Section.
硅技术的进步导致了拥有数百个处理器的系统,即基于noc的mpsoc。然而,在深亚微米技术中,较高的故障概率缩短了集成电路的寿命。操作系统能够在MPSoC处理元素(pe)中执行分布式应用程序。大型系统需要专用于管理目的的pe,例如,执行任务映射、处理监视数据和运行自我感知适应。本文讨论了一种分层组织的MPSoC: pe具有执行应用程序的嵌入式操作系统(SpE),专用pe在运行时管理系统资源(Mpe)。丰富的文献提出了MPSoC硬件和软件组件的容错建议,但在系统级别上存在与容错方法相关的显着差距,即与具有管理系统功能的pe相关。例如,考虑一个负责管理一组SpE的Mpe。Mpe的故障会阻止访问这组SpE来执行新的应用程序。本文的目标是提出一种确定Mpe何时出现故障的方法,并提出一种将管理软件安全地迁移到Spe的协议。管理数据被保留,而不需要在冗余结构中保存上下文。该建议对于在系统中执行的应用程序是透明的,在管理迁移期间观察到的执行开销很小,在结果部分中给出。
{"title":"System management recovery protocol for MPSoCs","authors":"Vinicius Fochi, L. L. Caimi, Marcelo Ruaro, E. Wächter, F. Moraes","doi":"10.1109/SOCC.2017.8226080","DOIUrl":"https://doi.org/10.1109/SOCC.2017.8226080","url":null,"abstract":"The advances in silicon technology lead to systems with hundreds of processors, the NoC-based MPSoCs. However, the higher fault probability in deep sub-micron technologies shortens the integrated circuits lifetime. Operating systems enable to execute distributed applications in the MPSoC processing elements (PEs). Large systems require PEs dedicated to management purposes, for example, execute the task mapping, handle monitoring data, and run self-awareness adaptation. This paper addresses an MPSoC hierarchically organized: PEs with an embedded operating system executing the applications (SpE) and dedicated PEs manage at runtime the system resources (Mpe). A rich literature presents fault-tolerant proposals for the hardware and software components of the MPSoC, but there is a significant gap related to fault-tolerant approaches at the system level, i.e., related to the PEs with the function to manage the system. Consider for example an Mpe responsible for managing a set of SpE s. A fault in an Mpe prevents the access to the set of SpE s to execute new applications. The goal of this paper is to present a method to determine when an Mpe became faulty, and propose a protocol to migrate the management software safely to an Spe. The management data is preserved, without saving the context in redundant structures. The proposal is transparent to the applications executing in the system, with a small execution overhead observed during the management migration, presented in the results Section.","PeriodicalId":366264,"journal":{"name":"2017 30th IEEE International System-on-Chip Conference (SOCC)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131571476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Robust throughput boosting for low latency dynamic partial reconfiguration 鲁棒吞吐量提升低延迟动态部分重构
Pub Date : 2017-09-01 DOI: 10.1109/SOCC.2017.8226013
A. Nannarelli, M. Re, G. Cardarilli, L. Nunzio, M. Brunella, R. Fazzolari, F. Carbonari
Reducing the configuration time of portions of an FPGA at run time is crucial in contemporary FPGA-based accelerators. In this work, we propose a method to increase the throughput for FPGA dynamic partial reconfiguration by using standard IP blocks. The throughput is increased by over-clocking the configuration bitstream circuitry beyond the limits stated in the specifications of these standard blocks. The experimental results show that the most power efficient implementation can reach a throughput of about 780 MB/s, corresponding to a configuration latency of about 670 micro-seconds for bitstreams of 1.2 MB. We also investigate alternatives to boost the reconfiguration throughput and sketch a methodology to achieve the most power efficient implementation of FPGA-based accelerators.
在当前基于FPGA的加速器中,在运行时减少FPGA部分的配置时间至关重要。在这项工作中,我们提出了一种通过使用标准IP块来提高FPGA动态部分重构吞吐量的方法。吞吐量通过超频配置比特流电路而增加,超出了这些标准块规范中规定的限制。实验结果表明,最节能的实现可以达到约780 MB/s的吞吐量,对应于1.2 MB的比特流的配置延迟约670微秒。我们还研究了提高重构吞吐量的替代方案,并概述了一种实现fpga加速器最节能的方法。
{"title":"Robust throughput boosting for low latency dynamic partial reconfiguration","authors":"A. Nannarelli, M. Re, G. Cardarilli, L. Nunzio, M. Brunella, R. Fazzolari, F. Carbonari","doi":"10.1109/SOCC.2017.8226013","DOIUrl":"https://doi.org/10.1109/SOCC.2017.8226013","url":null,"abstract":"Reducing the configuration time of portions of an FPGA at run time is crucial in contemporary FPGA-based accelerators. In this work, we propose a method to increase the throughput for FPGA dynamic partial reconfiguration by using standard IP blocks. The throughput is increased by over-clocking the configuration bitstream circuitry beyond the limits stated in the specifications of these standard blocks. The experimental results show that the most power efficient implementation can reach a throughput of about 780 MB/s, corresponding to a configuration latency of about 670 micro-seconds for bitstreams of 1.2 MB. We also investigate alternatives to boost the reconfiguration throughput and sketch a methodology to achieve the most power efficient implementation of FPGA-based accelerators.","PeriodicalId":366264,"journal":{"name":"2017 30th IEEE International System-on-Chip Conference (SOCC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130987269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2017 30th IEEE International System-on-Chip Conference (SOCC)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1