首页 > 最新文献

2018 IEEE International Symposium on Circuits and Systems (ISCAS)最新文献

英文 中文
Efficient Fixed/Floating-Point Merged Mixed-Precision Multiply-Accumulate Unit for Deep Learning Processors 用于深度学习处理器的高效固定/浮点合并混合精度乘累加单元
Pub Date : 2018-05-27 DOI: 10.1109/ISCAS.2018.8351354
H. Zhang, Hyuk-Jae Lee, S. Ko
Deep learning is getting more and more attentions in recent years. Many hardware architectures have been proposed for efficient implementation of deep neural network. The arithmetic unit, as a core processing part of the hardware architecture, can determine the functionality of the whole architecture. In this paper, an efficient fixed/floating-point merged multiply-accumulate unit for deep learning processor is proposed. The proposed architecture supports 16-bit half-precision floating-point multiplication with 32-bit single-precision accumulation for training operations of deep learning algorithm. In addition, within the same hardware, the proposed architecture also supports two parallel 8-bit fixed-point multiplications and accumulating the products to 32-bit fixed-point number. This will enable higher throughput for inference operations of deep learning algorithms. Compared to a half-precision multiply-accumulate unit (accumulating to single-precision), the proposed architecture has only 4.6% area overhead. With the proposed multiply-accumulate unit, the deep learning processor can support both training and high-throughput inference.
近年来,深度学习越来越受到人们的关注。为了有效地实现深度神经网络,已经提出了许多硬件架构。算术单元作为硬件体系结构的核心处理部分,可以决定整个体系结构的功能。本文提出了一种高效的用于深度学习处理器的固定/浮点合并乘累加单元。该架构支持16位半精度浮点乘法和32位单精度累加,用于深度学习算法的训练操作。此外,在相同的硬件内,所提出的架构还支持两个并行的8位定点乘法,并将乘积累加为32位定点数。这将为深度学习算法的推理操作提供更高的吞吐量。与半精度乘-累加单元(累加到单精度)相比,所提出的架构只有4.6%的面积开销。利用所提出的乘法累积单元,深度学习处理器可以同时支持训练和高吞吐量推理。
{"title":"Efficient Fixed/Floating-Point Merged Mixed-Precision Multiply-Accumulate Unit for Deep Learning Processors","authors":"H. Zhang, Hyuk-Jae Lee, S. Ko","doi":"10.1109/ISCAS.2018.8351354","DOIUrl":"https://doi.org/10.1109/ISCAS.2018.8351354","url":null,"abstract":"Deep learning is getting more and more attentions in recent years. Many hardware architectures have been proposed for efficient implementation of deep neural network. The arithmetic unit, as a core processing part of the hardware architecture, can determine the functionality of the whole architecture. In this paper, an efficient fixed/floating-point merged multiply-accumulate unit for deep learning processor is proposed. The proposed architecture supports 16-bit half-precision floating-point multiplication with 32-bit single-precision accumulation for training operations of deep learning algorithm. In addition, within the same hardware, the proposed architecture also supports two parallel 8-bit fixed-point multiplications and accumulating the products to 32-bit fixed-point number. This will enable higher throughput for inference operations of deep learning algorithms. Compared to a half-precision multiply-accumulate unit (accumulating to single-precision), the proposed architecture has only 4.6% area overhead. With the proposed multiply-accumulate unit, the deep learning processor can support both training and high-throughput inference.","PeriodicalId":6569,"journal":{"name":"2018 IEEE International Symposium on Circuits and Systems (ISCAS)","volume":"62 1","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83146641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
A Systematic Method for Approximate Circuit Design Using Feature Selection 基于特征选择的系统近似电路设计方法
Pub Date : 2018-05-27 DOI: 10.1109/ISCAS.2018.8351167
Ling Qiu, Yingjie Lao
As the size of technology reaches deep nanometer realm, the improvements in area, power, and timing resulting from developments in scaling have started to see a decrease. Alternative approaches to explore design space to achieve energy-efficient digital systems are of great interest in recent years. Approximate computing in hardware design has emerged as a promising paradigm which seeks to trade off the requirement of accuracy for reduction in power consumption and hardware cost. This paper presents a systematic and scalable method for approximate circuit design by employing data-driven feature selection techniques rather than using statistical or theoretical analysis, which is extremely suitable for applications at a larger scale. A case study on approximate multiplier is presented to demonstrate the proposed design flow. Our experimental results show that the proposed approach could achieve better area/power saving and comparable error performance with other existing manual approximate multiplier designs, while greatly reducing the design workload and complexity.
随着技术的尺寸达到深度纳米领域,由于缩放的发展而导致的面积、功率和时间方面的改进已经开始减少。近年来,探索设计空间以实现节能数字系统的替代方法引起了人们的极大兴趣。在硬件设计中,近似计算已经成为一种很有前途的范例,它寻求在降低功耗和硬件成本的同时权衡精度的要求。本文提出了一种系统的、可扩展的近似电路设计方法,该方法采用数据驱动的特征选择技术,而不是使用统计或理论分析,非常适合于更大规模的应用。以近似乘法器为例,说明了所提出的设计流程。实验结果表明,该方法可以实现与现有人工近似乘法器设计相比更好的面积/功耗节约和相当的误差性能,同时大大降低了设计工作量和复杂度。
{"title":"A Systematic Method for Approximate Circuit Design Using Feature Selection","authors":"Ling Qiu, Yingjie Lao","doi":"10.1109/ISCAS.2018.8351167","DOIUrl":"https://doi.org/10.1109/ISCAS.2018.8351167","url":null,"abstract":"As the size of technology reaches deep nanometer realm, the improvements in area, power, and timing resulting from developments in scaling have started to see a decrease. Alternative approaches to explore design space to achieve energy-efficient digital systems are of great interest in recent years. Approximate computing in hardware design has emerged as a promising paradigm which seeks to trade off the requirement of accuracy for reduction in power consumption and hardware cost. This paper presents a systematic and scalable method for approximate circuit design by employing data-driven feature selection techniques rather than using statistical or theoretical analysis, which is extremely suitable for applications at a larger scale. A case study on approximate multiplier is presented to demonstrate the proposed design flow. Our experimental results show that the proposed approach could achieve better area/power saving and comparable error performance with other existing manual approximate multiplier designs, while greatly reducing the design workload and complexity.","PeriodicalId":6569,"journal":{"name":"2018 IEEE International Symposium on Circuits and Systems (ISCAS)","volume":"56 1","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83483794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
The EcoChip: A Wireless Multi-Sensor Platform for Comprehensive Environmental Monitoring EcoChip:用于综合环境监测的无线多传感器平台
Pub Date : 2018-05-27 DOI: 10.1109/ISCAS.2018.8351654
M. Sylvain, Francis Lehoux, S. Morency, Felix Faucher, E. Bharucha, D. Tremblay, Denis Sarrazin, Sylvain Moineau, Michel Allard, Jacques Corbeil, Younès Messaddeq, Benoit Gosselin
This paper presents a new autonomous wireless sensor platform intended for the monitoring of microorganisms and molecules found in harsh environments, like in the northern climates. The EcoChip includes a layered multiwell plate that allows the growth of single strain microorganisms, within a well of the plate, isolated from environmental samples from Northern habitats. It can be deployed in the field for continuous monitoring of microbiological growth within 96 individual wells through a multichannel electro-chemical impedance monitoring circuit. Additional sensors are provided for monitoring luminosity, humidity, temperature, pH, and CO2 release. The embedded electronic board is equipped with a flash memory to accumulate and store sensor data for long periods of time, as well as with a low-power micro-controller, and a power management unit to control and supply all electronic building blocks. When a receiver is located within the transmission range of the EcoChip, a low-power wireless transceiver allows transmission of sensor data stored from on-board memory. We report the measured performance of the system, and we present experimental results obtained in the field during a pilot study performed with the EcoChip deployed in the village of Kuujjuarapik, at a latitude of 55 degrees, in Northern Canada.
本文提出了一种新的自主无线传感器平台,用于监测在恶劣环境中发现的微生物和分子,如在北方气候中。EcoChip包括一个分层的多孔板,允许单一菌株微生物在板的一个孔内生长,从北方栖息地的环境样品中分离出来。通过多通道电化学阻抗监测电路,可在现场连续监测96口单井内的微生物生长情况。额外的传感器用于监测亮度,湿度,温度,pH值和二氧化碳释放。嵌入式电子板配备了闪存,可以长时间积累和存储传感器数据,还配备了低功耗微控制器和电源管理单元,可以控制和供应所有电子构建块。当接收器位于EcoChip的传输范围内时,低功耗无线收发器允许传输存储在板载存储器中的传感器数据。我们报告了该系统的测量性能,并介绍了在加拿大北部纬度为55度的Kuujjuarapik村部署EcoChip进行的现场试验研究中获得的实验结果。
{"title":"The EcoChip: A Wireless Multi-Sensor Platform for Comprehensive Environmental Monitoring","authors":"M. Sylvain, Francis Lehoux, S. Morency, Felix Faucher, E. Bharucha, D. Tremblay, Denis Sarrazin, Sylvain Moineau, Michel Allard, Jacques Corbeil, Younès Messaddeq, Benoit Gosselin","doi":"10.1109/ISCAS.2018.8351654","DOIUrl":"https://doi.org/10.1109/ISCAS.2018.8351654","url":null,"abstract":"This paper presents a new autonomous wireless sensor platform intended for the monitoring of microorganisms and molecules found in harsh environments, like in the northern climates. The EcoChip includes a layered multiwell plate that allows the growth of single strain microorganisms, within a well of the plate, isolated from environmental samples from Northern habitats. It can be deployed in the field for continuous monitoring of microbiological growth within 96 individual wells through a multichannel electro-chemical impedance monitoring circuit. Additional sensors are provided for monitoring luminosity, humidity, temperature, pH, and CO2 release. The embedded electronic board is equipped with a flash memory to accumulate and store sensor data for long periods of time, as well as with a low-power micro-controller, and a power management unit to control and supply all electronic building blocks. When a receiver is located within the transmission range of the EcoChip, a low-power wireless transceiver allows transmission of sensor data stored from on-board memory. We report the measured performance of the system, and we present experimental results obtained in the field during a pilot study performed with the EcoChip deployed in the village of Kuujjuarapik, at a latitude of 55 degrees, in Northern Canada.","PeriodicalId":6569,"journal":{"name":"2018 IEEE International Symposium on Circuits and Systems (ISCAS)","volume":"26 1","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83525344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Spatio-temporal compressed sensing for real-time wireless EEG monitoring 实时无线脑电图监测的时空压缩感知
Pub Date : 2018-05-27 DOI: 10.1109/ISCAS.2018.8351863
Bathiya Senevirathna, P. Abshire
Wearable electronics capable of recording and transmitting biosignals can provide convenient and pervasive health monitoring. The wireless transmission bandwidth limits the number of recording sites that can be monitored at one time. Compressed sensing (CS) is a promising approach that uses computationally efficient encoding to reduce the number of samples that are transmitted wirelessly, allowing more channels to be monitored over a transmission channel. The rakeness CS approach shows improved performance for higher compression rates, but in prior work it has only been evaluated for single channel data. We analyze the fidelity tradeoffs for compressed sensing implemented on a mobile electroencephalography (EEG) system. We propose several methods for spatiotemporal encoding in rakeness CS and evaluate the performance using a spontaneous EEG dataset recorded during moderate movement. Reconstruction performance depends strongly on the compression ratio and weakly on the method of spatiotemporal encoding. This suggests weak spatial correlation between the different channels of EEG data, which were recorded in an experiment involving self-initiated movement.
能够记录和传输生物信号的可穿戴电子设备可以提供方便和无处不在的健康监测。无线传输带宽限制了一次可以监控的录音站点的数量。压缩感知(CS)是一种很有前途的方法,它使用计算效率高的编码来减少无线传输的样本数量,从而允许在传输信道上监控更多的信道。rakeness CS方法在更高的压缩率下表现出更好的性能,但在之前的工作中,它只对单通道数据进行了评估。我们分析了在移动脑电图(EEG)系统中实现压缩感知的保真度权衡。我们提出了几种rakeness CS的时空编码方法,并使用在中度运动中记录的自发性脑电图数据集评估其性能。重构性能主要依赖于压缩比,而对时空编码方式的影响较小。这表明在涉及自发运动的实验中记录的脑电图数据的不同通道之间的空间相关性较弱。
{"title":"Spatio-temporal compressed sensing for real-time wireless EEG monitoring","authors":"Bathiya Senevirathna, P. Abshire","doi":"10.1109/ISCAS.2018.8351863","DOIUrl":"https://doi.org/10.1109/ISCAS.2018.8351863","url":null,"abstract":"Wearable electronics capable of recording and transmitting biosignals can provide convenient and pervasive health monitoring. The wireless transmission bandwidth limits the number of recording sites that can be monitored at one time. Compressed sensing (CS) is a promising approach that uses computationally efficient encoding to reduce the number of samples that are transmitted wirelessly, allowing more channels to be monitored over a transmission channel. The rakeness CS approach shows improved performance for higher compression rates, but in prior work it has only been evaluated for single channel data. We analyze the fidelity tradeoffs for compressed sensing implemented on a mobile electroencephalography (EEG) system. We propose several methods for spatiotemporal encoding in rakeness CS and evaluate the performance using a spontaneous EEG dataset recorded during moderate movement. Reconstruction performance depends strongly on the compression ratio and weakly on the method of spatiotemporal encoding. This suggests weak spatial correlation between the different channels of EEG data, which were recorded in an experiment involving self-initiated movement.","PeriodicalId":6569,"journal":{"name":"2018 IEEE International Symposium on Circuits and Systems (ISCAS)","volume":"124 1","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88604860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
What Architecture Should I Choose for my Continuous-Time Delta-Sigma Modulator? 我应该为我的连续时间δ - σ调制器选择什么结构?
Pub Date : 2018-05-27 DOI: 10.1109/ISCAS.2018.8351141
S. Pavan, Siddharth Baskaran
A novice continuous-time delta-sigma designer is faced with an admittedly complex maze of possible design choices. The right architecture often determines how efficiently the modulator can be implemented. This paper critically examines various popular delta-sigma architectures. It concludes that a single-bit modulator with FIR feedback is a prime candidate that enables a power-efficient implementation for a variety of specifications. To support this thesis, measurement results of an audio delta-sigma modulator, designed in a 65 nm CMOS process are given. The modulator, which incorporates FIR feedback and chopping to reduce 1/f noise, achieves 98.6 dB peak SNDR in a 24 kHz bandwidth and consumes only 260 μ W from a 1.2 V supply.
一个连续时间delta-sigma设计师新手面临着一个不可否认的复杂的设计选择迷宫。正确的体系结构通常决定了调制器的实现效率。本文批判性地考察了各种流行的delta-sigma架构。它的结论是,具有FIR反馈的单比特调制器是一个主要的候选者,可以实现各种规格的节能实现。为了支持本文的研究,本文给出了65 nm CMOS工艺下的音频δ - σ调制器的测量结果。该调制器采用FIR反馈和斩波来降低1/f噪声,在24 kHz带宽下实现98.6 dB峰值SNDR,功耗仅为260 μ W。
{"title":"What Architecture Should I Choose for my Continuous-Time Delta-Sigma Modulator?","authors":"S. Pavan, Siddharth Baskaran","doi":"10.1109/ISCAS.2018.8351141","DOIUrl":"https://doi.org/10.1109/ISCAS.2018.8351141","url":null,"abstract":"A novice continuous-time delta-sigma designer is faced with an admittedly complex maze of possible design choices. The right architecture often determines how efficiently the modulator can be implemented. This paper critically examines various popular delta-sigma architectures. It concludes that a single-bit modulator with FIR feedback is a prime candidate that enables a power-efficient implementation for a variety of specifications. To support this thesis, measurement results of an audio delta-sigma modulator, designed in a 65 nm CMOS process are given. The modulator, which incorporates FIR feedback and chopping to reduce 1/f noise, achieves 98.6 dB peak SNDR in a 24 kHz bandwidth and consumes only 260 μ W from a 1.2 V supply.","PeriodicalId":6569,"journal":{"name":"2018 IEEE International Symposium on Circuits and Systems (ISCAS)","volume":"120 1","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89373440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Design of a Low Latency 40 Gb/s Flow-Based Traffic Manager Using High-Level Synthesis 基于高级综合的低延迟40gb /s流量管理器设计
Pub Date : 2018-05-27 DOI: 10.1109/ISCAS.2018.8351332
Imad Benacer, F. Boyer, Y. Savaria
This paper presents a traffic manager architecture targeting to meet today's networking requirements, especially reduced latency, and to support the upcoming 5G technology in the software defined networking context. The proposed traffic manager functionalities are policing, scheduling, shaping, and queuing of incoming traffic (packets). The incoming traffic is assumed to be a set of flows in a network processing unit. Traffic management imposes constraints on packets to be sent out in such a way to meet the allowed bandwidth quotas for each flow, and enforce desired quality of service (QoS) targets. The FPGA prototyped architecture is based on the C++ language and is synthesized with the Vivado High-Level Synthesis (HLS) tool. The proposed traffic manager design supports 40 Gb/s per egress port for 64-byte sized packets, running at 80 MHz when implemented on a ZC706 Xilinx board. A throughput improvement of 4.0× over previous reported works is claimed.
本文提出了一种流量管理器架构,旨在满足当今的网络需求,特别是降低延迟,并在软件定义网络环境中支持即将到来的5G技术。建议的流量管理器功能是对传入流量(数据包)进行监管、调度、定型和排队。传入流量被假定为网络处理单元中的一组流。流量管理以这种方式对要发送的数据包施加约束,以满足每个流允许的带宽配额,并强制执行所需的服务质量(QoS)目标。FPGA原型架构基于c++语言,并使用Vivado高级合成(High-Level Synthesis, HLS)工具进行合成。提出的流量管理器设计支持64字节大小的数据包的每个出口端口40 Gb/s,在ZC706 Xilinx板上实现时运行在80 MHz。声称吞吐量比以前报告的工作提高了4.0倍。
{"title":"Design of a Low Latency 40 Gb/s Flow-Based Traffic Manager Using High-Level Synthesis","authors":"Imad Benacer, F. Boyer, Y. Savaria","doi":"10.1109/ISCAS.2018.8351332","DOIUrl":"https://doi.org/10.1109/ISCAS.2018.8351332","url":null,"abstract":"This paper presents a traffic manager architecture targeting to meet today's networking requirements, especially reduced latency, and to support the upcoming 5G technology in the software defined networking context. The proposed traffic manager functionalities are policing, scheduling, shaping, and queuing of incoming traffic (packets). The incoming traffic is assumed to be a set of flows in a network processing unit. Traffic management imposes constraints on packets to be sent out in such a way to meet the allowed bandwidth quotas for each flow, and enforce desired quality of service (QoS) targets. The FPGA prototyped architecture is based on the C++ language and is synthesized with the Vivado High-Level Synthesis (HLS) tool. The proposed traffic manager design supports 40 Gb/s per egress port for 64-byte sized packets, running at 80 MHz when implemented on a ZC706 Xilinx board. A throughput improvement of 4.0× over previous reported works is claimed.","PeriodicalId":6569,"journal":{"name":"2018 IEEE International Symposium on Circuits and Systems (ISCAS)","volume":"33 1","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88129263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Novel Convolution Computing Paradigm Based on NOR Flash Array with High Computing Speed and Energy Efficient 一种新的基于NOR闪存阵列的高计算速度和高能效的卷积计算范式
Pub Date : 2018-05-27 DOI: 10.1109/ISCAS.2018.8351030
Runze Han, P. Huang, Y. Xiang, C. Liu, Zhen Dong, Z. Su, Y. B. Liu, L. Liu, X. Liu, Jinfeng Kang
A novel convolution computing paradigm based on the NOR Flash Array is proposed. Significant improvements both in computing speed and energy consumption are achieved compared to CMOS-based logic computing paradigms. Regarding to the feature extraction task from a 256×256 image, the computing speed of 3.9×104 frame per second (fps) and the energy consumption of 0.057nJ/pixel are achieved using the proposed computing paradigm.
提出了一种新的基于NOR闪存阵列的卷积计算范式。与基于cmos的逻辑计算范式相比,在计算速度和能耗方面都取得了显著的改进。对于256×256图像的特征提取任务,采用该计算范式实现了3.9×104帧/秒(fps)的计算速度和0.057nJ/像素的能量消耗。
{"title":"A Novel Convolution Computing Paradigm Based on NOR Flash Array with High Computing Speed and Energy Efficient","authors":"Runze Han, P. Huang, Y. Xiang, C. Liu, Zhen Dong, Z. Su, Y. B. Liu, L. Liu, X. Liu, Jinfeng Kang","doi":"10.1109/ISCAS.2018.8351030","DOIUrl":"https://doi.org/10.1109/ISCAS.2018.8351030","url":null,"abstract":"A novel convolution computing paradigm based on the NOR Flash Array is proposed. Significant improvements both in computing speed and energy consumption are achieved compared to CMOS-based logic computing paradigms. Regarding to the feature extraction task from a 256×256 image, the computing speed of 3.9×104 frame per second (fps) and the energy consumption of 0.057nJ/pixel are achieved using the proposed computing paradigm.","PeriodicalId":6569,"journal":{"name":"2018 IEEE International Symposium on Circuits and Systems (ISCAS)","volume":"68 1","pages":"1-4"},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83626852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
An Indoor Solar Energy Harvester with Ultra-Low-Power Reconfigurable Power-On-Reset-Styled Voltage Detector 具有超低功耗可重构电源-复位式电压检测器的室内太阳能采集器
Pub Date : 2018-05-27 DOI: 10.1109/ISCAS.2018.8351096
Xiaodong Meng, Xing Li, Y. Yao, C. Tsui, W. Ki
An ultra-lower-power reconfigurable voltage detector for indoor solar energy harvester is presented. The voltage detector monitors the solar cell voltage and sends out a flag signal if the solar cell voltage surpasses the triggering threshold of the detector. Instead of using a traditional dynamic comparator, this design is based on a power-on-reset (POR) circuit. A POR circuit has ultra-low quiescent loss and a fixed triggering threshold, which is determined by its topology and process. Our improvement is to use a feedback loop that allows the triggering threshold to be reconfigurable. The average quiescent loss of this POR voltage detector circuit is 2.774nW. Process and temperature variations can also be compensated by the feedback loop. The energy harvesting system is designed with a 0.18 p,m CMOS process. Equipped with the proposed voltage detector, the whole system achieves a 93.21% peak efficiency at 200μW input power.
提出了一种用于室内太阳能采集器的超低功耗可重构电压检测器。电压探测器监测太阳能电池电压,当太阳能电池电压超过探测器的触发阈值时发出标志信号。与传统的动态比较器不同,该设计基于电源复位(POR)电路。POR电路具有超低的静态损耗和固定的触发阈值,这是由其拓扑结构和工艺决定的。我们的改进是使用一个反馈回路,允许触发阈值可重新配置。该POR电压检测电路的平均静态损耗为2.774nW。过程和温度变化也可以通过反馈回路进行补偿。能量收集系统采用0.18 p,m CMOS工艺设计。在200μW的输入功率下,整个系统的峰值效率达到了93.21%。
{"title":"An Indoor Solar Energy Harvester with Ultra-Low-Power Reconfigurable Power-On-Reset-Styled Voltage Detector","authors":"Xiaodong Meng, Xing Li, Y. Yao, C. Tsui, W. Ki","doi":"10.1109/ISCAS.2018.8351096","DOIUrl":"https://doi.org/10.1109/ISCAS.2018.8351096","url":null,"abstract":"An ultra-lower-power reconfigurable voltage detector for indoor solar energy harvester is presented. The voltage detector monitors the solar cell voltage and sends out a flag signal if the solar cell voltage surpasses the triggering threshold of the detector. Instead of using a traditional dynamic comparator, this design is based on a power-on-reset (POR) circuit. A POR circuit has ultra-low quiescent loss and a fixed triggering threshold, which is determined by its topology and process. Our improvement is to use a feedback loop that allows the triggering threshold to be reconfigurable. The average quiescent loss of this POR voltage detector circuit is 2.774nW. Process and temperature variations can also be compensated by the feedback loop. The energy harvesting system is designed with a 0.18 p,m CMOS process. Equipped with the proposed voltage detector, the whole system achieves a 93.21% peak efficiency at 200μW input power.","PeriodicalId":6569,"journal":{"name":"2018 IEEE International Symposium on Circuits and Systems (ISCAS)","volume":"6 1","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89963849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Drift-Invariant Detection for Multilevel Phase-Change Memory 多电平相变存储器的漂移不变检测
Pub Date : 2018-05-27 DOI: 10.1109/ISCAS.2018.8351740
M. Stanisavljevic, T. Mittelholzer, N. Papandreou, Thomas Parnell, H. Pozidis
Next-generation memory (NGM) technologies present a major opportunity but also a significant challenge, due to their intricate reliability issues. In particular, multilevel-cell (MLC) storage is highly desirable for increasing storage capacity and lowering total cost-per-bit. In phase-change memory (PCM), MLC storage is hampered by sensitivity to temperature variations and resistance drift. A novel drift-invariant detection (DID) scheme that estimates variable read thresholds based on ordered statistics and clustering of the soft read-back signals from a small block of 32 cells has been developed and implemented in hardware to improve reliability and prolong data retention. A low-complexity implementation of the DID on a FPGA platform comprises 20'000 LUTs and 6'000 flip-flops and has a latency of 90ns. We present results from an extensive performance verification that ascertains highly reliable data retrieval up to 13 orders of magnitude in time after programming. Such elevated reliability is necessary for the most anticipated application of NGM, namely persistent far-memory, where the NGM is used as a large memory pool, possibly together with DRAM.
由于复杂的可靠性问题,下一代存储器(NGM)技术带来了重大机遇,但也带来了重大挑战。特别是,多层单元(MLC)存储对于增加存储容量和降低每比特的总成本是非常理想的。在相变存储器(PCM)中,MLC存储受到温度变化敏感性和电阻漂移的限制。为了提高可靠性和延长数据保留时间,提出了一种新的漂移不变检测(DID)方案,该方案基于对32个单元的小块软读回信号的有序统计和聚类来估计可变读阈值。在FPGA平台上实现的低复杂度DID包括20,000个lut和6,000个触发器,延迟为90ns。我们提出了一个广泛的性能验证结果,确定了高度可靠的数据检索,在编程后的时间高达13个数量级。这种高可靠性对于最令人期待的NGM应用(即持久远内存)是必要的,在这种应用中,NGM被用作大型内存池,可能与DRAM一起使用。
{"title":"Drift-Invariant Detection for Multilevel Phase-Change Memory","authors":"M. Stanisavljevic, T. Mittelholzer, N. Papandreou, Thomas Parnell, H. Pozidis","doi":"10.1109/ISCAS.2018.8351740","DOIUrl":"https://doi.org/10.1109/ISCAS.2018.8351740","url":null,"abstract":"Next-generation memory (NGM) technologies present a major opportunity but also a significant challenge, due to their intricate reliability issues. In particular, multilevel-cell (MLC) storage is highly desirable for increasing storage capacity and lowering total cost-per-bit. In phase-change memory (PCM), MLC storage is hampered by sensitivity to temperature variations and resistance drift. A novel drift-invariant detection (DID) scheme that estimates variable read thresholds based on ordered statistics and clustering of the soft read-back signals from a small block of 32 cells has been developed and implemented in hardware to improve reliability and prolong data retention. A low-complexity implementation of the DID on a FPGA platform comprises 20'000 LUTs and 6'000 flip-flops and has a latency of 90ns. We present results from an extensive performance verification that ascertains highly reliable data retrieval up to 13 orders of magnitude in time after programming. Such elevated reliability is necessary for the most anticipated application of NGM, namely persistent far-memory, where the NGM is used as a large memory pool, possibly together with DRAM.","PeriodicalId":6569,"journal":{"name":"2018 IEEE International Symposium on Circuits and Systems (ISCAS)","volume":"19 1","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89529773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A Heterogeneous Cluster with Reconfigurable Accelerator for Energy Efficient Near-Sensor Data Analytics 基于可重构加速器的异构集群节能近传感器数据分析
Pub Date : 2018-05-27 DOI: 10.1109/ISCAS.2018.8351749
Satyajit Das, Kevin J. M. Martin, P. Coussy, D. Rossi
IoT end-nodes require high performance and extreme energy efficiency to cope with complex near-sensor data analytics algorithms. Processing on multiple programmable processors operating in near-threshold is emerging as a promising solution to exploit the energy boost given by low-voltage operation, while recovering the related frequency degradation with parallelism. In this work, we present a heterogeneous cluster architecture extending a traditional parallel processor cluster with a reconfigurable Integrated Programmable Array (IPA) accelerator. While programmable processors guarantee programming legacy to easily manage peripherals, radio software stacks as well as the global program flow, offloading data-intensive and control-intensive kernels to the IPA leads to much higher system level performance and energy-efficiency. Experimental results show that the proposed heterogeneous cluster outperforms an 8-core homogeneous architecture by up to 4.8× in performance and 4.5× in energy efficiency when executing a mix of control-intensive and data-intensive kernels typical of near-sensor data analytics applications.
物联网终端节点需要高性能和极高的能效来应对复杂的近传感器数据分析算法。多可编程处理器在近阈值环境下的处理是一种很有前途的解决方案,可以利用低压操作带来的能量提升,同时通过并行性恢复相关的频率退化。在这项工作中,我们提出了一个异构集群架构,扩展了传统的并行处理器集群与可重构集成可编程阵列(IPA)加速器。虽然可编程处理器保证了编程遗产,可以轻松管理外设、无线电软件堆栈以及全局程序流,但将数据密集型和控制密集型内核卸载到IPA会带来更高的系统级性能和能效。实验结果表明,当执行典型的近传感器数据分析应用中控制密集型和数据密集型内核的混合时,所提出的异构集群的性能比8核同构架构高出4.8倍,能效提高4.5倍。
{"title":"A Heterogeneous Cluster with Reconfigurable Accelerator for Energy Efficient Near-Sensor Data Analytics","authors":"Satyajit Das, Kevin J. M. Martin, P. Coussy, D. Rossi","doi":"10.1109/ISCAS.2018.8351749","DOIUrl":"https://doi.org/10.1109/ISCAS.2018.8351749","url":null,"abstract":"IoT end-nodes require high performance and extreme energy efficiency to cope with complex near-sensor data analytics algorithms. Processing on multiple programmable processors operating in near-threshold is emerging as a promising solution to exploit the energy boost given by low-voltage operation, while recovering the related frequency degradation with parallelism. In this work, we present a heterogeneous cluster architecture extending a traditional parallel processor cluster with a reconfigurable Integrated Programmable Array (IPA) accelerator. While programmable processors guarantee programming legacy to easily manage peripherals, radio software stacks as well as the global program flow, offloading data-intensive and control-intensive kernels to the IPA leads to much higher system level performance and energy-efficiency. Experimental results show that the proposed heterogeneous cluster outperforms an 8-core homogeneous architecture by up to 4.8× in performance and 4.5× in energy efficiency when executing a mix of control-intensive and data-intensive kernels typical of near-sensor data analytics applications.","PeriodicalId":6569,"journal":{"name":"2018 IEEE International Symposium on Circuits and Systems (ISCAS)","volume":"35 1","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2018-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90267124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
期刊
2018 IEEE International Symposium on Circuits and Systems (ISCAS)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1