首页 > 最新文献

Integration-The Vlsi Journal最新文献

英文 中文
Physical design for microfluidic biochips considering actual volume management and channel storage 考虑实际体积管理和通道存储的微流控生物芯片物理设计
IF 1.9 3区 工程技术 Q2 Engineering Pub Date : 2024-06-07 DOI: 10.1016/j.vlsi.2024.102228
Genggeng Liu , Zhengyang Chen , Zhisheng Chen , Bowen Liu , Yu Zhang , Xing Huang

In recent years, microfluidic biochips have been widely applied in various fields of human society. The optimization design of system-architecture based on continuous-flow microfluidic biochips has been widely studied. However, most previous work was based on the traditional chip architecture with dedicated storage, which not only limits the performance of biochips but also increases their manufacturing costs. In order to improve the execution efficiency and reduce the manufacturing cost, a distributed channel-storage architecture can be used to temporarily cache intermediate fluids in idle flow channels. Under this architecture, careful consideration of the volume management of the fluid to be cached is a prerequisite for ensuring the reliability of bioassay results. However, the existing work has not considered the volume management of the fluid to be cached in detail. This may cause the volume of the fluid to not match the capacity of the storage channel, which can contaminate other fluids and lead to incorrect bioassay results or increase the manufacturing cost of biochips due to long storage channels. In this paper, we propose a physical design method for microfluidic biochips that considers the actual volume of fluid while utilizing distributed channel storage. We address this problem by taking a placement and routing co-design strategy throughout the iterative process of the simulated annealing algorithm. Experimental results under multiple benchmarks show that the proposed method can effectively reduce the completion time of bioassays, minimize the flow path length, and decrease the number of intersections.

近年来,微流控生物芯片已广泛应用于人类社会的各个领域。基于连续流微流控生物芯片的系统架构优化设计已被广泛研究。然而,以往的工作大多基于专用存储的传统芯片架构,这不仅限制了生物芯片的性能,还增加了其制造成本。为了提高执行效率并降低制造成本,可以采用分布式通道存储架构,在空闲的流动通道中临时缓存中间流体。在这种架构下,仔细考虑缓存液体的体积管理是确保生物测定结果可靠性的前提。然而,现有工作并未详细考虑待缓存流体的体积管理问题。这可能会导致液体体积与存储通道的容量不匹配,从而污染其他液体,导致生物测定结果错误,或因存储通道过长而增加生物芯片的制造成本。在本文中,我们提出了一种微流控生物芯片的物理设计方法,该方法在利用分布式通道存储的同时考虑了流体的实际体积。我们通过在模拟退火算法的整个迭代过程中采取放置和路由协同设计策略来解决这一问题。多种基准下的实验结果表明,所提出的方法能有效缩短生物测定的完成时间,最大限度地减少流路长度和交叉点数量。
{"title":"Physical design for microfluidic biochips considering actual volume management and channel storage","authors":"Genggeng Liu ,&nbsp;Zhengyang Chen ,&nbsp;Zhisheng Chen ,&nbsp;Bowen Liu ,&nbsp;Yu Zhang ,&nbsp;Xing Huang","doi":"10.1016/j.vlsi.2024.102228","DOIUrl":"https://doi.org/10.1016/j.vlsi.2024.102228","url":null,"abstract":"<div><p>In recent years, microfluidic biochips have been widely applied in various fields of human society. The optimization design of system-architecture based on continuous-flow microfluidic biochips has been widely studied. However, most previous work was based on the traditional chip architecture with dedicated storage, which not only limits the performance of biochips but also increases their manufacturing costs. In order to improve the execution efficiency and reduce the manufacturing cost, a distributed channel-storage architecture can be used to temporarily cache intermediate fluids in idle flow channels. Under this architecture, careful consideration of the volume management of the fluid to be cached is a prerequisite for ensuring the reliability of bioassay results. However, the existing work has not considered the volume management of the fluid to be cached in detail. This may cause the volume of the fluid to not match the capacity of the storage channel, which can contaminate other fluids and lead to incorrect bioassay results or increase the manufacturing cost of biochips due to long storage channels. In this paper, we propose a physical design method for microfluidic biochips that considers the actual volume of fluid while utilizing distributed channel storage. We address this problem by taking a placement and routing co-design strategy throughout the iterative process of the simulated annealing algorithm. Experimental results under multiple benchmarks show that the proposed method can effectively reduce the completion time of bioassays, minimize the flow path length, and decrease the number of intersections.</p></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":null,"pages":null},"PeriodicalIF":1.9,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141323393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
High-perfoprmance and low-power decoder circuits for SRAMs using mixed-logic scheme 采用混合逻辑方案的 SRAM 高性能、低功耗解码器电路
IF 1.9 3区 工程技术 Q2 Engineering Pub Date : 2024-06-06 DOI: 10.1016/j.vlsi.2024.102227
Donghao Xia , Yuejun Zhang , Yuanxin Tian , Mengfan Xu , Liang Wen

A mixed-logic design scheme utilizing pass-transistor logic (PTL) and dual-value logic (DVL) in combination with static CMOS logic for decoders in SRAMs is proposed. By using of the mixed-logic circuit, new n-Transistor (T) NAND/AND structures were provided for decoders, while achieving fewer transistors, faster speed, lower power dissipation as compared to traditional circuits, and having full-swing capability and good noise immunity. Experiments were conducted using TSMC 28 nm process for mixed-logic decoders, and the results show the superiority in terms of propagation delay and power dissipation, compared to the conventional corresponding circuits. A mixed-logic 2-4 decoder exhibits 36 % reduction in propagation delay and 10 % improvement in power dissipation; A mixed-logic 3-8 decoder exhibits 27 % reduction in propagation delay and 5.5 % improvement in power dissipation; While, A mixed-logic 4-16 decoder exhibits 30 % reduction in propagation delay and 5 % improvement in power dissipation; As well, A mixed-logic 5-32 decoder exhibits 34 % reduction in propagation delay and 6.3 % improvement in power dissipation.

针对 SRAM 中的解码器,我们提出了一种混合逻辑设计方案,它将通过晶体管逻辑(PTL)和双值逻辑(DVL)与静态 CMOS 逻辑相结合。通过使用混合逻辑电路,为解码器提供了新的 n 晶体管 (T) NAND/AND 结构,与传统电路相比,晶体管数量更少,速度更快,功耗更低,并且具有全摆幅能力和良好的抗噪能力。使用台积电 28 纳米工艺对混合逻辑解码器进行了实验,结果表明与传统相应电路相比,混合逻辑解码器在传播延迟和功耗方面更具优势。混合逻辑 2-4 解码器的传播延迟减少了 36%,功耗降低了 10%;混合逻辑 3-8 解码器的传播延迟减少了 27%,功耗降低了 5.5%;混合逻辑 4-16 解码器的传播延迟减少了 30%,功耗降低了 5%;混合逻辑 5-32 解码器的传播延迟减少了 34%,功耗降低了 6.3%。
{"title":"High-perfoprmance and low-power decoder circuits for SRAMs using mixed-logic scheme","authors":"Donghao Xia ,&nbsp;Yuejun Zhang ,&nbsp;Yuanxin Tian ,&nbsp;Mengfan Xu ,&nbsp;Liang Wen","doi":"10.1016/j.vlsi.2024.102227","DOIUrl":"https://doi.org/10.1016/j.vlsi.2024.102227","url":null,"abstract":"<div><p>A mixed-logic design scheme utilizing pass-transistor logic (PTL) and dual-value logic (DVL) in combination with static CMOS logic for decoders in SRAMs is proposed. By using of the mixed-logic circuit, new n-Transistor (T) NAND/AND structures were provided for decoders, while achieving fewer transistors, faster speed, lower power dissipation as compared to traditional circuits, and having full-swing capability and good noise immunity. Experiments were conducted using TSMC 28 nm process for mixed-logic decoders, and the results show the superiority in terms of propagation delay and power dissipation, compared to the conventional corresponding circuits. A mixed-logic 2-4 decoder exhibits 36 % reduction in propagation delay and 10 % improvement in power dissipation; A mixed-logic 3-8 decoder exhibits 27 % reduction in propagation delay and 5.5 % improvement in power dissipation; While, A mixed-logic 4-16 decoder exhibits 30 % reduction in propagation delay and 5 % improvement in power dissipation; As well, A mixed-logic 5-32 decoder exhibits 34 % reduction in propagation delay and 6.3 % improvement in power dissipation.</p></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":null,"pages":null},"PeriodicalIF":1.9,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141292248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development and validation of a 64-channel ROIC prototype for SWIR line scan sensor applications 开发和验证用于 SWIR 线扫描传感器应用的 64 通道 ROIC 原型机
IF 1.9 3区 工程技术 Q2 Engineering Pub Date : 2024-06-06 DOI: 10.1016/j.vlsi.2024.102226
Hyeon-June Kim, Dong-Yeon Lee, Min-Jun Park

This paper introduces the development and validation of a 64-channel readout integrated circuit (ROIC) prototype, specifically engineered for short-wave infrared (SWIR) line scan sensors. The design of the prototype undergoes various evaluation through comprehensive silicon-level testing, ensuring its robust performance across a variety of operational modes. Key features such as capacitive transimpedance amplifier (CTIA) gain control and sensitivity control are examined, demonstrating the prototype's ability to handle different input currents and capacitance values with precision. Fabricated with 0.18-μm CMOS technology, the ROIC is tailored for integration with Indium Gallium Arsenide (InGaAs) pixels, facilitating high-resolution imaging. The prototype consumes 26.55 mW with A 3.3 V power supply. The fabricated chip show that the total random noise (RN) level is 128 μVrms and column fixed pattern noise (FPN) is 0.16 mVrms

本文介绍了专为短波红外(SWIR)线扫描传感器设计的 64 通道读出集成电路(ROIC)原型的开发和验证。通过全面的硅级测试,对原型设计进行了各种评估,确保其在各种工作模式下都能保持稳定的性能。对电容式互阻抗放大器 (CTIA) 增益控制和灵敏度控制等关键功能进行了检查,证明原型能够精确地处理不同的输入电流和电容值。ROIC 采用 0.18μm CMOS 技术制造,专为集成砷化镓铟(InGaAs)像素而定制,有助于实现高分辨率成像。原型芯片在使用 3.3 V 电源时的功耗为 26.55 mW。制造的芯片显示,总随机噪声(RN)水平为 128 μVrms,列固定模式噪声(FPN)为 0.16 mVrms。
{"title":"Development and validation of a 64-channel ROIC prototype for SWIR line scan sensor applications","authors":"Hyeon-June Kim,&nbsp;Dong-Yeon Lee,&nbsp;Min-Jun Park","doi":"10.1016/j.vlsi.2024.102226","DOIUrl":"https://doi.org/10.1016/j.vlsi.2024.102226","url":null,"abstract":"<div><p>This paper introduces the development and validation of a 64-channel readout integrated circuit (ROIC) prototype, specifically engineered for short-wave infrared (SWIR) line scan sensors. The design of the prototype undergoes various evaluation through comprehensive silicon-level testing, ensuring its robust performance across a variety of operational modes. Key features such as capacitive transimpedance amplifier (CTIA) gain control and sensitivity control are examined, demonstrating the prototype's ability to handle different input currents and capacitance values with precision. Fabricated with 0.18-μm CMOS technology, the ROIC is tailored for integration with Indium Gallium Arsenide (InGaAs) pixels, facilitating high-resolution imaging. The prototype consumes 26.55 mW with A 3.3 V power supply. The fabricated chip show that the total random noise (RN) level is 128 μV<sub>rms</sub> and column fixed pattern noise (FPN) is 0.16 mV<sub>rms</sub></p></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":null,"pages":null},"PeriodicalIF":1.9,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141323394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A 10T SRAM architecture with 40 % enhanced throughput for IMC applications benchmarked with CIFAR-10 dataset 针对 IMC 应用的 10T SRAM 架构,通过 CIFAR-10 数据集进行基准测试,吞吐量提高了 40
IF 1.9 3区 工程技术 Q2 Engineering Pub Date : 2024-06-05 DOI: 10.1016/j.vlsi.2024.102225
Ravi S. Siddanath, Mohit Gupta, Chaitanya Joshi, Manish Goswami, Kavindra Kandpal

This research paper introduces a memory architecture that handles standard memory storage operations and enables in-memory computations, surpassing the capabilities of conventional SRAM bit-cells. The proposed architecture in this work effectively eliminates read-disturb issues and facilitates bit-wise operations like NAND, NOR, and XNOR, all without requiring intricate analog peripheral circuits. The suggested bit-cell architecture offers enhanced throughput compared to existing In-Memory Computing (IMC) bit-cell architectures, making it a more suitable design for IMC applications. Parallelism offers enhanced throughput due to the unique bit-cell architecture, which allows all the bit-wise operations to be achieved simultaneously in a single cycle. The validity of the suggested architecture has been confirmed through Monte-Carlo variation analysis, utilizing UMC 28 nm PDK transistor models to ensure its robustness. Furthermore, architecture is benchmarked using the CIFAR-10 dataset, which entails assessing its performance across various machine learning models via the NeuroSim Simulator. The proposed architecture offers a substantial increase of up to 40 % in throughput (TOPS/W) compared to the existing architectures. Utilizing accurate Monte-Carlo simulations with 1000 samples, the stability of the proposed 10T bit-cell is validated at worst-case PVT corners, up to 6σ variations.

本研究论文介绍了一种内存架构,它能处理标准内存存储操作,并实现内存计算,超越了传统 SRAM 位元组的能力。本文提出的架构有效地消除了读取干扰问题,方便了 NAND、NOR 和 XNOR 等位运算,而且无需复杂的模拟外围电路。与现有的内存计算(IMC)位元架构相比,建议的位元架构可提供更高的吞吐量,使其成为更适合 IMC 应用的设计。由于独特的位元架构允许在一个周期内同时完成所有的位操作,因此并行性提高了吞吐量。利用联电 28 纳米 PDK 晶体管模型,通过蒙特卡洛变化分析确认了建议架构的有效性,以确保其稳健性。此外,还利用 CIFAR-10 数据集对架构进行了基准测试,通过 NeuroSim 模拟器评估了各种机器学习模型的性能。与现有架构相比,拟议架构的吞吐量(TOPS/W)大幅提高了 40%。利用精确的 Monte-Carlo 模拟(1000 个样本),在最坏情况下的 PVT 角(变化率高达 6σ)验证了所提出的 10T 位元的稳定性。
{"title":"A 10T SRAM architecture with 40 % enhanced throughput for IMC applications benchmarked with CIFAR-10 dataset","authors":"Ravi S. Siddanath,&nbsp;Mohit Gupta,&nbsp;Chaitanya Joshi,&nbsp;Manish Goswami,&nbsp;Kavindra Kandpal","doi":"10.1016/j.vlsi.2024.102225","DOIUrl":"https://doi.org/10.1016/j.vlsi.2024.102225","url":null,"abstract":"<div><p>This research paper introduces a memory architecture that handles standard memory storage operations and enables in-memory computations, surpassing the capabilities of conventional SRAM bit-cells. The proposed architecture in this work effectively eliminates read-disturb issues and facilitates bit-wise operations like NAND, NOR, and XNOR, all without requiring intricate analog peripheral circuits. The suggested bit-cell architecture offers enhanced throughput compared to existing In-Memory Computing (IMC) bit-cell architectures, making it a more suitable design for IMC applications. Parallelism offers enhanced throughput due to the unique bit-cell architecture, which allows all the bit-wise operations to be achieved simultaneously in a single cycle. The validity of the suggested architecture has been confirmed through Monte-Carlo variation analysis, utilizing UMC 28 nm PDK transistor models to ensure its robustness. Furthermore, architecture is benchmarked using the CIFAR-10 dataset, which entails assessing its performance across various machine learning models via the NeuroSim Simulator. The proposed architecture offers a substantial increase of up to 40 % in throughput (TOPS/W) compared to the existing architectures. Utilizing accurate Monte-Carlo simulations with 1000 samples, the stability of the proposed 10T bit-cell is validated at worst-case PVT corners, up to 6σ variations.</p></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":null,"pages":null},"PeriodicalIF":1.9,"publicationDate":"2024-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141294868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A low voltage input boost converter with novel switch driver enhancement technology for indoor solar energy harvesting 采用新型开关驱动器增强技术的低压输入升压转换器,适用于室内太阳能收集
IF 1.9 3区 工程技术 Q2 Engineering Pub Date : 2024-05-28 DOI: 10.1016/j.vlsi.2024.102214
Xiwen Zhu, Kaixuan Xu, Mingxue Li, Yufeng Zhang

In the indoor environment, the output voltage of a small photovoltaic cell is usually too low to charge the battery or utilize it directly. As a result, this paper proposed a low-voltage input boost converter with novel switch driver enhancement technology for indoor solar energy harvesting. The boost converter utilized switched-capacitor charge pump architecture. Compared with conventional charge pumps, the proposed boost converter uses driver enhancement technology, which improves the output current ability of the circuit and power conversion efficiency. Besides, an adaptive dead-time circuit is designed to further optimize conversion efficiency at low input voltage. The integrated circuit (IC) of the boost converter has been manufactured in a 180 nm BCD process and occupies an active chip area of 1.6mm × 0.6 mm. Experimental measurement results confirm that the voltage boost converter increased the input voltage by four times. And the lowest start-up voltage is 0.12 V. The voltage conversion efficiency is 98 % and the highest power conversion efficiency is 76.7 % at Vin of 0.5 V. The design is suitable for indoor solar energy harvesting.

在室内环境中,小型光伏电池的输出电压通常过低,无法为电池充电或直接利用。因此,本文提出了一种采用新型开关驱动器增强技术的低压输入升压转换器,用于室内太阳能收集。该升压转换器采用了开关电容充电泵架构。与传统的电荷泵相比,本文提出的升压转换器采用了驱动增强技术,从而提高了电路的输出电流能力和功率转换效率。此外,还设计了自适应死区时间电路,以进一步优化低输入电压下的转换效率。升压转换器的集成电路(IC)采用 180 nm BCD 工艺制造,有效芯片面积为 1.6 mm × 0.6 mm。实验测量结果证实,升压转换器将输入电压提高了四倍。电压转换效率为 98%,当 Vin 为 0.5 V 时,最高功率转换效率为 76.7%。该设计适用于室内太阳能收集。
{"title":"A low voltage input boost converter with novel switch driver enhancement technology for indoor solar energy harvesting","authors":"Xiwen Zhu,&nbsp;Kaixuan Xu,&nbsp;Mingxue Li,&nbsp;Yufeng Zhang","doi":"10.1016/j.vlsi.2024.102214","DOIUrl":"https://doi.org/10.1016/j.vlsi.2024.102214","url":null,"abstract":"<div><p>In the indoor environment, the output voltage of a small photovoltaic cell is usually too low to charge the battery or utilize it directly. As a result, this paper proposed a low-voltage input boost converter with novel switch driver enhancement technology for indoor solar energy harvesting. The boost converter utilized switched-capacitor charge pump architecture. Compared with conventional charge pumps, the proposed boost converter uses driver enhancement technology, which improves the output current ability of the circuit and power conversion efficiency. Besides, an adaptive dead-time circuit is designed to further optimize conversion efficiency at low input voltage. The integrated circuit (IC) of the boost converter has been manufactured in a 180 nm BCD process and occupies an active chip area of 1.6mm × 0.6 mm. Experimental measurement results confirm that the voltage boost converter increased the input voltage by four times. And the lowest start-up voltage is 0.12 V. The voltage conversion efficiency is 98 % and the highest power conversion efficiency is 76.7 % at Vin of 0.5 V. The design is suitable for indoor solar energy harvesting.</p></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":null,"pages":null},"PeriodicalIF":1.9,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141243914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hybrid Radix-16 booth encoding and rounding-based approximate Karatsuba multiplier for fast Fourier transform computation in biomedical signal processing application 用于生物医学信号处理应用中快速傅立叶变换计算的混合 Radix-16 亭编码和基于舍入的近似 Karatsuba 乘法器
IF 1.9 3区 工程技术 Q2 Engineering Pub Date : 2024-05-28 DOI: 10.1016/j.vlsi.2024.102215
Dinesh Kumar Jayaraman Rajanediran , Ganesh Babu C , Priyadharsini K , M. Ramkumar

Multiplication is an essential biomedical signal processing function implemented in the Digital Signal Processing (DSP) cores. To enhance the speed, area and energy efficiency of DSP cores, approximate multiplication is used. Also, low power multiplier unit design is one of the requirements of DSP processor to meet the increasing demands. To balance both the design and error metrics of a multiplier design, an efficient Hybrid Radix-16 Booth Encoding and rounding-based approximate Karatsuba Multiplier (RBEKM-16) is proposed. This research introduces an Approximate Karatsuba multiplier based on rounding, utilizing rounding approximation to compute the least significant part of the product. Simple operators, like adders and multiplexers, replace complex and costly conventional Floating-Point (FP) multipliers in this process. Radix-4 logarithms are incorporated to further minimize hardware complexity and calculate the product's most significant part. Subsequently, an approximate 4-2 compressor is applied in the partial product reduction stage to generate the most significant bit result. In the experimental scenario, the efficiency of the multiplier is evaluated in terms of energy efficiency, area utilization and error rate by using Xilinx ISE 8.1i tool. The results from the experiments indicate that the suggested multiplier demonstrates improved energy efficiency, utilizes space more effectively, and performs well in applications related to biomedical signal processing. Further, the accomplished area utilization of the proposed 16-bit multiplier is 1068 μm2, delay is 3.01 ns, power consumption is 0.021 mW and power delay product is 119 fJ.

乘法是数字信号处理(DSP)内核中实现的一项基本生物医学信号处理功能。为了提高 DSP 内核的速度、面积和能效,需要使用近似乘法。此外,低功耗乘法器单元设计也是 DSP 处理器的要求之一,以满足日益增长的需求。为了平衡乘法器设计和误差指标,提出了一种高效的混合 Radix-16 Booth 编码和基于舍入的近似 Karatsuba 乘法器 (RBEKM-16)。这项研究引入了一种基于舍入的近似卡拉祖巴乘法器,利用舍入近似来计算乘积的最小有效部分。在此过程中,简单的运算器(如加法器和多路复用器)取代了复杂而昂贵的传统浮点(FP)乘法器。为了进一步降低硬件复杂性并计算乘积的最有意义部分,Radix-4 对数被纳入其中。随后,在部分乘积还原阶段应用近似 4-2 压缩器,生成最有意义位结果。在实验方案中,使用 Xilinx ISE 8.1i 工具从能效、面积利用率和错误率方面评估了乘法器的效率。实验结果表明,建议的乘法器提高了能效,更有效地利用了空间,在生物医学信号处理相关应用中表现良好。此外,所建议的 16 位乘法器的面积利用率为 1068 μm2,延迟为 3.01 ns,功耗为 0.021 mW,功率延迟积为 119 fJ。
{"title":"Hybrid Radix-16 booth encoding and rounding-based approximate Karatsuba multiplier for fast Fourier transform computation in biomedical signal processing application","authors":"Dinesh Kumar Jayaraman Rajanediran ,&nbsp;Ganesh Babu C ,&nbsp;Priyadharsini K ,&nbsp;M. Ramkumar","doi":"10.1016/j.vlsi.2024.102215","DOIUrl":"https://doi.org/10.1016/j.vlsi.2024.102215","url":null,"abstract":"<div><p>Multiplication is an essential biomedical signal processing function implemented in the Digital Signal Processing (DSP) cores. To enhance the speed, area and energy efficiency of DSP cores, approximate multiplication is used. Also, low power multiplier unit design is one of the requirements of DSP processor to meet the increasing demands. To balance both the design and error metrics of a multiplier design, an efficient Hybrid Radix-16 Booth Encoding and rounding-based approximate Karatsuba Multiplier (RBEKM-16) is proposed. <strong>This research introduces an Approximate Karatsuba multiplier based on rounding, utilizing rounding approximation to compute the least significant part of the product. Simple operators, like adders and multiplexers, replace complex and costly conventional Floating-Point (FP) multipliers in this process. Radix-4 logarithms are incorporated to further minimize hardware complexity and calculate the product's most significant part. Subsequently, an approximate 4-2 compressor is applied in the partial product reduction stage to generate the most significant bit result.</strong> In the experimental scenario, the efficiency of the multiplier is evaluated in terms of energy efficiency, area utilization and error rate by using Xilinx ISE 8.1i tool. The results from the experiments indicate that the suggested multiplier demonstrates improved energy efficiency, utilizes space more effectively, and performs well in applications related to biomedical signal processing. Further, the accomplished area utilization of the proposed 16-bit multiplier is 1068 <span><math><mrow><mi>μ</mi><msup><mi>m</mi><mn>2</mn></msup></mrow></math></span>, delay is 3.01 ns, power consumption is 0.021 mW and power delay product is 119 fJ.</p></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":null,"pages":null},"PeriodicalIF":1.9,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141243915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Content-addressable memory using selective-charging and adaptive-discharging scheme for low-power hardware search engine 采用选择性充电和自适应放电方案的内容可寻址存储器,用于低功耗硬件搜索引擎
IF 1.9 3区 工程技术 Q2 Engineering Pub Date : 2024-05-27 DOI: 10.1016/j.vlsi.2024.102213
Sheikh Wasmir Hussain , Telajala Venkata Mahendra , Sandeep Mishra , Anup Dandapat

Single clock cycle access feature of content-addressable memory (CAM) suits well for high-speed parallel content search operation in data-intensive hardware search engines. The diverse applications span from accelerating databases and routing networks to processing images, implementing machine learning, processing biomedical data, and compressing data. Nevertheless, the CAM macro consumes significant energy due to the high switching of most match-lines (MLs), which comprise CAM words, during parallel access. Segmented ML schemes reduced power yet the cell and ML delay, and the extra sequential cycles affect search-speed. A novel selective-charging and adaptive-discharging (SCAD) scheme in the form of dynamic ML architecture is proposed to reduce CAM power consumption at no extra cycle cost. Additionally, a full-swing CAM cell forms the basis of storage and comparison-evaluation to lessen ML delay. Based on 45-nm technology under 1-V supply, the proposed 64 × 32-bit and 256 × 144-bit SCAD-CAM arrays dissipate only 0.45–0.46 fJ/bit/search energy and achieve high-speed. Compared to CAMs based on low-power ML schemes, viz., low-swing precharge, division and control, and master–slave, and the conventional CAM as baseline design, the SCAD-CAM reduces 13.49%–89.35% energy-delay. The average-power reduction of 1.8×–2.4× establishes the SCAD-CAM as a promising memory architecture for emerging search-intensive applications involving large-scale data workloads.

内容可寻址存储器(CAM)的单时钟周期访问特性非常适合数据密集型硬件搜索引擎中的高速并行内容搜索操作。从加速数据库和路由网络到处理图像、实现机器学习、处理生物医学数据和压缩数据,这些应用多种多样。然而,在并行访问过程中,由于大多数匹配行(ML)(由 CAM 字组成)的高切换率,CAM 宏会消耗大量能量。分段式 ML 方案降低了功耗,但单元和 ML 的延迟以及额外的顺序周期影响了搜索速度。我们提出了一种动态 ML 架构形式的新型选择性充电和自适应放电(SCAD)方案,可在不增加额外周期成本的情况下降低 CAM 功耗。此外,全摆动 CAM 单元构成了存储和比较评估的基础,从而减少了 ML 延迟。基于 1 V 电源下的 45 纳米技术,所提出的 64 × 32 位和 256 × 144 位 SCAD-CAM 阵列仅耗散 0.45-0.46 fJ/bit/search 能量,并实现了高速。与基于低功耗 ML 方案(即低摆动预充电、分割和控制、主从)的 CAM 和作为基准设计的传统 CAM 相比,SCAD-CAM 减少了 13.49%-89.35% 的能耗延迟。平均功耗降低了 1.8 倍-2.4 倍,这使 SCAD-CAM 成为涉及大规模数据工作负载的新兴搜索密集型应用的理想内存架构。
{"title":"Content-addressable memory using selective-charging and adaptive-discharging scheme for low-power hardware search engine","authors":"Sheikh Wasmir Hussain ,&nbsp;Telajala Venkata Mahendra ,&nbsp;Sandeep Mishra ,&nbsp;Anup Dandapat","doi":"10.1016/j.vlsi.2024.102213","DOIUrl":"https://doi.org/10.1016/j.vlsi.2024.102213","url":null,"abstract":"<div><p>Single clock cycle access feature of content-addressable memory (CAM) suits well for high-speed parallel content search operation in data-intensive hardware search engines. The diverse applications span from accelerating databases and routing networks to processing images, implementing machine learning, processing biomedical data, and compressing data. Nevertheless, the CAM macro consumes significant energy due to the high switching of most match-lines (MLs), which comprise CAM words, during parallel access. Segmented ML schemes reduced power yet the cell and ML delay, and the extra sequential cycles affect search-speed. A novel selective-charging and adaptive-discharging (SCAD) scheme in the form of dynamic ML architecture is proposed to reduce CAM power consumption at no extra cycle cost. Additionally, a full-swing CAM cell forms the basis of storage and comparison-evaluation to lessen ML delay. Based on 45-nm technology under 1-V supply, the proposed 64 × 32-bit and 256 × 144-bit SCAD-CAM arrays dissipate only 0.45–0.46 fJ/bit/search energy and achieve high-speed. Compared to CAMs based on low-power ML schemes, viz., low-swing precharge, division and control, and master–slave, and the conventional CAM as baseline design, the SCAD-CAM reduces 13.49%–89.35% energy-delay. The average-power reduction of 1.8<span><math><mo>×</mo></math></span>–2.4<span><math><mo>×</mo></math></span> establishes the SCAD-CAM as a promising memory architecture for emerging search-intensive applications involving large-scale data workloads.</p></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":null,"pages":null},"PeriodicalIF":1.9,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141323392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SET-detection low complexity burst error correction codes for SRAM protection 用于 SRAM 保护的 SET 检测低复杂度突发纠错码
IF 1.9 3区 工程技术 Q2 Engineering Pub Date : 2024-05-25 DOI: 10.1016/j.vlsi.2024.102212
He Liu , Jiaqiang Li , Liyi Xiao , Tianqi Wang , Jie Li

As the feature size of transistors decreases, multiple bit upsets and single event transient effects become severe in circuits working in radiation environment. In static random-access memories (SRAM), both single event upsets and single event transients need caring about. Fault-tolerant ECCs are optional for SRAM protection, which own the ability to deal with SEU and SET at the same time. We designed a series of low complexity burst error correcting codes with fault detection feature. This can deal with burst errors in memories and transient errors in the decoder. Low complexity ECC simplifies the decoding circuits and reduces hardware overhead. Compared with schemes to deal with SET in decoders, the proposed scheme has obvious advantage on area’s overhead and can be an effective choice for SRAM protection in radiation environment.

随着晶体管特征尺寸的减小,在辐射环境中工作的电路中,多位中断和单事件瞬态效应变得越来越严重。在静态随机存取存储器(SRAM)中,单个事件中断和单个事件瞬变都需要关注。容错 ECC 是 SRAM 保护的可选项,它具有同时处理 SEU 和 SET 的能力。我们设计了一系列具有故障检测功能的低复杂度突发纠错码。这可以处理存储器中的突发错误和解码器中的瞬时错误。低复杂度 ECC 简化了解码电路,降低了硬件开销。与处理解码器中 SET 的方案相比,所提出的方案在面积开销方面具有明显的优势,可以成为辐射环境中 SRAM 保护的有效选择。
{"title":"SET-detection low complexity burst error correction codes for SRAM protection","authors":"He Liu ,&nbsp;Jiaqiang Li ,&nbsp;Liyi Xiao ,&nbsp;Tianqi Wang ,&nbsp;Jie Li","doi":"10.1016/j.vlsi.2024.102212","DOIUrl":"https://doi.org/10.1016/j.vlsi.2024.102212","url":null,"abstract":"<div><p>As the feature size of transistors decreases, multiple bit upsets and single event transient effects become severe in circuits working in radiation environment. In static random-access memories (SRAM), both single event upsets and single event transients need caring about. Fault-tolerant ECCs are optional for SRAM protection, which own the ability to deal with SEU and SET at the same time. We designed a series of low complexity burst error correcting codes with fault detection feature. This can deal with burst errors in memories and transient errors in the decoder. Low complexity ECC simplifies the decoding circuits and reduces hardware overhead. Compared with schemes to deal with SET in decoders, the proposed scheme has obvious advantage on area’s overhead and can be an effective choice for SRAM protection in radiation environment.</p></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":null,"pages":null},"PeriodicalIF":1.9,"publicationDate":"2024-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141250580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analysis of a new three-dimensional jerk chaotic system with transient chaos and its adaptive backstepping synchronous control 带有瞬态混沌的新型三维 Jerk 混沌系统及其自适应反步进同步控制分析
IF 1.9 3区 工程技术 Q2 Engineering Pub Date : 2024-05-24 DOI: 10.1016/j.vlsi.2024.102210
Shaohui Yan , Jianjian Wang , Lin Li

A new three-dimensional Jerk chaotic system with line equilibrium points is proposed. The system is researched in detail by the Lyapunov exponent graph, bifurcation diagram, phase diagram, and time domain waveform diagram, which show that the system has rich dynamical behaviors, such as eight types of coexisting attractors, extreme multistability of four different attractor states, and offset boosting in two directions. In addition, the system also has six types of transient chaos, which greatly increase the complexity of the system. We study the variation of the spectral entropy (SE) and C0 complexity when the system takes different initial values. Also, in this paper, the initial conditions under which the system is in a synchronized state are determined by initial values with higher complexity. The correctness of the theoretical analysis and numerical simulation is verified by circuit simulation and hardware experiments. Finally, the new system achieves synchronization control utilizing a designed adaptive backstepping controller, laying the foundation for its subsequent use in secure communications.

提出了一种新的三维 Jerk 混沌系统,该系统具有线平衡点。通过Lyapunov指数图、分岔图、相位图和时域波形图对该系统进行了详细研究,结果表明该系统具有丰富的动力学行为,如八种共存吸引子、四种不同吸引子状态的极端多稳态性和两个方向的偏移提升。此外,系统还存在六种瞬态混沌,大大增加了系统的复杂性。我们研究了系统取不同初始值时的谱熵 (SE) 和 C0 复杂性的变化。同时,在本文中,系统处于同步状态的初始条件是由复杂度较高的初始值决定的。电路仿真和硬件实验验证了理论分析和数值模拟的正确性。最后,新系统利用设计的自适应反步进控制器实现了同步控制,为其后续在安全通信中的应用奠定了基础。
{"title":"Analysis of a new three-dimensional jerk chaotic system with transient chaos and its adaptive backstepping synchronous control","authors":"Shaohui Yan ,&nbsp;Jianjian Wang ,&nbsp;Lin Li","doi":"10.1016/j.vlsi.2024.102210","DOIUrl":"10.1016/j.vlsi.2024.102210","url":null,"abstract":"<div><p>A new three-dimensional Jerk chaotic system with line equilibrium points is proposed. The system is researched in detail by the Lyapunov exponent graph, bifurcation diagram, phase diagram, and time domain waveform diagram, which show that the system has rich dynamical behaviors, such as eight types of coexisting attractors, extreme multistability of four different attractor states, and offset boosting in two directions. In addition, the system also has six types of transient chaos, which greatly increase the complexity of the system. We study the variation of the spectral entropy (SE) and C0 complexity when the system takes different initial values. Also, in this paper, the initial conditions under which the system is in a synchronized state are determined by initial values with higher complexity. The correctness of the theoretical analysis and numerical simulation is verified by circuit simulation and hardware experiments. Finally, the new system achieves synchronization control utilizing a designed adaptive backstepping controller, laying the foundation for its subsequent use in secure communications.</p></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":null,"pages":null},"PeriodicalIF":1.9,"publicationDate":"2024-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141145038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AiTO: Simultaneous gate sizing and buffer insertion for timing optimization with GNNs and RL AiTO:利用 GNN 和 RL 同时优化栅极尺寸和缓冲器插入以实现时序优化
IF 1.9 3区 工程技术 Q2 Engineering Pub Date : 2024-05-21 DOI: 10.1016/j.vlsi.2024.102211
Hongxi Wu , Zhipeng Huang , Xingquan Li , Wenxing Zhu

Gate sizing and buffer insertion for timing optimization are performed extensively in electronic design automation (EDA) flows. Both of them aim to adjust the upstream and downstream capacitances of gates/buffers to minimize delay. However, most of existing work focuses on gate sizing or buffer insertion independently. This paper proposes a learning-based timing optimization framework, AiTO, that combines reinforcement learning with graph neural network, to perform simultaneously gate sizing and buffer insertion. We model buffer insertion as a special gate sizing by determining possible buffer locations in advance and treating the buffer insertion and gate sizing as an RL process. Experimental results on 10 real designs (28-nm and 110-nm) show that, AiTO can achieve better worst negative slack (WNS) optimization results than OpenROAD while being able to improve the results of the commercial tool, Innovus, to some extent. Moreover, ablation studies demonstrate the benefits of performing simultaneous gate sizing and buffer insertion for timing optimization.

在电子设计自动化(EDA)流程中,为优化时序而进行的栅极尺寸调整和缓冲器插入工作被广泛采用。它们的目的都是调整栅极/缓冲器的上下游电容,以尽量减少延迟。然而,现有的大部分工作都集中在门大小或缓冲器插入的独立方面。本文提出了一种基于学习的时序优化框架 AiTO,它将强化学习与图神经网络相结合,可同时执行门大小调整和缓冲区插入。我们通过提前确定可能的缓冲区位置,将缓冲区插入作为一种特殊的栅极选型,并将缓冲区插入和栅极选型视为一个 RL 过程。10 个实际设计(28 纳米和 110 纳米)的实验结果表明,AiTO 比 OpenROAD 能获得更好的最差负松弛(WNS)优化结果,同时在一定程度上改善了商业工具 Innovus 的结果。此外,烧蚀研究还证明了同时执行栅极尺寸和缓冲器插入以进行时序优化的好处。
{"title":"AiTO: Simultaneous gate sizing and buffer insertion for timing optimization with GNNs and RL","authors":"Hongxi Wu ,&nbsp;Zhipeng Huang ,&nbsp;Xingquan Li ,&nbsp;Wenxing Zhu","doi":"10.1016/j.vlsi.2024.102211","DOIUrl":"10.1016/j.vlsi.2024.102211","url":null,"abstract":"<div><p>Gate sizing and buffer insertion for timing optimization are performed extensively in electronic design automation (EDA) flows. Both of them aim to adjust the upstream and downstream capacitances of gates/buffers to minimize delay. However, most of existing work focuses on gate sizing or buffer insertion independently. This paper proposes a learning-based timing optimization framework, AiTO, that combines reinforcement learning with graph neural network, to perform simultaneously gate sizing and buffer insertion. We model buffer insertion as a special gate sizing by determining possible buffer locations in advance and treating the buffer insertion and gate sizing as an RL process. Experimental results on 10 real designs (28-nm and 110-nm) show that, AiTO can achieve better worst negative slack (WNS) optimization results than OpenROAD while being able to improve the results of the commercial tool, Innovus, to some extent. Moreover, ablation studies demonstrate the benefits of performing simultaneous gate sizing and buffer insertion for timing optimization.</p></div>","PeriodicalId":54973,"journal":{"name":"Integration-The Vlsi Journal","volume":null,"pages":null},"PeriodicalIF":1.9,"publicationDate":"2024-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141136535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Integration-The Vlsi Journal
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1