2022 IEEE Custom Integrated Circuits Conference (CICC)最新文献_第6页

DCT-RAM: A Driver-Free Process-In-Memory 8T SRAM Macro with Multi-Bit Charge-Domain Computation and Time-Domain Quantization DCT-RAM:具有多比特电荷域计算和时域量化的无驱动进程内存8T SRAM宏

2022 IEEE Custom Integrated Circuits Conference (CICC)

Pub Date : 2022-04-01 DOI: 10.1109/CICC53496.2022.9772826

Zhiyu Chen, Qing Jin, Zhanghao Yu, Yanzhi Wang, Kaiyuan Yang

Process-In-Memory (PIM) is a promising solution to alleviating the memory-wall bottleneck in memory-intensive applications like CNNs. Recent demonstrations of SRAM-based PIM designs, particularly those computing in the charge domain [1]–[5], have greatly improved the linearity of analog multiply-and-add computations (MAC) and quantization, and their robustness to process variations, making their inference accuracy approach that of digital hardware in practical computer vision benchmarks such as CIFAR-10. However, there remain several limitations towards large scale integration of PIM macros, especially the assumptions on the availability of powerful external reference voltage drivers and the lack of scaling friendly designs. More specifically, high-bandwidth analog buffers driving large output load are necessary to distribute the massive number of analog signals (e.g. DAC outputs) across the macro, without sacrificing signal fidelity and computing speed. [10] is one work that reports its DAC drivers occupying 11.4% of the macro area and incurring 94-pJ energy overhead in 28 nm, accounting for 68.5% of the total energy in a macro supporting $5mathrm{b}$ activations and $8mathrm{b}$ weight. Second, SAR ADCs are popular for the common 5–9 bit resolution range. High-speed power-hungry analog buffers are required in conventional SAR ADCs to drive the capacitive DACs (CDACs) to reference voltages, with short settling time and high accuracy. Given the hundreds of ADCs in each macro, the design complexity and overheads incurred by these drivers are dominant. Our simulated reference driver takes 2.9-pJ energy in 65 nm, which is comparable to an ADC (e.g. 3.56 $text{pJ}$ in [12]). Third, it is challenging to fit any conventional $geq 7mathrm{b}$ SAR ADC into the narrow width of SRAM cells due to the bulky CDACs and layout matching requirements, ultimately limiting the computing parallelism and energy amortization.

内存中进程(PIM)是一种很有前途的解决方案，可以缓解像cnn这样的内存密集型应用程序中的内存墙瓶颈。最近基于sram的PIM设计的演示，特别是在电荷域[1]-[5]的计算，极大地提高了模拟乘法和加法计算(MAC)和量化的线性度，以及它们对过程变化的鲁棒性，使其推理精度接近实际计算机视觉基准中的数字硬件，如CIFAR-10。然而，PIM宏的大规模集成仍然存在一些限制，特别是对强大的外部参考电压驱动器的可用性的假设以及缺乏缩放友好的设计。更具体地说，在不牺牲信号保真度和计算速度的情况下，需要高带宽模拟缓冲驱动大输出负载，以便在宏中分配大量模拟信号(例如DAC输出)。[10]是一个报告其DAC驱动程序占用11.4的工作% of the macro area and incurring 94-pJ energy overhead in 28 nm, accounting for 68.5% of the total energy in a macro supporting $5mathrm{b}$ activations and $8mathrm{b}$ weight. Second, SAR ADCs are popular for the common 5–9 bit resolution range. High-speed power-hungry analog buffers are required in conventional SAR ADCs to drive the capacitive DACs (CDACs) to reference voltages, with short settling time and high accuracy. Given the hundreds of ADCs in each macro, the design complexity and overheads incurred by these drivers are dominant. Our simulated reference driver takes 2.9-pJ energy in 65 nm, which is comparable to an ADC (e.g. 3.56 $text{pJ}$ in [12]). Third, it is challenging to fit any conventional $geq 7mathrm{b}$ SAR ADC into the narrow width of SRAM cells due to the bulky CDACs and layout matching requirements, ultimately limiting the computing parallelism and energy amortization.

{"title":"DCT-RAM: A Driver-Free Process-In-Memory 8T SRAM Macro with Multi-Bit Charge-Domain Computation and Time-Domain Quantization","authors":"Zhiyu Chen, Qing Jin, Zhanghao Yu, Yanzhi Wang, Kaiyuan Yang","doi":"10.1109/CICC53496.2022.9772826","DOIUrl":"https://doi.org/10.1109/CICC53496.2022.9772826","url":null,"abstract":"Process-In-Memory (PIM) is a promising solution to alleviating the memory-wall bottleneck in memory-intensive applications like CNNs. Recent demonstrations of SRAM-based PIM designs, particularly those computing in the charge domain [1]–[5], have greatly improved the linearity of analog multiply-and-add computations (MAC) and quantization, and their robustness to process variations, making their inference accuracy approach that of digital hardware in practical computer vision benchmarks such as CIFAR-10. However, there remain several limitations towards large scale integration of PIM macros, especially the assumptions on the availability of powerful external reference voltage drivers and the lack of scaling friendly designs. More specifically, high-bandwidth analog buffers driving large output load are necessary to distribute the massive number of analog signals (e.g. DAC outputs) across the macro, without sacrificing signal fidelity and computing speed. [10] is one work that reports its DAC drivers occupying 11.4% of the macro area and incurring 94-pJ energy overhead in 28 nm, accounting for 68.5% of the total energy in a macro supporting $5mathrm{b}$ activations and $8mathrm{b}$ weight. Second, SAR ADCs are popular for the common 5–9 bit resolution range. High-speed power-hungry analog buffers are required in conventional SAR ADCs to drive the capacitive DACs (CDACs) to reference voltages, with short settling time and high accuracy. Given the hundreds of ADCs in each macro, the design complexity and overheads incurred by these drivers are dominant. Our simulated reference driver takes 2.9-pJ energy in 65 nm, which is comparable to an ADC (e.g. 3.56 $text{pJ}$ in [12]). Third, it is challenging to fit any conventional $geq 7mathrm{b}$ SAR ADC into the narrow width of SRAM cells due to the bulky CDACs and layout matching requirements, ultimately limiting the computing parallelism and energy amortization.","PeriodicalId":415990,"journal":{"name":"2022 IEEE Custom Integrated Circuits Conference (CICC)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115688608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

A Single-Mode Dual-Path Buck-Boost Converter with Reduced Inductor Current Across All Duty Cases Achieving 95.58% Efficiency at 1A in Boost Operation 一种单模双路降压-升压变换器，在所有工作情况下电感电流均降低，在1A升压工作时效率为95.58%

2022 IEEE Custom Integrated Circuits Conference (CICC)

Pub Date : 2022-04-01 DOI: 10.1109/CICC53496.2022.9772861

Donghee Cho, Hyungjoo Cho, Sein Oh, Yoontae Jung, S. Ha, Chul-Woong Kim, M. Je

In mobile devices, there are various functional blocks requiring different voltage levels, which should be generated from a single lithium-ion battery (Fig. 1), and 3.3V is one of the most demanded voltage levels. Since the battery output voltage discharges gradually from 4.2 to 2.9V, it requires a buck-boost converter that addresses the following challenges. 1) Due to ever-increasing load-current $(mathsf{I}_{mathsf{LOAD}})$ demands in the mobile device, the conduction loss by the DC resistance (DCR) of the inductor overwhelms other losses, especially when a small-size inductor is used. 2) At large $mathsf{I}_{mathsf{LOAD}}$, the battery output voltage decreases due to its internal resistance, making the boost operation more dominant. 3) Due to the controllability of display brightness and processing speed, burst currents are generated, resulting in unpredictable input voltage fluctuations.

在移动设备中，有各种各样的功能模块需要不同的电压水平，这些电压水平应该由单个锂离子电池产生(图1)，其中3.3V是需求量最大的电压水平之一。由于电池输出电压从4.2 v逐渐放电到2.9V，因此需要一个降压升压转换器来解决以下挑战。1)由于移动设备中不断增加的负载电流$(mathsf{I}_{mathsf{LOAD}})$的需求，电感的直流电阻(DCR)的传导损耗压倒了其他损耗，特别是当使用小尺寸电感时。2) $mathsf{I}_{mathsf{LOAD}}$较大时，电池输出电压因内阻降低，使升压操作更占优势。3)由于显示亮度和处理速度的可控性，产生突发电流，导致输入电压波动不可预测。

引用次数: 2

A Fully In-Package 4-Phase Fixed-Frequency DAB Hysteretic Controlled DC-DC Converter with Enhanced Efficiency, Load Regulation and Transient Response 一个完全封装的4相固定频率DAB滞回控制DC-DC变换器，具有增强的效率，负载调节和瞬态响应

2022 IEEE Custom Integrated Circuits Conference (CICC)

Pub Date : 2022-04-01 DOI: 10.1109/CICC53496.2022.9772798

Lei Zhao, Junyao Tang, Cheng Huang

Multiphase DC-DC converters have been widely used to deliver more power more efficiently with smaller ripples and faster large-signal dynamic responses [1]–[5]. In terms of closed-loop voltage regulation, traditional linear PWM control has limited small-signal bandwidth, which is further compromised to ensure stability at different loading conditions with different PVT and LC variations. Non-linear control, such as hysteretic control, does not have small-signal bandwidth limitations nor stability concerns, thus can potentially achieve a faster dynamic performance. Among different topologies, current-mode hysteretic control has been adopted in 4-phase converters [2], [3]. To ensure proper operation at higher frequency, they require careful matching between the inductor current-sensing RC networks and the inductance and parasitic DC resistance (DCR) of the power inductors [2], or more complex RC sensing networks [3]. Also, the converters in [2]–[4] did not include current balancing, which could introduce unbalanced current due to mismatches in power transistors, control timing, and power inductors among different phases, and result in significant compromise in efficiency. To maintain optimum efficiency over a wide loading range, active-phase-count (APC) control has been introduced in [1], [2], [4]. In [4], APC is realized by a multi-bit ADC, which increases the design complexity and power consumption. Double-adaptive-bound (DAB) hysteretic control in [6] has demonstrated fast transient responses, however, it only works in single phase, and the operation is very sensitive to the delay of the comparator, the gate driver and other circuits in the control path, and the matching of the RC filters, especially at higher switching frequencies. Besides, due to the lack of a high-gain amplifier, output voltage DC accuracy is also compromised in hysteretic controlled switching converters, with a 40mV/1A load regulation in [2].

多相DC-DC变换器已广泛应用于更有效地提供更大的功率，更小的波纹和更快的大信号动态响应[1]-[5]。在闭环电压调节方面，传统的线性PWM控制具有有限的小信号带宽，进一步削弱了在不同PVT和LC变化的不同负载条件下的稳定性。非线性控制，如迟滞控制，没有小信号带宽限制，也没有稳定性问题，因此可以潜在地实现更快的动态性能。在不同的拓扑结构中，电流型迟滞控制在4相变换器[2]、[3]中被采用。为了确保在更高频率下正常工作，它们需要在电感电流传感RC网络与功率电感[2]或更复杂的RC传感网络[3]的电感和寄生直流电阻(DCR)之间进行仔细匹配。此外，[2]-[4]中的变换器不包括电流平衡，这可能会导致功率晶体管、控制定时和功率电感在不同相位之间不匹配而导致电流不平衡，从而导致效率的显著降低。为了在宽负载范围内保持最佳效率，在[1]、[2]、[4]中引入了主动相位计数(APC)控制。在[4]中，APC是通过多位ADC实现的，这增加了设计的复杂性和功耗。双自适应约束(DAB)滞回控制在[6]中表现出快速的瞬态响应，然而，它只在单相下工作，并且对比较器、栅极驱动器和控制路径中其他电路的延迟以及RC滤波器的匹配非常敏感，特别是在较高的开关频率下。此外，由于缺乏高增益放大器，迟滞控制开关变换器的输出电压直流精度也受到影响，[2]的负载调节为40mV/1A。

{"title":"A Fully In-Package 4-Phase Fixed-Frequency DAB Hysteretic Controlled DC-DC Converter with Enhanced Efficiency, Load Regulation and Transient Response","authors":"Lei Zhao, Junyao Tang, Cheng Huang","doi":"10.1109/CICC53496.2022.9772798","DOIUrl":"https://doi.org/10.1109/CICC53496.2022.9772798","url":null,"abstract":"Multiphase DC-DC converters have been widely used to deliver more power more efficiently with smaller ripples and faster large-signal dynamic responses [1]–[5]. In terms of closed-loop voltage regulation, traditional linear PWM control has limited small-signal bandwidth, which is further compromised to ensure stability at different loading conditions with different PVT and LC variations. Non-linear control, such as hysteretic control, does not have small-signal bandwidth limitations nor stability concerns, thus can potentially achieve a faster dynamic performance. Among different topologies, current-mode hysteretic control has been adopted in 4-phase converters [2], [3]. To ensure proper operation at higher frequency, they require careful matching between the inductor current-sensing RC networks and the inductance and parasitic DC resistance (DCR) of the power inductors [2], or more complex RC sensing networks [3]. Also, the converters in [2]–[4] did not include current balancing, which could introduce unbalanced current due to mismatches in power transistors, control timing, and power inductors among different phases, and result in significant compromise in efficiency. To maintain optimum efficiency over a wide loading range, active-phase-count (APC) control has been introduced in [1], [2], [4]. In [4], APC is realized by a multi-bit ADC, which increases the design complexity and power consumption. Double-adaptive-bound (DAB) hysteretic control in [6] has demonstrated fast transient responses, however, it only works in single phase, and the operation is very sensitive to the delay of the comparator, the gate driver and other circuits in the control path, and the matching of the RC filters, especially at higher switching frequencies. Besides, due to the lack of a high-gain amplifier, output voltage DC accuracy is also compromised in hysteretic controlled switching converters, with a 40mV/1A load regulation in [2].","PeriodicalId":415990,"journal":{"name":"2022 IEEE Custom Integrated Circuits Conference (CICC)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130218425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A Lossless and Modeling Attack-Resistant Strong PUF with <4E-8 Bit Error Rate 误码率<4E-8的无损、抗建模攻击强PUF

2022 IEEE Custom Integrated Circuits Conference (CICC)

Pub Date : 2022-04-01 DOI: 10.1109/CICC53496.2022.9772852

Yan He, Qixuan Yu, Kaiyuan Yang

Strong physically unclonable functions (SPUFs) are promising solutions for low-cost authentication of loT edge devices, by generating an exponential number of device-specific challenge-response pairs (CRPs). Early SPUF designs are vulnerable against machine learning (ML) modeling attacks due to the lack of nonlinearity in challenge-to-response mapping [1]. Recent studies have shown that SPUFs can be designed with resiliency against ML modeling by incorporating entropy sources with non-linear operations such as Entropy LUT [2], AES S-box [3], or XOR network [4]. They achieved high resistance against known black-box ML modeling attacks with more than 0.1M training CRPs. A key challenge in these ML-resistant Strong PUF designs is ensuring the entropy sources (ES) stability under environmental variations, because a small number of unstable ES will lead to a much larger portion of unstable CRPs. The unstable CRPs need to be discarded, which reduces the number of available authentication attempts without CRP reuse. They are also a potential weak point that can be exploited to facilitate ML modeling using reliability-based attacks [5]. [2] eliminates the ES instability by hour-long accelerated aging at a high temperature, which induces a high testing cost. [3] creates an accurate ES instability map by evaluating ES under multiple temperature points and filtering out the unstable CRPs. An external access point to the ES is necessary for direct evaluation, representing another potential attack point. [4] proposes a special lithography step to randomize the interconnect, providing a more stable ES than CMOS variations. But the extra unconventional fabrication steps are undesirable in mass production.

强物理不可克隆函数(spuf)通过生成指数数量的特定于设备的挑战响应对(CRPs)，是loT边缘设备低成本认证的有希望的解决方案。由于挑战-响应映射中缺乏非线性，早期的SPUF设计容易受到机器学习(ML)建模攻击[1]。最近的研究表明，通过将熵源与非线性操作(如entropy LUT[2]、AES S-box[3]或异或网络[4])结合起来，spfs可以被设计成具有抗ML建模的弹性。他们使用超过0.1M的训练crp对已知的黑盒ML建模攻击实现了很高的抵抗力。在这些抗ml的强PUF设计中，一个关键的挑战是确保熵源(ES)在环境变化下的稳定性，因为少量的不稳定ES将导致更大比例的不稳定CRPs。不稳定的CRP需要被丢弃，这样可以减少可用的认证尝试次数，而不会复用CRP。它们也是一个潜在的弱点，可以利用基于可靠性的攻击来促进机器学习建模[5]。[2]采用高温下一小时加速时效的方法消除了ES的不稳定性，但测试成本较高。[3]通过对多个温度点下的ES进行评估，过滤掉不稳定的crp，得到了准确的ES不稳定图。ES的外部访问点对于直接评估是必要的，它代表了另一个潜在的攻击点。[4]提出了一种特殊的光刻步骤来随机化互连，提供比CMOS变化更稳定的ES。但在大规模生产中，额外的非常规制造步骤是不可取的。

{"title":"A Lossless and Modeling Attack-Resistant Strong PUF with <4E-8 Bit Error Rate","authors":"Yan He, Qixuan Yu, Kaiyuan Yang","doi":"10.1109/CICC53496.2022.9772852","DOIUrl":"https://doi.org/10.1109/CICC53496.2022.9772852","url":null,"abstract":"Strong physically unclonable functions (SPUFs) are promising solutions for low-cost authentication of loT edge devices, by generating an exponential number of device-specific challenge-response pairs (CRPs). Early SPUF designs are vulnerable against machine learning (ML) modeling attacks due to the lack of nonlinearity in challenge-to-response mapping [1]. Recent studies have shown that SPUFs can be designed with resiliency against ML modeling by incorporating entropy sources with non-linear operations such as Entropy LUT [2], AES S-box [3], or XOR network [4]. They achieved high resistance against known black-box ML modeling attacks with more than 0.1M training CRPs. A key challenge in these ML-resistant Strong PUF designs is ensuring the entropy sources (ES) stability under environmental variations, because a small number of unstable ES will lead to a much larger portion of unstable CRPs. The unstable CRPs need to be discarded, which reduces the number of available authentication attempts without CRP reuse. They are also a potential weak point that can be exploited to facilitate ML modeling using reliability-based attacks [5]. [2] eliminates the ES instability by hour-long accelerated aging at a high temperature, which induces a high testing cost. [3] creates an accurate ES instability map by evaluating ES under multiple temperature points and filtering out the unstable CRPs. An external access point to the ES is necessary for direct evaluation, representing another potential attack point. [4] proposes a special lithography step to randomize the interconnect, providing a more stable ES than CMOS variations. But the extra unconventional fabrication steps are undesirable in mass production.","PeriodicalId":415990,"journal":{"name":"2022 IEEE Custom Integrated Circuits Conference (CICC)","volume":"28 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120922399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Design Techniques for High Linearity and Dynamic Range Digital to Analog Converters 高线性和动态范围数模转换器的设计技术

2022 IEEE Custom Integrated Circuits Conference (CICC)

Pub Date : 2022-04-01 DOI: 10.1109/CICC53496.2022.9772804

A. Shabra, Yun-Shiang Shu, Shon-Hang Wen, Kuan-Dar Chen

This paper presents recent developments in the design of high linearity and dynamic range digital to analog converters (DAC). It will cover techniques that enable a THD < -120dB and DR > 130dB. Mismatch errors in non-unary DAC can be addressed with mismatch error shaping (MES). Real-time DEM and fixed-transition vector element selection logic (FT-VESL) can mitigate ISI. Moreover, selection algorithms and divide-and-conquer algorithms simplify the hardware implementation. The paper covers distortion mitigation due to analog impairments such as nonlinearities of DAC elements and passives, and routing parasitics. Finally, techniques to suppress reference noise are covered.

本文介绍了高线性和动态范围数模转换器(DAC)设计的最新进展。它将涵盖使THD < -120dB和DR > 130dB的技术。非一元DAC中的失配误差可以通过失配误差整形(MES)来解决。实时DEM和固定过渡向量元素选择逻辑(FT-VESL)可以缓解ISI。此外，选择算法和分治算法简化了硬件实现。本文涵盖了由于DAC元件和无源的非线性以及路由寄生等模拟损伤引起的失真缓解。最后，介绍了抑制参考噪声的技术。

引用次数: 3

A SAR-Assisted DC-Coupled Chopper-Stabilized 20μs-Artifact-Recovery $Delta Sigma$ ADC for Simultaneous Neural Recording and Stimulation 一种sar辅助dc耦合剪切稳定20μs伪影恢复$Delta Sigma$ ADC，用于同时记录和刺激神经

2022 IEEE Custom Integrated Circuits Conference (CICC)

Pub Date : 2022-04-01 DOI: 10.1109/CICC53496.2022.9772782

Tania Moeinfard, Georg Zoidl, Hossein Kassiri

Driven by the high efficacy of electrical neuro-stimulation in treatment of various neurological disorders, optimizing stimulation parameters for each specific patient has become a highly sought goal [1]. This optimization requires recording neuronal activity (10μV-1 mV) during and shortly after stimulation, when large (e.g.,>100mV) stimulation artifacts are present at all recording electrodes. Thus, a very large (>80dB) dynamic range (DR) is needed for the neural recording front-end, which cannot be achieved using conventional amplifiers.

由于神经电刺激在治疗各种神经系统疾病中的高疗效，优化每一个特定患者的刺激参数已成为人们高度追求的目标[1]。这种优化需要在刺激期间和刺激后不久记录神经元活动(10μV-1 mV)，当所有记录电极上都存在大的(例如，>100mV)刺激时。因此，神经记录前端需要非常大(>80dB)的动态范围(DR)，这是传统放大器无法实现的。

引用次数: 0

A 20µs turn-on time, 24kHz resolution, 1.5-100MHz digitally programmable temperature-compensated clock generator with 7.5ppm/°C inaccuracy 20µs导通时间，24kHz分辨率，1.5-100MHz数字可编程温度补偿时钟发生器，7.5ppm/°C误差

2022 IEEE Custom Integrated Circuits Conference (CICC)

Pub Date : 2022-04-01 DOI: 10.1109/CICC53496.2022.9772819

Yongxin Li, Nilanjan Pal, Tianyu Wang, M. Ahmed, Ahmed Abdelrahman, Mohamed Badr Younis, Ruhao Xia, Kyu-Sang Park, P. Hanumolu

The demand for portable electronic devices with a small form factor and extended battery life is ever increasing. Timing circuits impose several critical impediments in meeting this demand. For example, low-power microcontroller units use multiple crystal oscillators (XOs) and several on-chip fractional-N phase-locked loops (PLLs) to generate the desired clocks, which significantly increase board space, power consumption. XOs and PLLs cannot be turned ON and OFF rapidly, so they also severely limit the ability to employ system-level power-reduction strategies such as power cycling. On-chip closed-loop frequency-locked loop (FLL) based oscillators are promising candidates to address some of these drawbacks [1]. While they can achieve excellent frequency accuracy, they occupy a large area, consume significant power, and cannot be turned ON/OFF rapidly due to their very low bandwidth and can only provide an output at one fixed frequency. Given these drawbacks, this paper presents a fast start-up, temperature-stable digital FLL-based oscillator and low jitter open-loop fractional dividers that can provide highly programmable clock outputs. Fabricated in a 65nm CMOS process, the prototype can generate clock outputs from about 1.5MHz to 100MHz with a frequency inaccuracy and resolution of 7.5ppm/°C and 24kHz, respectively.

对具有小尺寸和延长电池寿命的便携式电子设备的需求不断增加。时序电路在满足这一需求时施加了几个关键的障碍。例如，低功耗微控制器单元使用多个晶体振荡器(xo)和几个片上分数n锁相环(pll)来产生所需的时钟，这显着增加了电路板空间，功耗。xo和pll不能快速打开和关闭，因此它们也严重限制了采用系统级功耗降低策略(如功率循环)的能力。基于片上闭环锁频环(FLL)的振荡器有望解决其中的一些缺点[1]。虽然它们可以实现出色的频率精度，但它们占地面积大，消耗大量功率，并且由于带宽非常低而无法快速打开/关闭，并且只能提供一个固定频率的输出。鉴于这些缺点，本文提出了一种快速启动，温度稳定的数字fll振荡器和低抖动开环分数分频器，可以提供高度可编程的时钟输出。该样机采用65nm CMOS工艺制造，可产生约1.5MHz至100MHz的时钟输出，频率误差和分辨率分别为7.5ppm/°C和24kHz。

{"title":"A 20µs turn-on time, 24kHz resolution, 1.5-100MHz digitally programmable temperature-compensated clock generator with 7.5ppm/°C inaccuracy","authors":"Yongxin Li, Nilanjan Pal, Tianyu Wang, M. Ahmed, Ahmed Abdelrahman, Mohamed Badr Younis, Ruhao Xia, Kyu-Sang Park, P. Hanumolu","doi":"10.1109/CICC53496.2022.9772819","DOIUrl":"https://doi.org/10.1109/CICC53496.2022.9772819","url":null,"abstract":"The demand for portable electronic devices with a small form factor and extended battery life is ever increasing. Timing circuits impose several critical impediments in meeting this demand. For example, low-power microcontroller units use multiple crystal oscillators (XOs) and several on-chip fractional-N phase-locked loops (PLLs) to generate the desired clocks, which significantly increase board space, power consumption. XOs and PLLs cannot be turned ON and OFF rapidly, so they also severely limit the ability to employ system-level power-reduction strategies such as power cycling. On-chip closed-loop frequency-locked loop (FLL) based oscillators are promising candidates to address some of these drawbacks [1]. While they can achieve excellent frequency accuracy, they occupy a large area, consume significant power, and cannot be turned ON/OFF rapidly due to their very low bandwidth and can only provide an output at one fixed frequency. Given these drawbacks, this paper presents a fast start-up, temperature-stable digital FLL-based oscillator and low jitter open-loop fractional dividers that can provide highly programmable clock outputs. Fabricated in a 65nm CMOS process, the prototype can generate clock outputs from about 1.5MHz to 100MHz with a frequency inaccuracy and resolution of 7.5ppm/°C and 24kHz, respectively.","PeriodicalId":415990,"journal":{"name":"2022 IEEE Custom Integrated Circuits Conference (CICC)","volume":"11979 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121407062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

A 0.8V/0.6V 2.2μW Time-Domain Analog Front-End with $540text{mV}_{text{pp}}$ Input Range, 81.6dB SNDR and $80mathrm{M}Omega$ Input Impedance 一个0.8V/0.6V 2.2μW时域模拟前端，$540text{mV}_{text{pp}}$输入范围，81.6dB SNDR和$80 mathm {M}Omega$输入阻抗

2022 IEEE Custom Integrated Circuits Conference (CICC)

Pub Date : 2022-04-01 DOI: 10.1109/CICC53496.2022.9772780

Liheng Liu, Tianxiang Qu, Pengjie Wang, Yao Zhang, Zhiliang Hong, Jiawei Xu

The next generation autonomous sensor nodes are being developed towards ultra-low-power with on-node signal processing capability. The former facilitates battery-less and miniaturized sensors relying on harvested energy, while the latter enables intelligent System-on-Chip (SoC) to sense and process multimodal parameters locally on the sensor nodes. As for analog front-end (AFE), a straightforward solution for low power and digital compatibility is to reduce its supply voltage to the sub-volt range. However, supply scaling is less friendly to conventional AFEs, which often require large dynamic range (DR) and high linearity, Furthermore, practical considerations of low noise, high input-impedance $(mathrm{Z}_{text{in}})$ and sensor-dependent bandwidth (BW) further exacerbate the challenges to comply with versatile sensors. To realize the low-voltage AFE, time-domain (TD) direct digitization architectures [1]–[5] were proposed (Fig. 1). The $mathrm{G}_{mathrm{m}}-mathrm{C}$ based delta-sigma modulator $(DeltaSigma mathrm{M})$ with a built-in TD loop filter benefits from high input impedance and higher order noise shaping [1], but the $mathrm{G}_{mathrm{m}}$ exhibits nonlinearity for a large input signal. Alternatively, the VCO-based AFEs provide better supply voltage scalability and inherent ${1}^{text{st}}$-order noise shaping. The open-loop VCO-based AFE [2] benefits from a small chip area, but suffering from the tradeoff between linearity and input range. While the closed-loop VCO-based AFE solves this issue [3]–[5], this topology often needs a highly linear feedback DAC that notably reduces the input impedance of the AFE, unless impedance boosting buffers are used [6]. Besides, the closed-loop VCO based AFEs needs to be clocked continuously, resulting in power overhead.

下一代自主传感器节点正朝着超低功耗、节点上信号处理能力的方向发展。前者促进了依靠收集能量的无电池和小型化传感器，而后者使智能片上系统(SoC)能够在传感器节点上本地感知和处理多模态参数。对于模拟前端(AFE)，为了低功耗和数字兼容，一个简单的解决方案是将其供电电压降低到亚伏范围。然而，对于通常需要大动态范围(DR)和高线性度的传统afe来说，电源缩放不太友好。此外，低噪声、高输入阻抗$(mathrm{Z}_{text{in}})$和传感器相关带宽(BW)的实际考虑进一步加剧了满足多功能传感器的挑战。为了实现低压AFE，提出了时域(TD)直接数字化架构[1]-[5](图1)。基于$mathrm{G}_{mathrm{m}}-mathrm{C}$的delta-sigma调制器$(DeltaSigma mathrm{M})$具有内置TD环路滤波器，可从高输入阻抗和高阶噪声整形中获益[1]，但$mathrm{G}_{mathrm{m}}$在大输入信号中表现出非线性。另外，基于vco的afe提供更好的电源电压可扩展性和固有的${1}^{text{st}}$阶噪声整形。基于vco的开环AFE[2]得益于芯片面积小，但在线性度和输入范围之间需要权衡。虽然基于vco的闭环AFE解决了这个问题[3]-[5]，但这种拓扑结构通常需要一个高度线性反馈的DAC，显著降低AFE的输入阻抗，除非使用阻抗提升缓冲器[6]。此外，基于VCO的闭环afe需要连续进行时钟处理，从而导致功率开销。

{"title":"A 0.8V/0.6V 2.2μW Time-Domain Analog Front-End with $540text{mV}_{text{pp}}$ Input Range, 81.6dB SNDR and $80mathrm{M}Omega$ Input Impedance","authors":"Liheng Liu, Tianxiang Qu, Pengjie Wang, Yao Zhang, Zhiliang Hong, Jiawei Xu","doi":"10.1109/CICC53496.2022.9772780","DOIUrl":"https://doi.org/10.1109/CICC53496.2022.9772780","url":null,"abstract":"The next generation autonomous sensor nodes are being developed towards ultra-low-power with on-node signal processing capability. The former facilitates battery-less and miniaturized sensors relying on harvested energy, while the latter enables intelligent System-on-Chip (SoC) to sense and process multimodal parameters locally on the sensor nodes. As for analog front-end (AFE), a straightforward solution for low power and digital compatibility is to reduce its supply voltage to the sub-volt range. However, supply scaling is less friendly to conventional AFEs, which often require large dynamic range (DR) and high linearity, Furthermore, practical considerations of low noise, high input-impedance $(mathrm{Z}_{text{in}})$ and sensor-dependent bandwidth (BW) further exacerbate the challenges to comply with versatile sensors. To realize the low-voltage AFE, time-domain (TD) direct digitization architectures [1]–[5] were proposed (Fig. 1). The $mathrm{G}_{mathrm{m}}-mathrm{C}$ based delta-sigma modulator $(DeltaSigma mathrm{M})$ with a built-in TD loop filter benefits from high input impedance and higher order noise shaping [1], but the $mathrm{G}_{mathrm{m}}$ exhibits nonlinearity for a large input signal. Alternatively, the VCO-based AFEs provide better supply voltage scalability and inherent ${1}^{text{st}}$-order noise shaping. The open-loop VCO-based AFE [2] benefits from a small chip area, but suffering from the tradeoff between linearity and input range. While the closed-loop VCO-based AFE solves this issue [3]–[5], this topology often needs a highly linear feedback DAC that notably reduces the input impedance of the AFE, unless impedance boosting buffers are used [6]. Besides, the closed-loop VCO based AFEs needs to be clocked continuously, resulting in power overhead.","PeriodicalId":415990,"journal":{"name":"2022 IEEE Custom Integrated Circuits Conference (CICC)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127393829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Battery-Less Crystal-Less 49.8µW Neural-Recording Chip Featuring Two-Tone RF Power Harvesting 无电池无晶体49.8 μ W神经记录芯片，具有双音射频功率采集

2022 IEEE Custom Integrated Circuits Conference (CICC)

Pub Date : 2022-04-01 DOI: 10.1109/CICC53496.2022.9772792

Ziyi Chang, Changgui Yang, Yunshan Zhang, Zhuhao Li, Tianyu Zheng, Yuxuan Luo, Shaomin Zhang, Kedi Xu, Gang Pan, Bo Zhao, Yong Chen

Implantable biomedical devices (IMDs) capable of recording electrophysiological signals effectively facilitate medical treatment, but they also face strict volume requirements [1]–[6]. An effective way to miniaturize the IMDs is to eliminate the bulky components such as battery and crystal. Wireless power transfer (WPT) helps to remove the battery [1]–[5], while a bulky crystal is still required to provide a precise clock to ensure the performance of signal-acquisition and communication blocks (Fig. 1 left, top). To eliminate the crystal, prior work [1] uses an on-chip oscillator as the clock generator (Fig. 1 left, middle), while suffering from off-chip tuning and SNR degradation of analog front-end (AFE), ADC, and wireless transmission. Recently, clock recovering from power-harvesting tone has become a promising solution to further reduce the volume of battery-less systems (Fig. 1 left, bottom) [2]–[5]. However, it's difficult to deal with a trade-off: A high power-harvesting frequency leads to power-hungry clock-recovery circuits [4], while a low frequency requires a large-size antenna [5].

植入式生物医学设备(imd)能够有效地记录电生理信号，为医疗提供便利，但也面临着严格的体积要求[1]-[6]。消除电池、晶体等笨重部件是实现imd小型化的有效途径。无线电力传输(WPT)有助于移除电池[1]-[5]，同时仍然需要一个庞大的晶体来提供精确的时钟，以确保信号采集和通信模块的性能(图1左上)。为了消除晶体，先前的工作[1]使用片上振荡器作为时钟发生器(图1左中)，同时遭受片外调谐和模拟前端(AFE)， ADC和无线传输的信噪比下降。最近，从能量收集音调中恢复时钟已成为进一步减少无电池系统体积的有希望的解决方案(图1左下)[2]-[5]。然而，很难处理一个权衡:高功率采集频率导致耗电的时钟恢复电路[4]，而低频率需要大尺寸的天线[5]。

{"title":"A Battery-Less Crystal-Less 49.8µW Neural-Recording Chip Featuring Two-Tone RF Power Harvesting","authors":"Ziyi Chang, Changgui Yang, Yunshan Zhang, Zhuhao Li, Tianyu Zheng, Yuxuan Luo, Shaomin Zhang, Kedi Xu, Gang Pan, Bo Zhao, Yong Chen","doi":"10.1109/CICC53496.2022.9772792","DOIUrl":"https://doi.org/10.1109/CICC53496.2022.9772792","url":null,"abstract":"Implantable biomedical devices (IMDs) capable of recording electrophysiological signals effectively facilitate medical treatment, but they also face strict volume requirements [1]–[6]. An effective way to miniaturize the IMDs is to eliminate the bulky components such as battery and crystal. Wireless power transfer (WPT) helps to remove the battery [1]–[5], while a bulky crystal is still required to provide a precise clock to ensure the performance of signal-acquisition and communication blocks (Fig. 1 left, top). To eliminate the crystal, prior work [1] uses an on-chip oscillator as the clock generator (Fig. 1 left, middle), while suffering from off-chip tuning and SNR degradation of analog front-end (AFE), ADC, and wireless transmission. Recently, clock recovering from power-harvesting tone has become a promising solution to further reduce the volume of battery-less systems (Fig. 1 left, bottom) [2]–[5]. However, it's difficult to deal with a trade-off: A high power-harvesting frequency leads to power-hungry clock-recovery circuits [4], while a low frequency requires a large-size antenna [5].","PeriodicalId":415990,"journal":{"name":"2022 IEEE Custom Integrated Circuits Conference (CICC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127961850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

MPAM: Reliable, Low-Latency, Near-Threshold-Voltage Multi-Voltage/Frequency-Domain Network-on-Chip with Metastability Risk Prediction and Mitigation 可靠、低延迟、近阈值电压的多电压/频域片上网络，具有亚稳态风险预测和缓解

2022 IEEE Custom Integrated Circuits Conference (CICC)

Pub Date : 2022-04-01 DOI: 10.1109/CICC53496.2022.9772849

Chuxiong Lin, Weifeng He, Yannan Sun, Lingmin Shao, Bo Zhang, Jun Yang, Mingoo Seok

Emerging applications like a drone and an autonomous vehicle require system-on-a-chips (SoCs) with high reliability, e.g., the mean-time-between-failure (MTBF) needs to be over tens of thousands of hours [1]. Meanwhile, as these applications require increasingly higher performance and energy efficiency, a multi-core architecture is often desirable. Here, each core operates in an independent voltage/frequency (V/F) domain, ideally from the near-threshold voltage (NTV) to super-threshold, while communicating with one another via a network-on-chip (NoC) [2]. However, this makes it challenging to ensure robustness in clock domain crossing against metastability. Metastability becomes even more critical to NTV circuits since metastability resolution time constant $T$ grows super-linearly with voltage scaling [3]. Conventionally, an NoC uses multi-stage (4 stages in [4]) synchronizers to improve MTBF, but they increase latency and cannot completely eliminate metastability. Recently, [5] proposed a novel NTV flip-flop, which has a lower probability of having metastability. Another recent work [6] proposed to detect the necessary condition of metastability and mitigate it by modulating the RX clock and also requesting retransmission to guarantee data correctness. However, as it detects a necessary condition, not actual metastability, it tends to overly request retransmission, hurting latency, throughput, and energy efficiency.

无人机和自动驾驶汽车等新兴应用需要具有高可靠性的片上系统(soc)，例如，平均故障间隔时间(MTBF)需要超过数万小时[1]。同时，由于这些应用程序需要越来越高的性能和能源效率，因此通常需要多核架构。在这里，每个核心在独立的电压/频率(V/F)域中工作，理想情况下从近阈值电压(NTV)到超阈值电压，同时通过片上网络(NoC)相互通信[2]。然而，这使得确保时钟域交叉抗亚稳态的鲁棒性具有挑战性。亚稳态对于NTV电路来说变得更加重要，因为亚稳态分辨率时间常数$T$随着电压缩放呈超线性增长[3]。传统上，NoC使用多级([4]中的4级)同步器来提高MTBF，但它们增加了延迟，并不能完全消除亚稳态。最近，[5]提出了一种新型的NTV触发器，具有较低的亚稳态概率。最近的另一项研究[6]提出检测亚稳态的必要条件，并通过调制RX时钟来缓解亚稳态，同时请求重传以保证数据的正确性。但是，由于它检测的是必要条件，而不是实际的亚稳态，因此它倾向于过度请求重传，从而损害延迟、吞吐量和能源效率。

{"title":"MPAM: Reliable, Low-Latency, Near-Threshold-Voltage Multi-Voltage/Frequency-Domain Network-on-Chip with Metastability Risk Prediction and Mitigation","authors":"Chuxiong Lin, Weifeng He, Yannan Sun, Lingmin Shao, Bo Zhang, Jun Yang, Mingoo Seok","doi":"10.1109/CICC53496.2022.9772849","DOIUrl":"https://doi.org/10.1109/CICC53496.2022.9772849","url":null,"abstract":"Emerging applications like a drone and an autonomous vehicle require system-on-a-chips (SoCs) with high reliability, e.g., the mean-time-between-failure (MTBF) needs to be over tens of thousands of hours [1]. Meanwhile, as these applications require increasingly higher performance and energy efficiency, a multi-core architecture is often desirable. Here, each core operates in an independent voltage/frequency (V/F) domain, ideally from the near-threshold voltage (NTV) to super-threshold, while communicating with one another via a network-on-chip (NoC) [2]. However, this makes it challenging to ensure robustness in clock domain crossing against metastability. Metastability becomes even more critical to NTV circuits since metastability resolution time constant $T$ grows super-linearly with voltage scaling [3]. Conventionally, an NoC uses multi-stage (4 stages in [4]) synchronizers to improve MTBF, but they increase latency and cannot completely eliminate metastability. Recently, [5] proposed a novel NTV flip-flop, which has a lower probability of having metastability. Another recent work [6] proposed to detect the necessary condition of metastability and mitigate it by modulating the RX clock and also requesting retransmission to guarantee data correctness. However, as it detects a necessary condition, not actual metastability, it tends to overly request retransmission, hurting latency, throughput, and energy efficiency.","PeriodicalId":415990,"journal":{"name":"2022 IEEE Custom Integrated Circuits Conference (CICC)","volume":"212 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121718549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1