Pub Date : 2022-04-01DOI: 10.1109/CICC53496.2022.9772826
Zhiyu Chen, Qing Jin, Zhanghao Yu, Yanzhi Wang, Kaiyuan Yang
Process-In-Memory (PIM) is a promising solution to alleviating the memory-wall bottleneck in memory-intensive applications like CNNs. Recent demonstrations of SRAM-based PIM designs, particularly those computing in the charge domain [1]–[5], have greatly improved the linearity of analog multiply-and-add computations (MAC) and quantization, and their robustness to process variations, making their inference accuracy approach that of digital hardware in practical computer vision benchmarks such as CIFAR-10. However, there remain several limitations towards large scale integration of PIM macros, especially the assumptions on the availability of powerful external reference voltage drivers and the lack of scaling friendly designs. More specifically, high-bandwidth analog buffers driving large output load are necessary to distribute the massive number of analog signals (e.g. DAC outputs) across the macro, without sacrificing signal fidelity and computing speed. [10] is one work that reports its DAC drivers occupying 11.4% of the macro area and incurring 94-pJ energy overhead in 28 nm, accounting for 68.5% of the total energy in a macro supporting $5mathrm{b}$ activations and $8mathrm{b}$ weight. Second, SAR ADCs are popular for the common 5–9 bit resolution range. High-speed power-hungry analog buffers are required in conventional SAR ADCs to drive the capacitive DACs (CDACs) to reference voltages, with short settling time and high accuracy. Given the hundreds of ADCs in each macro, the design complexity and overheads incurred by these drivers are dominant. Our simulated reference driver takes 2.9-pJ energy in 65 nm, which is comparable to an ADC (e.g. 3.56 $text{pJ}$ in [12]). Third, it is challenging to fit any conventional $geq 7mathrm{b}$ SAR ADC into the narrow width of SRAM cells due to the bulky CDACs and layout matching requirements, ultimately limiting the computing parallelism and energy amortization.
内存中进程(PIM)是一种很有前途的解决方案,可以缓解像cnn这样的内存密集型应用程序中的内存墙瓶颈。最近基于sram的PIM设计的演示,特别是在电荷域[1]-[5]的计算,极大地提高了模拟乘法和加法计算(MAC)和量化的线性度,以及它们对过程变化的鲁棒性,使其推理精度接近实际计算机视觉基准中的数字硬件,如CIFAR-10。然而,PIM宏的大规模集成仍然存在一些限制,特别是对强大的外部参考电压驱动器的可用性的假设以及缺乏缩放友好的设计。更具体地说,在不牺牲信号保真度和计算速度的情况下,需要高带宽模拟缓冲驱动大输出负载,以便在宏中分配大量模拟信号(例如DAC输出)。[10]是一个报告其DAC驱动程序占用11.4的工作% of the macro area and incurring 94-pJ energy overhead in 28 nm, accounting for 68.5% of the total energy in a macro supporting $5mathrm{b}$ activations and $8mathrm{b}$ weight. Second, SAR ADCs are popular for the common 5–9 bit resolution range. High-speed power-hungry analog buffers are required in conventional SAR ADCs to drive the capacitive DACs (CDACs) to reference voltages, with short settling time and high accuracy. Given the hundreds of ADCs in each macro, the design complexity and overheads incurred by these drivers are dominant. Our simulated reference driver takes 2.9-pJ energy in 65 nm, which is comparable to an ADC (e.g. 3.56 $text{pJ}$ in [12]). Third, it is challenging to fit any conventional $geq 7mathrm{b}$ SAR ADC into the narrow width of SRAM cells due to the bulky CDACs and layout matching requirements, ultimately limiting the computing parallelism and energy amortization.
{"title":"DCT-RAM: A Driver-Free Process-In-Memory 8T SRAM Macro with Multi-Bit Charge-Domain Computation and Time-Domain Quantization","authors":"Zhiyu Chen, Qing Jin, Zhanghao Yu, Yanzhi Wang, Kaiyuan Yang","doi":"10.1109/CICC53496.2022.9772826","DOIUrl":"https://doi.org/10.1109/CICC53496.2022.9772826","url":null,"abstract":"Process-In-Memory (PIM) is a promising solution to alleviating the memory-wall bottleneck in memory-intensive applications like CNNs. Recent demonstrations of SRAM-based PIM designs, particularly those computing in the charge domain [1]–[5], have greatly improved the linearity of analog multiply-and-add computations (MAC) and quantization, and their robustness to process variations, making their inference accuracy approach that of digital hardware in practical computer vision benchmarks such as CIFAR-10. However, there remain several limitations towards large scale integration of PIM macros, especially the assumptions on the availability of powerful external reference voltage drivers and the lack of scaling friendly designs. More specifically, high-bandwidth analog buffers driving large output load are necessary to distribute the massive number of analog signals (e.g. DAC outputs) across the macro, without sacrificing signal fidelity and computing speed. [10] is one work that reports its DAC drivers occupying 11.4% of the macro area and incurring 94-pJ energy overhead in 28 nm, accounting for 68.5% of the total energy in a macro supporting $5mathrm{b}$ activations and $8mathrm{b}$ weight. Second, SAR ADCs are popular for the common 5–9 bit resolution range. High-speed power-hungry analog buffers are required in conventional SAR ADCs to drive the capacitive DACs (CDACs) to reference voltages, with short settling time and high accuracy. Given the hundreds of ADCs in each macro, the design complexity and overheads incurred by these drivers are dominant. Our simulated reference driver takes 2.9-pJ energy in 65 nm, which is comparable to an ADC (e.g. 3.56 $text{pJ}$ in [12]). Third, it is challenging to fit any conventional $geq 7mathrm{b}$ SAR ADC into the narrow width of SRAM cells due to the bulky CDACs and layout matching requirements, ultimately limiting the computing parallelism and energy amortization.","PeriodicalId":415990,"journal":{"name":"2022 IEEE Custom Integrated Circuits Conference (CICC)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115688608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-04-01DOI: 10.1109/CICC53496.2022.9772861
Donghee Cho, Hyungjoo Cho, Sein Oh, Yoontae Jung, S. Ha, Chul-Woong Kim, M. Je
In mobile devices, there are various functional blocks requiring different voltage levels, which should be generated from a single lithium-ion battery (Fig. 1), and 3.3V is one of the most demanded voltage levels. Since the battery output voltage discharges gradually from 4.2 to 2.9V, it requires a buck-boost converter that addresses the following challenges. 1) Due to ever-increasing load-current $(mathsf{I}_{mathsf{LOAD}})$ demands in the mobile device, the conduction loss by the DC resistance (DCR) of the inductor overwhelms other losses, especially when a small-size inductor is used. 2) At large $mathsf{I}_{mathsf{LOAD}}$, the battery output voltage decreases due to its internal resistance, making the boost operation more dominant. 3) Due to the controllability of display brightness and processing speed, burst currents are generated, resulting in unpredictable input voltage fluctuations.
{"title":"A Single-Mode Dual-Path Buck-Boost Converter with Reduced Inductor Current Across All Duty Cases Achieving 95.58% Efficiency at 1A in Boost Operation","authors":"Donghee Cho, Hyungjoo Cho, Sein Oh, Yoontae Jung, S. Ha, Chul-Woong Kim, M. Je","doi":"10.1109/CICC53496.2022.9772861","DOIUrl":"https://doi.org/10.1109/CICC53496.2022.9772861","url":null,"abstract":"In mobile devices, there are various functional blocks requiring different voltage levels, which should be generated from a single lithium-ion battery (Fig. 1), and 3.3V is one of the most demanded voltage levels. Since the battery output voltage discharges gradually from 4.2 to 2.9V, it requires a buck-boost converter that addresses the following challenges. 1) Due to ever-increasing load-current $(mathsf{I}_{mathsf{LOAD}})$ demands in the mobile device, the conduction loss by the DC resistance (DCR) of the inductor overwhelms other losses, especially when a small-size inductor is used. 2) At large $mathsf{I}_{mathsf{LOAD}}$, the battery output voltage decreases due to its internal resistance, making the boost operation more dominant. 3) Due to the controllability of display brightness and processing speed, burst currents are generated, resulting in unpredictable input voltage fluctuations.","PeriodicalId":415990,"journal":{"name":"2022 IEEE Custom Integrated Circuits Conference (CICC)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129776995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-04-01DOI: 10.1109/CICC53496.2022.9772798
Lei Zhao, Junyao Tang, Cheng Huang
Multiphase DC-DC converters have been widely used to deliver more power more efficiently with smaller ripples and faster large-signal dynamic responses [1]–[5]. In terms of closed-loop voltage regulation, traditional linear PWM control has limited small-signal bandwidth, which is further compromised to ensure stability at different loading conditions with different PVT and LC variations. Non-linear control, such as hysteretic control, does not have small-signal bandwidth limitations nor stability concerns, thus can potentially achieve a faster dynamic performance. Among different topologies, current-mode hysteretic control has been adopted in 4-phase converters [2], [3]. To ensure proper operation at higher frequency, they require careful matching between the inductor current-sensing RC networks and the inductance and parasitic DC resistance (DCR) of the power inductors [2], or more complex RC sensing networks [3]. Also, the converters in [2]–[4] did not include current balancing, which could introduce unbalanced current due to mismatches in power transistors, control timing, and power inductors among different phases, and result in significant compromise in efficiency. To maintain optimum efficiency over a wide loading range, active-phase-count (APC) control has been introduced in [1], [2], [4]. In [4], APC is realized by a multi-bit ADC, which increases the design complexity and power consumption. Double-adaptive-bound (DAB) hysteretic control in [6] has demonstrated fast transient responses, however, it only works in single phase, and the operation is very sensitive to the delay of the comparator, the gate driver and other circuits in the control path, and the matching of the RC filters, especially at higher switching frequencies. Besides, due to the lack of a high-gain amplifier, output voltage DC accuracy is also compromised in hysteretic controlled switching converters, with a 40mV/1A load regulation in [2].
{"title":"A Fully In-Package 4-Phase Fixed-Frequency DAB Hysteretic Controlled DC-DC Converter with Enhanced Efficiency, Load Regulation and Transient Response","authors":"Lei Zhao, Junyao Tang, Cheng Huang","doi":"10.1109/CICC53496.2022.9772798","DOIUrl":"https://doi.org/10.1109/CICC53496.2022.9772798","url":null,"abstract":"Multiphase DC-DC converters have been widely used to deliver more power more efficiently with smaller ripples and faster large-signal dynamic responses [1]–[5]. In terms of closed-loop voltage regulation, traditional linear PWM control has limited small-signal bandwidth, which is further compromised to ensure stability at different loading conditions with different PVT and LC variations. Non-linear control, such as hysteretic control, does not have small-signal bandwidth limitations nor stability concerns, thus can potentially achieve a faster dynamic performance. Among different topologies, current-mode hysteretic control has been adopted in 4-phase converters [2], [3]. To ensure proper operation at higher frequency, they require careful matching between the inductor current-sensing RC networks and the inductance and parasitic DC resistance (DCR) of the power inductors [2], or more complex RC sensing networks [3]. Also, the converters in [2]–[4] did not include current balancing, which could introduce unbalanced current due to mismatches in power transistors, control timing, and power inductors among different phases, and result in significant compromise in efficiency. To maintain optimum efficiency over a wide loading range, active-phase-count (APC) control has been introduced in [1], [2], [4]. In [4], APC is realized by a multi-bit ADC, which increases the design complexity and power consumption. Double-adaptive-bound (DAB) hysteretic control in [6] has demonstrated fast transient responses, however, it only works in single phase, and the operation is very sensitive to the delay of the comparator, the gate driver and other circuits in the control path, and the matching of the RC filters, especially at higher switching frequencies. Besides, due to the lack of a high-gain amplifier, output voltage DC accuracy is also compromised in hysteretic controlled switching converters, with a 40mV/1A load regulation in [2].","PeriodicalId":415990,"journal":{"name":"2022 IEEE Custom Integrated Circuits Conference (CICC)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130218425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-04-01DOI: 10.1109/CICC53496.2022.9772852
Yan He, Qixuan Yu, Kaiyuan Yang
Strong physically unclonable functions (SPUFs) are promising solutions for low-cost authentication of loT edge devices, by generating an exponential number of device-specific challenge-response pairs (CRPs). Early SPUF designs are vulnerable against machine learning (ML) modeling attacks due to the lack of nonlinearity in challenge-to-response mapping [1]. Recent studies have shown that SPUFs can be designed with resiliency against ML modeling by incorporating entropy sources with non-linear operations such as Entropy LUT [2], AES S-box [3], or XOR network [4]. They achieved high resistance against known black-box ML modeling attacks with more than 0.1M training CRPs. A key challenge in these ML-resistant Strong PUF designs is ensuring the entropy sources (ES) stability under environmental variations, because a small number of unstable ES will lead to a much larger portion of unstable CRPs. The unstable CRPs need to be discarded, which reduces the number of available authentication attempts without CRP reuse. They are also a potential weak point that can be exploited to facilitate ML modeling using reliability-based attacks [5]. [2] eliminates the ES instability by hour-long accelerated aging at a high temperature, which induces a high testing cost. [3] creates an accurate ES instability map by evaluating ES under multiple temperature points and filtering out the unstable CRPs. An external access point to the ES is necessary for direct evaluation, representing another potential attack point. [4] proposes a special lithography step to randomize the interconnect, providing a more stable ES than CMOS variations. But the extra unconventional fabrication steps are undesirable in mass production.
{"title":"A Lossless and Modeling Attack-Resistant Strong PUF with <4E-8 Bit Error Rate","authors":"Yan He, Qixuan Yu, Kaiyuan Yang","doi":"10.1109/CICC53496.2022.9772852","DOIUrl":"https://doi.org/10.1109/CICC53496.2022.9772852","url":null,"abstract":"Strong physically unclonable functions (SPUFs) are promising solutions for low-cost authentication of loT edge devices, by generating an exponential number of device-specific challenge-response pairs (CRPs). Early SPUF designs are vulnerable against machine learning (ML) modeling attacks due to the lack of nonlinearity in challenge-to-response mapping [1]. Recent studies have shown that SPUFs can be designed with resiliency against ML modeling by incorporating entropy sources with non-linear operations such as Entropy LUT [2], AES S-box [3], or XOR network [4]. They achieved high resistance against known black-box ML modeling attacks with more than 0.1M training CRPs. A key challenge in these ML-resistant Strong PUF designs is ensuring the entropy sources (ES) stability under environmental variations, because a small number of unstable ES will lead to a much larger portion of unstable CRPs. The unstable CRPs need to be discarded, which reduces the number of available authentication attempts without CRP reuse. They are also a potential weak point that can be exploited to facilitate ML modeling using reliability-based attacks [5]. [2] eliminates the ES instability by hour-long accelerated aging at a high temperature, which induces a high testing cost. [3] creates an accurate ES instability map by evaluating ES under multiple temperature points and filtering out the unstable CRPs. An external access point to the ES is necessary for direct evaluation, representing another potential attack point. [4] proposes a special lithography step to randomize the interconnect, providing a more stable ES than CMOS variations. But the extra unconventional fabrication steps are undesirable in mass production.","PeriodicalId":415990,"journal":{"name":"2022 IEEE Custom Integrated Circuits Conference (CICC)","volume":"28 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120922399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-04-01DOI: 10.1109/CICC53496.2022.9772804
A. Shabra, Yun-Shiang Shu, Shon-Hang Wen, Kuan-Dar Chen
This paper presents recent developments in the design of high linearity and dynamic range digital to analog converters (DAC). It will cover techniques that enable a THD < -120dB and DR > 130dB. Mismatch errors in non-unary DAC can be addressed with mismatch error shaping (MES). Real-time DEM and fixed-transition vector element selection logic (FT-VESL) can mitigate ISI. Moreover, selection algorithms and divide-and-conquer algorithms simplify the hardware implementation. The paper covers distortion mitigation due to analog impairments such as nonlinearities of DAC elements and passives, and routing parasitics. Finally, techniques to suppress reference noise are covered.
{"title":"Design Techniques for High Linearity and Dynamic Range Digital to Analog Converters","authors":"A. Shabra, Yun-Shiang Shu, Shon-Hang Wen, Kuan-Dar Chen","doi":"10.1109/CICC53496.2022.9772804","DOIUrl":"https://doi.org/10.1109/CICC53496.2022.9772804","url":null,"abstract":"This paper presents recent developments in the design of high linearity and dynamic range digital to analog converters (DAC). It will cover techniques that enable a THD < -120dB and DR > 130dB. Mismatch errors in non-unary DAC can be addressed with mismatch error shaping (MES). Real-time DEM and fixed-transition vector element selection logic (FT-VESL) can mitigate ISI. Moreover, selection algorithms and divide-and-conquer algorithms simplify the hardware implementation. The paper covers distortion mitigation due to analog impairments such as nonlinearities of DAC elements and passives, and routing parasitics. Finally, techniques to suppress reference noise are covered.","PeriodicalId":415990,"journal":{"name":"2022 IEEE Custom Integrated Circuits Conference (CICC)","volume":"55 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125976530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-04-01DOI: 10.1109/CICC53496.2022.9772782
Tania Moeinfard, Georg Zoidl, Hossein Kassiri
Driven by the high efficacy of electrical neuro-stimulation in treatment of various neurological disorders, optimizing stimulation parameters for each specific patient has become a highly sought goal [1]. This optimization requires recording neuronal activity (10μV-1 mV) during and shortly after stimulation, when large (e.g.,>100mV) stimulation artifacts are present at all recording electrodes. Thus, a very large (>80dB) dynamic range (DR) is needed for the neural recording front-end, which cannot be achieved using conventional amplifiers.
{"title":"A SAR-Assisted DC-Coupled Chopper-Stabilized 20μs-Artifact-Recovery $Delta Sigma$ ADC for Simultaneous Neural Recording and Stimulation","authors":"Tania Moeinfard, Georg Zoidl, Hossein Kassiri","doi":"10.1109/CICC53496.2022.9772782","DOIUrl":"https://doi.org/10.1109/CICC53496.2022.9772782","url":null,"abstract":"Driven by the high efficacy of electrical neuro-stimulation in treatment of various neurological disorders, optimizing stimulation parameters for each specific patient has become a highly sought goal [1]. This optimization requires recording neuronal activity (10μV-1 mV) during and shortly after stimulation, when large (e.g.,>100mV) stimulation artifacts are present at all recording electrodes. Thus, a very large (>80dB) dynamic range (DR) is needed for the neural recording front-end, which cannot be achieved using conventional amplifiers.","PeriodicalId":415990,"journal":{"name":"2022 IEEE Custom Integrated Circuits Conference (CICC)","volume":"145 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134475650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-04-01DOI: 10.1109/CICC53496.2022.9772819
Yongxin Li, Nilanjan Pal, Tianyu Wang, M. Ahmed, Ahmed Abdelrahman, Mohamed Badr Younis, Ruhao Xia, Kyu-Sang Park, P. Hanumolu
The demand for portable electronic devices with a small form factor and extended battery life is ever increasing. Timing circuits impose several critical impediments in meeting this demand. For example, low-power microcontroller units use multiple crystal oscillators (XOs) and several on-chip fractional-N phase-locked loops (PLLs) to generate the desired clocks, which significantly increase board space, power consumption. XOs and PLLs cannot be turned ON and OFF rapidly, so they also severely limit the ability to employ system-level power-reduction strategies such as power cycling. On-chip closed-loop frequency-locked loop (FLL) based oscillators are promising candidates to address some of these drawbacks [1]. While they can achieve excellent frequency accuracy, they occupy a large area, consume significant power, and cannot be turned ON/OFF rapidly due to their very low bandwidth and can only provide an output at one fixed frequency. Given these drawbacks, this paper presents a fast start-up, temperature-stable digital FLL-based oscillator and low jitter open-loop fractional dividers that can provide highly programmable clock outputs. Fabricated in a 65nm CMOS process, the prototype can generate clock outputs from about 1.5MHz to 100MHz with a frequency inaccuracy and resolution of 7.5ppm/°C and 24kHz, respectively.
{"title":"A 20µs turn-on time, 24kHz resolution, 1.5-100MHz digitally programmable temperature-compensated clock generator with 7.5ppm/°C inaccuracy","authors":"Yongxin Li, Nilanjan Pal, Tianyu Wang, M. Ahmed, Ahmed Abdelrahman, Mohamed Badr Younis, Ruhao Xia, Kyu-Sang Park, P. Hanumolu","doi":"10.1109/CICC53496.2022.9772819","DOIUrl":"https://doi.org/10.1109/CICC53496.2022.9772819","url":null,"abstract":"The demand for portable electronic devices with a small form factor and extended battery life is ever increasing. Timing circuits impose several critical impediments in meeting this demand. For example, low-power microcontroller units use multiple crystal oscillators (XOs) and several on-chip fractional-N phase-locked loops (PLLs) to generate the desired clocks, which significantly increase board space, power consumption. XOs and PLLs cannot be turned ON and OFF rapidly, so they also severely limit the ability to employ system-level power-reduction strategies such as power cycling. On-chip closed-loop frequency-locked loop (FLL) based oscillators are promising candidates to address some of these drawbacks [1]. While they can achieve excellent frequency accuracy, they occupy a large area, consume significant power, and cannot be turned ON/OFF rapidly due to their very low bandwidth and can only provide an output at one fixed frequency. Given these drawbacks, this paper presents a fast start-up, temperature-stable digital FLL-based oscillator and low jitter open-loop fractional dividers that can provide highly programmable clock outputs. Fabricated in a 65nm CMOS process, the prototype can generate clock outputs from about 1.5MHz to 100MHz with a frequency inaccuracy and resolution of 7.5ppm/°C and 24kHz, respectively.","PeriodicalId":415990,"journal":{"name":"2022 IEEE Custom Integrated Circuits Conference (CICC)","volume":"11979 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121407062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The next generation autonomous sensor nodes are being developed towards ultra-low-power with on-node signal processing capability. The former facilitates battery-less and miniaturized sensors relying on harvested energy, while the latter enables intelligent System-on-Chip (SoC) to sense and process multimodal parameters locally on the sensor nodes. As for analog front-end (AFE), a straightforward solution for low power and digital compatibility is to reduce its supply voltage to the sub-volt range. However, supply scaling is less friendly to conventional AFEs, which often require large dynamic range (DR) and high linearity, Furthermore, practical considerations of low noise, high input-impedance $(mathrm{Z}_{text{in}})$ and sensor-dependent bandwidth (BW) further exacerbate the challenges to comply with versatile sensors. To realize the low-voltage AFE, time-domain (TD) direct digitization architectures [1]–[5] were proposed (Fig. 1). The $mathrm{G}_{mathrm{m}}-mathrm{C}$ based delta-sigma modulator $(DeltaSigma mathrm{M})$ with a built-in TD loop filter benefits from high input impedance and higher order noise shaping [1], but the $mathrm{G}_{mathrm{m}}$ exhibits nonlinearity for a large input signal. Alternatively, the VCO-based AFEs provide better supply voltage scalability and inherent ${1}^{text{st}}$-order noise shaping. The open-loop VCO-based AFE [2] benefits from a small chip area, but suffering from the tradeoff between linearity and input range. While the closed-loop VCO-based AFE solves this issue [3]–[5], this topology often needs a highly linear feedback DAC that notably reduces the input impedance of the AFE, unless impedance boosting buffers are used [6]. Besides, the closed-loop VCO based AFEs needs to be clocked continuously, resulting in power overhead.
{"title":"A 0.8V/0.6V 2.2μW Time-Domain Analog Front-End with $540text{mV}_{text{pp}}$ Input Range, 81.6dB SNDR and $80mathrm{M}Omega$ Input Impedance","authors":"Liheng Liu, Tianxiang Qu, Pengjie Wang, Yao Zhang, Zhiliang Hong, Jiawei Xu","doi":"10.1109/CICC53496.2022.9772780","DOIUrl":"https://doi.org/10.1109/CICC53496.2022.9772780","url":null,"abstract":"The next generation autonomous sensor nodes are being developed towards ultra-low-power with on-node signal processing capability. The former facilitates battery-less and miniaturized sensors relying on harvested energy, while the latter enables intelligent System-on-Chip (SoC) to sense and process multimodal parameters locally on the sensor nodes. As for analog front-end (AFE), a straightforward solution for low power and digital compatibility is to reduce its supply voltage to the sub-volt range. However, supply scaling is less friendly to conventional AFEs, which often require large dynamic range (DR) and high linearity, Furthermore, practical considerations of low noise, high input-impedance $(mathrm{Z}_{text{in}})$ and sensor-dependent bandwidth (BW) further exacerbate the challenges to comply with versatile sensors. To realize the low-voltage AFE, time-domain (TD) direct digitization architectures [1]–[5] were proposed (Fig. 1). The $mathrm{G}_{mathrm{m}}-mathrm{C}$ based delta-sigma modulator $(DeltaSigma mathrm{M})$ with a built-in TD loop filter benefits from high input impedance and higher order noise shaping [1], but the $mathrm{G}_{mathrm{m}}$ exhibits nonlinearity for a large input signal. Alternatively, the VCO-based AFEs provide better supply voltage scalability and inherent ${1}^{text{st}}$-order noise shaping. The open-loop VCO-based AFE [2] benefits from a small chip area, but suffering from the tradeoff between linearity and input range. While the closed-loop VCO-based AFE solves this issue [3]–[5], this topology often needs a highly linear feedback DAC that notably reduces the input impedance of the AFE, unless impedance boosting buffers are used [6]. Besides, the closed-loop VCO based AFEs needs to be clocked continuously, resulting in power overhead.","PeriodicalId":415990,"journal":{"name":"2022 IEEE Custom Integrated Circuits Conference (CICC)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127393829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-04-01DOI: 10.1109/CICC53496.2022.9772850
Muya Chang, Xunzhao Yin, Z. Toroczkai, X. Hu, A. Raychowdhury
NP-hard combinatorial optimization problems (COPs) are known to be very complex and expensive to solve with traditional computers. COPs can be mapped onto many different kinds of architecture, such as fully digital [1]–[3], event based [4], oscillators [5]–[6], Optical LASER [7], Qbits, or pure analog approach [8]–[9]. We propose a scalable pure analog clock-free continuous-time dynamical system to solve COPs in hardware.
{"title":"An Analog Clock-free Compute Fabric base on Continuous-Time Dynamical System for Solving Combinatorial Optimization Problems","authors":"Muya Chang, Xunzhao Yin, Z. Toroczkai, X. Hu, A. Raychowdhury","doi":"10.1109/CICC53496.2022.9772850","DOIUrl":"https://doi.org/10.1109/CICC53496.2022.9772850","url":null,"abstract":"NP-hard combinatorial optimization problems (COPs) are known to be very complex and expensive to solve with traditional computers. COPs can be mapped onto many different kinds of architecture, such as fully digital [1]–[3], event based [4], oscillators [5]–[6], Optical LASER [7], Qbits, or pure analog approach [8]–[9]. We propose a scalable pure analog clock-free continuous-time dynamical system to solve COPs in hardware.","PeriodicalId":415990,"journal":{"name":"2022 IEEE Custom Integrated Circuits Conference (CICC)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121820155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-04-01DOI: 10.1109/CICC53496.2022.9772849
Chuxiong Lin, Weifeng He, Yannan Sun, Lingmin Shao, Bo Zhang, Jun Yang, Mingoo Seok
Emerging applications like a drone and an autonomous vehicle require system-on-a-chips (SoCs) with high reliability, e.g., the mean-time-between-failure (MTBF) needs to be over tens of thousands of hours [1]. Meanwhile, as these applications require increasingly higher performance and energy efficiency, a multi-core architecture is often desirable. Here, each core operates in an independent voltage/frequency (V/F) domain, ideally from the near-threshold voltage (NTV) to super-threshold, while communicating with one another via a network-on-chip (NoC) [2]. However, this makes it challenging to ensure robustness in clock domain crossing against metastability. Metastability becomes even more critical to NTV circuits since metastability resolution time constant $T$ grows super-linearly with voltage scaling [3]. Conventionally, an NoC uses multi-stage (4 stages in [4]) synchronizers to improve MTBF, but they increase latency and cannot completely eliminate metastability. Recently, [5] proposed a novel NTV flip-flop, which has a lower probability of having metastability. Another recent work [6] proposed to detect the necessary condition of metastability and mitigate it by modulating the RX clock and also requesting retransmission to guarantee data correctness. However, as it detects a necessary condition, not actual metastability, it tends to overly request retransmission, hurting latency, throughput, and energy efficiency.
{"title":"MPAM: Reliable, Low-Latency, Near-Threshold-Voltage Multi-Voltage/Frequency-Domain Network-on-Chip with Metastability Risk Prediction and Mitigation","authors":"Chuxiong Lin, Weifeng He, Yannan Sun, Lingmin Shao, Bo Zhang, Jun Yang, Mingoo Seok","doi":"10.1109/CICC53496.2022.9772849","DOIUrl":"https://doi.org/10.1109/CICC53496.2022.9772849","url":null,"abstract":"Emerging applications like a drone and an autonomous vehicle require system-on-a-chips (SoCs) with high reliability, e.g., the mean-time-between-failure (MTBF) needs to be over tens of thousands of hours [1]. Meanwhile, as these applications require increasingly higher performance and energy efficiency, a multi-core architecture is often desirable. Here, each core operates in an independent voltage/frequency (V/F) domain, ideally from the near-threshold voltage (NTV) to super-threshold, while communicating with one another via a network-on-chip (NoC) [2]. However, this makes it challenging to ensure robustness in clock domain crossing against metastability. Metastability becomes even more critical to NTV circuits since metastability resolution time constant $T$ grows super-linearly with voltage scaling [3]. Conventionally, an NoC uses multi-stage (4 stages in [4]) synchronizers to improve MTBF, but they increase latency and cannot completely eliminate metastability. Recently, [5] proposed a novel NTV flip-flop, which has a lower probability of having metastability. Another recent work [6] proposed to detect the necessary condition of metastability and mitigate it by modulating the RX clock and also requesting retransmission to guarantee data correctness. However, as it detects a necessary condition, not actual metastability, it tends to overly request retransmission, hurting latency, throughput, and energy efficiency.","PeriodicalId":415990,"journal":{"name":"2022 IEEE Custom Integrated Circuits Conference (CICC)","volume":"212 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121718549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}