Pub Date : 2021-02-13DOI: 10.1109/ISSCC42613.2021.9365819
Qiaochu Zhang, Shiyu Su, C. Ho, M. Chen
Ring oscillator (RO)-based frequency synthesizers enable cost-efficient and scaling-friendly implementation, but also result in worse phase noise compared to LC-based alternatives. There has been increasing interest in injecting a low-noise reference clock signal to the RO to refresh the accumulated jitter, as seen in multiplying delay-locked loops (MDLLs) and injection-locked phase-locked loops (1L-pLLs). The main challenge of this architecture is that the injection signal derived from the reference clock is not perfectly aligned with the RO phase at the injection node, hence spurious tones are generated, and this phase alignment issue is especially exacerbated in fractional-N 0peration as the injection time paint must gradually drift away from the reference clock phase [1]. One way of achieving this required phase drift is by tuning digital-to-time converters (DTCs) according to a fractional frequency multiplication ratio; however, DTC offset and gain errors can introduce spurs. Several works have demonstrated DTC calibration, but are either limited to foreground calibration or incur an additional time-to-digital converter (TDC) [2], [3]. To further suppress injection-locking-induced spurs, we propose a background DTC calibration scheme that: 1) simultaneously estimates DTC gain and offset errors by occasional injection squelching; 2) performs two-point gain and offset error correction in the respective analog and digital domains; 3) uses TDC dithering [4] to enhance error estimation accuracy; and 4) removes dithering noise with an adaptive comb-filter-assisted cancellation loop. To prove the concept, a fractional-N digital MDLL using the DTC calibration scheme is implemented in 65nm CMOS and demonstrates 1.67ps RMS jitter and -60dBc fractional spur with 26dB spur suppression.
{"title":"29.4 A Fractional-N Digital MDLL with Background Two-Point DTC Calibration Achieving -60dBc Fractional Spur","authors":"Qiaochu Zhang, Shiyu Su, C. Ho, M. Chen","doi":"10.1109/ISSCC42613.2021.9365819","DOIUrl":"https://doi.org/10.1109/ISSCC42613.2021.9365819","url":null,"abstract":"Ring oscillator (RO)-based frequency synthesizers enable cost-efficient and scaling-friendly implementation, but also result in worse phase noise compared to LC-based alternatives. There has been increasing interest in injecting a low-noise reference clock signal to the RO to refresh the accumulated jitter, as seen in multiplying delay-locked loops (MDLLs) and injection-locked phase-locked loops (1L-pLLs). The main challenge of this architecture is that the injection signal derived from the reference clock is not perfectly aligned with the RO phase at the injection node, hence spurious tones are generated, and this phase alignment issue is especially exacerbated in fractional-N 0peration as the injection time paint must gradually drift away from the reference clock phase [1]. One way of achieving this required phase drift is by tuning digital-to-time converters (DTCs) according to a fractional frequency multiplication ratio; however, DTC offset and gain errors can introduce spurs. Several works have demonstrated DTC calibration, but are either limited to foreground calibration or incur an additional time-to-digital converter (TDC) [2], [3]. To further suppress injection-locking-induced spurs, we propose a background DTC calibration scheme that: 1) simultaneously estimates DTC gain and offset errors by occasional injection squelching; 2) performs two-point gain and offset error correction in the respective analog and digital domains; 3) uses TDC dithering [4] to enhance error estimation accuracy; and 4) removes dithering noise with an adaptive comb-filter-assisted cancellation loop. To prove the concept, a fractional-N digital MDLL using the DTC calibration scheme is implemented in 65nm CMOS and demonstrates 1.67ps RMS jitter and -60dBc fractional spur with 26dB spur suppression.","PeriodicalId":371093,"journal":{"name":"2021 IEEE International Solid- State Circuits Conference (ISSCC)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131209160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-02-13DOI: 10.1109/isscc42613.2021.9365794
{"title":"ISSCC 2021 Back Cover","authors":"","doi":"10.1109/isscc42613.2021.9365794","DOIUrl":"https://doi.org/10.1109/isscc42613.2021.9365794","url":null,"abstract":"","PeriodicalId":371093,"journal":{"name":"2021 IEEE International Solid- State Circuits Conference (ISSCC)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134243239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-02-13DOI: 10.1109/ISSCC42613.2021.9365782
Yasser Moursy, T. Rosa, Lionel Jure, A. Quelen, S. Genevey, L. Pierrefeu, E. Collins, Joerg Winkler, Jonathan Park, G. Pillonnet, V. Huard, A. Bonzo, P. Flatresse
A near-threshold power supply aims to operate at the minimal energy point but suffers from high-sensitivity to process, temperature and voltage variations. Adaptive voltage scaling (AVS) is a well-known strategy to adapt the power supply to die-to-die and temperature variations [1]. However, AVS needs dedicated power supplies with nonnegligible overheads, e.g. extra die area, lower power converter efficiency, and with granularity limitations or complex fine-grain integration in the power mesh. SOI-based technologies offer unique features by biasing the wells below the transistors to tune the threshold voltage ($mathrm{V}_{mathrm{T}mathrm{H}}$). The well-known adaptive back-biasing (ABB) technique has already shown its capability to reduce power consumption or/and maintain operating frequency by compensating $mathrm{V}_{mathrm{T}mathrm{H}}$ variability according to process corners and temperature [1–5]. However, previously published ABB architectures provide a limited overview on how to integrate the ABB seamlessly in the digital design flow with industrial-grade qualification. We propose a reusable ABB-IP for any biased digital load, from 0.4-100 mm2, with low area and power overhead, e.g. 1.2% @ 2mm2 and 0.4% @ 10mm2, respectively. We properly quantify the gain in a mass-production context with a large statistical scope analysis across 316 measured dies from different split-wafer lots and from -40 to 125°C with a representative load (a Cortex M4F). Thanks to 3V asymmetrical wells amplitude swing, our ABB-IP brings up to 30% power reduction by decreasing the minimal power supply byl00mV, while maintaining the target operating frequency (50 MHz) with a high yield. Distributed timing monitors (DTM) guarantee an accurate timing monitoring of the biased digital load, while scalable well drivers adjust to the biased well area, enabling the ABB-IP genericity.
{"title":"A 0.021mm2 PVT-Aware Digital-Flow-Compatible Adaptive Back-Biasing Regulator with Scalable Drivers Achieving 450% Frequency Boosting and 30% Power Reduction in 22nm FDSOI Technology","authors":"Yasser Moursy, T. Rosa, Lionel Jure, A. Quelen, S. Genevey, L. Pierrefeu, E. Collins, Joerg Winkler, Jonathan Park, G. Pillonnet, V. Huard, A. Bonzo, P. Flatresse","doi":"10.1109/ISSCC42613.2021.9365782","DOIUrl":"https://doi.org/10.1109/ISSCC42613.2021.9365782","url":null,"abstract":"A near-threshold power supply aims to operate at the minimal energy point but suffers from high-sensitivity to process, temperature and voltage variations. Adaptive voltage scaling (AVS) is a well-known strategy to adapt the power supply to die-to-die and temperature variations [1]. However, AVS needs dedicated power supplies with nonnegligible overheads, e.g. extra die area, lower power converter efficiency, and with granularity limitations or complex fine-grain integration in the power mesh. SOI-based technologies offer unique features by biasing the wells below the transistors to tune the threshold voltage ($mathrm{V}_{mathrm{T}mathrm{H}}$). The well-known adaptive back-biasing (ABB) technique has already shown its capability to reduce power consumption or/and maintain operating frequency by compensating $mathrm{V}_{mathrm{T}mathrm{H}}$ variability according to process corners and temperature [1–5]. However, previously published ABB architectures provide a limited overview on how to integrate the ABB seamlessly in the digital design flow with industrial-grade qualification. We propose a reusable ABB-IP for any biased digital load, from 0.4-100 mm2, with low area and power overhead, e.g. 1.2% @ 2mm2 and 0.4% @ 10mm2, respectively. We properly quantify the gain in a mass-production context with a large statistical scope analysis across 316 measured dies from different split-wafer lots and from -40 to 125°C with a representative load (a Cortex M4F). Thanks to 3V asymmetrical wells amplitude swing, our ABB-IP brings up to 30% power reduction by decreasing the minimal power supply byl00mV, while maintaining the target operating frequency (50 MHz) with a high yield. Distributed timing monitors (DTM) guarantee an accurate timing monitoring of the biased digital load, while scalable well drivers adjust to the biased well area, enabling the ABB-IP genericity.","PeriodicalId":371093,"journal":{"name":"2021 IEEE International Solid- State Circuits Conference (ISSCC)","volume":"22 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114001455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-02-13DOI: 10.1109/ISSCC42613.2021.9365840
Jihwan Kim, S. Kundu, A. Balankutty, Matthew Beach, Bong Chan Kim, Stephen T. Kim, Yutao Liu, Savyassachi Keshava Murthy, Priya Wali, Kai Yu, Hyung Seok Kim, Chuanchang Liu, Dongseok Shin, Ariel Cohen, Yongping Fan, F. O’Mahony
Wireline IOs have doubled per-lane data-rate every 3-4 years over the last two decades due to increasing demand in high-performance computing, networking/communications, and most recently from machine learning and AI. To address the need for higher throughput, this paper presents a 224Gb/s DAC-based PAM-4 TX with 8-tap FFE in 10nm CMOS technology. Doubling the data-rate from 112Gb/s while supporting the same PAM-4 modulation requires doubling the pad and internal net bandwidth and reducing the clocking jitter and circuit noise PSD by $2 times $ while maintaining swing, linearity, and reliability requirements. These are addressed by combining a low-noise on-chip LC-PLL, an inductive clock distribution network with jitter filtering, a two-stage 4:1 MUX with active peaking, and a group-delay-optimized output matching network for signal integrity.
{"title":"8.1 A 224Gb/s DAC-Based PAM-4 Transmitter with 8-Tap FFE in 10nm CMOS","authors":"Jihwan Kim, S. Kundu, A. Balankutty, Matthew Beach, Bong Chan Kim, Stephen T. Kim, Yutao Liu, Savyassachi Keshava Murthy, Priya Wali, Kai Yu, Hyung Seok Kim, Chuanchang Liu, Dongseok Shin, Ariel Cohen, Yongping Fan, F. O’Mahony","doi":"10.1109/ISSCC42613.2021.9365840","DOIUrl":"https://doi.org/10.1109/ISSCC42613.2021.9365840","url":null,"abstract":"Wireline IOs have doubled per-lane data-rate every 3-4 years over the last two decades due to increasing demand in high-performance computing, networking/communications, and most recently from machine learning and AI. To address the need for higher throughput, this paper presents a 224Gb/s DAC-based PAM-4 TX with 8-tap FFE in 10nm CMOS technology. Doubling the data-rate from 112Gb/s while supporting the same PAM-4 modulation requires doubling the pad and internal net bandwidth and reducing the clocking jitter and circuit noise PSD by $2 times $ while maintaining swing, linearity, and reliability requirements. These are addressed by combining a low-noise on-chip LC-PLL, an inductive clock distribution network with jitter filtering, a two-stage 4:1 MUX with active peaking, and a group-delay-optimized output matching network for signal integrity.","PeriodicalId":371093,"journal":{"name":"2021 IEEE International Solid- State Circuits Conference (ISSCC)","volume":"43 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114115750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-02-13DOI: 10.1109/ISSCC42613.2021.9365931
Heyi Li, Z. Tan, Yuanxin Bao, Han Xiao, Hao Zhang, Kaixuan Du, Yihan Zhang, Le Ye, Ru Huang
Capacitive sensors are widely deployed in low-power IoT nodes, where power consumption is stringently limited by the batteries or energy harvesters. Energy-efficient interface circuits that convert sensing information into digital code are important for successful application of such sensors. Two humidity sensors based on a frequencylocking loop (FLL) [1] and a delta-sigma modulator (DSM) [2] achieve high resolution, but at the expense of high power consumption of 10.32μW and 15.6μW, respectively. The Zoom-based humidity sensor in [3] and capacitor-to-digital converters (CDC) in [3,4] exhibit a significantly improved dynamic range (DR). However, the DSM in the Zoom scheme typically entails large redundancy to cover the SAR conversion error due to noise or interference. Further, the OTA current budget is set to drive the maximum input capacitance, thus wasting power when driving typically small capacitance in most cases.
{"title":"5.1 A 1.5μW 0.135pJ·%RH2 CMOS Humidity Sensor Using Adaptive Range-Shift Zoom CDC and Power-Aware Floating Inverter Amplifier Array","authors":"Heyi Li, Z. Tan, Yuanxin Bao, Han Xiao, Hao Zhang, Kaixuan Du, Yihan Zhang, Le Ye, Ru Huang","doi":"10.1109/ISSCC42613.2021.9365931","DOIUrl":"https://doi.org/10.1109/ISSCC42613.2021.9365931","url":null,"abstract":"Capacitive sensors are widely deployed in low-power IoT nodes, where power consumption is stringently limited by the batteries or energy harvesters. Energy-efficient interface circuits that convert sensing information into digital code are important for successful application of such sensors. Two humidity sensors based on a frequencylocking loop (FLL) [1] and a delta-sigma modulator (DSM) [2] achieve high resolution, but at the expense of high power consumption of 10.32μW and 15.6μW, respectively. The Zoom-based humidity sensor in [3] and capacitor-to-digital converters (CDC) in [3,4] exhibit a significantly improved dynamic range (DR). However, the DSM in the Zoom scheme typically entails large redundancy to cover the SAR conversion error due to noise or interference. Further, the OTA current budget is set to drive the maximum input capacitance, thus wasting power when driving typically small capacitance in most cases.","PeriodicalId":371093,"journal":{"name":"2021 IEEE International Solid- State Circuits Conference (ISSCC)","volume":"221 7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116065167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-02-13DOI: 10.1109/ISSCC42613.2021.9365773
A. Matamura, N. Nishimura, Preston Birdsong, A. Bandyopadhyay, Adam Spirer, M. Markova, Shaolong Liu
True Wireless Stereo/True Wireless Active-Noise-Canceling (ANC) headphones require low-latency digital-input headphone drivers that consume the lowest possible power to maximize battery life while providing high-fidelity audio playback. Typical headphone drivers use Class-A/AB topologies, and to improve power efficiency, Class-G/H drivers with a ground-center operation are used at the expense of using external components to decouple the required extra supply rails [1–3]. Closed-loop Class-D speaker drivers have become popular [4–6], and filter-less configurations are common in the 1-to-3W output range [6]. Some of the challenges associated with a Class-D driver for headphone applications are to maintain high linearity and SNR for low-voltage supply rails while reducing quiescent power. This paper describes a digital input, 93% efficient, filter-less Class-D amplifier achieving 113dB SNR and -93dB THD+N while operating from a single 1.8V supply.
{"title":"31.1 An 82mW ΔΣ - Based Filter-Less Class-D Headphone Amplifier with -93dB THD+N, 113dB SNR and 93% Efficiency","authors":"A. Matamura, N. Nishimura, Preston Birdsong, A. Bandyopadhyay, Adam Spirer, M. Markova, Shaolong Liu","doi":"10.1109/ISSCC42613.2021.9365773","DOIUrl":"https://doi.org/10.1109/ISSCC42613.2021.9365773","url":null,"abstract":"True Wireless Stereo/True Wireless Active-Noise-Canceling (ANC) headphones require low-latency digital-input headphone drivers that consume the lowest possible power to maximize battery life while providing high-fidelity audio playback. Typical headphone drivers use Class-A/AB topologies, and to improve power efficiency, Class-G/H drivers with a ground-center operation are used at the expense of using external components to decouple the required extra supply rails [1–3]. Closed-loop Class-D speaker drivers have become popular [4–6], and filter-less configurations are common in the 1-to-3W output range [6]. Some of the challenges associated with a Class-D driver for headphone applications are to maintain high linearity and SNR for low-voltage supply rails while reducing quiescent power. This paper describes a digital input, 93% efficient, filter-less Class-D amplifier achieving 113dB SNR and -93dB THD+N while operating from a single 1.8V supply.","PeriodicalId":371093,"journal":{"name":"2021 IEEE International Solid- State Circuits Conference (ISSCC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122492851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-02-13DOI: 10.1109/ISSCC42613.2021.9365987
Hossein Jalili, O. Momeni
High-resolution and fast imaging/sensing at THz requires highly directive steerable beams for scanning the object. A coherent array of coupled sources could improve the radiated power but requires mechanical and slow scanning of the object [1]. Phased array systems could use beam steering to scan the object at a higher speed, but in both coherent-array and phased-array systems, large array sizes with high power consumption are needed to generate a highly directive and narrow beam for high image resolution [2 –4]. Although Si lens can be used to increase directivity in a phased array, the steering capability is significantly diminished [5]. Therefore, arrays of non-coherent sources are used with Si lens to illuminate different parts of the object with each source with high directivity [6]. The firing angle of each source is determined by the ratio of its displacement $(L_{dis})$ from the lens center to the lens radius $(R_{lens})$, as shown in Fig. 23.2.1 [1]. However, this type of source can only illuminate the object in discrete steps determined by beam spacing, which in turn is limited by the inevitable distance between adjacent sources on the chip. Being constrained to independent single pixels for illumination leads to loss of resolution and blind spots between the neighboring beams (Fig. 23.2.1). A larger lens can improve the resolution by reducing the beam spacing but at the cost of a smaller total scanning range.
{"title":"23.2 A 436-to-467GHz Lens-Integrated Reconfigurable Radiating Source with Continuous 2D Steering and Multi-Beam Operations in 65nm CMOS","authors":"Hossein Jalili, O. Momeni","doi":"10.1109/ISSCC42613.2021.9365987","DOIUrl":"https://doi.org/10.1109/ISSCC42613.2021.9365987","url":null,"abstract":"High-resolution and fast imaging/sensing at THz requires highly directive steerable beams for scanning the object. A coherent array of coupled sources could improve the radiated power but requires mechanical and slow scanning of the object [1]. Phased array systems could use beam steering to scan the object at a higher speed, but in both coherent-array and phased-array systems, large array sizes with high power consumption are needed to generate a highly directive and narrow beam for high image resolution [2 –4]. Although Si lens can be used to increase directivity in a phased array, the steering capability is significantly diminished [5]. Therefore, arrays of non-coherent sources are used with Si lens to illuminate different parts of the object with each source with high directivity [6]. The firing angle of each source is determined by the ratio of its displacement $(L_{dis})$ from the lens center to the lens radius $(R_{lens})$, as shown in Fig. 23.2.1 [1]. However, this type of source can only illuminate the object in discrete steps determined by beam spacing, which in turn is limited by the inevitable distance between adjacent sources on the chip. Being constrained to independent single pixels for illumination leads to loss of resolution and blind spots between the neighboring beams (Fig. 23.2.1). A larger lens can improve the resolution by reducing the beam spacing but at the cost of a smaller total scanning range.","PeriodicalId":371093,"journal":{"name":"2021 IEEE International Solid- State Circuits Conference (ISSCC)","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122509365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-02-13DOI: 10.1109/ISSCC42613.2021.9365839
M. Lefebvre, Ludovic Moreau, R. Dekimpe, D. Bol
Mixed-signal vision chips are becoming increasingly popular for low-power embedded computer vision applications on smartphones, wearables and IoT nodes, as they meet stringent power and area constraints while maintaining a sufficient level of accuracy for low- to medium-level image processing tasks. On the one hand, in-sensor processing [1, 2] enables massively parallel operation but relies on pixel-level processing elements that degrade the pixel pitch and restrict the convolutional receptive field to neighboring pixels [1], precluding multi-scale operation. On the other hand, near-sensor processing [3–5] can operate at multiple scales by pixel downsampling [3] or binning [4] but entails significant power and area overhead as an analog memory is required to store pixel values awaiting processing. In addition, previous near-sensor processing SoCs are generally application-specific and thus suffer from limited versatility. In this paper, we present a 65nm QQVGA convolutional imager SoC codenamed SleepSpotter capable of versatile feature extraction and region-of-interest (RoI) detection based on in-sensor current-domain MAC operations. It operates at 6 different scales, features programmable filter size (F), stride (S), and ternary filter weights (1.5b). It reaches a minimum energy of 2.5pJ/pixel•frame•filter and a peak efficiency of 3.6TOPS/W, with 29% pixel area overhead for enabling the convolution and without the need for an analog memory.
{"title":"A 0.2-to-3.6TOPS/W Programmable Convolutional Imager SoC with In-Sensor Current-Domain Ternary-Weighted MAC Operations for Feature Extraction and Region-of-Interest Detection","authors":"M. Lefebvre, Ludovic Moreau, R. Dekimpe, D. Bol","doi":"10.1109/ISSCC42613.2021.9365839","DOIUrl":"https://doi.org/10.1109/ISSCC42613.2021.9365839","url":null,"abstract":"Mixed-signal vision chips are becoming increasingly popular for low-power embedded computer vision applications on smartphones, wearables and IoT nodes, as they meet stringent power and area constraints while maintaining a sufficient level of accuracy for low- to medium-level image processing tasks. On the one hand, in-sensor processing [1, 2] enables massively parallel operation but relies on pixel-level processing elements that degrade the pixel pitch and restrict the convolutional receptive field to neighboring pixels [1], precluding multi-scale operation. On the other hand, near-sensor processing [3–5] can operate at multiple scales by pixel downsampling [3] or binning [4] but entails significant power and area overhead as an analog memory is required to store pixel values awaiting processing. In addition, previous near-sensor processing SoCs are generally application-specific and thus suffer from limited versatility. In this paper, we present a 65nm QQVGA convolutional imager SoC codenamed SleepSpotter capable of versatile feature extraction and region-of-interest (RoI) detection based on in-sensor current-domain MAC operations. It operates at 6 different scales, features programmable filter size (F), stride (S), and ternary filter weights (1.5b). It reaches a minimum energy of 2.5pJ/pixel•frame•filter and a peak efficiency of 3.6TOPS/W, with 29% pixel area overhead for enabling the convolution and without the need for an analog memory.","PeriodicalId":371093,"journal":{"name":"2021 IEEE International Solid- State Circuits Conference (ISSCC)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125384336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-02-13DOI: 10.1109/ISSCC42613.2021.9366023
Wei Shi, Jiaxin Liu, Abhishek Mukherjee, Xiangxing Yang, Xiyuan Tang, Linxiao Shen, Wenda Zhao, Nan Sun
A high-order CTDSM can provide high resolution with a small OSR, but its design suffers from a few challenges. First, it requires a large number of OTAs [1]. This increases the design complexity and power. In addition, each OTA contributes extra phase delay, whose reduction requires increasing the OTA BW, further increasing power. Second, it is harder to stabilize, especially considering PVT variations. For example, a slight change in the RC time constant can cause instability. One way to address these issues is to use a passive discrete-time (DT) noise-shaping (NS) SAR ADC as quantizer [2], [3]. In [2], a 3rdorder DSM is built with only 1 OTA and a 2ndorder NS-SAR. Since it is set by device ratios, the NTF of a NS-SAR is PVT-robust. Hence, the 3rd order DSM stability is equivalent to that of a 1storder CTDSM, which is easy to ensure. Nevertheless, because its CT front-end provides only 1storder shaping, it cannot provide sufficient suppression for noises coming from later stages, limiting its SNDR to 70dB. Reference [3] increases the CT front-end order to 2 by using a single-amplifier-biquad (SAB), but its NS-SAR is only 1storder with a mild zero at 0.5, which limits its achievable resolution. Overall, both [2] and [3] achieve only 3rd order shaping with a Schreier FoM limited to 171dB.
高阶CTDSM可以用小OSR提供高分辨率,但其设计存在一些挑战。首先,它需要大量的ota[1]。这增加了设计的复杂性和功率。此外,每个OTA都会产生额外的相位延迟,降低相位延迟需要增加OTA的BW,从而进一步提高功率。其次,很难稳定,特别是考虑到PVT的变化。例如,RC时间常数的微小变化可能导致不稳定。解决这些问题的一种方法是使用无源离散时间(DT)噪声整形(NS) SAR ADC作为量化器[2],[3]。在[2]中,仅使用1个OTA和一个2阶NS-SAR构建了一个3阶DSM。由于NTF是由设备比例决定的,因此NS-SAR的NTF具有pvt鲁棒性。因此,三阶DSM的稳定性等同于一级CTDSM的稳定性,易于保证。然而,由于其CT前端仅提供1阶整形,因此无法对来自后期的噪声提供足够的抑制,从而将其SNDR限制在70dB。参考文献[3]通过使用单放大器-双放大器(SAB)将CT前端阶数增加到2,但其NS-SAR仅为1阶,在0.5处为轻度零,这限制了其可实现的分辨率。总的来说,[2]和[3]都只能实现三阶整形,Schreier FoM限制在171dB。
{"title":"10.4 A 3.7mW 12.5MHz 81dB-SNDR 4th-Order CTDSM with Single-OTA and 2nd-Order NS-SAR","authors":"Wei Shi, Jiaxin Liu, Abhishek Mukherjee, Xiangxing Yang, Xiyuan Tang, Linxiao Shen, Wenda Zhao, Nan Sun","doi":"10.1109/ISSCC42613.2021.9366023","DOIUrl":"https://doi.org/10.1109/ISSCC42613.2021.9366023","url":null,"abstract":"A high-order CTDSM can provide high resolution with a small OSR, but its design suffers from a few challenges. First, it requires a large number of OTAs [1]. This increases the design complexity and power. In addition, each OTA contributes extra phase delay, whose reduction requires increasing the OTA BW, further increasing power. Second, it is harder to stabilize, especially considering PVT variations. For example, a slight change in the RC time constant can cause instability. One way to address these issues is to use a passive discrete-time (DT) noise-shaping (NS) SAR ADC as quantizer [2], [3]. In [2], a 3rdorder DSM is built with only 1 OTA and a 2ndorder NS-SAR. Since it is set by device ratios, the NTF of a NS-SAR is PVT-robust. Hence, the 3rd order DSM stability is equivalent to that of a 1storder CTDSM, which is easy to ensure. Nevertheless, because its CT front-end provides only 1storder shaping, it cannot provide sufficient suppression for noises coming from later stages, limiting its SNDR to 70dB. Reference [3] increases the CT front-end order to 2 by using a single-amplifier-biquad (SAB), but its NS-SAR is only 1storder with a mild zero at 0.5, which limits its achievable resolution. Overall, both [2] and [3] achieve only 3rd order shaping with a Schreier FoM limited to 171dB.","PeriodicalId":371093,"journal":{"name":"2021 IEEE International Solid- State Circuits Conference (ISSCC)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126230202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}