Pub Date : 2021-02-13DOI: 10.1109/ISSCC42613.2021.9366036
Haikun Jia, W. Deng, Pingda Guan, Zhihua Wang, B. Chi
The recent development of the 5th-generation (5G) communication sytems has set increasingly strict requirements on the spectral purity of millimeter-wave (mm-wave) local oscillators (LO). Low phase noise is crucial to enable advanced modulation formats for high communication data-rates. Much effort has been made to improve the phase noise performance of the mm-wave LOs. A lower frequency voltage-controlled oscillator (VCO) together with a frequency multiplier can lower the phase noise [1]; however, the high-order harmonic components in VCOs are usually very weak, which requires additional power-consuming mm-wave amplification stages to satisfy the LO swing requirement. For single mm-wave fundamental VCOs, the minimal achievable phase noise is bounded by the smallest realizable inductor that displays a high Q factor. To avoid the “small inductor” problem, N oscillators with relatively large inductors can be coupled together to improve the phase noise by $10log _{10}(mathrm{N})[2 -5]$. Authors in [2] presented a quad-core bipolar VCO working around 15GHz as shown in Fig. 20.3.1 (Left), where four one-turn inductors are star-connected with the active cores placed in the middle. Resistors (Rc) are added to avoid undesired multi-tone concurrent oscillations. However, the four one-turn inductors still suffer from the Q-factor drop when the inductance decreases, thus limiting the highest achievable oscillation frequency. Besides, $mathrm{V}_{DD}$ at the inductor central taps and $mathrm{V}_{SS}$ at the tail current source are far from each other, making the $mathrm{V}_{DD}- mathrm{V}_{SS}$ current return path long. This path has to be carefully modeled in simulations, especially in the mm-wave frequency range, where the return path inductance is comparable to the tank inductance. Instead of the star-connected topology, authors in [3] presented a circular-connected quad-core VCO working close to 30GHz, where the inductors are arranged in a circular topology as shown in Fig. 20.3.1 (Middle). The destructive coupling between the inner edges inside a small inductor is eliminated. Therefore, the minimal realizable inductance is further reduced while keeping a high Q factor. The central taps are connected by narrow metal traces to avoid latching and mode ambiguity. The VCO adopts a CMOS configuration, which limits the highest operating frequency. It would be difficult for this topology to be adopted in NMOS-only VCOs because the central taps have to be resistively isolated to suppress unwanted modes; therefore, they cannot be connected to the AC-ground power supply simultaneously as required by the NMOS-only configuration. Due to the lack of harmonic impedance control in the circular inductors, extra tail filtering transformers are added to improve the phase noise.
{"title":"A 60GHz 186.5dBc/Hz FoM Quad-Core Fundamental VCO Using Circular Triple-Coupled Transformer with No Mode Ambiguity in 65nm CMOS","authors":"Haikun Jia, W. Deng, Pingda Guan, Zhihua Wang, B. Chi","doi":"10.1109/ISSCC42613.2021.9366036","DOIUrl":"https://doi.org/10.1109/ISSCC42613.2021.9366036","url":null,"abstract":"The recent development of the 5th-generation (5G) communication sytems has set increasingly strict requirements on the spectral purity of millimeter-wave (mm-wave) local oscillators (LO). Low phase noise is crucial to enable advanced modulation formats for high communication data-rates. Much effort has been made to improve the phase noise performance of the mm-wave LOs. A lower frequency voltage-controlled oscillator (VCO) together with a frequency multiplier can lower the phase noise [1]; however, the high-order harmonic components in VCOs are usually very weak, which requires additional power-consuming mm-wave amplification stages to satisfy the LO swing requirement. For single mm-wave fundamental VCOs, the minimal achievable phase noise is bounded by the smallest realizable inductor that displays a high Q factor. To avoid the “small inductor” problem, N oscillators with relatively large inductors can be coupled together to improve the phase noise by $10log _{10}(mathrm{N})[2 -5]$. Authors in [2] presented a quad-core bipolar VCO working around 15GHz as shown in Fig. 20.3.1 (Left), where four one-turn inductors are star-connected with the active cores placed in the middle. Resistors (Rc) are added to avoid undesired multi-tone concurrent oscillations. However, the four one-turn inductors still suffer from the Q-factor drop when the inductance decreases, thus limiting the highest achievable oscillation frequency. Besides, $mathrm{V}_{DD}$ at the inductor central taps and $mathrm{V}_{SS}$ at the tail current source are far from each other, making the $mathrm{V}_{DD}- mathrm{V}_{SS}$ current return path long. This path has to be carefully modeled in simulations, especially in the mm-wave frequency range, where the return path inductance is comparable to the tank inductance. Instead of the star-connected topology, authors in [3] presented a circular-connected quad-core VCO working close to 30GHz, where the inductors are arranged in a circular topology as shown in Fig. 20.3.1 (Middle). The destructive coupling between the inner edges inside a small inductor is eliminated. Therefore, the minimal realizable inductance is further reduced while keeping a high Q factor. The central taps are connected by narrow metal traces to avoid latching and mode ambiguity. The VCO adopts a CMOS configuration, which limits the highest operating frequency. It would be difficult for this topology to be adopted in NMOS-only VCOs because the central taps have to be resistively isolated to suppress unwanted modes; therefore, they cannot be connected to the AC-ground power supply simultaneously as required by the NMOS-only configuration. Due to the lack of harmonic impedance control in the circular inductors, extra tail filtering transformers are added to improve the phase noise.","PeriodicalId":371093,"journal":{"name":"2021 IEEE International Solid- State Circuits Conference (ISSCC)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131704048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-02-13DOI: 10.1109/ISSCC42613.2021.9365950
Jiannan Huang, P. Mercier
Motion and stimulation artifacts encountered in wearable sensors present difficult dynamic range (DR) and linearity challenges: AFEs need to be able to resolve $mu mathrm{V} -$ level signals in the presence of artifacts up to 100s of mV in amplitude while maintaining linearity without saturation, such that the signal of interest can be readily recovered during post-processing. Since it is not possible to build an amplifier with appreciable gain and linearity for $ gt 100$ mV inputs under $ lt 1mathrm{V}$ SoC-compatible supply, most high-DR AFEs instead incorporate an LNA into $mathrm{a}Delta sum -$ based ADC-direct architecture [1] –[3]. However, as many emerging wearable devices desire single-chip integration in scaled CMOS for size and digital performance considerations, conventional $Delta sum$ Ms, which rely on voltage-domain building blocks, suffer from reduced intrinsic gain and headroom. Instead, time-domain quantization through VCO-based AFEs benefits from scaled CMOS and offers intrinsic $1 ^{st} -$ order noise shaping. However, the non-linear V-F conversion of conventional VCO-based AFEs makes achieving a large and linear DR difficult [1]. To address this, [3] adopts a differential pulse code modulation (DPCM) technique that enables the VCO to process only a small prediction error, VERR, by subtracting from $V_{IN}mathrm{a}$ digital predictor value fed through a DAC (Fig. 28.1.1 top). Maximal linearity would be achieved if the predictor was perfect, resulting in $V_{ERR},approx 0$; however, this requires a highperformance and power-expensive DAC. Therefore, [3] truncates the predictor’s output, reducing the DAC requirements to 9b, but adding truncation error, ET. If the gain of paths P1 and P2 are made equal, which is enforced in [3] via a gain error calibration (GEC) circuit, ET will ideally cancel at the output. However, it is not possible to achieve perfect ET cancellation, and any residual ET will degrade SQNR, limiting the extent to which truncation can be used to relax the DAC’s resolution. In addition, GEC itself introduces power overhead.
{"title":"A Distortion-Free VCO-Based Sensor-to-Digital Front-End Achieving 178.9dB FoM and 128dB SFDR with a Calibration-Free Differential Pulse-Code Modulation Technique","authors":"Jiannan Huang, P. Mercier","doi":"10.1109/ISSCC42613.2021.9365950","DOIUrl":"https://doi.org/10.1109/ISSCC42613.2021.9365950","url":null,"abstract":"Motion and stimulation artifacts encountered in wearable sensors present difficult dynamic range (DR) and linearity challenges: AFEs need to be able to resolve $mu mathrm{V} -$ level signals in the presence of artifacts up to 100s of mV in amplitude while maintaining linearity without saturation, such that the signal of interest can be readily recovered during post-processing. Since it is not possible to build an amplifier with appreciable gain and linearity for $ gt 100$ mV inputs under $ lt 1mathrm{V}$ SoC-compatible supply, most high-DR AFEs instead incorporate an LNA into $mathrm{a}Delta sum -$ based ADC-direct architecture [1] –[3]. However, as many emerging wearable devices desire single-chip integration in scaled CMOS for size and digital performance considerations, conventional $Delta sum$ Ms, which rely on voltage-domain building blocks, suffer from reduced intrinsic gain and headroom. Instead, time-domain quantization through VCO-based AFEs benefits from scaled CMOS and offers intrinsic $1 ^{st} -$ order noise shaping. However, the non-linear V-F conversion of conventional VCO-based AFEs makes achieving a large and linear DR difficult [1]. To address this, [3] adopts a differential pulse code modulation (DPCM) technique that enables the VCO to process only a small prediction error, VERR, by subtracting from $V_{IN}mathrm{a}$ digital predictor value fed through a DAC (Fig. 28.1.1 top). Maximal linearity would be achieved if the predictor was perfect, resulting in $V_{ERR},approx 0$; however, this requires a highperformance and power-expensive DAC. Therefore, [3] truncates the predictor’s output, reducing the DAC requirements to 9b, but adding truncation error, ET. If the gain of paths P1 and P2 are made equal, which is enforced in [3] via a gain error calibration (GEC) circuit, ET will ideally cancel at the output. However, it is not possible to achieve perfect ET cancellation, and any residual ET will degrade SQNR, limiting the extent to which truncation can be used to relax the DAC’s resolution. In addition, GEC itself introduces power overhead.","PeriodicalId":371093,"journal":{"name":"2021 IEEE International Solid- State Circuits Conference (ISSCC)","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123076140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-02-13DOI: 10.1109/ISSCC42613.2021.9365856
Maged ElAnsary, Jianxiong Xu, J. S. Filho, Gairik Dutta, L. Long, Aly Shoukry, Camilo Tejeiro, Chenxi Tang, Enver G. Kilinc, Jaimin Joshi, P. Sabetian, Samantha Unger, J. Zariffa, P. Yoo, R. Genov
The peripheral nervous system (PNS) enables communication between the central nervous system and various organs, for example by conveying sensory information and relaying motor commands. Electrical stimulation of peripheral nerves has been shown effective in treating major intractable disorders ranging from autoimmune disorder to chronic pain. It acts on specific nerves and avoids significant side effects of most drugs. Closed-loop PNS neurostimulators offer the additional benefits of personalization and optimality of the treatment. Such medical devices infer physiological function from measurable nerve action potentials and deliver custom-tailored electrical stimulation to elicit desired clinical outcomes.
{"title":"28.8 Multi-Modal Peripheral Nerve Active Probe and Microstimulator with On-Chip Dual-Coil Power/Data Transmission and 64 2nd-Order Opamp-Less ΔΣ ADCs","authors":"Maged ElAnsary, Jianxiong Xu, J. S. Filho, Gairik Dutta, L. Long, Aly Shoukry, Camilo Tejeiro, Chenxi Tang, Enver G. Kilinc, Jaimin Joshi, P. Sabetian, Samantha Unger, J. Zariffa, P. Yoo, R. Genov","doi":"10.1109/ISSCC42613.2021.9365856","DOIUrl":"https://doi.org/10.1109/ISSCC42613.2021.9365856","url":null,"abstract":"The peripheral nervous system (PNS) enables communication between the central nervous system and various organs, for example by conveying sensory information and relaying motor commands. Electrical stimulation of peripheral nerves has been shown effective in treating major intractable disorders ranging from autoimmune disorder to chronic pain. It acts on specific nerves and avoids significant side effects of most drugs. Closed-loop PNS neurostimulators offer the additional benefits of personalization and optimality of the treatment. Such medical devices infer physiological function from measurable nerve action potentials and deliver custom-tailored electrical stimulation to elicit desired clinical outcomes.","PeriodicalId":371093,"journal":{"name":"2021 IEEE International Solid- State Circuits Conference (ISSCC)","volume":"24 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126107027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-02-13DOI: 10.1109/ISSCC42613.2021.9365761
Hao Guo, Yong Chen, Pui-in Mak, R. Martins
Since 2001 [1], LC VCOs have been demonstrating significant improvements of figureof-merit (FoM) and 1/f phase noise (PN) corner [2–5] by exploring common-mode (CM) resonance at twice the oscillation frequency $left(2 F_{0 s c}right)$. In addition, for area reduction, the shaping of the impulse sensitivity function (ISF) has evolved from explicit with two coils [1] to implicit with one coil [2]. Yet, as depicted in Fig. 20.1.1, the latter suffers from large CM magnetic-flux cancellation, resulting in a much lower CM impedance $left|Z_{c M}right|$ that is $sim 0.64$ of its differential-mode (DM) impedance $left|Z_{D M}right|$. The VCO in [3] achieves a high FoM $_{circledast 10 mathrm{M}+mathrm{z}}$ up to $191.4 mathrm{dBc} / mathrm{Hz}$ by boosting $left|mathrm{Z}_{mathrm{CM}}right|$ at $2 mathrm{~F}_{0 mathrm{sc}}$ and $left|mathrm{Z}_{mathrm{DM}}right|$ at $3 mathrm{~F}_{{osc. }}$ Yet, to uphold an optimal performance over the tuning range (TR), the VCO in [3]still requires manual harmonic tuning for aligning the $1^{{st }}-t 0-2^{{nd }}$ and $1^{{st }-t 0-3^{{d }} { harmonic resonances. This }}$ denotes a narrowband effect. For the VCO in [4], which features a four-winding transformer with no harmonic tuning, there is a large variation of FoM $_{circledast 10 mathrm{MHz}}(190.7 mathrm{t} mathrm{t}$ $196.5 mathrm{dBc} / mathrm{Hz}$) and 1/f PN corner (60 to 600kHz) across the TR.
{"title":"A 5.0-to-6.36GHz Wideband-Harmonic-Shaping VCO Achieving 196.9dBc/Hz Peak FoM and 90-to-180kHz 1/f3 PN Corner Without Harmonic Tuning","authors":"Hao Guo, Yong Chen, Pui-in Mak, R. Martins","doi":"10.1109/ISSCC42613.2021.9365761","DOIUrl":"https://doi.org/10.1109/ISSCC42613.2021.9365761","url":null,"abstract":"Since 2001 [1], LC VCOs have been demonstrating significant improvements of figureof-merit (FoM) and 1/f phase noise (PN) corner [2–5] by exploring common-mode (CM) resonance at twice the oscillation frequency $left(2 F_{0 s c}right)$. In addition, for area reduction, the shaping of the impulse sensitivity function (ISF) has evolved from explicit with two coils [1] to implicit with one coil [2]. Yet, as depicted in Fig. 20.1.1, the latter suffers from large CM magnetic-flux cancellation, resulting in a much lower CM impedance $left|Z_{c M}right|$ that is $sim 0.64$ of its differential-mode (DM) impedance $left|Z_{D M}right|$. The VCO in [3] achieves a high FoM $_{circledast 10 mathrm{M}+mathrm{z}}$ up to $191.4 mathrm{dBc} / mathrm{Hz}$ by boosting $left|mathrm{Z}_{mathrm{CM}}right|$ at $2 mathrm{~F}_{0 mathrm{sc}}$ and $left|mathrm{Z}_{mathrm{DM}}right|$ at $3 mathrm{~F}_{{osc. }}$ Yet, to uphold an optimal performance over the tuning range (TR), the VCO in [3]still requires manual harmonic tuning for aligning the $1^{{st }}-t 0-2^{{nd }}$ and $1^{{st }-t 0-3^{{d }} { harmonic resonances. This }}$ denotes a narrowband effect. For the VCO in [4], which features a four-winding transformer with no harmonic tuning, there is a large variation of FoM $_{circledast 10 mathrm{MHz}}(190.7 mathrm{t} mathrm{t}$ $196.5 mathrm{dBc} / mathrm{Hz}$) and 1/f PN corner (60 to 600kHz) across the TR.","PeriodicalId":371093,"journal":{"name":"2021 IEEE International Solid- State Circuits Conference (ISSCC)","volume":"51 1-2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120930727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-02-13DOI: 10.1109/ISSCC42613.2021.9365740
T. Hirata, Hironobu Murata, Hideaki Matsuda, Yojiro Tezuka, Shiro Tsunai
In recent developments, image sensors are no longer simply a means for collecting optical signals, but rather, are increasingly expected to serve as intelligent systems with surrounding configurations. Coded exposure (CE) is one of the methods applied in intelligent systems approaches, and various functions can be realized by the selection of the integration variable in the plenoptic function. High DR can be realized if the integration variable is time. A variety of means to achieve high DR have been proposed in the literature, for example, a method that provides a plurality of detection capacitors (LOFIC, [1]) or a method of preventing saturation by adding low-sensitivity pixels [2]. These methods often require an enlarged pixel size. Alternatively, high-speed readout like an array parallel stacked structure [3] is useful for integrating multiple frames to realize high DR. However, this leads to an increase in noise and needs faster readout to reduce motion artifacts. In order to mitigate the adverse effects, a method has been proposed in which a pixel array is divided into a plurality of blocks and the signal integration time of each block is individually controlled [4]. Another method was described in which CE was demonstrated by using pixel level control of the exposure time [5]. However, in these methods, it was necessary to arrange the readout path and control circuitry within the same plane because these are non-stacked sensors, so the pixel size was relatively large and high resolution was difficult to realize. To overcome the above problems, we report a sensor that can simultaneously achieve 4K×4K resolution and 1000fps high-speed readout. Using a stacked structure, we demonstrate coded exposure capability by individually controlling exposure time for each block of pixels.
{"title":"7.8 A 1-inch 17Mpixel 1000fps Block-Controlled Coded-Exposure Back-Illuminated Stacked CMOS Image Sensor for Computational Imaging and Adaptive Dynamic Range Control","authors":"T. Hirata, Hironobu Murata, Hideaki Matsuda, Yojiro Tezuka, Shiro Tsunai","doi":"10.1109/ISSCC42613.2021.9365740","DOIUrl":"https://doi.org/10.1109/ISSCC42613.2021.9365740","url":null,"abstract":"In recent developments, image sensors are no longer simply a means for collecting optical signals, but rather, are increasingly expected to serve as intelligent systems with surrounding configurations. Coded exposure (CE) is one of the methods applied in intelligent systems approaches, and various functions can be realized by the selection of the integration variable in the plenoptic function. High DR can be realized if the integration variable is time. A variety of means to achieve high DR have been proposed in the literature, for example, a method that provides a plurality of detection capacitors (LOFIC, [1]) or a method of preventing saturation by adding low-sensitivity pixels [2]. These methods often require an enlarged pixel size. Alternatively, high-speed readout like an array parallel stacked structure [3] is useful for integrating multiple frames to realize high DR. However, this leads to an increase in noise and needs faster readout to reduce motion artifacts. In order to mitigate the adverse effects, a method has been proposed in which a pixel array is divided into a plurality of blocks and the signal integration time of each block is individually controlled [4]. Another method was described in which CE was demonstrated by using pixel level control of the exposure time [5]. However, in these methods, it was necessary to arrange the readout path and control circuitry within the same plane because these are non-stacked sensors, so the pixel size was relatively large and high resolution was difficult to realize. To overcome the above problems, we report a sensor that can simultaneously achieve 4K×4K resolution and 1000fps high-speed readout. Using a stacked structure, we demonstrate coded exposure capability by individually controlling exposure time for each block of pixels.","PeriodicalId":371093,"journal":{"name":"2021 IEEE International Solid- State Circuits Conference (ISSCC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121169457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-02-13DOI: 10.1109/ISSCC42613.2021.9366012
Minsoo Choi, Zhongkai Wang, Kyoungtae Lee, Kwanseo Park, Zhaokai Liu, Ayan Biswas, Jaeduk Han, E. Alon
The ever-expanding demand for ultra-high-speed interconnects has driven the development of wireline TXs operating at >100Gb/s per lane [1]–[4]. This paper presents a PAM-4 TX achieving 200Gb/s with improved output bandwidth and output swing by minimizing the driver capacitance with pull-up current sources, multiplexing with flexible clock timing control, and employing a fully reconfigurable 5-tap FFE architecture.
{"title":"8 An Output-Bandwidth-Optimized 200Gb/s PAM-4 100Gb/s NRZ Transmitter with 5-Tap FFE in 28nm CMOS","authors":"Minsoo Choi, Zhongkai Wang, Kyoungtae Lee, Kwanseo Park, Zhaokai Liu, Ayan Biswas, Jaeduk Han, E. Alon","doi":"10.1109/ISSCC42613.2021.9366012","DOIUrl":"https://doi.org/10.1109/ISSCC42613.2021.9366012","url":null,"abstract":"The ever-expanding demand for ultra-high-speed interconnects has driven the development of wireline TXs operating at >100Gb/s per lane [1]–[4]. This paper presents a PAM-4 TX achieving 200Gb/s with improved output bandwidth and output swing by minimizing the driver capacitance with pull-up current sources, multiplexing with flexible clock timing control, and employing a fully reconfigurable 5-tap FFE architecture.","PeriodicalId":371093,"journal":{"name":"2021 IEEE International Solid- State Circuits Conference (ISSCC)","volume":"88 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116753406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-02-13DOI: 10.1109/ISSCC42613.2021.9365744
Daniel Gruber, M. Clara, R. Sanchez-Perez, Yu-shan Wang, C. Duller, Gerald Rauter, Patrick Torta, K. Azadet
Future multi-band software-defined-radio base-stations for digital beamforming and massive MIMO applications depend heavily on the availability of highly linear and compact data converters with good power efficiency, while at the same time offering multi-GHz signal-bandwidth at sampling rates well in excess of 10GS/s. Wideband RF-sampling D/A-converters have traditionally been implemented in current-steering architectures, mostly with extensive calibration infrastructure [1] –[3]. The transistor stack required to achieve the necessary static and dynamic output impedance for the code-steered current sources leads to limited supply voltage scalability, while the capacitive self-loading by the current-source array makes true wideband matching at the RF-output inherently difficult. Capacitive digital-to-analog converters (C-DAC) have been widely used as RF DAC or switched-capacitor power amplifiers. Up to now digital transmitters have used C-DACs with inherent mixing functionality in polar or IQ systems for synthesis of high-power RF signals of moderate bandwidth of up to 160MHz [4] –[6]. This work uses a capacitive DAC as a direct RF-sampling DAC with moderate output power level for direct signal synthesis over a bandwidth from 0.5GHz up to at least 8GHz.
{"title":"10.6 A 12b 16GS/s RF-Sampling Capacitive DAC for Multi-Band Soft-Radio Base-Station Applications with On-Chip Transmission-Line Matching Network in 16nm FinFET","authors":"Daniel Gruber, M. Clara, R. Sanchez-Perez, Yu-shan Wang, C. Duller, Gerald Rauter, Patrick Torta, K. Azadet","doi":"10.1109/ISSCC42613.2021.9365744","DOIUrl":"https://doi.org/10.1109/ISSCC42613.2021.9365744","url":null,"abstract":"Future multi-band software-defined-radio base-stations for digital beamforming and massive MIMO applications depend heavily on the availability of highly linear and compact data converters with good power efficiency, while at the same time offering multi-GHz signal-bandwidth at sampling rates well in excess of 10GS/s. Wideband RF-sampling D/A-converters have traditionally been implemented in current-steering architectures, mostly with extensive calibration infrastructure [1] –[3]. The transistor stack required to achieve the necessary static and dynamic output impedance for the code-steered current sources leads to limited supply voltage scalability, while the capacitive self-loading by the current-source array makes true wideband matching at the RF-output inherently difficult. Capacitive digital-to-analog converters (C-DAC) have been widely used as RF DAC or switched-capacitor power amplifiers. Up to now digital transmitters have used C-DACs with inherent mixing functionality in polar or IQ systems for synthesis of high-power RF signals of moderate bandwidth of up to 160MHz [4] –[6]. This work uses a capacitive DAC as a direct RF-sampling DAC with moderate output power level for direct signal synthesis over a bandwidth from 0.5GHz up to at least 8GHz.","PeriodicalId":371093,"journal":{"name":"2021 IEEE International Solid- State Circuits Conference (ISSCC)","volume":" 73","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113948677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-02-13DOI: 10.1109/ISSCC42613.2021.9365809
Jae-Woo Park, Doogon Kim, Sunghwa Ok, Jaebeom Park, Taeheui Kwon, Hyun-Seob Lee, Sungmook Lim, Sun-Young Jung, Hyeong-Jin Choi, Taikyu Kang, Gwan Park, Chulwoo Yang, Jeong-Gil Choi, Gwihan Ko, Jae-Hyeon Shin, Ingon Yang, Junghoon Nam, H. Sohn, Seok-in Hong, Yohan Jeong, Sung-Wook Choi, Changwoon Choi, Hyun-Soo Shin, Ju-Young Lim, Dongkyu Youn, Sanghyuk Nam, Juyeab Lee, M. Ahn, Hoseok Lee, Seungpil Lee, Jongmin Park, Kichang Gwon, Woopyo Jeong, Jungdal Choi, Jinkook Kim, K. Jin
With an explosive growth of data generated by various applications, one of the most important topics of the current era is to increase the storage capacity. The evolution from 2D planar NAND to 3D NAND enables the development of high-density storage by increasing the number of stacked word-lines (WLs) in a smaller footprint. The industry has moved beyond 96-stacked-WL and achieved a 128-stacked 3D NAND. A 128-stacked 3b/cell 3D NAND with a density of 7.8Gb/mm 2 was reported recently, based on a peripheral circuit under cell array (PUC) structure [1]. Nevertheless, due to the constant demand for increased density, 3D NAND faces the following challenges [2,3]: (1) a reduced PUC area due to an increasing WL stack, (2) increased load due to a higher number of stacks and a reduced spacing between WLs, (3) rising WL-channel capacitance due to an increasing number of strings, and (4) variation in the RC delay between WLs due to the non-uniformity of plug critical dimension (CD). Not only do these problems limit the density improvement of 3D NAND, but they also increase the WL rise time, which degrades read and write performance. This paper proposes the following techniques to overcome these challenges: (1) a 12-stage page buffer (PB) with one-to-one (1:1) PBUS(PB to cache connection bus), (2) a variable stage and frequency charge pump with a boosted local pump, (3) center X-decoder (XDEC) and half-plane activation, (4) an unselected string boosting scheme, and (5) adaptive WL overdrive (OVD). By applying these techniques, we achieved a density of 10.8Gb/mm 2 in a 176stacked 3D NAND using 3b/cell.
{"title":"A 176-Stacked 512Gb 3b/Cell 3D-NAND Flash with 10.8Gb/mm2 Density with a Peripheral Circuit Under Cell Array Architecture","authors":"Jae-Woo Park, Doogon Kim, Sunghwa Ok, Jaebeom Park, Taeheui Kwon, Hyun-Seob Lee, Sungmook Lim, Sun-Young Jung, Hyeong-Jin Choi, Taikyu Kang, Gwan Park, Chulwoo Yang, Jeong-Gil Choi, Gwihan Ko, Jae-Hyeon Shin, Ingon Yang, Junghoon Nam, H. Sohn, Seok-in Hong, Yohan Jeong, Sung-Wook Choi, Changwoon Choi, Hyun-Soo Shin, Ju-Young Lim, Dongkyu Youn, Sanghyuk Nam, Juyeab Lee, M. Ahn, Hoseok Lee, Seungpil Lee, Jongmin Park, Kichang Gwon, Woopyo Jeong, Jungdal Choi, Jinkook Kim, K. Jin","doi":"10.1109/ISSCC42613.2021.9365809","DOIUrl":"https://doi.org/10.1109/ISSCC42613.2021.9365809","url":null,"abstract":"With an explosive growth of data generated by various applications, one of the most important topics of the current era is to increase the storage capacity. The evolution from 2D planar NAND to 3D NAND enables the development of high-density storage by increasing the number of stacked word-lines (WLs) in a smaller footprint. The industry has moved beyond 96-stacked-WL and achieved a 128-stacked 3D NAND. A 128-stacked 3b/cell 3D NAND with a density of 7.8Gb/mm 2 was reported recently, based on a peripheral circuit under cell array (PUC) structure [1]. Nevertheless, due to the constant demand for increased density, 3D NAND faces the following challenges [2,3]: (1) a reduced PUC area due to an increasing WL stack, (2) increased load due to a higher number of stacks and a reduced spacing between WLs, (3) rising WL-channel capacitance due to an increasing number of strings, and (4) variation in the RC delay between WLs due to the non-uniformity of plug critical dimension (CD). Not only do these problems limit the density improvement of 3D NAND, but they also increase the WL rise time, which degrades read and write performance. This paper proposes the following techniques to overcome these challenges: (1) a 12-stage page buffer (PB) with one-to-one (1:1) PBUS(PB to cache connection bus), (2) a variable stage and frequency charge pump with a boosted local pump, (3) center X-decoder (XDEC) and half-plane activation, (4) an unselected string boosting scheme, and (5) adaptive WL overdrive (OVD). By applying these techniques, we achieved a density of 10.8Gb/mm 2 in a 176stacked 3D NAND using 3b/cell.","PeriodicalId":371093,"journal":{"name":"2021 IEEE International Solid- State Circuits Conference (ISSCC)","volume":"311 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123768194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-02-13DOI: 10.1109/ISSCC42613.2021.9365830
E. Garay, D. Munzer, Hua Wang
The mm-wave spectrum is opening a new opportunity for TRx systems to operate at high-Gb/s data-rates. However, this opportunity is also imposing stringent requirements for power amplifiers (PAs) in terms of efficiency and linearity. To this date, all PA designs focus on increasing the peak/power-back-off (PBO) PAE and output power $(max mathrm{P}_{out})$ by either presenting multi-harmonic terminations or improving on existing topologies, such as stacked, outphasing, and Doherty PAs [1 –3]. However, in highly scaled silicon processes with low supply voltages, these reported techniques see diminishing returns on PAE and $mathrm{P}_{out}$ since the transistor knee voltage $(mathrm{V}_{knee})$ becomes a significant portion of the supply voltage [5]. Moreover, an extra reduction in supply voltage is often performed in practical deployment to ensure device reliability. This is especially relevant for mm-wave array operations, where array element couplings result in substantial antenna impedance mismatches and undesired large PA voltage swings [6]. Although the reported techniques have improved overall PA efficiency at mm-wave, fundamentally they are incapable of surpassing the theoretical PA core efficiency at the same conduction angle (e.g., Class-B common-source (CS) PA) without resorting to device switching, or harmonic shaping.
{"title":"26.3 A mm-Wave Power Amplifier for 5G Communication Using a Dual-Drive Topology Exhibiting a Maximum PAE of 50% and Maximum DE of 60% at 30GHz","authors":"E. Garay, D. Munzer, Hua Wang","doi":"10.1109/ISSCC42613.2021.9365830","DOIUrl":"https://doi.org/10.1109/ISSCC42613.2021.9365830","url":null,"abstract":"The mm-wave spectrum is opening a new opportunity for TRx systems to operate at high-Gb/s data-rates. However, this opportunity is also imposing stringent requirements for power amplifiers (PAs) in terms of efficiency and linearity. To this date, all PA designs focus on increasing the peak/power-back-off (PBO) PAE and output power $(max mathrm{P}_{out})$ by either presenting multi-harmonic terminations or improving on existing topologies, such as stacked, outphasing, and Doherty PAs [1 –3]. However, in highly scaled silicon processes with low supply voltages, these reported techniques see diminishing returns on PAE and $mathrm{P}_{out}$ since the transistor knee voltage $(mathrm{V}_{knee})$ becomes a significant portion of the supply voltage [5]. Moreover, an extra reduction in supply voltage is often performed in practical deployment to ensure device reliability. This is especially relevant for mm-wave array operations, where array element couplings result in substantial antenna impedance mismatches and undesired large PA voltage swings [6]. Although the reported techniques have improved overall PA efficiency at mm-wave, fundamentally they are incapable of surpassing the theoretical PA core efficiency at the same conduction angle (e.g., Class-B common-source (CS) PA) without resorting to device switching, or harmonic shaping.","PeriodicalId":371093,"journal":{"name":"2021 IEEE International Solid- State Circuits Conference (ISSCC)","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123786144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-02-13DOI: 10.1109/ISSCC42613.2021.9365984
Jian-Wei Su, Yen-Chi Chou, Ruhui Liu, Ta-Wei Liu, Pei-Jung Lu, P. Wu, Yen-Lin Chung, Li-Yang Hung, Jin-Sheng Ren, Tianlong Pan, Sih-Han Li, Shih-Chieh Chang, S. Sheu, W. Lo, Chih-I Wu, Xin Si, C. Lo, Ren-Shuo Liu, C. Hsieh, K. Tang, Meng-Fan Chang
Recent SRAM-based computation-in-memory (CIM) macros enable mid-to-high precision multiply-and-accumulate (MAC) operations with improved energy efficiency using ultra-small/small capacity (0.4-8KB) memory devices. However, advanced CIM-based edge-AI chips favor multiple mid/large capacity SRAM-CIM macros: with high input (IN) and weight (W) precision to reduce the frequency of data reloads from external DRAM, and to avoid the need for additional SRAM buffers or ultra-large on-chip weight buffers. However, enlarging memory capacity and throughput increases the delay parasitics on WLs and BLs, and the number of parallel computing elements; resulting in longer compute latency (tAC), lower energy-efficiency (EF), degraded signal margin, and larger fluctuations in power consumption across data-patterns (see Fig. 16.3.1). Recent SRAM-CIM macros tend to not use in-lab SRAM cells, with a logic-based layout, in favor of foundry provided compact-layout 8T [2], 3, [5] or 6T cells with local-computing cells (LCCs) [4], [6] to reduce the cell-array area and facilitate manufacturing. This paper presents a SRAM-CIM structure using (1) a segmented-BL charge-sharing (SBCS) scheme for MAC operations, with low energy consumption and a consistently high signal margin across MAC values (MACV); (2) An new LCC cell, called a source-injection local-multiplication cell (SILMC), to support the SBCS scheme with a consistent signal margin against transistor process variation; and (3) A prioritized-hybrid-ADC (Ph-ADC) to achieve a small area and power overhead for analog readout. A 28nm 384kb SRAM-CIM macro was fabricated using a foundry compact-6T cell with support for MAC operations with 16 accumulations of 8b-inputs and 8b-weights with near-full precision output (20b). This macro achieves a 7.2ns tAC and a 22.75TOPS/W EF for 8b-MAC operations with an FoM (IN-precision × W-precision × output-ratio × output-channel × EF/tAC) 6× higher than prior work.
{"title":"16.3 A 28nm 384kb 6T-SRAM Computation-in-Memory Macro with 8b Precision for AI Edge Chips","authors":"Jian-Wei Su, Yen-Chi Chou, Ruhui Liu, Ta-Wei Liu, Pei-Jung Lu, P. Wu, Yen-Lin Chung, Li-Yang Hung, Jin-Sheng Ren, Tianlong Pan, Sih-Han Li, Shih-Chieh Chang, S. Sheu, W. Lo, Chih-I Wu, Xin Si, C. Lo, Ren-Shuo Liu, C. Hsieh, K. Tang, Meng-Fan Chang","doi":"10.1109/ISSCC42613.2021.9365984","DOIUrl":"https://doi.org/10.1109/ISSCC42613.2021.9365984","url":null,"abstract":"Recent SRAM-based computation-in-memory (CIM) macros enable mid-to-high precision multiply-and-accumulate (MAC) operations with improved energy efficiency using ultra-small/small capacity (0.4-8KB) memory devices. However, advanced CIM-based edge-AI chips favor multiple mid/large capacity SRAM-CIM macros: with high input (IN) and weight (W) precision to reduce the frequency of data reloads from external DRAM, and to avoid the need for additional SRAM buffers or ultra-large on-chip weight buffers. However, enlarging memory capacity and throughput increases the delay parasitics on WLs and BLs, and the number of parallel computing elements; resulting in longer compute latency (tAC), lower energy-efficiency (EF), degraded signal margin, and larger fluctuations in power consumption across data-patterns (see Fig. 16.3.1). Recent SRAM-CIM macros tend to not use in-lab SRAM cells, with a logic-based layout, in favor of foundry provided compact-layout 8T [2], 3, [5] or 6T cells with local-computing cells (LCCs) [4], [6] to reduce the cell-array area and facilitate manufacturing. This paper presents a SRAM-CIM structure using (1) a segmented-BL charge-sharing (SBCS) scheme for MAC operations, with low energy consumption and a consistently high signal margin across MAC values (MACV); (2) An new LCC cell, called a source-injection local-multiplication cell (SILMC), to support the SBCS scheme with a consistent signal margin against transistor process variation; and (3) A prioritized-hybrid-ADC (Ph-ADC) to achieve a small area and power overhead for analog readout. A 28nm 384kb SRAM-CIM macro was fabricated using a foundry compact-6T cell with support for MAC operations with 16 accumulations of 8b-inputs and 8b-weights with near-full precision output (20b). This macro achieves a 7.2ns tAC and a 22.75TOPS/W EF for 8b-MAC operations with an FoM (IN-precision × W-precision × output-ratio × output-channel × EF/tAC) 6× higher than prior work.","PeriodicalId":371093,"journal":{"name":"2021 IEEE International Solid- State Circuits Conference (ISSCC)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131149292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}