Pub Date : 2019-06-09DOI: 10.23919/VLSIT.2019.8776579
Changbeom Woo, Myeongwon Lee, Shinkeun Kim, Jaeyeol Park, Gil-Bok Choi, M. Seo, K. Noh, Myounggon Kang, Hyungcheol Shin
Right after program, stored electrons in the shallow nitride trap level can be released less than a few seconds. By setting the delay between program and reading phase to as small as 10μs, we found that several mechanisms are mixed when stored electrons are emitted during short term retention of 3-D NAND Flash. For the first time, we have confirmed that the charge loss mechanisms consist of three mechanisms and have separated each mechanism. In particular, the vertical redistribution of electrons in the charge trap layer, observed only during short term, was analyzed for the first time. Short term retention data measured at various temperatures (25-115°C) and at several program verify levels (PV3, PV5, PV7) in solid (S/P) and checker-board patterns (C/P) were analyzed using our model. Finally, the activation energy (Ea) of each mechanism was extracted by the Arrhenius law and the magnitudes of $E_{text{a}}$ were compared.
{"title":"Modeling of Charge Loss Mechanisms during the Short Term Retention Operation in 3-D NAND Flash Memories","authors":"Changbeom Woo, Myeongwon Lee, Shinkeun Kim, Jaeyeol Park, Gil-Bok Choi, M. Seo, K. Noh, Myounggon Kang, Hyungcheol Shin","doi":"10.23919/VLSIT.2019.8776579","DOIUrl":"https://doi.org/10.23919/VLSIT.2019.8776579","url":null,"abstract":"Right after program, stored electrons in the shallow nitride trap level can be released less than a few seconds. By setting the delay between program and reading phase to as small as 10μs, we found that several mechanisms are mixed when stored electrons are emitted during short term retention of 3-D NAND Flash. For the first time, we have confirmed that the charge loss mechanisms consist of three mechanisms and have separated each mechanism. In particular, the vertical redistribution of electrons in the charge trap layer, observed only during short term, was analyzed for the first time. Short term retention data measured at various temperatures (25-115°C) and at several program verify levels (PV3, PV5, PV7) in solid (S/P) and checker-board patterns (C/P) were analyzed using our model. Finally, the activation energy (Ea) of each mechanism was extracted by the Arrhenius law and the magnitudes of $E_{text{a}}$ were compared.","PeriodicalId":6752,"journal":{"name":"2019 Symposium on VLSI Technology","volume":"3 1","pages":"T214-T215"},"PeriodicalIF":0.0,"publicationDate":"2019-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82500303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-06-09DOI: 10.23919/VLSIT.2019.8776525
S. Dutta, W. Chakraborty, J. Gomez, K. Ni, S. Joshi, S. Datta
We present a system implementing extremely energy-efficient inference on multi-channel biomedical-sensor data. We leverage Ferroelectric FET (FeFET) to perform classification directly on analog sensor signals. We demonstrate: (i) voltage-controlled multi-domain ferroelectric polarization switching to obtain 8 distinct transconductance $(text{g}_{text{m}})$ states in a 28nm HKMG FeFET technology [1], (ii) 30x tunable range in $text{g}_{text{m}}$ over the bandwidth of interest, (iii) successful implementation of artifact removal, feature extraction and classification for seizure detection from CHB-MIT EEG dataset with 98.46% accuracy and $< 0.375/text{hr}$. false alarm rate for two patients, (iv) ultra-low energy of 47 fJ/MAC with 1,000x improvement in area compared to alternative mixed-signal MAC.
{"title":"Energy-Efficient Edge Inference on Multi-Channel Streaming Data in 28nm HKMG FeFET Technology","authors":"S. Dutta, W. Chakraborty, J. Gomez, K. Ni, S. Joshi, S. Datta","doi":"10.23919/VLSIT.2019.8776525","DOIUrl":"https://doi.org/10.23919/VLSIT.2019.8776525","url":null,"abstract":"We present a system implementing extremely energy-efficient inference on multi-channel biomedical-sensor data. We leverage Ferroelectric FET (FeFET) to perform classification directly on analog sensor signals. We demonstrate: (i) voltage-controlled multi-domain ferroelectric polarization switching to obtain 8 distinct transconductance $(text{g}_{text{m}})$ states in a 28nm HKMG FeFET technology [1], (ii) 30x tunable range in $text{g}_{text{m}}$ over the bandwidth of interest, (iii) successful implementation of artifact removal, feature extraction and classification for seizure detection from CHB-MIT EEG dataset with 98.46% accuracy and $< 0.375/text{hr}$. false alarm rate for two patients, (iv) ultra-low energy of 47 fJ/MAC with 1,000x improvement in area compared to alternative mixed-signal MAC.","PeriodicalId":6752,"journal":{"name":"2019 Symposium on VLSI Technology","volume":"12 1","pages":"T38-T39"},"PeriodicalIF":0.0,"publicationDate":"2019-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88362483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-06-01DOI: 10.23919/VLSIT.2019.8776562
T. Meunier, L. Hutin, B. Bertrand, Y. Thonnart, G. Pillonnet, G. Billiot, H. Jacquinot, M. Cassé, S. Barraud, Y.-J. Kim, V. Mazzocchi, A. Amisse, H. Bohuslavskyi, L. Bourdet, A. Crippa, X. Jehl, R. Maurand, Y. Niquet, M. Sanquer, B. Venitucci, B. Jadot, E. Chanrion, P. Mortemousque, C. Spence, M. Urdampilleta, S. de Franceschi, M. Vinet
Quantum computing (QC) is expected to extend the high performance computing roadmap [1]–[2] at the condition to be able to run a large number of errorless quantum operations, typically. over a billion. It is out of reach in actual physical systems because of the quantum decoherence. As a consequence, quantum error correction techniques, which utilize the idea of redundant encoding, have been introduced to cure for the errors [3]–[5]. In state-of-the-art codes, with error thresholds or fidelities around 10−2 in Si spin qubits, it is expected that logical qubits will be made out of a few thousands or more of physical qubits [6], bringing the number of required physical qubits to perform relevant quantum calculations to at least a million.
{"title":"Towards scalable quantum computing based on silicon spin","authors":"T. Meunier, L. Hutin, B. Bertrand, Y. Thonnart, G. Pillonnet, G. Billiot, H. Jacquinot, M. Cassé, S. Barraud, Y.-J. Kim, V. Mazzocchi, A. Amisse, H. Bohuslavskyi, L. Bourdet, A. Crippa, X. Jehl, R. Maurand, Y. Niquet, M. Sanquer, B. Venitucci, B. Jadot, E. Chanrion, P. Mortemousque, C. Spence, M. Urdampilleta, S. de Franceschi, M. Vinet","doi":"10.23919/VLSIT.2019.8776562","DOIUrl":"https://doi.org/10.23919/VLSIT.2019.8776562","url":null,"abstract":"Quantum computing (QC) is expected to extend the high performance computing roadmap [1]–[2] at the condition to be able to run a large number of errorless quantum operations, typically. over a billion. It is out of reach in actual physical systems because of the quantum decoherence. As a consequence, quantum error correction techniques, which utilize the idea of redundant encoding, have been introduced to cure for the errors [3]–[5]. In state-of-the-art codes, with error thresholds or fidelities around 10−2 in Si spin qubits, it is expected that logical qubits will be made out of a few thousands or more of physical qubits [6], bringing the number of required physical qubits to perform relevant quantum calculations to at least a million.","PeriodicalId":6752,"journal":{"name":"2019 Symposium on VLSI Technology","volume":"89 1","pages":"T30-T31"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78703202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We demonstrate ultra-thin ALD-processed dual-oxide (Al2O3/TiO2) hybrid device with memory and selector characteristics by engineering the stability of metal filament in Al2O3 and TiO2 layer. The optimized hybrid memory device shows outstanding performances such as low off current $(< 1text{nA})$, low reset current $(< 1text{nA})$, and high on/off ratio $(> 10^{4})$. Inserting a Ti buffer layer which has a low electrode potential value, we observed excellent uniformity and retention property. Finally, an outstanding read/write margins and ultra-low power consumption are confirmed through array simulations of the proposed hybrid memory device.
{"title":"Ultra-thin <10nm) Dual-oxide (Al2O3/TiO2) Hybrid Device (Memory/Selector) with Extremely Low Ioff <1nA) and Ireset <1nA) for 3D Storage Class Memory","authors":"Changhyuck Sung, Jeonghwan Song, Donguk Lee, Seokjae Lim, Myounghun Kwak, H. Hwang","doi":"10.23919/VLSIT.2019.8776527","DOIUrl":"https://doi.org/10.23919/VLSIT.2019.8776527","url":null,"abstract":"We demonstrate ultra-thin ALD-processed dual-oxide (Al<inf>2</inf>O<inf>3</inf>/TiO<inf>2</inf>) hybrid device with memory and selector characteristics by engineering the stability of metal filament in Al<inf>2</inf>O<inf>3</inf> and TiO<inf>2</inf> layer. The optimized hybrid memory device shows outstanding performances such as low off current <tex>$(< 1text{nA})$</tex>, low reset current <tex>$(< 1text{nA})$</tex>, and high on/off ratio <tex>$(> 10^{4})$</tex>. Inserting a Ti buffer layer which has a low electrode potential value, we observed excellent uniformity and retention property. Finally, an outstanding read/write margins and ultra-low power consumption are confirmed through array simulations of the proposed hybrid memory device.","PeriodicalId":6752,"journal":{"name":"2019 Symposium on VLSI Technology","volume":"108 1","pages":"T62-T63"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81412667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-06-01DOI: 10.23919/VLSIT.2019.8776560
K. Tang, Wei-Chen Wei, Zuo-Wei Yeh, Tzu-Hsiang Hsu, Yen-Cheng Chiu, Cheng-Xin Xue, Yu-Chun Kuo, Tai-Hsing We, M. Ho, C. Lo, Ren-Shuo Liu, C. Hsieh, Meng-Fan Chang
In quest to execute emerging deep learning algorithms at edge devices, developing low-power and low-latency deep learning accelerators (DLAs) have become top priority. To achieve this goal, data processing techniques in sensor and memory utilizing the array structure have drawn much attention. Processing-in-sensor (PIS) solutions could reduce data transfer; computing-in-memory (CIM) macros could reduce memory access and intermediate data movement. We propose a new architecture to integrate PIS and CIM to realize low-power DLA. The advantages of using these techniques and the challenges from system point-of-view are discussed.
{"title":"Considerations of Integrating Computing-In-Memory and Processing-In-Sensor into Convolutional Neural Network Accelerators for Low-Power Edge Devices","authors":"K. Tang, Wei-Chen Wei, Zuo-Wei Yeh, Tzu-Hsiang Hsu, Yen-Cheng Chiu, Cheng-Xin Xue, Yu-Chun Kuo, Tai-Hsing We, M. Ho, C. Lo, Ren-Shuo Liu, C. Hsieh, Meng-Fan Chang","doi":"10.23919/VLSIT.2019.8776560","DOIUrl":"https://doi.org/10.23919/VLSIT.2019.8776560","url":null,"abstract":"In quest to execute emerging deep learning algorithms at edge devices, developing low-power and low-latency deep learning accelerators (DLAs) have become top priority. To achieve this goal, data processing techniques in sensor and memory utilizing the array structure have drawn much attention. Processing-in-sensor (PIS) solutions could reduce data transfer; computing-in-memory (CIM) macros could reduce memory access and intermediate data movement. We propose a new architecture to integrate PIS and CIM to realize low-power DLA. The advantages of using these techniques and the challenges from system point-of-view are discussed.","PeriodicalId":6752,"journal":{"name":"2019 Symposium on VLSI Technology","volume":"33 1","pages":"T166-T167"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76094639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-06-01DOI: 10.23919/VLSIT.2019.8776545
Longyang Lin, Saurabh Jain, M. Alioto
This paper presents a power management unit (PMU) driving a microcontroller, and controlling a power knob that enables adaptation to the sensed power availability over an ultra-wide range, well beyond voltage scaling. Conventional battery-powered operation is augmented with pure harvesting. Wide power adaptation is enabled by comparator delay self-biasing and zero-current switching scheme shared among all power modes with single-cycle convergence.
{"title":"Integrated Power Management and Microcontroller for Ultra-Wide Power Adaptation down to nW","authors":"Longyang Lin, Saurabh Jain, M. Alioto","doi":"10.23919/VLSIT.2019.8776545","DOIUrl":"https://doi.org/10.23919/VLSIT.2019.8776545","url":null,"abstract":"This paper presents a power management unit (PMU) driving a microcontroller, and controlling a power knob that enables adaptation to the sensed power availability over an ultra-wide range, well beyond voltage scaling. Conventional battery-powered operation is augmented with pure harvesting. Wide power adaptation is enabled by comparator delay self-biasing and zero-current switching scheme shared among all power modes with single-cycle convergence.","PeriodicalId":6752,"journal":{"name":"2019 Symposium on VLSI Technology","volume":"12 1","pages":"C178-C179"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88299191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-06-01DOI: 10.23919/VLSIT.2019.8776540
Yen-Cheng Chiu, Han-Wen Hu, Li-Ya Lai, Tsung-Yuan Huang, Hui-Yao Kao, K. Chang, M. Ho, Chung-Cheng Chou, Y. Chih, T. Chang, Meng-Fan Chang
This work proposes (1) an auto-forming (AF) scheme to shorten the macro forming time $(text{T}_{text{FM}-text{M}})$ and testing costs; (2) an auto-RESET (ARST) scheme to shorten page-RESET time $(text{T}_{text{W}-text{PAGE}-text{RST}})$ for expanding the applications of hidden-RESET operation in standby mode, and (3) an auto-SET (ASET) scheme to shorten page-write time $(text{T}_{text{W}-text{PAGE}})$ combined with hidden-RESET scheme. A fabricated 40nm 2Mb ReRAM macro achieved 85+% reduction in TFM-M, and $99+%$ reduction in $text{T}_{text{W}}-text{PAGE}$ for a page. For the first time, AF, ARST, and ASET schemes are demonstrated in silicon for ReRAM. Keywords: ReRAM, forming, page-write
{"title":"A 40nm 2Mb ReRAM Macro with 85% Reduction in FORMING Time and 99% Reduction in Page-Write Time Using Auto-FORMING and Auto-Write Schemes","authors":"Yen-Cheng Chiu, Han-Wen Hu, Li-Ya Lai, Tsung-Yuan Huang, Hui-Yao Kao, K. Chang, M. Ho, Chung-Cheng Chou, Y. Chih, T. Chang, Meng-Fan Chang","doi":"10.23919/VLSIT.2019.8776540","DOIUrl":"https://doi.org/10.23919/VLSIT.2019.8776540","url":null,"abstract":"This work proposes (1) an auto-forming (AF) scheme to shorten the macro forming time $(text{T}_{text{FM}-text{M}})$ and testing costs; (2) an auto-RESET (ARST) scheme to shorten page-RESET time $(text{T}_{text{W}-text{PAGE}-text{RST}})$ for expanding the applications of hidden-RESET operation in standby mode, and (3) an auto-SET (ASET) scheme to shorten page-write time $(text{T}_{text{W}-text{PAGE}})$ combined with hidden-RESET scheme. A fabricated 40nm 2Mb ReRAM macro achieved 85+% reduction in TFM-M, and $99+%$ reduction in $text{T}_{text{W}}-text{PAGE}$ for a page. For the first time, AF, ARST, and ASET schemes are demonstrated in silicon for ReRAM. Keywords: ReRAM, forming, page-write","PeriodicalId":6752,"journal":{"name":"2019 Symposium on VLSI Technology","volume":"58 1","pages":"T232-T233"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89165665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-06-01DOI: 10.23919/VLSIT.2019.8776486
C.C. Hu, M.F. Chen, W. Chiou, Doug C. H. Yu
The electrical characterization of System on Integrated Chips (SoIC™), an innovative 3D heterogeneous integration technology manufactured in front-end of line with known-good-die is reported. Chiplets integration of devices including foundry leading edge 7nm FinFET technology with SoIC™ illustrates its advantages in high bandwidth density and high power efficiency, as compared with 2.5D and conventional 3D-IC with micro-bump/TSV.
{"title":"3D Multi-chip Integration with System on Integrated Chips (SoIC™)","authors":"C.C. Hu, M.F. Chen, W. Chiou, Doug C. H. Yu","doi":"10.23919/VLSIT.2019.8776486","DOIUrl":"https://doi.org/10.23919/VLSIT.2019.8776486","url":null,"abstract":"The electrical characterization of System on Integrated Chips (SoIC™), an innovative 3D heterogeneous integration technology manufactured in front-end of line with known-good-die is reported. Chiplets integration of devices including foundry leading edge 7nm FinFET technology with SoIC™ illustrates its advantages in high bandwidth density and high power efficiency, as compared with 2.5D and conventional 3D-IC with micro-bump/TSV.","PeriodicalId":6752,"journal":{"name":"2019 Symposium on VLSI Technology","volume":"86 1","pages":"T20-T21"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75109116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-06-01DOI: 10.23919/VLSIT.2019.8776565
C. Matsui, S. Fukuyama, Atsuna Hayakawa, K. Takeuchi
This paper proposes Variability-Aware Approximate Computing (V-AC) for TaOx ReRAM storage at data centers. For the first time, this paper shows that application-induced variability degrades the performance. To solve this problem, V-AC utilizes error resilience of machine learning (ML) application and reduces bit-error rate (BER) of typical cells by removing extra data copy and enlarging BER difference among cells. By combining device measurement and system emulations, this paper realizes system, circuit and device codesign (SCDCD). V-AC is key enabling technology to push the limits of performance, power, chip size and scaling of ReRAM for ML. Performance, energy and cell area of ReRAM storage improves by 7.0 times, 90% and 8.5%, respectively.
{"title":"Application-Induced Cell Reliability Variability-Aware Approximate Computing in TaOx-based ReRAM Data Center Storage for Machine Learning","authors":"C. Matsui, S. Fukuyama, Atsuna Hayakawa, K. Takeuchi","doi":"10.23919/VLSIT.2019.8776565","DOIUrl":"https://doi.org/10.23919/VLSIT.2019.8776565","url":null,"abstract":"This paper proposes Variability-Aware Approximate Computing (V-AC) for TaOx ReRAM storage at data centers. For the first time, this paper shows that application-induced variability degrades the performance. To solve this problem, V-AC utilizes error resilience of machine learning (ML) application and reduces bit-error rate (BER) of typical cells by removing extra data copy and enlarging BER difference among cells. By combining device measurement and system emulations, this paper realizes system, circuit and device codesign (SCDCD). V-AC is key enabling technology to push the limits of performance, power, chip size and scaling of ReRAM for ML. Performance, energy and cell area of ReRAM storage improves by 7.0 times, 90% and 8.5%, respectively.","PeriodicalId":6752,"journal":{"name":"2019 Symposium on VLSI Technology","volume":"29 1","pages":"T234-T235"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74480599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}