Pub Date : 2022-10-28DOI: 10.1109/ICTA56932.2022.9963072
Jialong Xue, T. Zou, Hao Xu, T. Han, Mi Tian, Weiqiang Zhu, Zhijian Li, Na Yan
This paper presents the design of a 6-18GHz low-noise amplifier (LNA) utilizing noise canceling technique to achieve large bandwidth and low noise figure (NF) simultaneously. The LNA is composed of three stages, resistive shunt feedback cascode topology is adopted for the first one, which is convenient for wideband input impedance matching. Besides, the second and third stage are designed for noise canceling and gain compensation respectively. Inductive peaking technique is employed to broaden the bandwidth. Implemented in 130-nm CMOS PD-SOI technology, the proposed LNA achieves maximum 15.44dB gain and minimum 2.42dB NF with flatness of ±1.44dB and 0.109dB/GHz respectively across 6-18GHz, whose fractional bandwidth is as large as 100%.
{"title":"A 6-18GHz Low-Noise Amplifier Using Noise Canceling Technique in 130-nm CMOS PD-SOI","authors":"Jialong Xue, T. Zou, Hao Xu, T. Han, Mi Tian, Weiqiang Zhu, Zhijian Li, Na Yan","doi":"10.1109/ICTA56932.2022.9963072","DOIUrl":"https://doi.org/10.1109/ICTA56932.2022.9963072","url":null,"abstract":"This paper presents the design of a 6-18GHz low-noise amplifier (LNA) utilizing noise canceling technique to achieve large bandwidth and low noise figure (NF) simultaneously. The LNA is composed of three stages, resistive shunt feedback cascode topology is adopted for the first one, which is convenient for wideband input impedance matching. Besides, the second and third stage are designed for noise canceling and gain compensation respectively. Inductive peaking technique is employed to broaden the bandwidth. Implemented in 130-nm CMOS PD-SOI technology, the proposed LNA achieves maximum 15.44dB gain and minimum 2.42dB NF with flatness of ±1.44dB and 0.109dB/GHz respectively across 6-18GHz, whose fractional bandwidth is as large as 100%.","PeriodicalId":325602,"journal":{"name":"2022 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130200679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-28DOI: 10.1109/ICTA56932.2022.9963035
Weiliang Chen, Zhaoshi Li, Leibo Liu, Shaojun Wei
GPGPUs utilize multi-dimensional memory subsystems to provide the bandwidth needed by their multi-dimensional parallelism. However, an unfavorable address mapping leads to imbalanced memory request distribution across the memory resources, causing degraded performance and poor power efficiency. The optimal mapping is both application- and hardware-dependent. This paper provides a software-hardware co-design to dynamically reconfigure the address mapping according to the trace of the targeted application. First, a circuit to sample the entropy of address bits is proposed to capture the optimal address mapping. Second, a dynamic reconfiguration mechanism is designed to apply the optimal address mapping. Simulation results show up to 45% performance improvement over fixed address mappings.
{"title":"Dynamically Reconfigurable Memory Address Mapping for General-Purpose Graphics Processing Unit","authors":"Weiliang Chen, Zhaoshi Li, Leibo Liu, Shaojun Wei","doi":"10.1109/ICTA56932.2022.9963035","DOIUrl":"https://doi.org/10.1109/ICTA56932.2022.9963035","url":null,"abstract":"GPGPUs utilize multi-dimensional memory subsystems to provide the bandwidth needed by their multi-dimensional parallelism. However, an unfavorable address mapping leads to imbalanced memory request distribution across the memory resources, causing degraded performance and poor power efficiency. The optimal mapping is both application- and hardware-dependent. This paper provides a software-hardware co-design to dynamically reconfigure the address mapping according to the trace of the targeted application. First, a circuit to sample the entropy of address bits is proposed to capture the optimal address mapping. Second, a dynamic reconfiguration mechanism is designed to apply the optimal address mapping. Simulation results show up to 45% performance improvement over fixed address mappings.","PeriodicalId":325602,"journal":{"name":"2022 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125667255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-28DOI: 10.1109/ICTA56932.2022.9963057
Qiushi Kang, Ge Li, F. Niu, Chenxi Wang
Cu/SiO2 hybrid bonding is a potent tool to effectively mitigate data-movement issues within von Neumann architecture due to the shortening of the distance between the processor and the memory unit. To protect stacked chip performance, the realization of hybrid bonding at low temperatures (<260°C) is paramount. The essence of low-temperature hybrid bonding lies in the construction of desirable chemical structures on Cu and SiO2 surfaces. Therefore, this paper presents two types of feasible surface-activation strategies to achieve selective/non-selective hydrophilization of the Cu/SiO2 surface. Regardless of activation strategy, the Cu-Cu interface with sufficient grain growth and seamless amorphous SiO2-SiO2 interface structure were obtained at 200 °C. Moreover, the non-selective hydrophilization of Cu/SiO2 surface based on Ar/O2→NH4OH activation realized interfacial layer-free SiO2-SiO2 interface, which can provide more reliable mechanical support for next-generation data-centric applications.
{"title":"Cooperative surface-activation strategy for low-temperature Cu/SiO2 hybrid bonding","authors":"Qiushi Kang, Ge Li, F. Niu, Chenxi Wang","doi":"10.1109/ICTA56932.2022.9963057","DOIUrl":"https://doi.org/10.1109/ICTA56932.2022.9963057","url":null,"abstract":"Cu/SiO2 hybrid bonding is a potent tool to effectively mitigate data-movement issues within von Neumann architecture due to the shortening of the distance between the processor and the memory unit. To protect stacked chip performance, the realization of hybrid bonding at low temperatures (<260°C) is paramount. The essence of low-temperature hybrid bonding lies in the construction of desirable chemical structures on Cu and SiO2 surfaces. Therefore, this paper presents two types of feasible surface-activation strategies to achieve selective/non-selective hydrophilization of the Cu/SiO2 surface. Regardless of activation strategy, the Cu-Cu interface with sufficient grain growth and seamless amorphous SiO2-SiO2 interface structure were obtained at 200 °C. Moreover, the non-selective hydrophilization of Cu/SiO2 surface based on Ar/O2→NH4OH activation realized interfacial layer-free SiO2-SiO2 interface, which can provide more reliable mechanical support for next-generation data-centric applications.","PeriodicalId":325602,"journal":{"name":"2022 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128237797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-28DOI: 10.1109/ICTA56932.2022.9963103
Lei Huang, Huan-Zhu Wang, Qingzhi Wu, Shuman Mao, Yuehang Xu
Trapping effects (TE) have significant influence on device performances, including Pulse-IV, scattering parameters and linearity. Due to its slight influence on GaAs high electron mobility transistors (HEMTs), the TE are always neglected in compact models like EE-HEMT. In this paper, we present a physical-based quasi-physical zone division (QPZD) large-signal model and the TE is characterized by using simplified Shockley-Read-Hall (SRH) model, which can characterize the dynamic process of electron capture and emission. The results show that a more accurate model is obtained with TE taken into consideration, which can characterize the Pulse-IV and radio frequency (RF) performance with less errors, especially the linearity of GaAs HEMTs under two-tone excitation with high input dynamic range.
{"title":"Characterization and Modeling of Trapping Effects in GaAs Enhanced HEMT under High Input Dynamic Range","authors":"Lei Huang, Huan-Zhu Wang, Qingzhi Wu, Shuman Mao, Yuehang Xu","doi":"10.1109/ICTA56932.2022.9963103","DOIUrl":"https://doi.org/10.1109/ICTA56932.2022.9963103","url":null,"abstract":"Trapping effects (TE) have significant influence on device performances, including Pulse-IV, scattering parameters and linearity. Due to its slight influence on GaAs high electron mobility transistors (HEMTs), the TE are always neglected in compact models like EE-HEMT. In this paper, we present a physical-based quasi-physical zone division (QPZD) large-signal model and the TE is characterized by using simplified Shockley-Read-Hall (SRH) model, which can characterize the dynamic process of electron capture and emission. The results show that a more accurate model is obtained with TE taken into consideration, which can characterize the Pulse-IV and radio frequency (RF) performance with less errors, especially the linearity of GaAs HEMTs under two-tone excitation with high input dynamic range.","PeriodicalId":325602,"journal":{"name":"2022 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA)","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132122379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-28DOI: 10.1109/ICTA56932.2022.9963137
Yu’ang Wu, Mingyang Zhou, Hao Cai
The influence of temperature change on device directly affects the memory performance, especially in access latency and energy consumption. Based on temperature monitor, temperature adaptive magnetic random access memory (MRAM) eliminates the impact of temperature on storage performance. However, limited by the characteristics of the magnetic tunnel junction (MTJ) device and the operating mode of the MRAM array, wider operating temperature brings challenges to the design of monitor. In this work, based on MRAM array, using the method of segmented detection, we propose a novel temperature monitor for monitoring temperature under -55~125°C. Simulation results show that the temperature monitor can detect the temperature with an accuracy of 5°C within 1.2μs.
{"title":"A Novel Segmented Temperature Monitor for Adaptive MRAM","authors":"Yu’ang Wu, Mingyang Zhou, Hao Cai","doi":"10.1109/ICTA56932.2022.9963137","DOIUrl":"https://doi.org/10.1109/ICTA56932.2022.9963137","url":null,"abstract":"The influence of temperature change on device directly affects the memory performance, especially in access latency and energy consumption. Based on temperature monitor, temperature adaptive magnetic random access memory (MRAM) eliminates the impact of temperature on storage performance. However, limited by the characteristics of the magnetic tunnel junction (MTJ) device and the operating mode of the MRAM array, wider operating temperature brings challenges to the design of monitor. In this work, based on MRAM array, using the method of segmented detection, we propose a novel temperature monitor for monitoring temperature under -55~125°C. Simulation results show that the temperature monitor can detect the temperature with an accuracy of 5°C within 1.2μs.","PeriodicalId":325602,"journal":{"name":"2022 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA)","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115877728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-28DOI: 10.1109/ICTA56932.2022.9963033
Huafeng Ye, Huipeng Deng, Jian Wang, Mingyu Wang, Zhiyi Yu
3D Convolutional neural networks (3D CNNs) perform better in some scenarios, such as video understanding and 3D medical image diagnosis. With the increase in the dimension and size of the convolution kernel, CNN's computational complexity and implementation difficulty increase severely. Winograd transformation can significantly reduce the number of multiplications in convolution operations. However, large convolution filters will bring numerical instability. In this article, we presented a novel method called 3D nested Winograd algorithm to address the problem. Compared with the state-of-art OLA-Winograd algorithm, the proposed algorithm reduces the multiplications by 1.72 to 5.83× for computing 5 × 5 × 5 to 9 × 9 × 9 convolutions. Finally, we demonstrate the efficiency of 3D-NWA on the FPGA platform (Xilinx VCU118) and achieve highest DSP efficiency up to 4.67× compared with the state-of-art accelerators.
{"title":"3D-NWA: A Nested-Winograd Accelerator for 3D CNNs","authors":"Huafeng Ye, Huipeng Deng, Jian Wang, Mingyu Wang, Zhiyi Yu","doi":"10.1109/ICTA56932.2022.9963033","DOIUrl":"https://doi.org/10.1109/ICTA56932.2022.9963033","url":null,"abstract":"3D Convolutional neural networks (3D CNNs) perform better in some scenarios, such as video understanding and 3D medical image diagnosis. With the increase in the dimension and size of the convolution kernel, CNN's computational complexity and implementation difficulty increase severely. Winograd transformation can significantly reduce the number of multiplications in convolution operations. However, large convolution filters will bring numerical instability. In this article, we presented a novel method called 3D nested Winograd algorithm to address the problem. Compared with the state-of-art OLA-Winograd algorithm, the proposed algorithm reduces the multiplications by 1.72 to 5.83× for computing 5 × 5 × 5 to 9 × 9 × 9 convolutions. Finally, we demonstrate the efficiency of 3D-NWA on the FPGA platform (Xilinx VCU118) and achieve highest DSP efficiency up to 4.67× compared with the state-of-art accelerators.","PeriodicalId":325602,"journal":{"name":"2022 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117131038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The input buffer is widely used in the analog-to-digital converter (ADC) to isolate input signal from the internal sample-and-hold network and package. In this work, we propose a wide-band and high-linearity input buffer which is based on cascade complementary source follower (CCSF) structure. It is consisted of two-stage PMOS source follower (SF) and NMOS SF with improved linearity. Designed in 65-nm CMOS under 2.5-V supply, the post-layout simulation result shows that the differential input buffer achieves a Nyquist SFDR of 78.3 dB at 4 GS/s sampling rate and consumes 21.14 mW.
{"title":"A Wideband High-linearity Input Buffer Based on Cascade Complementary Source Follower","authors":"Tian Feng, Dengquan Li, Jiale Ding, Shubin Liu, Yi Shen, Zhangming Zhu","doi":"10.1109/ICTA56932.2022.9962966","DOIUrl":"https://doi.org/10.1109/ICTA56932.2022.9962966","url":null,"abstract":"The input buffer is widely used in the analog-to-digital converter (ADC) to isolate input signal from the internal sample-and-hold network and package. In this work, we propose a wide-band and high-linearity input buffer which is based on cascade complementary source follower (CCSF) structure. It is consisted of two-stage PMOS source follower (SF) and NMOS SF with improved linearity. Designed in 65-nm CMOS under 2.5-V supply, the post-layout simulation result shows that the differential input buffer achieves a Nyquist SFDR of 78.3 dB at 4 GS/s sampling rate and consumes 21.14 mW.","PeriodicalId":325602,"journal":{"name":"2022 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA)","volume":"240 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124296105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, the oxide fused and anti-fused behavior has been observed in a simple metal-oxide-metal device: Pt/HfO2/NiOx/Ni. The anti-fused state and fused state can be achieved by applying program voltage on the devices with or without current compliance, respectively. And the resistance window of the two states reaches about 109, which can effectively reduce the possibility of incorrect programming. It also showed excellent retention characteristics and a simple structure friendly for integration. It can be well used in the field of high reliability of one-time programmable memory.
{"title":"A High-Density Large-Ratio Fuse Based Oxide Devices for One-time-programmable Memory Applications","authors":"Xuecheng Cui, Dong Liu, Jifang Cao, Xiao Yu, Bing Chen","doi":"10.1109/ICTA56932.2022.9962988","DOIUrl":"https://doi.org/10.1109/ICTA56932.2022.9962988","url":null,"abstract":"In this paper, the oxide fused and anti-fused behavior has been observed in a simple metal-oxide-metal device: Pt/HfO2/NiOx/Ni. The anti-fused state and fused state can be achieved by applying program voltage on the devices with or without current compliance, respectively. And the resistance window of the two states reaches about 109, which can effectively reduce the possibility of incorrect programming. It also showed excellent retention characteristics and a simple structure friendly for integration. It can be well used in the field of high reliability of one-time programmable memory.","PeriodicalId":325602,"journal":{"name":"2022 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124495987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-28DOI: 10.1109/ICTA56932.2022.9963053
Huabing Liao, Haikun Jia, Xiangrong Huang, Bao Shi, W. Deng, B. Chi, Zhihua Wang
A CMOS broadband millimeter-wave power amplifier (PA) based on a Sandwiched Transformer (ST) output matching network is presented in this paper. The ST output matching network with a three-layer structure providing a larger coupling coefficient (k) than the traditional two-layer stack structure is proposed for PA's output matching network. The layout of the transistors is optimized to improve the PA's performance. Fabricated in 65-nm CMOS process, the PA has achieved 15 dBm OP1dBand 36.5% peak power added efficiency (PAE). The 3-dB bandwidth of the PA is from 22.8 GHz to 32.8 GHz.
{"title":"A 22.8 GHz to 32.8 GHz Compact Power Amplifier with a 15 dBm Output P1dB and 36.5% Peak PAE in 65-nm CMOS","authors":"Huabing Liao, Haikun Jia, Xiangrong Huang, Bao Shi, W. Deng, B. Chi, Zhihua Wang","doi":"10.1109/ICTA56932.2022.9963053","DOIUrl":"https://doi.org/10.1109/ICTA56932.2022.9963053","url":null,"abstract":"A CMOS broadband millimeter-wave power amplifier (PA) based on a Sandwiched Transformer (ST) output matching network is presented in this paper. The ST output matching network with a three-layer structure providing a larger coupling coefficient (k) than the traditional two-layer stack structure is proposed for PA's output matching network. The layout of the transistors is optimized to improve the PA's performance. Fabricated in 65-nm CMOS process, the PA has achieved 15 dBm OP1dBand 36.5% peak power added efficiency (PAE). The 3-dB bandwidth of the PA is from 22.8 GHz to 32.8 GHz.","PeriodicalId":325602,"journal":{"name":"2022 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115114036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-28DOI: 10.1109/ICTA56932.2022.9962978
Kanglin Xiao, Xiaoxin Cui, Xin Qiao, Xin'an Wang, Yuan Wang
In this work, we present a reconfigurable SRAM computing-in-memory (CIM) macro supporting ping-pong operation and pipeline operation for multi-mode multiply-and-accumulate (MAC) operations. The macro can be reconfigured to execute AND or XNOR, offering great flexibilities to cover binary neural network (BNN), ternary neural network (TNN), and multi-bit operation through serially 1-bit AND operations. The main contributions include: (1) A reconfigurable scheme to map inputs and weight of 8T1C bit-cell, supporting three MAC operations; (2) An architecture integrated ping-pong operation and two-level CIM pipeline. Simulated in a standard 28-nm process, the proposed design shows good computing linearity and variations. The average energy efficiency of 1b-AND, BNN, and TNN MAC operation are 1533.7, 1522.9, and 1713.2 TOPS/W, respectively.
{"title":"A Reconfigurable SRAM Computing-in-Memory Macro Supporting Ping-Pong Operation and CIM pipeline for Multi-mode MAC operations","authors":"Kanglin Xiao, Xiaoxin Cui, Xin Qiao, Xin'an Wang, Yuan Wang","doi":"10.1109/ICTA56932.2022.9962978","DOIUrl":"https://doi.org/10.1109/ICTA56932.2022.9962978","url":null,"abstract":"In this work, we present a reconfigurable SRAM computing-in-memory (CIM) macro supporting ping-pong operation and pipeline operation for multi-mode multiply-and-accumulate (MAC) operations. The macro can be reconfigured to execute AND or XNOR, offering great flexibilities to cover binary neural network (BNN), ternary neural network (TNN), and multi-bit operation through serially 1-bit AND operations. The main contributions include: (1) A reconfigurable scheme to map inputs and weight of 8T1C bit-cell, supporting three MAC operations; (2) An architecture integrated ping-pong operation and two-level CIM pipeline. Simulated in a standard 28-nm process, the proposed design shows good computing linearity and variations. The average energy efficiency of 1b-AND, BNN, and TNN MAC operation are 1533.7, 1522.9, and 1713.2 TOPS/W, respectively.","PeriodicalId":325602,"journal":{"name":"2022 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116268928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}