Adders and multipliers based on memristive Material Implication (IMPLY) logic are widely used in primary building blocks of Arithmetic Logic Unit (ALU). To solve the issue that the existing IMPLY-based multipliers cannot protect the input operands, this paper presents a novel data non-destructive memristive IMPLY-based semi-parallel multiplier for Computing-in-Memory (CIM) systems, by assigning function-specific memristors for data-protection and introducing additional switches for higher parallelism. Simulation results show that the proposed multiplier can achieve 30% faster than conventional semi-parallel design and 9.1 % less memristors against the state-of-art semi-serial design for 4-bit multiplication, while preventing the input weight from destruction as required by CNN weight reuse.
{"title":"An IMPLY-based Memristive Multiplier for Computing-in-Memory Systems with Weight-Stationary CNN Acceleration","authors":"Wenhui Liang, Jiarui Xu, Yuansheng Zhao, Zixuan Shen, Guoyi Yu, Yuhui He, Chao Wang","doi":"10.1109/ICTA56932.2022.9962994","DOIUrl":"https://doi.org/10.1109/ICTA56932.2022.9962994","url":null,"abstract":"Adders and multipliers based on memristive Material Implication (IMPLY) logic are widely used in primary building blocks of Arithmetic Logic Unit (ALU). To solve the issue that the existing IMPLY-based multipliers cannot protect the input operands, this paper presents a novel data non-destructive memristive IMPLY-based semi-parallel multiplier for Computing-in-Memory (CIM) systems, by assigning function-specific memristors for data-protection and introducing additional switches for higher parallelism. Simulation results show that the proposed multiplier can achieve 30% faster than conventional semi-parallel design and 9.1 % less memristors against the state-of-art semi-serial design for 4-bit multiplication, while preventing the input weight from destruction as required by CNN weight reuse.","PeriodicalId":325602,"journal":{"name":"2022 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125547469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-28DOI: 10.1109/ICTA56932.2022.9963056
Hailong Wu, Jindong Li, Xiang Chen
Field Programmable Gate Array (FPGA) has the characteristics of low power consumption, high performance and flexibility. Research on FPGA neural network acceleration is emerging, but most of the researches are based on foreign FPGA devices. In order to improve the current situation of domestic FPGA, a novel Convolutional neural networks (CNNs) accelerator for domestic FPGA equipped with lightweight RISC-V soft core is proposed. The peak performance of the proposed accelerator reaches 153.6 GOP/s, occupying only 14K LUTs (Look-Up-Table), 32 DRMs (Dedicated RAM Modules) and 208 APMs (Arithmetic Process Modules). The proposed accelerator has enough computing power for most of the Edge-AI applications and embedded systems, providing a possible AI inference acceleration solution for domestic FPGA.
{"title":"Implementation of CNN Heterogeneous Scheme Based on Domestic FPGA with RISC-V Soft Core CPU","authors":"Hailong Wu, Jindong Li, Xiang Chen","doi":"10.1109/ICTA56932.2022.9963056","DOIUrl":"https://doi.org/10.1109/ICTA56932.2022.9963056","url":null,"abstract":"Field Programmable Gate Array (FPGA) has the characteristics of low power consumption, high performance and flexibility. Research on FPGA neural network acceleration is emerging, but most of the researches are based on foreign FPGA devices. In order to improve the current situation of domestic FPGA, a novel Convolutional neural networks (CNNs) accelerator for domestic FPGA equipped with lightweight RISC-V soft core is proposed. The peak performance of the proposed accelerator reaches 153.6 GOP/s, occupying only 14K LUTs (Look-Up-Table), 32 DRMs (Dedicated RAM Modules) and 208 APMs (Arithmetic Process Modules). The proposed accelerator has enough computing power for most of the Edge-AI applications and embedded systems, providing a possible AI inference acceleration solution for domestic FPGA.","PeriodicalId":325602,"journal":{"name":"2022 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129450555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-28DOI: 10.1109/ICTA56932.2022.9963068
Xin Wang, Yanqing Wu
Two-dimensional (2D) semiconducting materials channels enable ultimate scaling of transistors and will help Moore's Law Scaling for decades. In this paper, we reported p-type WSe2transistors using monolayer (¬0.85 nm) channels by molten-salt-assisted chemical vapor deposition. The transfer-free back-gate devices fabricated based on 100 nm SiO2/Si substrate exhibit highest on current at Vds= -1 V among transistors of monolayer p-WSe2, and a high on/off ratio up to 108.
{"title":"CVD Monolayer tungsten-based PMOS Transistor with high performance at Vds = -1 V","authors":"Xin Wang, Yanqing Wu","doi":"10.1109/ICTA56932.2022.9963068","DOIUrl":"https://doi.org/10.1109/ICTA56932.2022.9963068","url":null,"abstract":"Two-dimensional (2D) semiconducting materials channels enable ultimate scaling of transistors and will help Moore's Law Scaling for decades. In this paper, we reported p-type WSe2transistors using monolayer (¬0.85 nm) channels by molten-salt-assisted chemical vapor deposition. The transfer-free back-gate devices fabricated based on 100 nm SiO2/Si substrate exhibit highest on current at Vds= -1 V among transistors of monolayer p-WSe2, and a high on/off ratio up to 108.","PeriodicalId":325602,"journal":{"name":"2022 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114376919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-28DOI: 10.1109/ICTA56932.2022.9963088
Haoqing Xu, Weizhuo Gan, Lei Cao, H. Yin, Zhenhua Wu
In this paper, we demonstrate the prediction of important figures of merit (FoMs) including threshold voltage (Vth), subthreshold swing (SS), on-state (Ion) and off-state (Ioft) current, of vertically stacked lateral nanosheet field-effect-transistors (NSFET) using 1) an artificial neural network generated by genetic algorithm (GA) and 2) a conventional multi-layer neural network (NN). Our work shows that the trained GA-based NN has a great capability of predicting FoMs with an average of coefficients of determination at 0.992, which is better than that of the trained multi-layer neural network at 0.987. Additionally, GA-based NN has a significant reduction of calculation time by 80% compared with that of multi-layer NN under the same computing power, which indicates the possibility to reduce the computational cost by using the auto-machine learning approach for TCAD simulation.
{"title":"Prediction of Key Metrics of Stacked Nanosheet nFETs using Genetic Algorithm-based Neural Networks","authors":"Haoqing Xu, Weizhuo Gan, Lei Cao, H. Yin, Zhenhua Wu","doi":"10.1109/ICTA56932.2022.9963088","DOIUrl":"https://doi.org/10.1109/ICTA56932.2022.9963088","url":null,"abstract":"In this paper, we demonstrate the prediction of important figures of merit (FoMs) including threshold voltage (Vth), subthreshold swing (SS), on-state (Ion) and off-state (Ioft) current, of vertically stacked lateral nanosheet field-effect-transistors (NSFET) using 1) an artificial neural network generated by genetic algorithm (GA) and 2) a conventional multi-layer neural network (NN). Our work shows that the trained GA-based NN has a great capability of predicting FoMs with an average of coefficients of determination at 0.992, which is better than that of the trained multi-layer neural network at 0.987. Additionally, GA-based NN has a significant reduction of calculation time by 80% compared with that of multi-layer NN under the same computing power, which indicates the possibility to reduce the computational cost by using the auto-machine learning approach for TCAD simulation.","PeriodicalId":325602,"journal":{"name":"2022 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131358855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-28DOI: 10.1109/ICTA56932.2022.9963101
Yunqiang Yang, Ming Zhong, Qianli Ma, Ziyi Lin, Leliang Li, Guike Li, Liyuan Liu, Jian Liu, N. Wu, Haikun Jia, Xinghui Liu, Nan Qi
This paper presents a 56Gb/s de-serializer with PAM-4 CDR for chiplet optical-I/O in 28nm CMOS. There are two channels in this chip. Each channel consists of a high-performance analog front end (AFE) and a half-rate clock and data recovery (CDR) circuit based on a digital phase interpolator and digital loop filter. To provide 28-GHz clock signals to both channels, a clock distribution circuit is integrated. Experimental results show that the proposed de-serializer recovers a 56Gb/s PAM-4 input signal with channel loss, achieving an output swing of 1.01-Vppd and 760ps RMS jitter.
{"title":"A 56Gb/s De-serializer with PAM-4 CDR for Chiplet Optical-I/O","authors":"Yunqiang Yang, Ming Zhong, Qianli Ma, Ziyi Lin, Leliang Li, Guike Li, Liyuan Liu, Jian Liu, N. Wu, Haikun Jia, Xinghui Liu, Nan Qi","doi":"10.1109/ICTA56932.2022.9963101","DOIUrl":"https://doi.org/10.1109/ICTA56932.2022.9963101","url":null,"abstract":"This paper presents a 56Gb/s de-serializer with PAM-4 CDR for chiplet optical-I/O in 28nm CMOS. There are two channels in this chip. Each channel consists of a high-performance analog front end (AFE) and a half-rate clock and data recovery (CDR) circuit based on a digital phase interpolator and digital loop filter. To provide 28-GHz clock signals to both channels, a clock distribution circuit is integrated. Experimental results show that the proposed de-serializer recovers a 56Gb/s PAM-4 input signal with channel loss, achieving an output swing of 1.01-Vppd and 760ps RMS jitter.","PeriodicalId":325602,"journal":{"name":"2022 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131386439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-28DOI: 10.1109/ICTA56932.2022.9962969
Liangfan Chen, Lu Zhao, Zihao Chen
A dual-band tunable monopole antenna is designed for 5G communication applications. The devised tuner consists of RF switch and RF capacitors of 0.3 pF, 0.5 pF, 1 pF, 2 pF and 5 pF, which enables the monopole antenna to be operated in different frequency bands. The proposed antenna is fabricated and measured. The measured -10 dB input impedance bandwidths of the proposed antenna are 1.32 GHz - 1.95 GHz and 1.98 GHz - 5.02 GHz, which can fully cover the 5G frequency spectrum in China.
{"title":"A Tunable Monopole Antenna for 5G Communication Applications","authors":"Liangfan Chen, Lu Zhao, Zihao Chen","doi":"10.1109/ICTA56932.2022.9962969","DOIUrl":"https://doi.org/10.1109/ICTA56932.2022.9962969","url":null,"abstract":"A dual-band tunable monopole antenna is designed for 5G communication applications. The devised tuner consists of RF switch and RF capacitors of 0.3 pF, 0.5 pF, 1 pF, 2 pF and 5 pF, which enables the monopole antenna to be operated in different frequency bands. The proposed antenna is fabricated and measured. The measured -10 dB input impedance bandwidths of the proposed antenna are 1.32 GHz - 1.95 GHz and 1.98 GHz - 5.02 GHz, which can fully cover the 5G frequency spectrum in China.","PeriodicalId":325602,"journal":{"name":"2022 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132760444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-28DOI: 10.1109/ICTA56932.2022.9963003
Heng Liu, Dongxu Li, Xian Tang
This paper presents an output-capacitorless low-dropout regulator (OCL-LDO) using capacitive bulk-driven feed-forward (CBDFF) technique and an adaptive-biasing error amplifier with gm-boosting to enhance the power supply rejection (PSR) and the transient response. The proposed OCL-LDO has been implemented in a 22nm CMOS technology. It consumes a quiescent current of 49 µA from a power supply of 1.05-1.25 V and has a dropout voltage of 200 mV. The OCL-LDO achieves -84 dB PSR at low frequency and -69 dB PSR at 1 MHz for the load current of 20 mA. It achieves a line regulation of 0.18 mV/V, a load regulation of 0.77 µV/mA, and a settling time of 135 ns.
{"title":"A High PSR and Fast Transient Response Output-Capacitorless LDO using Gm-Boosting and Capacitive Bulk-Driven Feed-Forward Technique in 22nm CMOS","authors":"Heng Liu, Dongxu Li, Xian Tang","doi":"10.1109/ICTA56932.2022.9963003","DOIUrl":"https://doi.org/10.1109/ICTA56932.2022.9963003","url":null,"abstract":"This paper presents an output-capacitorless low-dropout regulator (OCL-LDO) using capacitive bulk-driven feed-forward (CBDFF) technique and an adaptive-biasing error amplifier with gm-boosting to enhance the power supply rejection (PSR) and the transient response. The proposed OCL-LDO has been implemented in a 22nm CMOS technology. It consumes a quiescent current of 49 µA from a power supply of 1.05-1.25 V and has a dropout voltage of 200 mV. The OCL-LDO achieves -84 dB PSR at low frequency and -69 dB PSR at 1 MHz for the load current of 20 mA. It achieves a line regulation of 0.18 mV/V, a load regulation of 0.77 µV/mA, and a settling time of 135 ns.","PeriodicalId":325602,"journal":{"name":"2022 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117002343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present a study to suppress error bits from lateral charge migration (LCM) in charge-trap (CT) 3D NAND flash memory. For the first time, a new Baking-and-Pre-read (BPR) method is proposed with combined long-time charge diffusion by baking and short-time stabilizing by Pre-read. By characterizing 96-layer Triple-level-cell (TLC) 3D NAND chips by the raw NAND chip tester, the storage stabilities, including data retention (DR) and read disturb (RD), are studied and it is found that DR/RD error bits can be reduced up to >70%, which could be explained by the large effects of suppression to LCM-related threshold voltage (Vth) down-shifts.
我们提出了一种抑制电荷阱(CT) 3D NAND闪存中横向电荷迁移(LCM)错误位的研究。首次提出了一种结合烘烤长时间电荷扩散和预读短时间稳定的烘烤预读(BPR)方法。通过对96层三电平单元(TLC) 3D NAND芯片的原始NAND芯片测试,研究了其存储稳定性,包括数据保留(DR)和读取干扰(RD),发现DR/RD错误位可以减少高达>70%,这可以解释为对lcm相关阈值电压(Vth)降移的抑制作用很大。
{"title":"Large Suppression to Lateral Charge Migration (LCM) Related Error Bits in Charge-Trap TLC 3D NAND Flash","authors":"Kenie Xie, Pena Guo, Fei Chen, Binglu Chen, Xiaotong Fang, Jixuan Wu, Xuepeng Zhan, Jiezhi Chen","doi":"10.1109/ICTA56932.2022.9962997","DOIUrl":"https://doi.org/10.1109/ICTA56932.2022.9962997","url":null,"abstract":"We present a study to suppress error bits from lateral charge migration (LCM) in charge-trap (CT) 3D NAND flash memory. For the first time, a new Baking-and-Pre-read (BPR) method is proposed with combined long-time charge diffusion by baking and short-time stabilizing by Pre-read. By characterizing 96-layer Triple-level-cell (TLC) 3D NAND chips by the raw NAND chip tester, the storage stabilities, including data retention (DR) and read disturb (RD), are studied and it is found that DR/RD error bits can be reduced up to >70%, which could be explained by the large effects of suppression to LCM-related threshold voltage (Vth) down-shifts.","PeriodicalId":325602,"journal":{"name":"2022 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117174482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents a high-performance digital equalizer with four-level pulse amplitude modulation (PAM-4) for 64Gb/$s$ backplane I/Os. The digital equalizer consists of a tap-configurable feed-forward equalizer (FFE) and a partially unrolled decision-feedback equalizer (DFE). The first two post-cursor is covered by DFE and then FFE follows, which can largely reduce the influence of noise and crosstalk. The configurable FFE taps enable better adaption for different kind of channels. In order to optimize the internal algorithm, the look-up table (LUT) is used in both FFE and DFE. And the DFE is unrolled for timing closing using a new architecture introduced in this paper. Fabricated in 28nm CMOS, the digital equalizer operates at 64Gb/s with only 5pJ/bit power consumption at 1V.
本文提出了一种适用于64Gb/ s /背板I/ o的高性能四电平脉冲调幅(PAM-4)数字均衡器。数字均衡器由分接可配置的前馈均衡器(FFE)和部分展开的决策反馈均衡器(DFE)组成。前两个后光标被DFE覆盖,然后是FFE,这可以很大程度上减少噪声和串扰的影响。可配置的FFE抽头能够更好地适应不同类型的通道。为了优化内部算法,在FFE和DFE中都使用了查找表(LUT)。并利用本文介绍的一种新结构展开了DFE定时闭合。该数字均衡器采用28nm CMOS制造,工作速度为64Gb/s,在1V电压下功耗仅为5pJ/bit。
{"title":"A 64Gb/s PAM-4 Digital Equalizer With Tap-Configurable FFE and Partially Unrolled DFE in 28nm CMOS","authors":"Xinjie Feng, Yong-Nan Chen, Youzhi Gu, Jiangfeng Wu","doi":"10.1109/ICTA56932.2022.9963099","DOIUrl":"https://doi.org/10.1109/ICTA56932.2022.9963099","url":null,"abstract":"This paper presents a high-performance digital equalizer with four-level pulse amplitude modulation (PAM-4) for 64Gb/$s$ backplane I/Os. The digital equalizer consists of a tap-configurable feed-forward equalizer (FFE) and a partially unrolled decision-feedback equalizer (DFE). The first two post-cursor is covered by DFE and then FFE follows, which can largely reduce the influence of noise and crosstalk. The configurable FFE taps enable better adaption for different kind of channels. In order to optimize the internal algorithm, the look-up table (LUT) is used in both FFE and DFE. And the DFE is unrolled for timing closing using a new architecture introduced in this paper. Fabricated in 28nm CMOS, the digital equalizer operates at 64Gb/s with only 5pJ/bit power consumption at 1V.","PeriodicalId":325602,"journal":{"name":"2022 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117262143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-28DOI: 10.1109/ICTA56932.2022.9963022
Yukang Huang, Dong Jiang, Yongkui Yang, Enyi Yao
The combinatorial optimization problem is ubiquitously in our daily life and typically inefficient for modern Von Neumann architecture-based computer. Targeting for various combinatorial optimization problems, this paper presents a 10K-bit area-efficient architecture of the domain specific accelerator based on fully-connected Ising model using an FPGA platform. The proposed system is based on simulated annealing algorithm with a spin preselection scheme to prevent the system to be trapped in the local minimum and increase the convergence efficiency, which is more easily and efficiently to be hardware implemented. Using max-cut problem as the experiment benchmark, the proposed hardware architecture achieves an acceleration of 50,000 × compared with the software simulation result.
{"title":"A Fully-Connected and Area-Efficient Ising Model Annealing Accelerator for Combinatorial Optimization Problems","authors":"Yukang Huang, Dong Jiang, Yongkui Yang, Enyi Yao","doi":"10.1109/ICTA56932.2022.9963022","DOIUrl":"https://doi.org/10.1109/ICTA56932.2022.9963022","url":null,"abstract":"The combinatorial optimization problem is ubiquitously in our daily life and typically inefficient for modern Von Neumann architecture-based computer. Targeting for various combinatorial optimization problems, this paper presents a 10K-bit area-efficient architecture of the domain specific accelerator based on fully-connected Ising model using an FPGA platform. The proposed system is based on simulated annealing algorithm with a spin preselection scheme to prevent the system to be trapped in the local minimum and increase the convergence efficiency, which is more easily and efficiently to be hardware implemented. Using max-cut problem as the experiment benchmark, the proposed hardware architecture achieves an acceleration of 50,000 × compared with the software simulation result.","PeriodicalId":325602,"journal":{"name":"2022 IEEE International Conference on Integrated Circuits, Technologies and Applications (ICTA)","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123502538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}