Pub Date : 2024-11-18DOI: 10.1109/TVLSI.2024.3495558
Francesco Gagliardi;Danilo Scintu;Massimo Piotto;Paolo Bruschi;Michele Dei
Driven by the ongoing challenge of designing high-accuracy digital-to-analog converters (DACs) at the cost of a relatively small area occupation, optimal combination algorithms (OCAs) recently gained attention within the myriad of possible calibration techniques for DACs. OCAs show appealing properties with respect to traditional approaches such as dynamic element matching (DEM). At start-up or upon request, mismatches affecting DAC elements are measured on-chip, allowing rearrangement in the selection logic of the DAC unit elements. The newly found arrangement is, hence, used during normal operation, achieving superior linearity. As of today, several alternative OCAs have been proposed; however, designers willing to implement OCA-calibrated DACs are faced with unclear tradeoffs and insufficient design guidelines. In this work, we provide a detailed comparison of existing OCAs based on statistical behavioral simulations. Starting from this, we investigate the relationships between OCAs’ performances and circuit-level design aspects. Specifically, OCAs’ effectiveness in improving the static linearity is linked to the number of DAC bits and the accuracy of the auxiliary comparator required by every OCA. Unforeseen trends emerge, and new design considerations are suggested, fostering novel awareness on the subject of high-accuracy DAC designs enabled by OCA-based calibration techniques.
在设计高精度数模转换器(DAC)的过程中,需要以相对较小的占地面积为代价,在这一挑战的推动下,优化组合算法(OCA)最近在数不胜数的 DAC 校准技术中受到了关注。与动态元素匹配(DEM)等传统方法相比,优化组合算法显示出了极具吸引力的特性。在启动时或根据要求,可在芯片上测量影响 DAC 元件的不匹配情况,从而重新安排 DAC 单元元件的选择逻辑。因此,新发现的排列方式可在正常运行时使用,从而实现出色的线性度。到目前为止,已经提出了几种可供选择的 OCA;但是,愿意采用 OCA 校准 DAC 的设计人员面临着权衡不清和设计指导不足的问题。在这项工作中,我们基于统计行为模拟对现有的 OCA 进行了详细比较。在此基础上,我们研究了 OCA 性能与电路级设计之间的关系。具体来说,OCA 在改善静态线性度方面的有效性与每个 OCA 所需的 DAC 位数和辅助比较器的精度有关。我们发现了不可预见的趋势,并提出了新的设计考虑因素,从而促进了对基于 OCA 校准技术的高精度 DAC 设计这一主题的新认识。
{"title":"Static-Linearity Enhancement Techniques for Digital-to-Analog Converters Exploiting Optimal Arrangements of Unit Elements","authors":"Francesco Gagliardi;Danilo Scintu;Massimo Piotto;Paolo Bruschi;Michele Dei","doi":"10.1109/TVLSI.2024.3495558","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3495558","url":null,"abstract":"Driven by the ongoing challenge of designing high-accuracy digital-to-analog converters (DACs) at the cost of a relatively small area occupation, optimal combination algorithms (OCAs) recently gained attention within the myriad of possible calibration techniques for DACs. OCAs show appealing properties with respect to traditional approaches such as dynamic element matching (DEM). At start-up or upon request, mismatches affecting DAC elements are measured on-chip, allowing rearrangement in the selection logic of the DAC unit elements. The newly found arrangement is, hence, used during normal operation, achieving superior linearity. As of today, several alternative OCAs have been proposed; however, designers willing to implement OCA-calibrated DACs are faced with unclear tradeoffs and insufficient design guidelines. In this work, we provide a detailed comparison of existing OCAs based on statistical behavioral simulations. Starting from this, we investigate the relationships between OCAs’ performances and circuit-level design aspects. Specifically, OCAs’ effectiveness in improving the static linearity is linked to the number of DAC bits and the accuracy of the auxiliary comparator required by every OCA. Unforeseen trends emerge, and new design considerations are suggested, fostering novel awareness on the subject of high-accuracy DAC designs enabled by OCA-based calibration techniques.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"32 12","pages":"2243-2256"},"PeriodicalIF":2.8,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10756519","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142821155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-14DOI: 10.1109/TVLSI.2024.3480997
Jia-Li Duan;Chi Zhang;Li-Hui Wang;Lei Shen
Fully homomorphic encryption (FHE) enables calculations on encrypted data and is a crucial foundation for achieving privacy computing. However, the high computation overhead restricts its widespread application. Even after algorithm and software optimization, its processing speed remains low. This article proposes the first practical system-level multicore Brakerski-Gentry-Vaikuntanathan (BGV) hardware acceleration scheme based on field-programmable gate array (FPGA). By analyzing the bottleneck of system acceleration, a hierarchical storage structure is introduced to reduce data movement. A novel 4-2 mixed-radix number theoretic transform (NTT) algorithm is proposed, allowing flexible switching between radix-4 and radix-2, with the ability to reuse twiddle factors. In addition, a reconfigurable processing element (PE) is proposed that supports all homomorphic operations of BGV. The design of this article is evaluated on Xilinx Virtex7 series FPGA, achieving a throughput of NTT/inverse NTT (INTT) up to $14times $ higher than previous designs. Compared with simple encrypted arithmetic library (SEAL), the full system performances of homomorphic encryption (ENC), decryption (DEC), and homomorphic multiplication achieve improvements of $13.9times $ , $7.07times $ , and $16.6times $ , respectively.
{"title":"SMBHA: A System-Level Multicore BGV Hardware Accelerator Based on FPGA","authors":"Jia-Li Duan;Chi Zhang;Li-Hui Wang;Lei Shen","doi":"10.1109/TVLSI.2024.3480997","DOIUrl":"https://doi.org/10.1109/TVLSI.2024.3480997","url":null,"abstract":"Fully homomorphic encryption (FHE) enables calculations on encrypted data and is a crucial foundation for achieving privacy computing. However, the high computation overhead restricts its widespread application. Even after algorithm and software optimization, its processing speed remains low. This article proposes the first practical system-level multicore Brakerski-Gentry-Vaikuntanathan (BGV) hardware acceleration scheme based on field-programmable gate array (FPGA). By analyzing the bottleneck of system acceleration, a hierarchical storage structure is introduced to reduce data movement. A novel 4-2 mixed-radix number theoretic transform (NTT) algorithm is proposed, allowing flexible switching between radix-4 and radix-2, with the ability to reuse twiddle factors. In addition, a reconfigurable processing element (PE) is proposed that supports all homomorphic operations of BGV. The design of this article is evaluated on Xilinx Virtex7 series FPGA, achieving a throughput of NTT/inverse NTT (INTT) up to <inline-formula> <tex-math>$14times $ </tex-math></inline-formula> higher than previous designs. Compared with simple encrypted arithmetic library (SEAL), the full system performances of homomorphic encryption (ENC), decryption (DEC), and homomorphic multiplication achieve improvements of <inline-formula> <tex-math>$13.9times $ </tex-math></inline-formula>, <inline-formula> <tex-math>$7.07times $ </tex-math></inline-formula>, and <inline-formula> <tex-math>$16.6times $ </tex-math></inline-formula>, respectively.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 2","pages":"546-557"},"PeriodicalIF":2.8,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142993412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-14DOI: 10.1109/TVLSI.2024.3489231
Jie Ding;Fuming Liu;Kuan Deng;Zihan Zheng;Jingnan Zheng;Yongzhen Chen;Jiangfeng Wu
This article introduces a successive approximation register (SAR) analog-to-digital converter (ADC) that utilizes a foreground capacitor mismatch self-calibration method. The proposed floating operation puts the uncalibrated high-bit capacitor into the floating state, preventing the sub-ADC from saturating caused by comparator static offset during the calibration process. To address the random mismatch of the LSB capacitors and improve the calibration accuracy, this article employs round-robin grouping of eight sets of LSB capacitors. In addition, a precharged bootstrapped switch is proposed to achieve high sampling linearity with low power consumption and area overhead. An anti-interference custom-designed 0.5-fF capacitor structure is suggested for binary-weighted capacitor mismatch of capacitive DAC (CDAC). Furthermore, the circuit implementation of the comparator utilized by ADC is also discussed. The prototype was fabricated in a 180-nm CMOS process with a 1.8-V supply and achieved spurious-free dynamic ranges of 108.9 and 92.38 dB at an input frequency of 1 kHz while operating at sampling rates of 100 kS/s and 1 MS/s, respectively. The prototype consumes 6.745 mW and occupies 0.91 $text {mm}^{2}$