{"title":"带模拟指数预处理的可重构浮点内存计算器","authors":"Pengyu He;Yuanzhe Zhao;Heng Xie;Yang Wang;Shouyi Yin;Li Li;Yan Zhu;Rui P. Martins;Chi-Hang Chan;Minglei Zhang","doi":"10.1109/LSSC.2024.3463208","DOIUrl":null,"url":null,"abstract":"This letter presents a reconfigurable floating-point compute-in-memory (FP-CIM) macro that preprocesses the exponent in the analog domain, enhancing the energy efficiency of edge devices for the floating-point (FP) inference. The presented FP-CIM macro supports FP8 inference, while can be configured to BP16 precision in a segmented computation manner. Furthermore, a time-domain analog-to-digital converter facilitates the analog compute-in-memory (CIM) macro while improving energy efficiency by sharing the counter and quantizing in a coarse-fine structure. Fabricated in a 28-nm CMOS process, the presented FP-CIM macro achieves 314.6-TFLOPS/W energy efficiency and 12.13-TFLOPS/mm2 area efficiency at the FP8 mode.","PeriodicalId":13032,"journal":{"name":"IEEE Solid-State Circuits Letters","volume":"7 ","pages":"271-274"},"PeriodicalIF":2.2000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Reconfigurable Floating-Point Compute-in-Memory With Analog Exponent Preprocesses\",\"authors\":\"Pengyu He;Yuanzhe Zhao;Heng Xie;Yang Wang;Shouyi Yin;Li Li;Yan Zhu;Rui P. Martins;Chi-Hang Chan;Minglei Zhang\",\"doi\":\"10.1109/LSSC.2024.3463208\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This letter presents a reconfigurable floating-point compute-in-memory (FP-CIM) macro that preprocesses the exponent in the analog domain, enhancing the energy efficiency of edge devices for the floating-point (FP) inference. The presented FP-CIM macro supports FP8 inference, while can be configured to BP16 precision in a segmented computation manner. Furthermore, a time-domain analog-to-digital converter facilitates the analog compute-in-memory (CIM) macro while improving energy efficiency by sharing the counter and quantizing in a coarse-fine structure. Fabricated in a 28-nm CMOS process, the presented FP-CIM macro achieves 314.6-TFLOPS/W energy efficiency and 12.13-TFLOPS/mm2 area efficiency at the FP8 mode.\",\"PeriodicalId\":13032,\"journal\":{\"name\":\"IEEE Solid-State Circuits Letters\",\"volume\":\"7 \",\"pages\":\"271-274\"},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2024-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Solid-State Circuits Letters\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10683795/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Solid-State Circuits Letters","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10683795/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
A Reconfigurable Floating-Point Compute-in-Memory With Analog Exponent Preprocesses
This letter presents a reconfigurable floating-point compute-in-memory (FP-CIM) macro that preprocesses the exponent in the analog domain, enhancing the energy efficiency of edge devices for the floating-point (FP) inference. The presented FP-CIM macro supports FP8 inference, while can be configured to BP16 precision in a segmented computation manner. Furthermore, a time-domain analog-to-digital converter facilitates the analog compute-in-memory (CIM) macro while improving energy efficiency by sharing the counter and quantizing in a coarse-fine structure. Fabricated in a 28-nm CMOS process, the presented FP-CIM macro achieves 314.6-TFLOPS/W energy efficiency and 12.13-TFLOPS/mm2 area efficiency at the FP8 mode.