Bo Liu;Na Xie;Renyuan Zhang;Haichuan Yang;Ziyu Wang;Deliang Fan;Zhen Wang;Weiqiang Liu;Hao Cai
{"title":"基于对称压缩三权神经网络的3.8 μ w 10关键字噪声鲁棒关键字识别处理器","authors":"Bo Liu;Na Xie;Renyuan Zhang;Haichuan Yang;Ziyu Wang;Deliang Fan;Zhen Wang;Weiqiang Liu;Hao Cai","doi":"10.1109/OJSSCS.2023.3312354","DOIUrl":null,"url":null,"abstract":"A ternary-weight neural network (TWN) inspired keyword spotting (KWS) processor is proposed to support complicated and variable application scenarios. To achieve high-precision recognition of ten keywords under 5 dB~Clean wide range of background noises, a convolution neural network consists of four convolution layers and four fully connected layers, with modified sparsity-controllable truncated Gaussian approximation-based ternary-weight training is used. End-to-end optimization composed of three techniques is utilized: 1) the stage-by-stage bit-width selection algorithm to optimize the hardware overhead of FFT; 2) the lossy compressed TWN with symmetric kernel training (SKT) and dedicated internal data reuse computation flow; and 3) the error intercompensation approximate addition tree to reduce the computation overhead with marginal accuracy loss. Fabricated in an industrial 22-nm CMOS process, the processor realizes up to ten keywords in real-time recognition under 11 background noise types, with the accuracy of 90.6%@clean and 85.4%@5 dB. It consumes an average power of \n<inline-formula> <tex-math>$3.8 ~\\mu \\text{W}$ </tex-math></inline-formula>\n at 250 kHz and the normalized energy efficiency is \n<inline-formula> <tex-math>$2.79\\times $ </tex-math></inline-formula>\n higher than state of the art.","PeriodicalId":100633,"journal":{"name":"IEEE Open Journal of the Solid-State Circuits Society","volume":"3 ","pages":"185-196"},"PeriodicalIF":0.0000,"publicationDate":"2023-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10242041","citationCount":"0","resultStr":"{\"title\":\"A 3.8-μW 10-Keyword Noise-Robust Keyword Spotting Processor Using Symmetric Compressed Ternary-Weight Neural Networks\",\"authors\":\"Bo Liu;Na Xie;Renyuan Zhang;Haichuan Yang;Ziyu Wang;Deliang Fan;Zhen Wang;Weiqiang Liu;Hao Cai\",\"doi\":\"10.1109/OJSSCS.2023.3312354\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A ternary-weight neural network (TWN) inspired keyword spotting (KWS) processor is proposed to support complicated and variable application scenarios. To achieve high-precision recognition of ten keywords under 5 dB~Clean wide range of background noises, a convolution neural network consists of four convolution layers and four fully connected layers, with modified sparsity-controllable truncated Gaussian approximation-based ternary-weight training is used. End-to-end optimization composed of three techniques is utilized: 1) the stage-by-stage bit-width selection algorithm to optimize the hardware overhead of FFT; 2) the lossy compressed TWN with symmetric kernel training (SKT) and dedicated internal data reuse computation flow; and 3) the error intercompensation approximate addition tree to reduce the computation overhead with marginal accuracy loss. Fabricated in an industrial 22-nm CMOS process, the processor realizes up to ten keywords in real-time recognition under 11 background noise types, with the accuracy of 90.6%@clean and 85.4%@5 dB. It consumes an average power of \\n<inline-formula> <tex-math>$3.8 ~\\\\mu \\\\text{W}$ </tex-math></inline-formula>\\n at 250 kHz and the normalized energy efficiency is \\n<inline-formula> <tex-math>$2.79\\\\times $ </tex-math></inline-formula>\\n higher than state of the art.\",\"PeriodicalId\":100633,\"journal\":{\"name\":\"IEEE Open Journal of the Solid-State Circuits Society\",\"volume\":\"3 \",\"pages\":\"185-196\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-09-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10242041\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Open Journal of the Solid-State Circuits Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10242041/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Open Journal of the Solid-State Circuits Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10242041/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A 3.8-μW 10-Keyword Noise-Robust Keyword Spotting Processor Using Symmetric Compressed Ternary-Weight Neural Networks
A ternary-weight neural network (TWN) inspired keyword spotting (KWS) processor is proposed to support complicated and variable application scenarios. To achieve high-precision recognition of ten keywords under 5 dB~Clean wide range of background noises, a convolution neural network consists of four convolution layers and four fully connected layers, with modified sparsity-controllable truncated Gaussian approximation-based ternary-weight training is used. End-to-end optimization composed of three techniques is utilized: 1) the stage-by-stage bit-width selection algorithm to optimize the hardware overhead of FFT; 2) the lossy compressed TWN with symmetric kernel training (SKT) and dedicated internal data reuse computation flow; and 3) the error intercompensation approximate addition tree to reduce the computation overhead with marginal accuracy loss. Fabricated in an industrial 22-nm CMOS process, the processor realizes up to ten keywords in real-time recognition under 11 background noise types, with the accuracy of 90.6%@clean and 85.4%@5 dB. It consumes an average power of
$3.8 ~\mu \text{W}$
at 250 kHz and the normalized energy efficiency is
$2.79\times $
higher than state of the art.