Pub Date : 2024-08-01DOI: 10.1109/TBCAS.2024.3436837
Qi Cheng, Xiaofang Hu, He Xiao, Yue Zhou, Shukai Duan
In recent years, The combination of Attention mechanism and deep learning has a wide range of applications in the field of medical imaging. However, due to its complex computational processes, existing hardware architectures have high resource consumption or low accuracy, and deploying them efficiently to DNN accelerators is a challenge. This paper proposes an online-programmable Attention hardware architecture based on compute-in-memory (CIM) marco, which reduces the complexity of Attention in hardware and improves integration density, energy efficiency, and calculation accuracy. First, the Attention computation process is decomposed into multiple cascaded combinatorial matrix operations to reduce the complexity of its implementation on the hardware side; second, in order to reduce the influence of the non-ideal characteristics of the hardware, an online-programmable CIM architecture is designed to improve calculation accuracy by dynamically adjusting the weights; and lastly, it is verified that the proposed Attention hardware architecture can be applied for the inference of deep neural networks through Spice simulation. Based on the 100nm CMOS process, compared with the traditional Attention hardware architectures, the integrated density and energy efficiency are increased by at least 91.38 times, and latency and computing efficiency are improved by about 12.5 times.
{"title":"High-Performance Method and Architecture for Attention Computation in DNN Inference.","authors":"Qi Cheng, Xiaofang Hu, He Xiao, Yue Zhou, Shukai Duan","doi":"10.1109/TBCAS.2024.3436837","DOIUrl":"10.1109/TBCAS.2024.3436837","url":null,"abstract":"<p><p>In recent years, The combination of Attention mechanism and deep learning has a wide range of applications in the field of medical imaging. However, due to its complex computational processes, existing hardware architectures have high resource consumption or low accuracy, and deploying them efficiently to DNN accelerators is a challenge. This paper proposes an online-programmable Attention hardware architecture based on compute-in-memory (CIM) marco, which reduces the complexity of Attention in hardware and improves integration density, energy efficiency, and calculation accuracy. First, the Attention computation process is decomposed into multiple cascaded combinatorial matrix operations to reduce the complexity of its implementation on the hardware side; second, in order to reduce the influence of the non-ideal characteristics of the hardware, an online-programmable CIM architecture is designed to improve calculation accuracy by dynamically adjusting the weights; and lastly, it is verified that the proposed Attention hardware architecture can be applied for the inference of deep neural networks through Spice simulation. Based on the 100nm CMOS process, compared with the traditional Attention hardware architectures, the integrated density and energy efficiency are increased by at least 91.38 times, and latency and computing efficiency are improved by about 12.5 times.</p>","PeriodicalId":94031,"journal":{"name":"IEEE transactions on biomedical circuits and systems","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141876988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This work proposes a classification system for arrhythmias, aiming to enhance the efficiency of the diagnostic process for cardiologists. The proposed algorithm includes a naive preprocessing procedure for electrocardiography (ECG) data applicable to various ECG databases. Additionally, this work proposes an ultralightweight model for arrhythmia classification based on a convolutional neural network and incorporating R-peak interval features to represent long-term rhythm information, thereby improving the model's classification performance. The proposed model is trained and tested by using the MIT-BIH and NCKU-CBIC databases in accordance with the classification standards of the Association for the Advancement of Medical Instrumentation (AAMI), achieving high accuracies of 98.32% and 97.1%. This work applies the arrhythmia classification algorithm to a web-based system, thus providing a graphical interface. The cloud-based execution of automated artificial intelligence (AI) classification allows cardiologists and patients to view ECG wave conditions instantly, thereby remarkably enhancing the quality of medical examination. This work also designs a customized integrated circuit for the hardware implementation of an AI accelerator. The accelerator utilizes a parallelized processing element array architecture to perform convolution and fully connected layer operations. It introduces proposed hybrid stationary techniques, combining input and weight stationary modes to increase data reuse drastically and reduce hardware execution cycles and power consumption, ultimately achieving high-performance computing. This accelerator is implemented in the form of a chip by using the TSMC 180 nm CMOS process. It exhibits a power consumption of 122 µW, a classification latency of 6.8 ms, and an energy efficiency of 0.83 µJ/classification.
{"title":"AI Accelerator With Ultralightweight Time-Period CNN-Based Model for Arrhythmia Classification","authors":"Shuenn-Yuh Lee;Ming-Yueh Ku;Wei-Cheng Tseng;Ju-Yi Chen","doi":"10.1109/TBCAS.2024.3435718","DOIUrl":"10.1109/TBCAS.2024.3435718","url":null,"abstract":"This work proposes a classification system for arrhythmias, aiming to enhance the efficiency of the diagnostic process for cardiologists. The proposed algorithm includes a naive preprocessing procedure for electrocardiography (ECG) data applicable to various ECG databases. Additionally, this work proposes an ultralightweight model for arrhythmia classification based on a convolutional neural network and incorporating R-peak interval features to represent long-term rhythm information, thereby improving the model's classification performance. The proposed model is trained and tested by using the MIT-BIH and NCKU-CBIC databases in accordance with the classification standards of the Association for the Advancement of Medical Instrumentation (AAMI), achieving high accuracies of 98.32% and 97.1%. This work applies the arrhythmia classification algorithm to a web-based system, thus providing a graphical interface. The cloud-based execution of automated artificial intelligence (AI) classification allows cardiologists and patients to view ECG wave conditions instantly, thereby remarkably enhancing the quality of medical examination. This work also designs a customized integrated circuit for the hardware implementation of an AI accelerator. The accelerator utilizes a parallelized processing element array architecture to perform convolution and fully connected layer operations. It introduces proposed hybrid stationary techniques, combining input and weight stationary modes to increase data reuse drastically and reduce hardware execution cycles and power consumption, ultimately achieving high-performance computing. This accelerator is implemented in the form of a chip by using the TSMC 180 nm CMOS process. It exhibits a power consumption of 122 µW, a classification latency of 6.8 ms, and an energy efficiency of 0.83 µJ/classification.","PeriodicalId":94031,"journal":{"name":"IEEE transactions on biomedical circuits and systems","volume":"19 1","pages":"16-27"},"PeriodicalIF":0.0,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141857446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-17DOI: 10.1109/TBCAS.2024.3430038
Maryam Habibollahi;Dai Jiang;Henry Thomas Lancashire;Andreas Demosthenous
Interfaces with peripheral nerves have been widely developed to enable bioelectronic control of neural activity. Peripheral nerve neuromodulation shows great potential in addressing motor dysfunctions, neurological disorders, and psychiatric conditions. The integration of high-density neural electrodes with stimulation and recording circuits poses a challenge in the design of neural interfaces. Recent advances in active electrode strategies have achieved improved reliability and performance by implementing in-situ