{"title":"内存处理加速卷积神经网络","authors":"Van-Khoa Pham","doi":"10.1109/ICSSE58758.2023.10227155","DOIUrl":null,"url":null,"abstract":"In artificial neural network applications, convolutional neural networks (CNNs), compared to conventional fully connected networks, significantly reduce the number of trained synaptic weights by stacking many convolution layers sequentially. In addition, CNNs outperform a fully-connected approach in terms of accuracy. However, these advantages only come for a fee because sharing trained weights results in many computation-intensive operations. With practical applications using resource-constraint hardware to process large-scale input images, these layers consume much more computing time as well as power because of utilizing massive complexity hardware and a large memory footprint. To deal with the challenge, an alternative approach using the in-DRAM processing concept is proposed in this study to avoid the multiplier operation. The design was tested with the GTSRB dataset to verify the recognition performance of the trained neural network. In comparison to the conventional combination of main memory with processing chips on Von-Neumann computer architectures, the simulation results indicate that the proposed circuit can achieve a competitive performance and significantly reduce the number of computation cycles as well.","PeriodicalId":280745,"journal":{"name":"2023 International Conference on System Science and Engineering (ICSSE)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"in-Memory Processing to Accelerate Convolutional Neural Networks\",\"authors\":\"Van-Khoa Pham\",\"doi\":\"10.1109/ICSSE58758.2023.10227155\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In artificial neural network applications, convolutional neural networks (CNNs), compared to conventional fully connected networks, significantly reduce the number of trained synaptic weights by stacking many convolution layers sequentially. In addition, CNNs outperform a fully-connected approach in terms of accuracy. However, these advantages only come for a fee because sharing trained weights results in many computation-intensive operations. With practical applications using resource-constraint hardware to process large-scale input images, these layers consume much more computing time as well as power because of utilizing massive complexity hardware and a large memory footprint. To deal with the challenge, an alternative approach using the in-DRAM processing concept is proposed in this study to avoid the multiplier operation. The design was tested with the GTSRB dataset to verify the recognition performance of the trained neural network. In comparison to the conventional combination of main memory with processing chips on Von-Neumann computer architectures, the simulation results indicate that the proposed circuit can achieve a competitive performance and significantly reduce the number of computation cycles as well.\",\"PeriodicalId\":280745,\"journal\":{\"name\":\"2023 International Conference on System Science and Engineering (ICSSE)\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 International Conference on System Science and Engineering (ICSSE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSSE58758.2023.10227155\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference on System Science and Engineering (ICSSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSSE58758.2023.10227155","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
in-Memory Processing to Accelerate Convolutional Neural Networks
In artificial neural network applications, convolutional neural networks (CNNs), compared to conventional fully connected networks, significantly reduce the number of trained synaptic weights by stacking many convolution layers sequentially. In addition, CNNs outperform a fully-connected approach in terms of accuracy. However, these advantages only come for a fee because sharing trained weights results in many computation-intensive operations. With practical applications using resource-constraint hardware to process large-scale input images, these layers consume much more computing time as well as power because of utilizing massive complexity hardware and a large memory footprint. To deal with the challenge, an alternative approach using the in-DRAM processing concept is proposed in this study to avoid the multiplier operation. The design was tested with the GTSRB dataset to verify the recognition performance of the trained neural network. In comparison to the conventional combination of main memory with processing chips on Von-Neumann computer architectures, the simulation results indicate that the proposed circuit can achieve a competitive performance and significantly reduce the number of computation cycles as well.