{"title":"Rei:基于阵列的 CNN 加速器的可重构互连单元","authors":"Paria Darbani;Hakem Beitollahi;Pejman Lotfi-Kamran","doi":"10.1109/TETC.2023.3290138","DOIUrl":null,"url":null,"abstract":"Convolutional Neural Network (CNN) is used in many real-world applications due to its high accuracy. The rapid growth of modern applications based on learning algorithms has increased the importance of efficient implementation of CNNs. The array-type architecture is a well-known platform for the efficient implementation of CNN models, which takes advantage of parallel computation and data reuse. However, accelerators suffer from restricted hardware resources, whereas CNNs involve considerable communication and computation load. Furthermore, since accelerators execute CNN layer by layer, different shapes and sizes of layers lead to suboptimal resource utilization. This problem prevents the accelerator from reaching maximum performance. The increasing scale and complexity of deep learning applications exacerbate this problem. Therefore, the performance of CNN models depends on the hardware's ability to adapt to different shapes of different layers to increase resource utilization. This work proposes a reconfigurable accelerator that can efficiently execute a wide range of CNNs. The proposed flexible and low-cost reconfigurable interconnect units allow the array to perform CNN faster than fixed-size implementations (by 45.9% for ResNet-18 compared to the baseline). The proposed architecture also reduces the on-chip memory access rate by 36.5% without compromising accuracy.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"11 4","pages":"895-906"},"PeriodicalIF":5.1000,"publicationDate":"2023-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Rei: A Reconfigurable Interconnection Unit for Array-Based CNN Accelerators\",\"authors\":\"Paria Darbani;Hakem Beitollahi;Pejman Lotfi-Kamran\",\"doi\":\"10.1109/TETC.2023.3290138\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Convolutional Neural Network (CNN) is used in many real-world applications due to its high accuracy. The rapid growth of modern applications based on learning algorithms has increased the importance of efficient implementation of CNNs. The array-type architecture is a well-known platform for the efficient implementation of CNN models, which takes advantage of parallel computation and data reuse. However, accelerators suffer from restricted hardware resources, whereas CNNs involve considerable communication and computation load. Furthermore, since accelerators execute CNN layer by layer, different shapes and sizes of layers lead to suboptimal resource utilization. This problem prevents the accelerator from reaching maximum performance. The increasing scale and complexity of deep learning applications exacerbate this problem. Therefore, the performance of CNN models depends on the hardware's ability to adapt to different shapes of different layers to increase resource utilization. This work proposes a reconfigurable accelerator that can efficiently execute a wide range of CNNs. The proposed flexible and low-cost reconfigurable interconnect units allow the array to perform CNN faster than fixed-size implementations (by 45.9% for ResNet-18 compared to the baseline). The proposed architecture also reduces the on-chip memory access rate by 36.5% without compromising accuracy.\",\"PeriodicalId\":13156,\"journal\":{\"name\":\"IEEE Transactions on Emerging Topics in Computing\",\"volume\":\"11 4\",\"pages\":\"895-906\"},\"PeriodicalIF\":5.1000,\"publicationDate\":\"2023-06-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Emerging Topics in Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10171166/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Emerging Topics in Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10171166/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Rei: A Reconfigurable Interconnection Unit for Array-Based CNN Accelerators
Convolutional Neural Network (CNN) is used in many real-world applications due to its high accuracy. The rapid growth of modern applications based on learning algorithms has increased the importance of efficient implementation of CNNs. The array-type architecture is a well-known platform for the efficient implementation of CNN models, which takes advantage of parallel computation and data reuse. However, accelerators suffer from restricted hardware resources, whereas CNNs involve considerable communication and computation load. Furthermore, since accelerators execute CNN layer by layer, different shapes and sizes of layers lead to suboptimal resource utilization. This problem prevents the accelerator from reaching maximum performance. The increasing scale and complexity of deep learning applications exacerbate this problem. Therefore, the performance of CNN models depends on the hardware's ability to adapt to different shapes of different layers to increase resource utilization. This work proposes a reconfigurable accelerator that can efficiently execute a wide range of CNNs. The proposed flexible and low-cost reconfigurable interconnect units allow the array to perform CNN faster than fixed-size implementations (by 45.9% for ResNet-18 compared to the baseline). The proposed architecture also reduces the on-chip memory access rate by 36.5% without compromising accuracy.
期刊介绍:
IEEE Transactions on Emerging Topics in Computing publishes papers on emerging aspects of computer science, computing technology, and computing applications not currently covered by other IEEE Computer Society Transactions. Some examples of emerging topics in computing include: IT for Green, Synthetic and organic computing structures and systems, Advanced analytics, Social/occupational computing, Location-based/client computer systems, Morphic computer design, Electronic game systems, & Health-care IT.