Kenneth L. Rice, T. Taha, K. Iftekharuddin, Keith Anderson, Teddy Salan
{"title":"适用于迷宫遍历的细胞同步循环网络的GPGPU加速","authors":"Kenneth L. Rice, T. Taha, K. Iftekharuddin, Keith Anderson, Teddy Salan","doi":"10.1109/IJCNN.2011.6033575","DOIUrl":null,"url":null,"abstract":"At present, a major initiative in the research community is investigating new ways of processing data that capture the efficiency of the human brain in hardware and software. This has resulted in increased interest and development of bio-inspired computing approaches in software and hardware. One such bio-inspired approach is Cellular Simultaneous Recurrent Networks (CSRNs). CSRNs have been demonstrated to be very useful in solving state transition type problems, such as maze traversals. Although powerful in image processing capabilities, CSRNs have high computational demands with increasing input problem size. In this work, we revisit the maze traversal problem to gain an understanding of the general processing of CSRNs. We use a 2.67 GHz Intel Xeon X5550 processor coupled with an NVIDIA Tesla C2050 general purpose graphical processing unit (GPGPU) to create several novel accelerated CSRN implementations as a means of overcoming the high computational cost. Additionally, we explore the use of decoupled extended Kalman filters in the CSRN training phase and find a significant reduction in runtime with negligible change in accuracy. We find in our results that we can achieve average speedups of 21.73 and 3.55 times for the training and testing phases respectively when compared to optimized C implementations. The main bottleneck in training performance was a matrix inversion computation. Therefore, we utilize several methods to reduce the effects of the matrix inversion computation.","PeriodicalId":415833,"journal":{"name":"The 2011 International Joint Conference on Neural Networks","volume":"76 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"GPGPU acceleration of Cellular Simultaneous Recurrent Networks adapted for maze traversals\",\"authors\":\"Kenneth L. Rice, T. Taha, K. Iftekharuddin, Keith Anderson, Teddy Salan\",\"doi\":\"10.1109/IJCNN.2011.6033575\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"At present, a major initiative in the research community is investigating new ways of processing data that capture the efficiency of the human brain in hardware and software. This has resulted in increased interest and development of bio-inspired computing approaches in software and hardware. One such bio-inspired approach is Cellular Simultaneous Recurrent Networks (CSRNs). CSRNs have been demonstrated to be very useful in solving state transition type problems, such as maze traversals. Although powerful in image processing capabilities, CSRNs have high computational demands with increasing input problem size. In this work, we revisit the maze traversal problem to gain an understanding of the general processing of CSRNs. We use a 2.67 GHz Intel Xeon X5550 processor coupled with an NVIDIA Tesla C2050 general purpose graphical processing unit (GPGPU) to create several novel accelerated CSRN implementations as a means of overcoming the high computational cost. Additionally, we explore the use of decoupled extended Kalman filters in the CSRN training phase and find a significant reduction in runtime with negligible change in accuracy. We find in our results that we can achieve average speedups of 21.73 and 3.55 times for the training and testing phases respectively when compared to optimized C implementations. The main bottleneck in training performance was a matrix inversion computation. Therefore, we utilize several methods to reduce the effects of the matrix inversion computation.\",\"PeriodicalId\":415833,\"journal\":{\"name\":\"The 2011 International Joint Conference on Neural Networks\",\"volume\":\"76 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-10-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The 2011 International Joint Conference on Neural Networks\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IJCNN.2011.6033575\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The 2011 International Joint Conference on Neural Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IJCNN.2011.6033575","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
摘要
目前,研究界的一项主要举措是研究在硬件和软件中捕捉人类大脑效率的数据处理新方法。这导致了对软件和硬件中生物启发计算方法的兴趣和发展的增加。其中一种受生物启发的方法是细胞同步循环网络(CSRNs)。csrn已被证明在解决状态转换类型问题(如迷宫遍历)方面非常有用。尽管CSRNs具有强大的图像处理能力,但随着输入问题规模的增加,其计算需求也越来越高。在这项工作中,我们重新审视迷宫遍历问题,以了解csrn的一般处理。我们使用2.67 GHz Intel Xeon X5550处理器和NVIDIA Tesla C2050通用图形处理单元(GPGPU)来创建几个新的加速CSRN实现,作为克服高计算成本的手段。此外,我们探索了在CSRN训练阶段使用解耦扩展卡尔曼滤波器,并发现运行时间显著减少,精度变化可以忽略不计。我们在结果中发现,与优化的C实现相比,我们可以在训练和测试阶段分别实现21.73倍和3.55倍的平均速度。训练性能的主要瓶颈是矩阵反演计算。因此,我们利用几种方法来减少矩阵反演计算的影响。
GPGPU acceleration of Cellular Simultaneous Recurrent Networks adapted for maze traversals
At present, a major initiative in the research community is investigating new ways of processing data that capture the efficiency of the human brain in hardware and software. This has resulted in increased interest and development of bio-inspired computing approaches in software and hardware. One such bio-inspired approach is Cellular Simultaneous Recurrent Networks (CSRNs). CSRNs have been demonstrated to be very useful in solving state transition type problems, such as maze traversals. Although powerful in image processing capabilities, CSRNs have high computational demands with increasing input problem size. In this work, we revisit the maze traversal problem to gain an understanding of the general processing of CSRNs. We use a 2.67 GHz Intel Xeon X5550 processor coupled with an NVIDIA Tesla C2050 general purpose graphical processing unit (GPGPU) to create several novel accelerated CSRN implementations as a means of overcoming the high computational cost. Additionally, we explore the use of decoupled extended Kalman filters in the CSRN training phase and find a significant reduction in runtime with negligible change in accuracy. We find in our results that we can achieve average speedups of 21.73 and 3.55 times for the training and testing phases respectively when compared to optimized C implementations. The main bottleneck in training performance was a matrix inversion computation. Therefore, we utilize several methods to reduce the effects of the matrix inversion computation.