Pub Date : 2012-10-18DOI: 10.1109/CNNA.2012.6331419
M. Tukel, R. Yeniceri, M. Yalçin
In this work, active wave simulation on Cellular Nonlinear Network was computed for path planning on the GPU of a NVIDIA GTX275 video card. In software part, QtOpenCL, which is a wrapper library of OpenCL, was used to make code portable for systems with different GPUs. We achieved promising results comparing to results achieved by both CPU and FPGA. We have implemented different hardware and software solutions to path planning problem for 2-D media in real-time. They were almost at limit of real-time requirements because of some bottlenecks such as low communication bandwidth and low resolution of network. In this work, by utilizing GPUs, we performed 60000 iterations per second for simulation of 128×128 node network while we achieved at most 35 iterations per second with software on an Intel Core 2 Duo P8700 processor. We also achieved 36 iterations per second for 3-D active wave simulation of a 256 × 256 × 256 network on GPU.
本文在NVIDIA GTX275显卡的GPU上进行了蜂窝非线性网络有源波仿真,并进行了路径规划。在软件部分,利用OpenCL的封装库QtOpenCL实现了代码在不同gpu系统上的可移植性。与CPU和FPGA的结果相比,我们取得了很好的结果。针对二维介质的实时路径规划问题,我们实现了不同的硬件和软件解决方案。由于通信带宽低、网络分辨率低等瓶颈,它们的实时性几乎达到了极限。在这项工作中,通过使用gpu,我们每秒执行60000次迭代来模拟128×128节点网络,而我们在Intel Core 2 Duo P8700处理器上的软件每秒最多实现35次迭代。我们还在GPU上实现了256 × 256 × 256网络的三维有源波模拟每秒36次迭代。
{"title":"Nonlinear spatio-temporal wave computing for real-time applications on GPU","authors":"M. Tukel, R. Yeniceri, M. Yalçin","doi":"10.1109/CNNA.2012.6331419","DOIUrl":"https://doi.org/10.1109/CNNA.2012.6331419","url":null,"abstract":"In this work, active wave simulation on Cellular Nonlinear Network was computed for path planning on the GPU of a NVIDIA GTX275 video card. In software part, QtOpenCL, which is a wrapper library of OpenCL, was used to make code portable for systems with different GPUs. We achieved promising results comparing to results achieved by both CPU and FPGA. We have implemented different hardware and software solutions to path planning problem for 2-D media in real-time. They were almost at limit of real-time requirements because of some bottlenecks such as low communication bandwidth and low resolution of network. In this work, by utilizing GPUs, we performed 60000 iterations per second for simulation of 128×128 node network while we achieved at most 35 iterations per second with software on an Intel Core 2 Duo P8700 processor. We also achieved 36 iterations per second for 3-D active wave simulation of a 256 × 256 × 256 network on GPU.","PeriodicalId":387536,"journal":{"name":"2012 13th International Workshop on Cellular Nanoscale Networks and their Applications","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128220861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-10-18DOI: 10.1109/CNNA.2012.6331411
B. Molnár, Z. Toroczkai, M. Ercsey-Ravasz
We present a deterministic continuous-time recurrent neural network similar to CNN models, which can solve Boolean satisfiability (k-SAT) problems without getting trapped in non-solution fixed points. The model can be implemented by analog circuits, in which case the algorithm would take a single operation: the template (connection weights) is set by the k-SAT instance and starting from any initial condition the system converges to a solution. We prove that there is a one-to-one correspondence between the stable fixed points of the model and the k-SAT solutions and present numerical evidence that limit cycles may also be avoided by appropriately choosing the parameters of the model. As this study opens potentially novel technical avenues to tackle hard optimization problems, we also discuss some of the arising questions that need to be investigated in future studies.
{"title":"Continuous-time neural networks without local traps for solving Boolean satisfiability","authors":"B. Molnár, Z. Toroczkai, M. Ercsey-Ravasz","doi":"10.1109/CNNA.2012.6331411","DOIUrl":"https://doi.org/10.1109/CNNA.2012.6331411","url":null,"abstract":"We present a deterministic continuous-time recurrent neural network similar to CNN models, which can solve Boolean satisfiability (k-SAT) problems without getting trapped in non-solution fixed points. The model can be implemented by analog circuits, in which case the algorithm would take a single operation: the template (connection weights) is set by the k-SAT instance and starting from any initial condition the system converges to a solution. We prove that there is a one-to-one correspondence between the stable fixed points of the model and the k-SAT solutions and present numerical evidence that limit cycles may also be avoided by appropriately choosing the parameters of the model. As this study opens potentially novel technical avenues to tackle hard optimization problems, we also discuss some of the arising questions that need to be investigated in future studies.","PeriodicalId":387536,"journal":{"name":"2012 13th International Workshop on Cellular Nanoscale Networks and their Applications","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134448780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-10-18DOI: 10.1109/CNNA.2012.6331451
E. László, P. Szolgay, Z. Nagy
The CNN (Cellular Neural Network) is a powerful image processing architecture whose hardware implementation is extremely fast. The lack of such hardware device in a development process can be substituted by using an efficient simulator implementation. Commercially available graphics cards with high computing capabilities make this simulator feasible. The aim of this work is to present a GPU based implementation of a CNN simulator using nVidia's Fermi architecture. Different implementation approaches are considered and compared to a multi-core, multi-threaded CPU and some earlier GPU implementations. A detailed analysis of the introduced GPU implementation is presented.
{"title":"Analysis of a GPU based CNN implementation","authors":"E. László, P. Szolgay, Z. Nagy","doi":"10.1109/CNNA.2012.6331451","DOIUrl":"https://doi.org/10.1109/CNNA.2012.6331451","url":null,"abstract":"The CNN (Cellular Neural Network) is a powerful image processing architecture whose hardware implementation is extremely fast. The lack of such hardware device in a development process can be substituted by using an efficient simulator implementation. Commercially available graphics cards with high computing capabilities make this simulator feasible. The aim of this work is to present a GPU based implementation of a CNN simulator using nVidia's Fermi architecture. Different implementation approaches are considered and compared to a multi-core, multi-threaded CPU and some earlier GPU implementations. A detailed analysis of the introduced GPU implementation is presented.","PeriodicalId":387536,"journal":{"name":"2012 13th International Workshop on Cellular Nanoscale Networks and their Applications","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133701883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-10-18DOI: 10.1109/CNNA.2012.6331457
P. Foldesy, Á. Zarándy
This paper describes the of a 90 nm CMOS sub-THz detector array ASIC. The sub-THz detector array is an integrated system composed of silicon field effect plasma wave sensors, various integrated antennas, pre-amplifiers, ADCs, and digital domain lock-in amplifier detector. The peak responsivity is found 185 kV/W@365 GHz and 52 kV/W@470 GHz and at the detectivity maximum NEP ~ 20 pW/Hz-1.
{"title":"Integrated CMOS sub-THz imager array","authors":"P. Foldesy, Á. Zarándy","doi":"10.1109/CNNA.2012.6331457","DOIUrl":"https://doi.org/10.1109/CNNA.2012.6331457","url":null,"abstract":"This paper describes the of a 90 nm CMOS sub-THz detector array ASIC. The sub-THz detector array is an integrated system composed of silicon field effect plasma wave sensors, various integrated antennas, pre-amplifiers, ADCs, and digital domain lock-in amplifier detector. The peak responsivity is found 185 kV/W@365 GHz and 52 kV/W@470 GHz and at the detectivity maximum NEP ~ 20 pW/Hz-1.","PeriodicalId":387536,"journal":{"name":"2012 13th International Workshop on Cellular Nanoscale Networks and their Applications","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120970816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-10-18DOI: 10.1109/CNNA.2012.6331436
M. G. Ahmed, Kyoungrok Cho, Tae-Won Cho
In 2008, the fourth passive element “Memristor” was implemented as a device having both passivity and nonvolatile properties opening the way into new possibilities in the design and fabrication of innovative memory, arithmetic and logic architectures. Nano-features and ionic transport mechanism inherent in memristor device introduce new challenges into modeling, characterization and, in particular, in the related circuit simulation needs with system constructs. Therefore, in this paper, we analyze memristor device fundamentally to characterize the memristance paying particular attention to the hidden memcapacitance effect. Our proposed macro-model modifies takes into account some of the non ideal effects like tunneling current and the hidden memcapacitor constructed across non conducting materials. The model provides the insight for building a device as either memristive or memcapacitive system. The simulation results have been compared with HP published data which show good agreement.
{"title":"Memristance and memcapacitance modeling of thin film devices showing memristive behavior","authors":"M. G. Ahmed, Kyoungrok Cho, Tae-Won Cho","doi":"10.1109/CNNA.2012.6331436","DOIUrl":"https://doi.org/10.1109/CNNA.2012.6331436","url":null,"abstract":"In 2008, the fourth passive element “Memristor” was implemented as a device having both passivity and nonvolatile properties opening the way into new possibilities in the design and fabrication of innovative memory, arithmetic and logic architectures. Nano-features and ionic transport mechanism inherent in memristor device introduce new challenges into modeling, characterization and, in particular, in the related circuit simulation needs with system constructs. Therefore, in this paper, we analyze memristor device fundamentally to characterize the memristance paying particular attention to the hidden memcapacitance effect. Our proposed macro-model modifies takes into account some of the non ideal effects like tunneling current and the hidden memcapacitor constructed across non conducting materials. The model provides the insight for building a device as either memristive or memcapacitive system. The simulation results have been compared with HP published data which show good agreement.","PeriodicalId":387536,"journal":{"name":"2012 13th International Workshop on Cellular Nanoscale Networks and their Applications","volume":"215 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122078556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-10-18DOI: 10.1109/CNNA.2012.6331475
Á. Zarándy, T. Zsedrovits, Zoltán Nagy, A. Kiss, P. Szolgay, T. Roska
Embedded sensor-processor system is being developed for on-board UAV (Unmanned Aerial Vehicle) safety applications. The role of the device is to detect intruder airplanes which are on or close to collision course. Due to weight, power, size, and cost requirements, the visual approach leads to feasible solution only. In our design, 5 cameras are applied to collect visual data from a large field of view. The image flows are processed by 3 different virtual cellular processor arrays, which are implemented in FPGA.
{"title":"Cellular processor array based UAV safety system","authors":"Á. Zarándy, T. Zsedrovits, Zoltán Nagy, A. Kiss, P. Szolgay, T. Roska","doi":"10.1109/CNNA.2012.6331475","DOIUrl":"https://doi.org/10.1109/CNNA.2012.6331475","url":null,"abstract":"Embedded sensor-processor system is being developed for on-board UAV (Unmanned Aerial Vehicle) safety applications. The role of the device is to detect intruder airplanes which are on or close to collision course. Due to weight, power, size, and cost requirements, the visual approach leads to feasible solution only. In our design, 5 cameras are applied to collect visual data from a large field of view. The image flows are processed by 3 different virtual cellular processor arrays, which are implemented in FPGA.","PeriodicalId":387536,"journal":{"name":"2012 13th International Workshop on Cellular Nanoscale Networks and their Applications","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128530800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-10-18DOI: 10.1109/CNNA.2012.6331448
T. Fulop, Á. Zarándy
The retina inspired approaching object detection algorithm - based on the recently identified Pvlab-5 ganglion cell - is a computationally easy segmentation free method. The original method can detect only the dark looming objects against bright background. This paper shows a modified algorithm, which can detect any looming and recessing objects against dark or bright background. Moreover, we show a post processing evaluation method, which can measure the lateral motion direction using the spatial-temporal activities of the ganglion cells without introducing any hard calculation.
{"title":"Bio-inspired looming direction detection method","authors":"T. Fulop, Á. Zarándy","doi":"10.1109/CNNA.2012.6331448","DOIUrl":"https://doi.org/10.1109/CNNA.2012.6331448","url":null,"abstract":"The retina inspired approaching object detection algorithm - based on the recently identified Pvlab-5 ganglion cell - is a computationally easy segmentation free method. The original method can detect only the dark looming objects against bright background. This paper shows a modified algorithm, which can detect any looming and recessing objects against dark or bright background. Moreover, we show a post processing evaluation method, which can measure the lateral motion direction using the spatial-temporal activities of the ganglion cells without introducing any hard calculation.","PeriodicalId":387536,"journal":{"name":"2012 13th International Workshop on Cellular Nanoscale Networks and their Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131257326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-10-18DOI: 10.1109/CNNA.2012.6331474
G. Csaba, M. Pufall, D. Nikonov, G. Bourianoff, A. Horváth, T. Roska, W. Porod
We present physics-based models for both individual and coupled spin torque nano oscillators (STNOs). Such STNOs may become as building blocks for CNN-like dynamic computing architectures. We discuss a hierarchy of models, extending from micromagnetic models, which include the detailed geometry and physics, to compact models, which are based on parameters extracted from the underlying physical description. These simulations also include coupling between individual STNOs, both via spin waves and via electrical interconnects. Using this modeling approach, we demonstrate frequency entrainment and phase synchronization between STOs in the array, which enable computing functions.
{"title":"Spin torque oscillator models for applications in associative memories","authors":"G. Csaba, M. Pufall, D. Nikonov, G. Bourianoff, A. Horváth, T. Roska, W. Porod","doi":"10.1109/CNNA.2012.6331474","DOIUrl":"https://doi.org/10.1109/CNNA.2012.6331474","url":null,"abstract":"We present physics-based models for both individual and coupled spin torque nano oscillators (STNOs). Such STNOs may become as building blocks for CNN-like dynamic computing architectures. We discuss a hierarchy of models, extending from micromagnetic models, which include the detailed geometry and physics, to compact models, which are based on parameters extracted from the underlying physical description. These simulations also include coupling between individual STNOs, both via spin waves and via electrical interconnects. Using this modeling approach, we demonstrate frequency entrainment and phase synchronization between STOs in the array, which enable computing functions.","PeriodicalId":387536,"journal":{"name":"2012 13th International Workshop on Cellular Nanoscale Networks and their Applications","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123451291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-10-18DOI: 10.1109/CNNA.2012.6331456
N. Neufeld, X. Vilasís-Cardona
High energy physics particle detectors are large and complex devices with very demanding requirements at the level of signal to noise ratios, processing times and data throughput. The first stages of the data acquisition are hardware based while the last ones depend rather on software. Among the solutions to the problems posed by the requirements we may find the use of multi-core processors or maybe GPU's. We shall review what are the points in which these techniques could be of use and the actual proposals.
{"title":"Many-core processors and GPU opportunities in particle detectors","authors":"N. Neufeld, X. Vilasís-Cardona","doi":"10.1109/CNNA.2012.6331456","DOIUrl":"https://doi.org/10.1109/CNNA.2012.6331456","url":null,"abstract":"High energy physics particle detectors are large and complex devices with very demanding requirements at the level of signal to noise ratios, processing times and data throughput. The first stages of the data acquisition are hardware based while the last ones depend rather on software. Among the solutions to the problems posed by the requirements we may find the use of multi-core processors or maybe GPU's. We shall review what are the points in which these techniques could be of use and the actual proposals.","PeriodicalId":387536,"journal":{"name":"2012 13th International Workshop on Cellular Nanoscale Networks and their Applications","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129072538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-08-01DOI: 10.1109/CNNA.2012.6331449
A. Paasio, H. Ansio
Vision chips are natural candidates for being among the first areas that are able to utilize the emerging 3D integration possibilities. In some 2D vision chip architectures there are pixel level AD and/or DA converters that are used for various purposes. This article covers the challenges and needs when targeting a megapixel architecture within a 1cm2 chip area. The Through-Silicon-Vias (TSVs) on one hand allow the 3D integration, but on the other hand pose strict challenges for the design. The TSVs occupy certain area and in an area restricted design, the number of TSVs should be minimized. Also the associated Keep-Out-Zone (KOZ) for each TSV should be taken into account.
{"title":"On challenges for implementing pixelwise DA converter in 3D","authors":"A. Paasio, H. Ansio","doi":"10.1109/CNNA.2012.6331449","DOIUrl":"https://doi.org/10.1109/CNNA.2012.6331449","url":null,"abstract":"Vision chips are natural candidates for being among the first areas that are able to utilize the emerging 3D integration possibilities. In some 2D vision chip architectures there are pixel level AD and/or DA converters that are used for various purposes. This article covers the challenges and needs when targeting a megapixel architecture within a 1cm2 chip area. The Through-Silicon-Vias (TSVs) on one hand allow the 3D integration, but on the other hand pose strict challenges for the design. The TSVs occupy certain area and in an area restricted design, the number of TSVs should be minimized. Also the associated Keep-Out-Zone (KOZ) for each TSV should be taken into account.","PeriodicalId":387536,"journal":{"name":"2012 13th International Workshop on Cellular Nanoscale Networks and their Applications","volume":"2010 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125986571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}