Pub Date : 2009-12-09DOI: 10.1109/ReConFig.2009.66
Mark Hamilton, W. Marnane
This paper outlines a FPGA implementation of an elliptic curve processor that utilises the GLV method. The GLV method has been shown to be able to speed up computationally expensive point multiplication operations. We also present an implementation of a Hiasat multiplier which can be used with special moduli to further speed up point multiplications. The Hiasat multiplier takes advantage of fast reduction techniques that can be applied to Mersenne primes. The results are then compared with standard multiplication algorithms.
{"title":"FPGA Implementation of an Elliptic Curve Processor Using the GLV Method","authors":"Mark Hamilton, W. Marnane","doi":"10.1109/ReConFig.2009.66","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.66","url":null,"abstract":"This paper outlines a FPGA implementation of an elliptic curve processor that utilises the GLV method. The GLV method has been shown to be able to speed up computationally expensive point multiplication operations. We also present an implementation of a Hiasat multiplier which can be used with special moduli to further speed up point multiplications. The Hiasat multiplier takes advantage of fast reduction techniques that can be applied to Mersenne primes. The results are then compared with standard multiplication algorithms.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129117605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
E. Cabal-Yépez, R. Osornio-Ríos, R. Romero-Troncoso, J. R. Razo-Hernandez, R. Lopez-Garcia
Online monitoring of rotary machines, like induction motors, can effectively diagnosis electrical and mechanical faults. The origin of most recurrent faults in rotary machines is in the components: bearings, stator, rotor and others. Different methodologies based on current and vibration monitoring have been proposed using FFT and wavelet analysis for preventive monitoring of induction motors resulting in countless techniques for diagnosing specific faults, arising the necessity for a generalized technique that allows multiple fault detection. This work presents a novel methodology and its FPGA implementation for multiple fault online detection analyzing the current and vibration signals of an induction motor combining FFT and wavelet processing during its startup transient and steady state, precisely performing the identification of different faults like misalignment, unbalance, outer-race bearing defects and broken bars. The results obtained using the proposed methodology show its effectiveness providing a precise diagnosis of the induction motor condition.
{"title":"FPGA-Based Online Induction Motor Multiple-Fault Detection with Fused FFT and Wavelet Analysis","authors":"E. Cabal-Yépez, R. Osornio-Ríos, R. Romero-Troncoso, J. R. Razo-Hernandez, R. Lopez-Garcia","doi":"10.1109/RECONFIG.2009.9","DOIUrl":"https://doi.org/10.1109/RECONFIG.2009.9","url":null,"abstract":"Online monitoring of rotary machines, like induction motors, can effectively diagnosis electrical and mechanical faults. The origin of most recurrent faults in rotary machines is in the components: bearings, stator, rotor and others. Different methodologies based on current and vibration monitoring have been proposed using FFT and wavelet analysis for preventive monitoring of induction motors resulting in countless techniques for diagnosing specific faults, arising the necessity for a generalized technique that allows multiple fault detection. This work presents a novel methodology and its FPGA implementation for multiple fault online detection analyzing the current and vibration signals of an induction motor combining FFT and wavelet processing during its startup transient and steady state, precisely performing the identification of different faults like misalignment, unbalance, outer-race bearing defects and broken bars. The results obtained using the proposed methodology show its effectiveness providing a precise diagnosis of the induction motor condition.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128240954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-12-09DOI: 10.1109/ReConFig.2009.50
S. Bhasin, J. Danger, Florent Flament, T. Graba, S. Guilley, Y. Mathieu, Maxime Nassar, L. Sauvage, Nidhal Selmane
The main challenge when implementing cryptographic algorithms in hardware is to protect them against attacks that target directly the device. Two strategies are customarily employed by malevolent adversaries: observation and differential perturbation attacks, also called SCA and DFA in the abundant scientific literature on this topic. Numerous research efforts have been carried out to defeat respectively SCA or DFA. However, few publications deal with concomitant protection against both threats. The current consensus is to devise algorithmic countermeasures to DFA and subsequently to synthesize the DFA-protected design thanks to a DPA-resistant CAD flow. In this article, we put to the fore that this approach is the best neither in terms of performance nor of relevance. Notably, the contribution of this paper is to demonstrate that the strongest SCA countermeasure known so far, namely the dual-rail with precharge logic styles that do not evaluate early, happen surprisingly to be almost natively immune to most DFAs. Therefore, unexpected two-in-one solutions against SCA and DFA indeed exist and deserve a closer attention, because they ally simplicity with efficiency. In particular, we illustrate a logic style, called WDDL without early evaluation (WDDL w/o EE), and a design flow that realizes in practice one possible combined DPA and DFA counter-measure especially suited for reconfigurable hardware.
{"title":"Combined SCA and DFA Countermeasures Integrable in a FPGA Design Flow","authors":"S. Bhasin, J. Danger, Florent Flament, T. Graba, S. Guilley, Y. Mathieu, Maxime Nassar, L. Sauvage, Nidhal Selmane","doi":"10.1109/ReConFig.2009.50","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.50","url":null,"abstract":"The main challenge when implementing cryptographic algorithms in hardware is to protect them against attacks that target directly the device. Two strategies are customarily employed by malevolent adversaries: observation and differential perturbation attacks, also called SCA and DFA in the abundant scientific literature on this topic. Numerous research efforts have been carried out to defeat respectively SCA or DFA. However, few publications deal with concomitant protection against both threats. The current consensus is to devise algorithmic countermeasures to DFA and subsequently to synthesize the DFA-protected design thanks to a DPA-resistant CAD flow. In this article, we put to the fore that this approach is the best neither in terms of performance nor of relevance. Notably, the contribution of this paper is to demonstrate that the strongest SCA countermeasure known so far, namely the dual-rail with precharge logic styles that do not evaluate early, happen surprisingly to be almost natively immune to most DFAs. Therefore, unexpected two-in-one solutions against SCA and DFA indeed exist and deserve a closer attention, because they ally simplicity with efficiency. In particular, we illustrate a logic style, called WDDL without early evaluation (WDDL w/o EE), and a design flow that realizes in practice one possible combined DPA and DFA counter-measure especially suited for reconfigurable hardware.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128133627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-12-09DOI: 10.1109/ReConFig.2009.47
H. Pham, S. Pillement, D. Demigny
Parallel computing is an important trend of embedded system. One possible response to increasing requirements in computational power is to distribute tasks over various processors and let these processors operate in parallel. Soft-core processors and FPGAs require low Non-Recurring Engineering costs to develop such multi-processors systems. Furthermore, certain FPGAs allow dynamic partial run-time reconfiguration, but their high sensitivity to electronic defects can cause the system disfunction. This paper presents a fault-tolerant multi-processor system-on-chip based on the dynamic reconfiguration of the entire platform. Also, a modification of the standard methodology of the runtime self-reconfiguration, who facilitates the complex modular concept design, is presented in this paper.
{"title":"A Fault-Tolerant Layer for Dynamically Reconfigurable Multi-processor System-on-Chip","authors":"H. Pham, S. Pillement, D. Demigny","doi":"10.1109/ReConFig.2009.47","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.47","url":null,"abstract":"Parallel computing is an important trend of embedded system. One possible response to increasing requirements in computational power is to distribute tasks over various processors and let these processors operate in parallel. Soft-core processors and FPGAs require low Non-Recurring Engineering costs to develop such multi-processors systems. Furthermore, certain FPGAs allow dynamic partial run-time reconfiguration, but their high sensitivity to electronic defects can cause the system disfunction. This paper presents a fault-tolerant multi-processor system-on-chip based on the dynamic reconfiguration of the entire platform. Also, a modification of the standard methodology of the runtime self-reconfiguration, who facilitates the complex modular concept design, is presented in this paper.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116738340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-12-09DOI: 10.1109/ReConFig.2009.23
M. Nüssle, Benjamin Geib, H. Fröning, U. Brüning
An FPGA-based prototype of a custom high-performance network hardware has been implemented, integrating both a switch and a network interface in one FPGA. The network interfaces to the host processor over HyperTransport. About 85% of the slices of a Virtex IV FX100 FPGA are occupied and 10 individual clock domains are used. Six of the MGT-blocks of the device implement high-speed links to other nodes. Together with the integrated switch it is thus possible to build topologies with a node degree of up to 6, i.e. a 3D-torus or a 6D Hypercube. The target clock rate is 156 MHz with the links running at 6.24 Gbit/s and 200 MHz for the HyperTransport Core. This goal was reached with a 32-bit wide data path in the network-switch and link blocks. The integrated switch reaches an aggregate bandwidth of more than 45 Gbit/s. The resulting interconnection network features a very low latency – between nodes and including switching - close to 1 µs.
实现了一个基于FPGA的定制高性能网络硬件原型,在一个FPGA中集成了交换机和网络接口。通过HyperTransport与主机处理器的网络接口。Virtex IV FX100 FPGA占用了大约85%的切片,使用了10个单独的时钟域。设备的6个mgt块实现与其他节点的高速链路。与集成开关一起,因此可以构建节点度高达6的拓扑结构,即3d环面或6D超立方体。目标时钟速率为156mhz,链路速率为6.24 Gbit/s, HyperTransport Core为200mhz。这个目标是通过网络交换机和链路块中的32位宽数据路径实现的。集成交换机总带宽可达45gbit /s以上。由此产生的互连网络具有非常低的延迟-节点之间,包括交换-接近1µs。
{"title":"An FPGA-Based Custom High Performance Interconnection Network","authors":"M. Nüssle, Benjamin Geib, H. Fröning, U. Brüning","doi":"10.1109/ReConFig.2009.23","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.23","url":null,"abstract":"An FPGA-based prototype of a custom high-performance network hardware has been implemented, integrating both a switch and a network interface in one FPGA. The network interfaces to the host processor over HyperTransport. About 85% of the slices of a Virtex IV FX100 FPGA are occupied and 10 individual clock domains are used. Six of the MGT-blocks of the device implement high-speed links to other nodes. Together with the integrated switch it is thus possible to build topologies with a node degree of up to 6, i.e. a 3D-torus or a 6D Hypercube. The target clock rate is 156 MHz with the links running at 6.24 Gbit/s and 200 MHz for the HyperTransport Core. This goal was reached with a 32-bit wide data path in the network-switch and link blocks. The integrated switch reaches an aggregate bandwidth of more than 45 Gbit/s. The resulting interconnection network features a very low latency – between nodes and including switching - close to 1 µs.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"08 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116806735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-12-09DOI: 10.1109/RECONFIG.2009.26
Taciano A. Rodolfo, Ney Laert Vilar Calazans, F. Moraes
Although the use of floating point hardware in FPGAs has long been considered unfeasible or relegated to use only in expensive devices and platforms, this is no longer the case. This paper describes fully-fledged implementations of single-precision floating point units for a MIPS processor architecture implementation. These coprocessors take as little room as 6% of a medium-sized FPGA, while the processor CPU may take only 2% of the same device. The space exploration process described here values the area and performance metrics and considers variations on the choice of synthesis tool, floating point unit generation method and architectural issues like clocking schemes. The conducted experiments show reductions of up to 22 times in clock cycles count for typical floating point application modules, compared to the use of software-emulated floating point processing.
{"title":"Floating Point Hardware for Embedded Processors in FPGAs: Design Space Exploration for Performance and Area","authors":"Taciano A. Rodolfo, Ney Laert Vilar Calazans, F. Moraes","doi":"10.1109/RECONFIG.2009.26","DOIUrl":"https://doi.org/10.1109/RECONFIG.2009.26","url":null,"abstract":"Although the use of floating point hardware in FPGAs has long been considered unfeasible or relegated to use only in expensive devices and platforms, this is no longer the case. This paper describes fully-fledged implementations of single-precision floating point units for a MIPS processor architecture implementation. These coprocessors take as little room as 6% of a medium-sized FPGA, while the processor CPU may take only 2% of the same device. The space exploration process described here values the area and performance metrics and considers variations on the choice of synthesis tool, floating point unit generation method and architectural issues like clocking schemes. The conducted experiments show reductions of up to 22 times in clock cycles count for typical floating point application modules, compared to the use of software-emulated floating point processing.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128687135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-12-09DOI: 10.1109/ReConFig.2009.78
S. Geninatti, J. Benítez, M. Calviño, Nicolás Guil Mata, Juan Gómez-Luna
Many applications of image analysis need a similarity measure that highlights the presence of shapes or objects. The Generalized Hough Transform (GHT) is a popular image processing technique that is able to locate an object in an image. In its original formulation, this algorithm has a regular computation pattern, but suffers from high computational and memory requirements. More efficient GHT implementations have been carried out leading to computation and memory saving at the expenses of introducing irregularities in the computation, which make more difficult the design of a specific hardware solution. This work proposes the use of Field-Programmable Gate Arrays (FPGAs) for the implementation of an efficient version of the GHT. The development of the GHT has been divided into several functional blocks. This permits us to take advantage of a progressive reduction of the data flow and the algorithm stages, in order to optimize the use of the FPGA resources and clock cycles. We have tested our design by applying the GHT to the similarity calculation of frames in a video sequence.
{"title":"FPGA Implementation of the Generalized Hough Transform","authors":"S. Geninatti, J. Benítez, M. Calviño, Nicolás Guil Mata, Juan Gómez-Luna","doi":"10.1109/ReConFig.2009.78","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.78","url":null,"abstract":"Many applications of image analysis need a similarity measure that highlights the presence of shapes or objects. The Generalized Hough Transform (GHT) is a popular image processing technique that is able to locate an object in an image. In its original formulation, this algorithm has a regular computation pattern, but suffers from high computational and memory requirements. More efficient GHT implementations have been carried out leading to computation and memory saving at the expenses of introducing irregularities in the computation, which make more difficult the design of a specific hardware solution. This work proposes the use of Field-Programmable Gate Arrays (FPGAs) for the implementation of an efficient version of the GHT. The development of the GHT has been divided into several functional blocks. This permits us to take advantage of a progressive reduction of the data flow and the algorithm stages, in order to optimize the use of the FPGA resources and clock cycles. We have tested our design by applying the GHT to the similarity calculation of frames in a video sequence.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130664122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-12-09DOI: 10.1109/ReConFig.2009.58
L. Sauvage, Maxime Nassar, S. Guilley, Florent Flament, J. Danger, Y. Mathieu
FPGA design of side channel analysis countermeasure using unmasked dual-rail with precharge logic appears to be a great challenge. Indeed, the robustness of such a solution relies on careful differential placement and routing, whereas both FPGA layout and FPGA EDA tools are not developed for such purposes. However, assessing the security level which can be achieved with them is an important issue, as it is directly related to the suitability to use commercial FPGA instead of proprietary custom FPGA for this kind of protection. In this article, we experimentally prove that differential placement and routing of an FPGA implementation can be done with a granularity fine enough to improve the security gain. However, the gain is lower than for ASICs. We expect that an in-depth analysis of routing resources power consumption could help bridge the gap.
{"title":"DPL on Stratix II FPGA: What to Expect?","authors":"L. Sauvage, Maxime Nassar, S. Guilley, Florent Flament, J. Danger, Y. Mathieu","doi":"10.1109/ReConFig.2009.58","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.58","url":null,"abstract":"FPGA design of side channel analysis countermeasure using unmasked dual-rail with precharge logic appears to be a great challenge. Indeed, the robustness of such a solution relies on careful differential placement and routing, whereas both FPGA layout and FPGA EDA tools are not developed for such purposes. However, assessing the security level which can be achieved with them is an important issue, as it is directly related to the suitability to use commercial FPGA instead of proprietary custom FPGA for this kind of protection. In this article, we experimentally prove that differential placement and routing of an FPGA implementation can be done with a granularity fine enough to improve the security gain. However, the gain is lower than for ASICs. We expect that an in-depth analysis of routing resources power consumption could help bridge the gap.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122198367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-12-09DOI: 10.1109/ReConFig.2009.13
Lei Wang, Lei Chen, Zhiping Wen, Huabo Sun, Shuo Wang
This paper has investigated present radiation hardened FPGA manufacturers and SEU hardened method of configurable SRAM (CSRAM) applied to FPGA. A novel high-density single-event upset hardened CSRAM applied to BQV 300 FPGA is proposed, and this paper uses the mix-mode radiation hardened verification method to simulate the SEU hardened CSRAM. The proposed SEU-hardened CSRAM applied to FPGAs is SEU immune up to 22.49 MeVŸcm2/mg, under the angle for incident ion of 0°. But the area of proposed CSRAM only increases 12% than traditional 6-T SRAM, and the area of DICE will increase 69% than proposed CSRAM. Using the proposed CSRAM makes BQV 300 FPGA able to be fabricated. The SEU LETth is much higher than SEU LETth of CSRAM for Xilinx’s FPGA.
本文研究了现有的抗辐射FPGA厂商和应用于FPGA的可配置SRAM (CSRAM)的抗辐射方法。提出了一种适用于BQV 300 FPGA的高密度单事件强化CSRAM,并采用混合模式辐射强化验证方法对SEU强化CSRAM进行仿真。所提出的用于fpga的SEU强化CSRAM在入射离子角度为0°的情况下,SEU免疫率高达22.49 MeVŸcm2/mg。但是,拟议的CSRAM的面积仅比传统的6-T SRAM增加12%,DICE的面积将比拟议的CSRAM增加69%。利用所提出的CSRAM使bqv300 FPGA能够制造。SEU leth远高于赛灵思FPGA的CSRAM SEU leth。
{"title":"A Novel High-Density Single-Event Upset Hardened Configurable SRAM Applied to FPGA","authors":"Lei Wang, Lei Chen, Zhiping Wen, Huabo Sun, Shuo Wang","doi":"10.1109/ReConFig.2009.13","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.13","url":null,"abstract":"This paper has investigated present radiation hardened FPGA manufacturers and SEU hardened method of configurable SRAM (CSRAM) applied to FPGA. A novel high-density single-event upset hardened CSRAM applied to BQV 300 FPGA is proposed, and this paper uses the mix-mode radiation hardened verification method to simulate the SEU hardened CSRAM. The proposed SEU-hardened CSRAM applied to FPGAs is SEU immune up to 22.49 MeVŸcm2/mg, under the angle for incident ion of 0°. But the area of proposed CSRAM only increases 12% than traditional 6-T SRAM, and the area of DICE will increase 69% than proposed CSRAM. Using the proposed CSRAM makes BQV 300 FPGA able to be fabricated. The SEU LETth is much higher than SEU LETth of CSRAM for Xilinx’s FPGA.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121654382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-12-09DOI: 10.1109/ReConFig.2009.20
P. Huerta, J. Castillo, C. Pedraza, Javier Cano, J. Martínez
Advances in FPGA technologies allow designing highly complex systems using on-chip FPGA resources and intellectual property (IP) cores. Furthermore, it is possible to build multiprocessor systems using hard-core or soft-core processors, increasing the range of applications that can be implemented on an FPGA. In this paper we propose a symmetric multiprocessor architecture using the Microblace soft-core processor, and the operating system support needed for running multithreaded applications. Four systems with different shared memory configurations have been implemented on FPGA and tested with parallel applications to show its performance.
{"title":"Symmetric Multiprocessor Systems on FPGA","authors":"P. Huerta, J. Castillo, C. Pedraza, Javier Cano, J. Martínez","doi":"10.1109/ReConFig.2009.20","DOIUrl":"https://doi.org/10.1109/ReConFig.2009.20","url":null,"abstract":"Advances in FPGA technologies allow designing highly complex systems using on-chip FPGA resources and intellectual property (IP) cores. Furthermore, it is possible to build multiprocessor systems using hard-core or soft-core processors, increasing the range of applications that can be implemented on an FPGA. In this paper we propose a symmetric multiprocessor architecture using the Microblace soft-core processor, and the operating system support needed for running multithreaded applications. Four systems with different shared memory configurations have been implemented on FPGA and tested with parallel applications to show its performance.","PeriodicalId":325631,"journal":{"name":"2009 International Conference on Reconfigurable Computing and FPGAs","volume":"115 51","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120827703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}