Pub Date : 2011-04-13DOI: 10.1109/SPL.2011.5782619
Jorge Cogo, Javier G. García, P. A. Roncagliolo, C. Muravchik
In this work we present the design of an FPGA based platform for acquiring and storing signals for SDR applications. The system comprises an embedded RISC processor, an A/D converter, RAM memory chips and a DMA controller core. This last component was designed from scratch to meet the high data rate and bulk requirements.
{"title":"High speed acquisition and storage platform for SDR applications development","authors":"Jorge Cogo, Javier G. García, P. A. Roncagliolo, C. Muravchik","doi":"10.1109/SPL.2011.5782619","DOIUrl":"https://doi.org/10.1109/SPL.2011.5782619","url":null,"abstract":"In this work we present the design of an FPGA based platform for acquiring and storing signals for SDR applications. The system comprises an embedded RISC processor, an A/D converter, RAM memory chips and a DMA controller core. This last component was designed from scratch to meet the high data rate and bulk requirements.","PeriodicalId":6329,"journal":{"name":"2011 VII Southern Conference on Programmable Logic (SPL)","volume":"79 1","pages":"19-24"},"PeriodicalIF":0.0,"publicationDate":"2011-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89686289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-04-13DOI: 10.1109/SPL.2011.5782620
Roman Arenas, J. Finochietto, Ramiro R. Lopez, Ulises Morales
Optical Transport Networks (OTN) have emerged as a key enabler to increase the capacity of current telecommunication infrastructure. ITU-T Recommendation G.709 describes these networks by defining a flexible frame structure capable of carrying different client data signals. Recently, G.709 framer devices have received much attention from the telecommunication industry as next generation 10/40/100G transport equipment demands its integration. This paper proposes a simple framer design based on independent modules which implement G.709 layer specific processes. This architecture enables different integration schemes where some modules can be implemented as ASIC while others in FPGA. The framer verification is also discussed in the context of testing its functionality in a network scenario. Finally, the implementation of a prototype on a FPGA board is described.
{"title":"Framer design, verification and prototyping for G.709 optical transport networks","authors":"Roman Arenas, J. Finochietto, Ramiro R. Lopez, Ulises Morales","doi":"10.1109/SPL.2011.5782620","DOIUrl":"https://doi.org/10.1109/SPL.2011.5782620","url":null,"abstract":"Optical Transport Networks (OTN) have emerged as a key enabler to increase the capacity of current telecommunication infrastructure. ITU-T Recommendation G.709 describes these networks by defining a flexible frame structure capable of carrying different client data signals. Recently, G.709 framer devices have received much attention from the telecommunication industry as next generation 10/40/100G transport equipment demands its integration. This paper proposes a simple framer design based on independent modules which implement G.709 layer specific processes. This architecture enables different integration schemes where some modules can be implemented as ASIC while others in FPGA. The framer verification is also discussed in the context of testing its functionality in a network scenario. Finally, the implementation of a prototype on a FPGA board is described.","PeriodicalId":6329,"journal":{"name":"2011 VII Southern Conference on Programmable Logic (SPL)","volume":"47 1","pages":"25-30"},"PeriodicalIF":0.0,"publicationDate":"2011-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79119271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-04-13DOI: 10.1109/SPL.2011.5782638
A. Bonatto, A. Soares, A. Susin
Embedded consumer electronics like video processing systems require large storage capacity and high bandwidth memory access. Also, those systems are built from heterogeneous processing units, designed specifically to perform dedicated tasks in order to maximize the processing power. A single off-chip memory is shared between the processing units to reduce power and save costs. The external memory access is the system bottleneck when decoding high definition video sequences in real time. This paper presents the design and validation of a multichannel DDR2 SDRAM controller design for a H.264/AVC video decoder. A four-level memory hierarchy was designed to manage the decoded video in macroblock granularity with low latency. The proposed controller is able to manage memory access in decoding 1080p H.264 video sequences. This architecture was validated and prototyped using a Xilinx Virtex-5 FPGA board.
{"title":"Multichannel SDRAM controller design for H.264/AVC video decoder","authors":"A. Bonatto, A. Soares, A. Susin","doi":"10.1109/SPL.2011.5782638","DOIUrl":"https://doi.org/10.1109/SPL.2011.5782638","url":null,"abstract":"Embedded consumer electronics like video processing systems require large storage capacity and high bandwidth memory access. Also, those systems are built from heterogeneous processing units, designed specifically to perform dedicated tasks in order to maximize the processing power. A single off-chip memory is shared between the processing units to reduce power and save costs. The external memory access is the system bottleneck when decoding high definition video sequences in real time. This paper presents the design and validation of a multichannel DDR2 SDRAM controller design for a H.264/AVC video decoder. A four-level memory hierarchy was designed to manage the decoded video in macroblock granularity with low latency. The proposed controller is able to manage memory access in decoding 1080p H.264 video sequences. This architecture was validated and prototyped using a Xilinx Virtex-5 FPGA board.","PeriodicalId":6329,"journal":{"name":"2011 VII Southern Conference on Programmable Logic (SPL)","volume":"53 1","pages":"137-142"},"PeriodicalIF":0.0,"publicationDate":"2011-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82261677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-04-13DOI: 10.1109/SPL.2011.5782650
M. Morales-Sandoval, C. Feregrino-Uribe, R. Cumplido, I. Algredo-Badillo
Elliptic Curve Cryptography (ECC) is a kind of cryptography that provides the security information services using shorter keys than other known public-key crypto-algorithms without decreasing the security level. This makes ECC a good choice for implementing security services in constrained devices, like the mobile ones. However, the diversity of ECC implementation parameters recommended by international standards has led to interoperability problems among ECC implementations. This work presents the design and implementation results of a novel FPGA coprocessor for ECC than can be reconfigured at run time to support different implementation parameters and hence, different security levels. Regardless there are several related works in the literature, to our knowledge this is the first ECC coprocessor that makes use of a partial reconfigurable methodology to deal with interoperability problems in ECC. A suitable application of the proposed reconfigurable coprocessor is the security protocol IPSec, where the domain parameters for ECC-based cryptographic schemes, like digital signature or encryption, have to be negotiated and agreed upon by the communication partners at run time.
{"title":"A reconfigurable GF(2M) elliptic curve cryptographic coprocessor","authors":"M. Morales-Sandoval, C. Feregrino-Uribe, R. Cumplido, I. Algredo-Badillo","doi":"10.1109/SPL.2011.5782650","DOIUrl":"https://doi.org/10.1109/SPL.2011.5782650","url":null,"abstract":"Elliptic Curve Cryptography (ECC) is a kind of cryptography that provides the security information services using shorter keys than other known public-key crypto-algorithms without decreasing the security level. This makes ECC a good choice for implementing security services in constrained devices, like the mobile ones. However, the diversity of ECC implementation parameters recommended by international standards has led to interoperability problems among ECC implementations. This work presents the design and implementation results of a novel FPGA coprocessor for ECC than can be reconfigured at run time to support different implementation parameters and hence, different security levels. Regardless there are several related works in the literature, to our knowledge this is the first ECC coprocessor that makes use of a partial reconfigurable methodology to deal with interoperability problems in ECC. A suitable application of the proposed reconfigurable coprocessor is the security protocol IPSec, where the domain parameters for ECC-based cryptographic schemes, like digital signature or encryption, have to be negotiated and agreed upon by the communication partners at run time.","PeriodicalId":6329,"journal":{"name":"2011 VII Southern Conference on Programmable Logic (SPL)","volume":"PP 1","pages":"209-214"},"PeriodicalIF":0.0,"publicationDate":"2011-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84173669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-04-13DOI: 10.1109/SPL.2011.5782639
N. Lawal, B. Thornberg, M. O’nils
In this paper, we present an approach that uses information about the FPGA architecture to achieve optimized allocation of embedded memory in real-time video processing system. A cost function defined in terms of required memory sizes, available block- and distributed-RAM resources is used to motivate the allocation decision. This work is a high-level exploration that generates VHDL RTL modules and synthesis constraint files to specify memory allocation. Results show that the proposed approach achieves appreciable reduction in block RAM usage over previous logic to memory mapping approach at negligible increase in logic usage.
{"title":"Architecture driven memory allocation for FPGA based real-time video processing systems","authors":"N. Lawal, B. Thornberg, M. O’nils","doi":"10.1109/SPL.2011.5782639","DOIUrl":"https://doi.org/10.1109/SPL.2011.5782639","url":null,"abstract":"In this paper, we present an approach that uses information about the FPGA architecture to achieve optimized allocation of embedded memory in real-time video processing system. A cost function defined in terms of required memory sizes, available block- and distributed-RAM resources is used to motivate the allocation decision. This work is a high-level exploration that generates VHDL RTL modules and synthesis constraint files to specify memory allocation. Results show that the proposed approach achieves appreciable reduction in block RAM usage over previous logic to memory mapping approach at negligible increase in logic usage.","PeriodicalId":6329,"journal":{"name":"2011 VII Southern Conference on Programmable Logic (SPL)","volume":"15 1","pages":"143-148"},"PeriodicalIF":0.0,"publicationDate":"2011-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73427955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-04-13DOI: 10.1109/SPL.2011.5782618
E. Lizarraga, V. Sauchelli, Gabriel N. Maggio
N-Continuous OFDM systems have been proposed to achieve an important reduction of the out-of-band emitted power compared to conventional OFDM. However, system complexity has been increased and some resource demanding operations are necessary. So, this work considers the implementation in FPGA of the transmitter and also provides a novel analysis on the influence of the IFFT length in the representation of the continuity condition. Spectral measurements are practiced in the model to evaluate the performance.
{"title":"N-continuous OFDM signal analysis of FPGA-based transmissions","authors":"E. Lizarraga, V. Sauchelli, Gabriel N. Maggio","doi":"10.1109/SPL.2011.5782618","DOIUrl":"https://doi.org/10.1109/SPL.2011.5782618","url":null,"abstract":"N-Continuous OFDM systems have been proposed to achieve an important reduction of the out-of-band emitted power compared to conventional OFDM. However, system complexity has been increased and some resource demanding operations are necessary. So, this work considers the implementation in FPGA of the transmitter and also provides a novel analysis on the influence of the IFFT length in the representation of the continuity condition. Spectral measurements are practiced in the model to evaluate the performance.","PeriodicalId":6329,"journal":{"name":"2011 VII Southern Conference on Programmable Logic (SPL)","volume":"1 1","pages":"13-18"},"PeriodicalIF":0.0,"publicationDate":"2011-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75556108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-04-13DOI: 10.1109/SPL.2011.5782626
J. Y. Mori, Camilo Sánchez-Ferreira, D. Muñoz, C. Llanos, P. Berger
Currently the market and the academic community have required applications of image and video processing with several real-time constraints. In order to seek an alternative design that allows the rapid development of real time image processing systems this paper proposes an unified hardware architecture for some image filtering algorithms in space domain, such as windowing-based operations, which are implemented on FPGAs (Field Programmable Gate Arrays). For achieving this, six different filters have been implemented in a parallel approach, separating them in simple hardware structures, allowing the algorithms to explore their parallel capabilities by using a simple systolic architecture. In this system all implemented algorithms run in parallel allowing the user to select a defined output for depicting it in a display. Both image processing and synthesis results have demonstrated the feasibility of FPGAs for implementing the proposed filtering algorithms in a full parallel approach.
{"title":"An unified approach for convolution-based image filtering on reconfigurable systems","authors":"J. Y. Mori, Camilo Sánchez-Ferreira, D. Muñoz, C. Llanos, P. Berger","doi":"10.1109/SPL.2011.5782626","DOIUrl":"https://doi.org/10.1109/SPL.2011.5782626","url":null,"abstract":"Currently the market and the academic community have required applications of image and video processing with several real-time constraints. In order to seek an alternative design that allows the rapid development of real time image processing systems this paper proposes an unified hardware architecture for some image filtering algorithms in space domain, such as windowing-based operations, which are implemented on FPGAs (Field Programmable Gate Arrays). For achieving this, six different filters have been implemented in a parallel approach, separating them in simple hardware structures, allowing the algorithms to explore their parallel capabilities by using a simple systolic architecture. In this system all implemented algorithms run in parallel allowing the user to select a defined output for depicting it in a display. Both image processing and synthesis results have demonstrated the feasibility of FPGAs for implementing the proposed filtering algorithms in a full parallel approach.","PeriodicalId":6329,"journal":{"name":"2011 VII Southern Conference on Programmable Logic (SPL)","volume":"8 1","pages":"63-68"},"PeriodicalIF":0.0,"publicationDate":"2011-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79687341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-04-13DOI: 10.1109/SPL.2011.5782637
M. Corrêa, M. T. Schoenknecht, L. Agostini
This paper presents a hardware design for the H.264/AVC Quarter-Pixel Motion Estimation Refinement to be used in a complete Fractional Motion Estimation architecture. The architecture was optimized to reach a high throughput through a balanced pipeline and parallelism exploration. The design was described in VHDL and synthesized to an Altera Stratix III FPGA device. The design achieves an operation frequency of 245 MHz, processing up to 39 QHDTV frames (3840×2048 pixels) per second. This architecture is also able to reach real time when processing other resolutions, like HD 1080p (1920×1080 pixels) with lower operation frequencies. The final results are very competitive when compared to related works.
本文提出了H.264/AVC四分之一像素运动估计细化的硬件设计,用于完整的分数阶运动估计体系结构。通过平衡管道和并行性探索,优化了该体系结构以达到高吞吐量。该设计用VHDL语言描述,并合成到Altera Stratix III FPGA器件上。该设计实现了245mhz的工作频率,每秒处理高达39个QHDTV帧(3840×2048像素)。该架构在处理其他分辨率时也能够达到实时性,例如HD 1080p (1920×1080像素),操作频率较低。与相关作品相比,最终的结果是非常有竞争力的。
{"title":"A H.264/AVC Quarter-Pixel Motion Estimation Refinement architecture targeting high resolution videos","authors":"M. Corrêa, M. T. Schoenknecht, L. Agostini","doi":"10.1109/SPL.2011.5782637","DOIUrl":"https://doi.org/10.1109/SPL.2011.5782637","url":null,"abstract":"This paper presents a hardware design for the H.264/AVC Quarter-Pixel Motion Estimation Refinement to be used in a complete Fractional Motion Estimation architecture. The architecture was optimized to reach a high throughput through a balanced pipeline and parallelism exploration. The design was described in VHDL and synthesized to an Altera Stratix III FPGA device. The design achieves an operation frequency of 245 MHz, processing up to 39 QHDTV frames (3840×2048 pixels) per second. This architecture is also able to reach real time when processing other resolutions, like HD 1080p (1920×1080 pixels) with lower operation frequencies. The final results are very competitive when compared to related works.","PeriodicalId":6329,"journal":{"name":"2011 VII Southern Conference on Programmable Logic (SPL)","volume":"62 1 1","pages":"131-136"},"PeriodicalIF":0.0,"publicationDate":"2011-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78384615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-04-13DOI: 10.1109/SPL.2011.5782656
J. Martínez, Jaime Vitola, Adriana Sanabria, C. Pedraza
One of the main challenges that Music Information Retrieval (MIR) faces is performance. This paper presents an algorithm based on fingerprinting techniques implemented in a low-cost embedded reconfigurable platform. This fast algorithm is even faster when implemented in parallel for a GPU platform. The hit rate of the implementations is practically 100% and the response time is two times faster than the response time of a top class PC, which means MIR times of up to 65 audio tracks in real time.
{"title":"Fast parallel audio fingerprinting implementation in reconfigurable hardware and GPUs","authors":"J. Martínez, Jaime Vitola, Adriana Sanabria, C. Pedraza","doi":"10.1109/SPL.2011.5782656","DOIUrl":"https://doi.org/10.1109/SPL.2011.5782656","url":null,"abstract":"One of the main challenges that Music Information Retrieval (MIR) faces is performance. This paper presents an algorithm based on fingerprinting techniques implemented in a low-cost embedded reconfigurable platform. This fast algorithm is even faster when implemented in parallel for a GPU platform. The hit rate of the implementations is practically 100% and the response time is two times faster than the response time of a top class PC, which means MIR times of up to 65 audio tracks in real time.","PeriodicalId":6329,"journal":{"name":"2011 VII Southern Conference on Programmable Logic (SPL)","volume":"81 1","pages":"245-250"},"PeriodicalIF":0.0,"publicationDate":"2011-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84913871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-04-13DOI: 10.1109/SPL.2011.5782659
J. Arias-Garcia, R. Pezzuol Jacobi, C. Llanos, M. Ayala-Rincón
This work presents an architecture to compute matrix inversions in a hardware reconfigurable FPGA with single-precision floating-point representation, whose main unit is the processing component for Gauss-Jordan elimination. This component consists of other smaller arithmetic units, organized to maintain the accuracy of the results without the need to internally normalize and de-normalize the floating-point data. The implementation of the operations and the whole unit take advantage of the resources available in the Virtex-5 FPGA. The performance and resource consumption of the implementation are improvements in comparison with different more elaborated architectures whose implementations are more complex for low cost applications. Benchmarks are done with solutions implemented previously in FPGA and software, such as Matlab.
{"title":"A suitable FPGA implementation of floating-point matrix inversion based on Gauss-Jordan elimination","authors":"J. Arias-Garcia, R. Pezzuol Jacobi, C. Llanos, M. Ayala-Rincón","doi":"10.1109/SPL.2011.5782659","DOIUrl":"https://doi.org/10.1109/SPL.2011.5782659","url":null,"abstract":"This work presents an architecture to compute matrix inversions in a hardware reconfigurable FPGA with single-precision floating-point representation, whose main unit is the processing component for Gauss-Jordan elimination. This component consists of other smaller arithmetic units, organized to maintain the accuracy of the results without the need to internally normalize and de-normalize the floating-point data. The implementation of the operations and the whole unit take advantage of the resources available in the Virtex-5 FPGA. The performance and resource consumption of the implementation are improvements in comparison with different more elaborated architectures whose implementations are more complex for low cost applications. Benchmarks are done with solutions implemented previously in FPGA and software, such as Matlab.","PeriodicalId":6329,"journal":{"name":"2011 VII Southern Conference on Programmable Logic (SPL)","volume":"1993 1","pages":"263-268"},"PeriodicalIF":0.0,"publicationDate":"2011-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88193973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}