Pub Date : 1997-04-16DOI: 10.1109/FPGA.1997.624625
U. Tangen, L. Schulte, J. McCaskill
Previous work (J.S. McCaskill et al., 1996; 1997) has shown the power of massively parallel configurable hardware (NGEN) in conjunction with dataflow architectures for the simulation of evolving populations. NGEN is a flexible computer hardware for rapid custom circuit simulation of fine grained physical processes via a massively parallel architecture, e.g. 144 hardware configurable field programmable gate arrays (FPGAs, XC4008, Xilinx). NGEN is optimized to implement dataflow architectures and systolic algorithms for large problems and is confectioned with high speed distributed SRAM, 144*8*256 kBit, 15ns access time, on the chip to chip interconnect. Microconfigurable FPGAs allow a further step to close the gap between micro electronics and biology on the information processing area. A design for a massively parallel microconfigurable computer (POLYP) is presented. It is designed to allow online evolution in hardware with significant locally controllable memory resources. It is also designed for high throughput dataflow applications with large problem size. Additionally, an evolvable interface between high rate measurement devices is provided to allow adaptive processing coupled with real time experimental environments. The computer represents the next logical step towards evolvable hardware interacting with biology beyond the massively parallel computer NGEN.
以前的工作(J.S. McCaskill et al., 1996;1997年)展示了大规模并行可配置硬件(NGEN)与数据流架构相结合的能力,可以模拟不断进化的种群。NGEN是一种灵活的计算机硬件,可通过大规模并行架构快速定制细粒度物理过程的电路仿真,例如144个硬件可配置现场可编程门阵列(fpga, XC4008, Xilinx)。NGEN经过优化,可实现大型问题的数据流架构和收缩算法,并在片与片互连上配置高速分布式SRAM, 144*8*256 kBit, 15ns访问时间。微可配置fpga进一步缩小了微电子学和生物学在信息处理领域的差距。提出了一种大规模并行微组态计算机(POLYP)的设计方案。它被设计为允许在具有大量本地可控内存资源的硬件中进行在线演进。它还设计用于具有大问题规模的高吞吐量数据流应用程序。此外,提供了高速率测量设备之间的可进化接口,以允许与实时实验环境相结合的自适应处理。计算机代表了在大规模并行计算机NGEN之外,迈向可进化硬件与生物交互的下一个合乎逻辑的步骤。
{"title":"A parallel hardware evolvable computer POLYP","authors":"U. Tangen, L. Schulte, J. McCaskill","doi":"10.1109/FPGA.1997.624625","DOIUrl":"https://doi.org/10.1109/FPGA.1997.624625","url":null,"abstract":"Previous work (J.S. McCaskill et al., 1996; 1997) has shown the power of massively parallel configurable hardware (NGEN) in conjunction with dataflow architectures for the simulation of evolving populations. NGEN is a flexible computer hardware for rapid custom circuit simulation of fine grained physical processes via a massively parallel architecture, e.g. 144 hardware configurable field programmable gate arrays (FPGAs, XC4008, Xilinx). NGEN is optimized to implement dataflow architectures and systolic algorithms for large problems and is confectioned with high speed distributed SRAM, 144*8*256 kBit, 15ns access time, on the chip to chip interconnect. Microconfigurable FPGAs allow a further step to close the gap between micro electronics and biology on the information processing area. A design for a massively parallel microconfigurable computer (POLYP) is presented. It is designed to allow online evolution in hardware with significant locally controllable memory resources. It is also designed for high throughput dataflow applications with large problem size. Additionally, an evolvable interface between high rate measurement devices is provided to allow adaptive processing coupled with real time experimental environments. The computer represents the next logical step towards evolvable hardware interacting with biology beyond the massively parallel computer NGEN.","PeriodicalId":303064,"journal":{"name":"Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130057781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-04-16DOI: 10.1109/FPGA.1997.624610
C. Ebeling, Darren C. Cronquist, Paul Franklin, Jason Secosky, Stefan G. Berg
The goal of the RaPiD (Reconfigurable Pipelined Datapath) architecture is to provide high performance configurable computing for a range of computationally-intensive applications that demand special-purpose hardware. This is accomplished by mapping the computation into a deep pipeline using a configurable array of coarse-grained computational units. A key feature of RaPiD is the combination of static and dynamic control. While the underlying computational pipelines are configured statically, a limited amount of dynamic control is provided which greatly increases the range and capability of applications that can be mapped to RaPiD. This paper illustrates this mapping and configuration for several important applications including a FIR filter, 2-D DCT, motion estimation, and parametric curve generation; it also shows how static and dynamic control are used to perform complex computations.
{"title":"Mapping applications to the RaPiD configurable architecture","authors":"C. Ebeling, Darren C. Cronquist, Paul Franklin, Jason Secosky, Stefan G. Berg","doi":"10.1109/FPGA.1997.624610","DOIUrl":"https://doi.org/10.1109/FPGA.1997.624610","url":null,"abstract":"The goal of the RaPiD (Reconfigurable Pipelined Datapath) architecture is to provide high performance configurable computing for a range of computationally-intensive applications that demand special-purpose hardware. This is accomplished by mapping the computation into a deep pipeline using a configurable array of coarse-grained computational units. A key feature of RaPiD is the combination of static and dynamic control. While the underlying computational pipelines are configured statically, a limited amount of dynamic control is provided which greatly increases the range and capability of applications that can be mapped to RaPiD. This paper illustrates this mapping and configuration for several important applications including a FIR filter, 2-D DCT, motion estimation, and parametric curve generation; it also shows how static and dynamic control are used to perform complex computations.","PeriodicalId":303064,"journal":{"name":"Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186)","volume":"218 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133713141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-04-16DOI: 10.1109/FPGA.1997.624627
HyunJeong Yun, Aaron Smith, H. Silverman
Armstrong III is a 20 node multi-computer that is currently operational. In addition to a RISC processor, each node contains reconfigurable resources implemented with FPGAs. The in-circuit reprogramability of static RAM based FPGAs allows the computational capabilities of a node to be dynamically matched to the computational requirements of an application. Most reconfigurable computers in existence today rely solely on a large number of FPGAs to perform computations. In contrast, the paper demonstrates the utility of a small number of FPGAs coupled to a RISC processor with a simple interconnect. The article describes a substantive example application that performs HMM training for speech recognition with the reconfigurable platform.
{"title":"Speech recognition HMM training on reconfigurable parallel processor","authors":"HyunJeong Yun, Aaron Smith, H. Silverman","doi":"10.1109/FPGA.1997.624627","DOIUrl":"https://doi.org/10.1109/FPGA.1997.624627","url":null,"abstract":"Armstrong III is a 20 node multi-computer that is currently operational. In addition to a RISC processor, each node contains reconfigurable resources implemented with FPGAs. The in-circuit reprogramability of static RAM based FPGAs allows the computational capabilities of a node to be dynamically matched to the computational requirements of an application. Most reconfigurable computers in existence today rely solely on a large number of FPGAs to perform computations. In contrast, the paper demonstrates the utility of a small number of FPGAs coupled to a RISC processor with a simple interconnect. The article describes a substantive example application that performs HMM training for speech recognition with the reconfigurable platform.","PeriodicalId":303064,"journal":{"name":"Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127689329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-04-16DOI: 10.1109/FPGA.1997.624604
H. Schmit
This paper examines the implementation of pipelined applications using run-time reconfiguration. Throughput and latency of pipelined applications can be significantly improved when reconfiguration is performed at the level of individual pipeline stages, as opposed to configuration of the entire FPGA. If reconfiguration and execution can be performed simultaneously, the performance of a pipelined application approaches its theoretical maximum. This paper proposes a new FPGA configuration mechanism, called striping, that supports pipeline stage reconfiguration and simultaneous configuration and execution. Additionally, the use of the pipeline stage as the atomic unit of reconfiguration introduces a design abstraction that enables the development families of upwardly-compatible FPGAs and virtual hardware design.
{"title":"Incremental reconfiguration for pipelined applications","authors":"H. Schmit","doi":"10.1109/FPGA.1997.624604","DOIUrl":"https://doi.org/10.1109/FPGA.1997.624604","url":null,"abstract":"This paper examines the implementation of pipelined applications using run-time reconfiguration. Throughput and latency of pipelined applications can be significantly improved when reconfiguration is performed at the level of individual pipeline stages, as opposed to configuration of the entire FPGA. If reconfiguration and execution can be performed simultaneously, the performance of a pipelined application approaches its theoretical maximum. This paper proposes a new FPGA configuration mechanism, called striping, that supports pipeline stage reconfiguration and simultaneous configuration and execution. Additionally, the use of the pipeline stage as the atomic unit of reconfiguration introduces a design abstraction that enables the development families of upwardly-compatible FPGAs and virtual hardware design.","PeriodicalId":303064,"journal":{"name":"Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125346948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-04-16DOI: 10.1109/FPGA.1997.624609
Ray Bittner, P. Athanas
The wormhole run-time reconfiguration (RTR) computing paradigm is a method for creating high performance computational pipelines. The scalability, distributed control and data flow features of the paradigm allow it to fit neatly into the configurable computing machine (CCM) domain. To date, the field has been dominated by large bit-oriented devices whose flexibility can lead to lowered silicon utilization efficiencies. In an effort to raise this efficiency, the Colt CCM has been created based on the wormhole RTR paradigm. This paper outlines methods of implementation and performance for several common operations using these concepts. They serve as indicators of the diversity of algorithms that can be instantiated through the high-speed run-time reconfiguration that these devices make possible. Particular attention is paid to floating point multiplication. Also discussed is the topic of data dependent computation which would seem to be counter intuitive to the wormhole RTR paradigm. The paper concludes with a summary of performance of the three computations.
{"title":"Computing kernels implemented with a wormhole RTR CCM","authors":"Ray Bittner, P. Athanas","doi":"10.1109/FPGA.1997.624609","DOIUrl":"https://doi.org/10.1109/FPGA.1997.624609","url":null,"abstract":"The wormhole run-time reconfiguration (RTR) computing paradigm is a method for creating high performance computational pipelines. The scalability, distributed control and data flow features of the paradigm allow it to fit neatly into the configurable computing machine (CCM) domain. To date, the field has been dominated by large bit-oriented devices whose flexibility can lead to lowered silicon utilization efficiencies. In an effort to raise this efficiency, the Colt CCM has been created based on the wormhole RTR paradigm. This paper outlines methods of implementation and performance for several common operations using these concepts. They serve as indicators of the diversity of algorithms that can be instantiated through the high-speed run-time reconfiguration that these devices make possible. Particular attention is paid to floating point multiplication. Also discussed is the topic of data dependent computation which would seem to be counter intuitive to the wormhole RTR paradigm. The paper concludes with a summary of performance of the three computations.","PeriodicalId":303064,"journal":{"name":"Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121980377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-04-16DOI: 10.1109/FPGA.1997.624605
W. Luk, N. Shirazi, P. Cheung
This paper describes a framework and tools for automating the production of designs which can be partially reconfigured at run time. The tools include: a partial evaluator, which produces configuration files for a given design, where the number of configurations can be minimised by a process, known as compile-time sequencing; an incremental configuration calculator, which takes the output of the partial evaluator and generates an initial configuration file and incremental configuration files that partially update preceding configurations; and a tool which further optimises designs for FPGAs supporting simultaneous configuration of multiple cells. While many of our techniques are independent of the design language and device used, our tools currently target Xilinx 6200 devices. Simultaneous configuration, for example, can be used to reduce the time for reconfiguring an adder to a subtractor from time linear with respect to its size to constant time at best and logarithmic time at worst.
{"title":"Compilation tools for run-time reconfigurable designs","authors":"W. Luk, N. Shirazi, P. Cheung","doi":"10.1109/FPGA.1997.624605","DOIUrl":"https://doi.org/10.1109/FPGA.1997.624605","url":null,"abstract":"This paper describes a framework and tools for automating the production of designs which can be partially reconfigured at run time. The tools include: a partial evaluator, which produces configuration files for a given design, where the number of configurations can be minimised by a process, known as compile-time sequencing; an incremental configuration calculator, which takes the output of the partial evaluator and generates an initial configuration file and incremental configuration files that partially update preceding configurations; and a tool which further optimises designs for FPGAs supporting simultaneous configuration of multiple cells. While many of our techniques are independent of the design language and device used, our tools currently target Xilinx 6200 devices. Simultaneous configuration, for example, can be used to reduce the time for reconfiguring an adder to a subtractor from time linear with respect to its size to constant time at best and logarithmic time at worst.","PeriodicalId":303064,"journal":{"name":"Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125218037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-04-16DOI: 10.1109/FPGA.1997.624612
L. Moll, M. Shand
We describe the use of a reconfigurable board to obtain information on the performance that can be expected on particular systems. Our goal is to use the reconfigurability, of the board's interface to test a system and discover not only the maximum bandwidth and best latency attainable, but also the way to reliably achieve these figures. The board we present uses the now widespread PCI bus. PCI is sufficiently complex, and its implementations sufficiently varied, that it is impossible to guess the performance that can be obtained by a specific board on a specific computer with the only technical characteristics of the two in hand. We observe astonishing performance differences between almost identical systems and comparable figures between small PCs and big servers. Our performance tests can be an end in themselves, however, they also serve to demonstrate the value of a reconfigurable bus interface. With the same board, we can test and choose a system, make informed architectural decisions on the hardware/software interface, and then finely tune the bus interface to get maximum and predictable figures in the running application.
{"title":"Systems performance measurement on PCI Pamette","authors":"L. Moll, M. Shand","doi":"10.1109/FPGA.1997.624612","DOIUrl":"https://doi.org/10.1109/FPGA.1997.624612","url":null,"abstract":"We describe the use of a reconfigurable board to obtain information on the performance that can be expected on particular systems. Our goal is to use the reconfigurability, of the board's interface to test a system and discover not only the maximum bandwidth and best latency attainable, but also the way to reliably achieve these figures. The board we present uses the now widespread PCI bus. PCI is sufficiently complex, and its implementations sufficiently varied, that it is impossible to guess the performance that can be obtained by a specific board on a specific computer with the only technical characteristics of the two in hand. We observe astonishing performance differences between almost identical systems and comparable figures between small PCs and big servers. Our performance tests can be an end in themselves, however, they also serve to demonstrate the value of a reconfigurable bus interface. With the same board, we can test and choose a system, make informed architectural decisions on the hardware/software interface, and then finely tune the bus interface to get maximum and predictable figures in the running application.","PeriodicalId":303064,"journal":{"name":"Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131416283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-04-16DOI: 10.1109/FPGA.1997.624616
M. Gokhale, D. Gomersall
The authors present an integrated tool set to generate highly optimized hardware computation blocks from a C language subset. By starting with a C language description of the algorithm, they address the problem of making FPGA processors accessible to programmers as opposed to hardware designers. Their work is specifically targeted to fine grained FPGAs such as the National Semiconductor CLAy/sup TM/ FPGA family. Such FPGAs exhibit extremely high performance on regular data path circuits, which are more prevalent in computationally oriented hardware applications. Dense packing of data path functional elements makes it possible to fit the computation on one or a small number of chips, and the use of local routing resources makes it possible to clock the chip at a high rate. By developing a lower level tool suite that exploits the regular, geometric nature of fine grained FPGAs, and mapping the compiler output to this tool suite, they greatly improve performance over traditional high level synthesis to fine grained FPGAs.
{"title":"High level compilation for fine grained FPGAs","authors":"M. Gokhale, D. Gomersall","doi":"10.1109/FPGA.1997.624616","DOIUrl":"https://doi.org/10.1109/FPGA.1997.624616","url":null,"abstract":"The authors present an integrated tool set to generate highly optimized hardware computation blocks from a C language subset. By starting with a C language description of the algorithm, they address the problem of making FPGA processors accessible to programmers as opposed to hardware designers. Their work is specifically targeted to fine grained FPGAs such as the National Semiconductor CLAy/sup TM/ FPGA family. Such FPGAs exhibit extremely high performance on regular data path circuits, which are more prevalent in computationally oriented hardware applications. Dense packing of data path functional elements makes it possible to fit the computation on one or a small number of chips, and the use of local routing resources makes it possible to clock the chip at a high rate. By developing a lower level tool suite that exploits the regular, geometric nature of fine grained FPGAs, and mapping the compiler output to this tool suite, they greatly improve performance over traditional high level synthesis to fine grained FPGAs.","PeriodicalId":303064,"journal":{"name":"Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186)","volume":"312 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132519267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-04-16DOI: 10.1109/FPGA.1997.624624
T. Callahan, J. Wawrzynek
Widespread acceptance of FPGA-based reconfigurable coprocessors will be expedited if compilation time for FPGA configurations can be reduced to be comparable to software compilation. This research achieves this goal, generating complete datapath layouts in fractions of a second rather than hours. Our algorithm, adapted from instruction selection in compilers, packs multiple operations into single rows of CLBs when possible, while preserving a regular bit-slice layout. Furthermore, placement and thus routing delays are considered simultaneously with packing, so that the total delay, not just the CLB delay, is optimized.
{"title":"Datapath-oriented FPGA mapping and placement for configurable computing","authors":"T. Callahan, J. Wawrzynek","doi":"10.1109/FPGA.1997.624624","DOIUrl":"https://doi.org/10.1109/FPGA.1997.624624","url":null,"abstract":"Widespread acceptance of FPGA-based reconfigurable coprocessors will be expedited if compilation time for FPGA configurations can be reduced to be comparable to software compilation. This research achieves this goal, generating complete datapath layouts in fractions of a second rather than hours. Our algorithm, adapted from instruction selection in compilers, packs multiple operations into single rows of CLBs when possible, while preserving a regular bit-slice layout. Furthermore, placement and thus routing delays are considered simultaneously with packing, so that the total delay, not just the CLB delay, is optimized.","PeriodicalId":303064,"journal":{"name":"Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186)","volume":"149 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122978428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-04-16DOI: 10.1109/FPGA.1997.624601
S. Trimberger, D. Carberry, Anders Johnson, Jennifer Wong
This paper describes the architecture of a time-multiplexed FPGA. Eight configurations of the FPGA are stored in on-chip memory. This inactive on-chip memory is distributed around the chip, and accessible so that the entire configuration of the FPGA can be changed in a single cycle of the memory. The entire configuration of the FPGA can be loaded from this on-chip memory in 30 ns. Inactive memory is accessible as block RAM for applications. The FPGA is based on the Xilinx XC4000E FPGA, and includes extensions for dealing with state saving and forwarding and for increased routing demand due to time-multiplexing the hardware.
{"title":"A time-multiplexed FPGA","authors":"S. Trimberger, D. Carberry, Anders Johnson, Jennifer Wong","doi":"10.1109/FPGA.1997.624601","DOIUrl":"https://doi.org/10.1109/FPGA.1997.624601","url":null,"abstract":"This paper describes the architecture of a time-multiplexed FPGA. Eight configurations of the FPGA are stored in on-chip memory. This inactive on-chip memory is distributed around the chip, and accessible so that the entire configuration of the FPGA can be changed in a single cycle of the memory. The entire configuration of the FPGA can be loaded from this on-chip memory in 30 ns. Inactive memory is accessible as block RAM for applications. The FPGA is based on the Xilinx XC4000E FPGA, and includes extensions for dealing with state saving and forwarding and for increased routing demand due to time-multiplexing the hardware.","PeriodicalId":303064,"journal":{"name":"Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123749368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}