首页 > 最新文献

2014 International Conference on Field-Programmable Technology (FPT)最新文献

英文 中文
High performance relevance vector machine on HMPSoC HMPSoC上的高性能相关向量机
Pub Date : 2014-12-01 DOI: 10.1109/FPT.2014.7082812
Yongfu He, Shaojun Wang, Yu Peng, Y. Pang, Ning Ma, Jingyue Pang
Relevance Vector Machine (RVM) with the uncertainty expressing ability has spawned broad applications in Prognostic and Health Management (PHM). However computationally intensive intrinsic nature of RVM greatly limits its usage. This paper presents a software and hardware co-design approach based on HMPSoC technology, which efficiently exploited sequential and parallel nature of RVM. Multi-channel and pipelined hardware architecture for the acceleration of kernel formulation and intermediate values calculation is proposed. The hardware that wrapped with AXI-Stream interface is integrated into HMPSoC as an acceleration engine. We implement the design on an on-board PHM prototype platform with a Xilinx Zynq XC7Z020 AP SoC. The experiment results show 5.3× and 46.8× speed up in terms of the time cost than the RVM running on PC with a Xeon 5620 processor and ARM Cortex A9 processor. The energy consumption is reduced by 153.0× and 37.3×, respectively.
相关向量机(RVM)具有表达不确定性的能力,在预后和健康管理(PHM)中得到了广泛的应用。然而,RVM固有的计算密集型特性极大地限制了它的使用。本文提出了一种基于HMPSoC技术的软硬件协同设计方法,有效地利用了RVM的顺序和并行特性。提出了多通道和流水线的硬件结构,以加速核公式和中间值的计算。轴流接口封装的硬件作为加速引擎集成到HMPSoC中。我们在带有Xilinx Zynq XC7Z020 AP SoC的板载PHM原型平台上实现了该设计。实验结果表明,RVM在运行于Xeon 5620处理器和ARM Cortex A9处理器的PC机上时,运行速度分别提高5.3倍和46.8倍。能耗分别降低153.0倍和37.3倍。
{"title":"High performance relevance vector machine on HMPSoC","authors":"Yongfu He, Shaojun Wang, Yu Peng, Y. Pang, Ning Ma, Jingyue Pang","doi":"10.1109/FPT.2014.7082812","DOIUrl":"https://doi.org/10.1109/FPT.2014.7082812","url":null,"abstract":"Relevance Vector Machine (RVM) with the uncertainty expressing ability has spawned broad applications in Prognostic and Health Management (PHM). However computationally intensive intrinsic nature of RVM greatly limits its usage. This paper presents a software and hardware co-design approach based on HMPSoC technology, which efficiently exploited sequential and parallel nature of RVM. Multi-channel and pipelined hardware architecture for the acceleration of kernel formulation and intermediate values calculation is proposed. The hardware that wrapped with AXI-Stream interface is integrated into HMPSoC as an acceleration engine. We implement the design on an on-board PHM prototype platform with a Xilinx Zynq XC7Z020 AP SoC. The experiment results show 5.3× and 46.8× speed up in terms of the time cost than the RVM running on PC with a Xeon 5620 processor and ARM Cortex A9 processor. The energy consumption is reduced by 153.0× and 37.3×, respectively.","PeriodicalId":6877,"journal":{"name":"2014 International Conference on Field-Programmable Technology (FPT)","volume":"35 1","pages":"334-337"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81126948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Improving the reliability of RO PUF using frequency offset 利用频偏提高RO PUF的可靠性
Pub Date : 2014-12-01 DOI: 10.1109/FPT.2014.7082813
Bin Tang, Yaping Lin, Jiliang Zhang
Physical unclonable function (PUF) is a promising hardware security primitive that can be applied to various security related areas. The ring oscillator (RO) PUF is one of the most popular PUFs that can generate the volatile key by comparing the frequency between ROs. Previous RO PUFs incur unacceptable hardware overheads to improve the reliability in order to eliminate the effect of environment factors. In this paper, we propose a frequency offset algorithm (FOA) to enhance the reliability and low the hardware overhead. The key idea is to make the frequency difference larger than a given threshold by offsetting the frequencies of RO pairs. Experimental results show that our proposed FOA method has the better reliability and lower hardware overhead than the temperature-aware cooperative (TAC). Especially, our proposed method can achieve the 100% utilization of ROs.
物理不可克隆函数(PUF)是一种很有前途的硬件安全原语,可应用于各种与安全相关的领域。环形振荡器(RO) PUF是最流行的PUF之一,它可以通过比较RO之间的频率来生成易失密钥。为了消除环境因素的影响,以前的RO puf产生了不可接受的硬件开销,以提高可靠性。在本文中,我们提出了一种频率偏移算法(FOA)来提高可靠性和降低硬件开销。关键思想是通过抵消RO对的频率使频率差大于给定的阈值。实验结果表明,与温度感知协同(TAC)方法相比,该方法具有更高的可靠性和更低的硬件开销。特别是,我们提出的方法可以实现100%的活性氧利用率。
{"title":"Improving the reliability of RO PUF using frequency offset","authors":"Bin Tang, Yaping Lin, Jiliang Zhang","doi":"10.1109/FPT.2014.7082813","DOIUrl":"https://doi.org/10.1109/FPT.2014.7082813","url":null,"abstract":"Physical unclonable function (PUF) is a promising hardware security primitive that can be applied to various security related areas. The ring oscillator (RO) PUF is one of the most popular PUFs that can generate the volatile key by comparing the frequency between ROs. Previous RO PUFs incur unacceptable hardware overheads to improve the reliability in order to eliminate the effect of environment factors. In this paper, we propose a frequency offset algorithm (FOA) to enhance the reliability and low the hardware overhead. The key idea is to make the frequency difference larger than a given threshold by offsetting the frequencies of RO pairs. Experimental results show that our proposed FOA method has the better reliability and lower hardware overhead than the temperature-aware cooperative (TAC). Especially, our proposed method can achieve the 100% utilization of ROs.","PeriodicalId":6877,"journal":{"name":"2014 International Conference on Field-Programmable Technology (FPT)","volume":"60 1","pages":"338-341"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81084593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
No zero padded sparse matrix-vector multiplication on FPGAs fpga上无补零稀疏矩阵向量乘法
Pub Date : 2014-12-01 DOI: 10.1109/FPT.2014.7082800
Jiasen Huang, Junyan Ren, Wenbo Yin, Lingli Wang
Sparse Matrix-Vector Multiplication (SpMxV) algorithms suffer heavy performance penalties due to irregular memory accesses. In this paper, we introduce a novel compressed element storage (CES) format, in which the additional data structures for indexing are abandoned, and each location associated with the non-zero element of the matrix is now indicated by the name of a variable multiplied by the corresponding element of the vector. To ensure fastest access and parallel access without data hazards, on-chip registers are used exclusively to replace the BRAM or off-chip DRAM/SRAM to hold all the SpMxV data. On-chip DSP resources are fully utilized so as to ensure a maximum number of multipliers concurrently working.
稀疏矩阵向量乘法(SpMxV)算法由于不规律的内存访问而遭受严重的性能损失。在本文中,我们引入了一种新的压缩元素存储(CES)格式,其中放弃了用于索引的额外数据结构,并且与矩阵的非零元素相关的每个位置现在由变量的名称乘以向量的相应元素来表示。为了确保最快的访问和并行访问而没有数据危害,片上寄存器专门用于取代BRAM或片外DRAM/SRAM来保存所有SpMxV数据。充分利用片上DSP资源,保证最大数量的乘数同时工作。
{"title":"No zero padded sparse matrix-vector multiplication on FPGAs","authors":"Jiasen Huang, Junyan Ren, Wenbo Yin, Lingli Wang","doi":"10.1109/FPT.2014.7082800","DOIUrl":"https://doi.org/10.1109/FPT.2014.7082800","url":null,"abstract":"Sparse Matrix-Vector Multiplication (SpMxV) algorithms suffer heavy performance penalties due to irregular memory accesses. In this paper, we introduce a novel compressed element storage (CES) format, in which the additional data structures for indexing are abandoned, and each location associated with the non-zero element of the matrix is now indicated by the name of a variable multiplied by the corresponding element of the vector. To ensure fastest access and parallel access without data hazards, on-chip registers are used exclusively to replace the BRAM or off-chip DRAM/SRAM to hold all the SpMxV data. On-chip DSP resources are fully utilized so as to ensure a maximum number of multipliers concurrently working.","PeriodicalId":6877,"journal":{"name":"2014 International Conference on Field-Programmable Technology (FPT)","volume":"28 4 1","pages":"290-291"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78570045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A pure-CMOS nonvolatile multi-context configuration memory for dynamically reconfigurable FPGAs 用于动态可重构fpga的纯cmos非易失性多上下文配置存储器
Pub Date : 2014-12-01 DOI: 10.1109/FPT.2014.7082778
K. Tatsumura, Masato Oda, S. Yasuda
Multi-context configuration memory stores multiple sets of configuration data and changes the entire configuration of FPGA quickly, enabling enhancement of hardware utilization with dynamic reconfiguration architectures. The memory area for one set of configuration data should be much smaller than the computational resource it controls. In this paper, we propose a pure-CMOS, nonvolatile, and small-footprint multi-context configuration memory. The multi-context memory includes multiple 2Tr nonvolatile memory elements, which are programmed by channel hot-electron injection, and allows context switching in a single clock cycle. A primitive dynamically reconfigurable device having a lookup table and minimum interconnect backed by 16-bit 8-context configuration memory was fabricated by a 0.18 um CMOS process and its functionality was demonstrated. The 2Tr nonvolatile memory element is more than 4 times denser than 6Tr SRAM, enabling achievement of greater logic density. The pure-CMOS and nonvolatile features would enhance the attractiveness of the technology in many applications.
多上下文配置存储器存储多组配置数据,并快速更改FPGA的整个配置,从而通过动态重构架构提高硬件利用率。一组配置数据的内存区域应该比它控制的计算资源小得多。在本文中,我们提出了一种纯cmos,非易失性和小占用的多上下文配置存储器。多上下文存储器包括多个2Tr非易失性存储器元件,其通过通道热电子注入编程,并允许在单个时钟周期内进行上下文切换。采用0.18 um CMOS工艺制作了一个具有查找表和最小互连的原始动态可重构器件,并对其功能进行了验证。2Tr非易失性存储器元件的密度是6Tr SRAM的4倍以上,可以实现更大的逻辑密度。纯cmos和非易失性的特性将增强该技术在许多应用中的吸引力。
{"title":"A pure-CMOS nonvolatile multi-context configuration memory for dynamically reconfigurable FPGAs","authors":"K. Tatsumura, Masato Oda, S. Yasuda","doi":"10.1109/FPT.2014.7082778","DOIUrl":"https://doi.org/10.1109/FPT.2014.7082778","url":null,"abstract":"Multi-context configuration memory stores multiple sets of configuration data and changes the entire configuration of FPGA quickly, enabling enhancement of hardware utilization with dynamic reconfiguration architectures. The memory area for one set of configuration data should be much smaller than the computational resource it controls. In this paper, we propose a pure-CMOS, nonvolatile, and small-footprint multi-context configuration memory. The multi-context memory includes multiple 2Tr nonvolatile memory elements, which are programmed by channel hot-electron injection, and allows context switching in a single clock cycle. A primitive dynamically reconfigurable device having a lookup table and minimum interconnect backed by 16-bit 8-context configuration memory was fabricated by a 0.18 um CMOS process and its functionality was demonstrated. The 2Tr nonvolatile memory element is more than 4 times denser than 6Tr SRAM, enabling achievement of greater logic density. The pure-CMOS and nonvolatile features would enhance the attractiveness of the technology in many applications.","PeriodicalId":6877,"journal":{"name":"2014 International Conference on Field-Programmable Technology (FPT)","volume":"41 1","pages":"215-222"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88066363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
FPGA implementation of Blokus Duo player using hardware/software co-design 采用硬件/软件协同设计的Blokus Duo播放器FPGA实现
Pub Date : 2014-12-01 DOI: 10.1109/FPT.2014.7082825
A. Kojima
Blokus Duo is an abstract strategy game for two players. In this paper, we describe our FPGA implementation of Blokus Duo player for ICFPT2014 design contest, which is the revised version of the previous design for ICPFT2013 design contest. Our design consists of hardware logic part and software part using soft IP processor. The hardware logic part calculates evaluation value of the board status which is a heavy task for the software part. Our implementation uses recursive Alpha-Beta pruning and iteration deepening algorithm by the software part, which are complex to implement as the hardware logic circuit. The current version of our implementation on Xilinx Artix7 can run at 142MHz. The hardware logic part evaluates about 90,000 nodes in one second at the beginning of the game.
《Blokus Duo》是一款面向两名玩家的抽象策略游戏。在本文中,我们描述了我们为ICFPT2014设计竞赛设计的Blokus Duo播放器的FPGA实现,这是ICPFT2013设计竞赛之前设计的修改版本。本设计采用软IP处理器,由硬件逻辑部分和软件部分组成。硬件逻辑部分计算单板状态的评估值,这是软件部分的一项繁重的任务。软件部分采用递归Alpha-Beta剪枝和迭代深化算法,硬件逻辑电路实现起来比较复杂。我们在Xilinx Artix7上实现的当前版本可以运行在142MHz。在游戏开始时,硬件逻辑部分在一秒钟内评估约90,000个节点。
{"title":"FPGA implementation of Blokus Duo player using hardware/software co-design","authors":"A. Kojima","doi":"10.1109/FPT.2014.7082825","DOIUrl":"https://doi.org/10.1109/FPT.2014.7082825","url":null,"abstract":"Blokus Duo is an abstract strategy game for two players. In this paper, we describe our FPGA implementation of Blokus Duo player for ICFPT2014 design contest, which is the revised version of the previous design for ICPFT2013 design contest. Our design consists of hardware logic part and software part using soft IP processor. The hardware logic part calculates evaluation value of the board status which is a heavy task for the software part. Our implementation uses recursive Alpha-Beta pruning and iteration deepening algorithm by the software part, which are complex to implement as the hardware logic circuit. The current version of our implementation on Xilinx Artix7 can run at 142MHz. The hardware logic part evaluates about 90,000 nodes in one second at the beginning of the game.","PeriodicalId":6877,"journal":{"name":"2014 International Conference on Field-Programmable Technology (FPT)","volume":"1 1","pages":"378-381"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84034239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RotoRouter: Router support for endpoint-authorized decentralized traffic filtering to prevent DoS attacks RotoRouter:路由器支持端点授权的分散流量过滤,以防止DoS攻击
Pub Date : 2014-12-01 DOI: 10.1109/FPT.2014.7082774
Albert Kwon, Kaiyu Zhang, P. L. Lim, Yuchen Pan, Jonathan M. Smith, A. DeHon
RotoRouter addresses Denial-of-Service (DoS) attacks on networks with a novel protocol and router implementation. Sets of RotoRouters cooperate in detecting and filtering out invalid network traffic before it reaches network endpoints; a new router-enforceable connection protocol queries destination endpoints to authorize traffic flows and uses per-packet digital signatures to distinguish allowed from disallowed connections. A RotoRouter prototype was implemented on a four-port 1000BASE-T NetFPGA-10G platform and supports 1024 simultaneous active connections using 74 BRAMs (less than one quarter of the available NetFPGA-10G BRAMs). It is able to sustain 800 Mbps per port throughputs for 1500B packets with less than 0.3/its latency, even during a DoS attack. With additional logic and memory resources, the required validation and switching operations scale to port speeds in excess of 10 Gbps and links with more than 10,000 active flows.
RotoRouter通过一种新的协议和路由器实现来解决网络上的拒绝服务(DoS)攻击。在到达网络端点之前,一组RotoRouters协同检测并过滤掉无效的网络流量;一种新的路由器强制连接协议查询目标端点来授权流量,并使用每包数字签名来区分允许的连接和不允许的连接。RotoRouter原型在四端口1000BASE-T NetFPGA-10G平台上实现,使用74个bram(不到可用NetFPGA-10G bram的四分之一)支持1024个同时活动连接。即使在DoS攻击期间,它也能够为1500B数据包维持每个端口800mbps的吞吐量,延迟小于0.3/ s。有了额外的逻辑和内存资源,所需的验证和交换操作就可以扩展到端口速度超过10 Gbps和链接超过10,000个活动流。
{"title":"RotoRouter: Router support for endpoint-authorized decentralized traffic filtering to prevent DoS attacks","authors":"Albert Kwon, Kaiyu Zhang, P. L. Lim, Yuchen Pan, Jonathan M. Smith, A. DeHon","doi":"10.1109/FPT.2014.7082774","DOIUrl":"https://doi.org/10.1109/FPT.2014.7082774","url":null,"abstract":"RotoRouter addresses Denial-of-Service (DoS) attacks on networks with a novel protocol and router implementation. Sets of RotoRouters cooperate in detecting and filtering out invalid network traffic before it reaches network endpoints; a new router-enforceable connection protocol queries destination endpoints to authorize traffic flows and uses per-packet digital signatures to distinguish allowed from disallowed connections. A RotoRouter prototype was implemented on a four-port 1000BASE-T NetFPGA-10G platform and supports 1024 simultaneous active connections using 74 BRAMs (less than one quarter of the available NetFPGA-10G BRAMs). It is able to sustain 800 Mbps per port throughputs for 1500B packets with less than 0.3/its latency, even during a DoS attack. With additional logic and memory resources, the required validation and switching operations scale to port speeds in excess of 10 Gbps and links with more than 10,000 active flows.","PeriodicalId":6877,"journal":{"name":"2014 International Conference on Field-Programmable Technology (FPT)","volume":"39 1","pages":"183-190"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84325900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Architectural synthesis of computational pipelines with decoupled memory access 具有解耦内存访问的计算管道的体系结构综合
Pub Date : 2014-12-01 DOI: 10.1109/FPT.2014.7082758
Shaoyi Cheng, J. Wawrzynek
As high level synthesis (HLS) moves towards mainstream adoption among FPGA designers, it has proven to be an effective method for rapid hardware generation. However, in the context of offloading compute intensive software kernels to FPGA accelerators, current HLS tools do not always take full advantage of the hardware platforms. In this paper, we present an automatic flow to refactor and restructure processor-centric software implementations, making them better suited for FPGA platforms. The methodology generates pipelines that decouple memory operations and data access from computation. The resulting pipelines have much better throughput due to their efficient use of the memory bandwidth and improved tolerance to data access latency. The methodology complements existing work in high-level synthesis, easing the creation of heterogeneous systems with high performance accelerators and general purpose processors. With this approach, for a set of non-regular algorithm kernels written in C, a performance improvement of 3.3 to 9.1x is observed over direct C-to-Hardware mapping using a state-of-the-art HLS tool.
随着高层次综合(HLS)在FPGA设计人员中逐渐成为主流,它已被证明是快速硬件生成的有效方法。然而,在将计算密集型软件内核卸载到FPGA加速器的背景下,当前的HLS工具并不总是充分利用硬件平台。在本文中,我们提出了一个自动流程来重构和重构以处理器为中心的软件实现,使它们更适合FPGA平台。该方法生成了将内存操作和数据访问与计算解耦的管道。由于有效地利用了内存带宽并提高了对数据访问延迟的容忍度,因此生成的管道具有更好的吞吐量。该方法补充了现有的高级综合工作,简化了具有高性能加速器和通用处理器的异构系统的创建。使用这种方法,对于用C编写的一组非规则算法内核,使用最先进的HLS工具进行直接C到硬件映射,可以观察到3.3到9.1倍的性能改进。
{"title":"Architectural synthesis of computational pipelines with decoupled memory access","authors":"Shaoyi Cheng, J. Wawrzynek","doi":"10.1109/FPT.2014.7082758","DOIUrl":"https://doi.org/10.1109/FPT.2014.7082758","url":null,"abstract":"As high level synthesis (HLS) moves towards mainstream adoption among FPGA designers, it has proven to be an effective method for rapid hardware generation. However, in the context of offloading compute intensive software kernels to FPGA accelerators, current HLS tools do not always take full advantage of the hardware platforms. In this paper, we present an automatic flow to refactor and restructure processor-centric software implementations, making them better suited for FPGA platforms. The methodology generates pipelines that decouple memory operations and data access from computation. The resulting pipelines have much better throughput due to their efficient use of the memory bandwidth and improved tolerance to data access latency. The methodology complements existing work in high-level synthesis, easing the creation of heterogeneous systems with high performance accelerators and general purpose processors. With this approach, for a set of non-regular algorithm kernels written in C, a performance improvement of 3.3 to 9.1x is observed over direct C-to-Hardware mapping using a state-of-the-art HLS tool.","PeriodicalId":6877,"journal":{"name":"2014 International Conference on Field-Programmable Technology (FPT)","volume":"18 1","pages":"83-90"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85643289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Industrial session 工业会议
Pub Date : 2014-01-01 DOI: 10.1109/fpt.2014.7082827
Sukjin Kim, Jason Wong, P. Kane, Dylan Wang, Xiaolong Xie
Xilinx has developed even more advanced FPGAs and 2nd generation SoCs and 3D ICs to stay a generation ahead, and deliver an extra node worth of performance, power, and integration. The UltraScale architecture was developed to scale from 20nm planar through 16nm and beyond FinFET (FF) technologies, and from monolithic through 3D ICs. In this talk, we will study the cases about Xilinx FPGA in cutting edge applications, also the advantages of UltraScale architecture 2nd generation SoCs, and design tools. IoT and Wearable Applications Enabled by Bluetooth Low Energy (BLE) Solutions Patrick Kane, Cypress Abstract: The Internet of things is happening right now. The newest standard is Bluetooth Low Energy or BLE. This may or may not be the long term answer to IoT communication, but it is certainly in the race to become the leading IoT communication standard. Industrial Session The Internet of things is happening right now. The newest standard is Bluetooth Low Energy or BLE. This may or may not be the long term answer to IoT communication, but it is certainly in the race to become the leading IoT communication standard. Industrial Session
赛灵思已经开发出更先进的fpga和第二代soc和3D ic,以保持领先,并提供额外的节点价值,性能,功率和集成。UltraScale架构可扩展到20nm平面到16nm及以上的FinFET (FF)技术,从单片到3D ic。在本次演讲中,我们将研究赛灵思FPGA在前沿应用中的案例,以及UltraScale架构第二代soc的优势和设计工具。Patrick Kane,赛普拉斯(Cypress)摘要:物联网正在发生。最新的标准是低功耗蓝牙(BLE)。这可能是物联网通信的长期答案,也可能不是,但它肯定会成为领先的物联网通信标准。物联网正在发生。最新的标准是低功耗蓝牙(BLE)。这可能是物联网通信的长期答案,也可能不是,但它肯定会成为领先的物联网通信标准。工业会议
{"title":"Industrial session","authors":"Sukjin Kim, Jason Wong, P. Kane, Dylan Wang, Xiaolong Xie","doi":"10.1109/fpt.2014.7082827","DOIUrl":"https://doi.org/10.1109/fpt.2014.7082827","url":null,"abstract":"Xilinx has developed even more advanced FPGAs and 2nd generation SoCs and 3D ICs to stay a generation ahead, and deliver an extra node worth of performance, power, and integration. The UltraScale architecture was developed to scale from 20nm planar through 16nm and beyond FinFET (FF) technologies, and from monolithic through 3D ICs. In this talk, we will study the cases about Xilinx FPGA in cutting edge applications, also the advantages of UltraScale architecture 2nd generation SoCs, and design tools. IoT and Wearable Applications Enabled by Bluetooth Low Energy (BLE) Solutions Patrick Kane, Cypress Abstract: The Internet of things is happening right now. The newest standard is Bluetooth Low Energy or BLE. This may or may not be the long term answer to IoT communication, but it is certainly in the race to become the leading IoT communication standard. Industrial Session The Internet of things is happening right now. The newest standard is Bluetooth Low Energy or BLE. This may or may not be the long term answer to IoT communication, but it is certainly in the race to become the leading IoT communication standard. Industrial Session","PeriodicalId":6877,"journal":{"name":"2014 International Conference on Field-Programmable Technology (FPT)","volume":"42 1","pages":"1-3"},"PeriodicalIF":0.0,"publicationDate":"2014-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83589874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Time sharing of Runtime Coarse-Grain Reconfigurable Architectures processing elements in multi-process systems 多进程系统中运行时粗粒度可重构体系结构处理元素的时间共享
Pub Date : 2014-01-01 DOI: 10.1109/FPT.2014.7082757
Benjamin Carrión Schäfer
{"title":"Time sharing of Runtime Coarse-Grain Reconfigurable Architectures processing elements in multi-process systems","authors":"Benjamin Carrión Schäfer","doi":"10.1109/FPT.2014.7082757","DOIUrl":"https://doi.org/10.1109/FPT.2014.7082757","url":null,"abstract":"","PeriodicalId":6877,"journal":{"name":"2014 International Conference on Field-Programmable Technology (FPT)","volume":"39 1","pages":"76-82"},"PeriodicalIF":0.0,"publicationDate":"2014-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83458446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Why Put FPGAs in your CPU socket? 为什么把fpga放在你的CPU插槽?
Pub Date : 2013-12-01 DOI: 10.1109/FPT.2013.6718320
P. Chow
Summary form only given. Ever since FPGAs were invented, there has been great interest in using them as computing devices, and with the logic densities of today's devices, many interesting functions have been shown to have significant performance and energy benefits when implemented in FPGAs. However, when an application requires the combination of a high-performance CPU and an FPGA accelerator, the effectiveness of the FPGA is highly determined by the latency and bandwidth between the CPU, the CPU memory system and the FPGA and its memory system. Putting FPGAs into the CPU socket is one way to address this issue. This talk will present the history, the advantages and disadvantages, the challenges, architectures, programming models and applications of "insocket" accelerator systems.
只提供摘要形式。自从fpga被发明以来,人们对使用它们作为计算设备产生了极大的兴趣,并且随着当今设备的逻辑密度,许多有趣的功能在fpga中实现时已被证明具有显着的性能和能源优势。但是,当应用程序需要高性能CPU和FPGA加速器的组合时,FPGA的有效性在很大程度上取决于CPU、CPU内存系统和FPGA及其内存系统之间的延迟和带宽。将fpga放入CPU插槽是解决此问题的一种方法。本讲座将介绍“insocket”加速系统的历史、优缺点、挑战、架构、编程模型和应用。
{"title":"Why Put FPGAs in your CPU socket?","authors":"P. Chow","doi":"10.1109/FPT.2013.6718320","DOIUrl":"https://doi.org/10.1109/FPT.2013.6718320","url":null,"abstract":"Summary form only given. Ever since FPGAs were invented, there has been great interest in using them as computing devices, and with the logic densities of today's devices, many interesting functions have been shown to have significant performance and energy benefits when implemented in FPGAs. However, when an application requires the combination of a high-performance CPU and an FPGA accelerator, the effectiveness of the FPGA is highly determined by the latency and bandwidth between the CPU, the CPU memory system and the FPGA and its memory system. Putting FPGAs into the CPU socket is one way to address this issue. This talk will present the history, the advantages and disadvantages, the challenges, architectures, programming models and applications of \"insocket\" accelerator systems.","PeriodicalId":6877,"journal":{"name":"2014 International Conference on Field-Programmable Technology (FPT)","volume":"17 1","pages":"3"},"PeriodicalIF":0.0,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81476395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
2014 International Conference on Field-Programmable Technology (FPT)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1