首页 > 最新文献

2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)最新文献

英文 中文
LearnLoc: A framework for smart indoor localization with embedded mobile devices LearnLoc:基于嵌入式移动设备的智能室内定位框架
S. Pasricha, Viney Ugave, Charles W. Anderson, Qi Han
There has been growing interest in location-based services and indoor localization in recent years. While several smartphone based indoor localization techniques have been proposed, these techniques have many shortcomings related to accuracy and consistency. These prior efforts also ignore energy consumption analysis which is a crucial quality metric in resource-constrained smartphones. In this work, we propose novel techniques based on machine learning algorithms and smart sensor management for real-time indoor localization using smartphones. We implement our proposed techniques as well as state-of-the-art techniques on real smartphones and evaluate their tracking effectiveness and energy overheads across several diverse real-world indoor environments. Our best technique improves upon prior work, achieving indoor localization accuracy between 1-3 meters.
近年来,人们对基于位置的服务和室内定位越来越感兴趣。虽然已经提出了几种基于智能手机的室内定位技术,但这些技术在准确性和一致性方面存在许多缺点。这些先前的努力也忽略了能源消耗分析,这是资源有限的智能手机的关键质量指标。在这项工作中,我们提出了基于机器学习算法和智能传感器管理的新技术,用于使用智能手机进行实时室内定位。我们在真实的智能手机上实施了我们提出的技术以及最先进的技术,并在几个不同的真实室内环境中评估了它们的跟踪效率和能源开销。我们的最佳技术在先前工作的基础上得到了改进,室内定位精度在1-3米之间。
{"title":"LearnLoc: A framework for smart indoor localization with embedded mobile devices","authors":"S. Pasricha, Viney Ugave, Charles W. Anderson, Qi Han","doi":"10.5555/2830840.2830845","DOIUrl":"https://doi.org/10.5555/2830840.2830845","url":null,"abstract":"There has been growing interest in location-based services and indoor localization in recent years. While several smartphone based indoor localization techniques have been proposed, these techniques have many shortcomings related to accuracy and consistency. These prior efforts also ignore energy consumption analysis which is a crucial quality metric in resource-constrained smartphones. In this work, we propose novel techniques based on machine learning algorithms and smart sensor management for real-time indoor localization using smartphones. We implement our proposed techniques as well as state-of-the-art techniques on real smartphones and evaluate their tracking effectiveness and energy overheads across several diverse real-world indoor environments. Our best technique improves upon prior work, achieving indoor localization accuracy between 1-3 meters.","PeriodicalId":281383,"journal":{"name":"2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132844385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
seBoost: Selective boosting for heterogeneous manycores seBoost:对异构多核的选择性增强
Santiago Pagani, M. Shafique, Heba Khdr, Jian-Jia Chen, J. Henkel
Boosting techniques have been widely adopted in commercial multicore and manycore systems, mainly because they provide means to satisfy performance requirements surges, for one or more cores, at run-time. Current boosting techniques select the boosting levels (for boosted cores) and the throttle-down levels (for non-boosted cores) either arbitrarily or through step-wise control approaches. These methods might result in unnecessary performance losses for the non-boosted cores, in short boosting intervals, in failing to satisfy the required performance surges, or in unnecessary high power and energy consumption. This paper presents an efficient and lightweight run-time boosting technique based on transient temperature estimation, called seBoost. Our technique guarantees meeting the performance requirements surges at run-time, thus maximizing the boosting time with a minimum loss of performance for the non-boosted cores.
增强技术在商业多核和多核系统中被广泛采用,主要是因为它们提供了在运行时满足一个或多个核的性能需求激增的方法。当前的增压技术选择增压水平(用于增压核心)和节流水平(用于非增压核心),要么是任意的,要么是通过逐步控制方法。这些方法可能会导致非增强核的不必要的性能损失,增强间隔短,无法满足所需的性能激增,或者导致不必要的高功耗和能耗。本文提出了一种基于瞬态温度估计的高效轻量级运行时提升技术,称为seBoost。我们的技术保证在运行时满足性能需求激增,从而最大限度地提高了增强时间,而对非增强核心的性能损失最小。
{"title":"seBoost: Selective boosting for heterogeneous manycores","authors":"Santiago Pagani, M. Shafique, Heba Khdr, Jian-Jia Chen, J. Henkel","doi":"10.1109/CODESISSS.2015.7331373","DOIUrl":"https://doi.org/10.1109/CODESISSS.2015.7331373","url":null,"abstract":"Boosting techniques have been widely adopted in commercial multicore and manycore systems, mainly because they provide means to satisfy performance requirements surges, for one or more cores, at run-time. Current boosting techniques select the boosting levels (for boosted cores) and the throttle-down levels (for non-boosted cores) either arbitrarily or through step-wise control approaches. These methods might result in unnecessary performance losses for the non-boosted cores, in short boosting intervals, in failing to satisfy the required performance surges, or in unnecessary high power and energy consumption. This paper presents an efficient and lightweight run-time boosting technique based on transient temperature estimation, called seBoost. Our technique guarantees meeting the performance requirements surges at run-time, thus maximizing the boosting time with a minimum loss of performance for the non-boosted cores.","PeriodicalId":281383,"journal":{"name":"2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129493584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
R2Cache: Reliability-aware reconfigurable last-level cache architecture for multi-cores R2Cache:用于多核的可靠性感知可重构的最后一级缓存架构
F. Kriebel, Arun K. Subramaniyan, Semeen Rehman, Segnon Jean Bruno Ahandagbe, M. Shafique, J. Henkel
On-chip last-level caches in multicore systems are one of the most vulnerable components to soft errors. However, vulnerability to soft errors highly depends upon the parameters and configuration of the last-level cache, especially when executing different applications. Therefore, in a reconfigurable cache architecture, the cache parameters can be adapted at run-time to improve its reliability against soft errors. In this paper we propose a novel reliability-aware reconfigurable last-level cache architecture (R2Cache) for multicore systems. It provides reliability-wise efficient cache configurations (i.e. cache parameter selection and cache partitioning) for different concurrently executing applications under user-provided tolerable performance overheads. To enable run-time adaptations, we also introduce a lightweight online vulnerability predictor that exploits the knowledge of performance metrics like number of L2 misses to accurately estimate the cache vulnerability to soft errors. Based on the predicted vulnerabilities of different concurrently executing applications in the current execution epoch, our run-time reliability manager reconfigures the cache such that, for the next execution epoch, the total vulnerability for all concurrently executing applications is minimized. In scenarios where single-bit error correction for cache lines may be afforded, vulnerability-aware reconfigurations can be leveraged to increase the reliability of the last-level cache against multi-bit errors. Compared to state-of-the-art, the proposed architecture provides 24% vulnerability savings when averaged across numerous experiments, while reducing the vulnerability by more than 60% for selected applications and application phases.
在多核系统中,片上最后一级缓存是最容易发生软错误的组件之一。但是,软错误的脆弱性在很大程度上取决于最后一级缓存的参数和配置,特别是在执行不同的应用程序时。因此,在可重构的缓存体系结构中,可以在运行时调整缓存参数,以提高其对软错误的可靠性。在本文中,我们提出了一种新的可靠性感知可重构的多核系统最后一级缓存架构(R2Cache)。它在用户提供可容忍的性能开销下,为不同并发执行的应用程序提供了可靠性方面的高效缓存配置(即缓存参数选择和缓存分区)。为了启用运行时适应性,我们还引入了一个轻量级的在线漏洞预测器,该预测器利用性能指标(如L2错误数量)的知识来准确估计软错误的缓存漏洞。基于当前执行时期不同并发执行应用程序的预测漏洞,我们的运行时可靠性管理器重新配置缓存,以便在下一个执行时期,所有并发执行应用程序的总漏洞最小化。在可以为缓存线路提供单比特错误纠正的场景中,可以利用漏洞感知的重新配置来提高最后一级缓存对多比特错误的可靠性。与最先进的技术相比,当在许多实验中平均时,所建议的体系结构提供了24%的漏洞节省,同时在选定的应用程序和应用程序阶段减少了60%以上的漏洞。
{"title":"R2Cache: Reliability-aware reconfigurable last-level cache architecture for multi-cores","authors":"F. Kriebel, Arun K. Subramaniyan, Semeen Rehman, Segnon Jean Bruno Ahandagbe, M. Shafique, J. Henkel","doi":"10.1109/CODESISSS.2015.7331362","DOIUrl":"https://doi.org/10.1109/CODESISSS.2015.7331362","url":null,"abstract":"On-chip last-level caches in multicore systems are one of the most vulnerable components to soft errors. However, vulnerability to soft errors highly depends upon the parameters and configuration of the last-level cache, especially when executing different applications. Therefore, in a reconfigurable cache architecture, the cache parameters can be adapted at run-time to improve its reliability against soft errors. In this paper we propose a novel reliability-aware reconfigurable last-level cache architecture (R2Cache) for multicore systems. It provides reliability-wise efficient cache configurations (i.e. cache parameter selection and cache partitioning) for different concurrently executing applications under user-provided tolerable performance overheads. To enable run-time adaptations, we also introduce a lightweight online vulnerability predictor that exploits the knowledge of performance metrics like number of L2 misses to accurately estimate the cache vulnerability to soft errors. Based on the predicted vulnerabilities of different concurrently executing applications in the current execution epoch, our run-time reliability manager reconfigures the cache such that, for the next execution epoch, the total vulnerability for all concurrently executing applications is minimized. In scenarios where single-bit error correction for cache lines may be afforded, vulnerability-aware reconfigurations can be leveraged to increase the reliability of the last-level cache against multi-bit errors. Compared to state-of-the-art, the proposed architecture provides 24% vulnerability savings when averaged across numerous experiments, while reducing the vulnerability by more than 60% for selected applications and application phases.","PeriodicalId":281383,"journal":{"name":"2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127837024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Fast parallel application and multiprocessor design space exploration from sequential code 从顺序代码探索快速并行应用和多处理器设计空间
V. Schwambach, Sébastien Cleyet-Merle, Alain Issard, S. Mancini
When designing an application-specific multiprocessor, two key questions arise: (i) how to size the multiprocessor platform to meet application requirements with lowest area and power consumption; and (ii) how to parallelize the target application in order maximize the utilization of the platform. In this paper, we present a methodology for early joint parallel application and multiprocessor design space exploration from sequential application traces and parallelization scenarios. We describe its implementation in Parana, a fast trace-driven simulator, targeting OpenMP applications on the STMicroelectronics' STxP70 Application-Specific Multiprocessor. Results for a NAS Parallel Benchmark and two computer vision applications show an error margin of less than 10% compared to the reference cycle-approximate simulator, with lower modeling effort and one order of magnitude faster execution time.
在设计特定应用的多处理器时,会出现两个关键问题:(i)如何确定多处理器平台的尺寸,以最小的面积和功耗满足应用需求;(ii)如何并行化目标应用程序,以最大限度地利用平台。在本文中,我们提出了一种从顺序应用轨迹和并行化场景出发的早期联合并行应用和多处理器设计空间探索的方法。我们描述了它在Parana中的实现,Parana是一个快速跟踪驱动的模拟器,针对意法半导体STxP70专用多处理器上的OpenMP应用程序。NAS并行基准测试和两个计算机视觉应用程序的结果显示,与参考周期近似模拟器相比,误差范围小于10%,建模工作量更少,执行时间快了一个数量级。
{"title":"Fast parallel application and multiprocessor design space exploration from sequential code","authors":"V. Schwambach, Sébastien Cleyet-Merle, Alain Issard, S. Mancini","doi":"10.1109/CODESISSS.2015.7331379","DOIUrl":"https://doi.org/10.1109/CODESISSS.2015.7331379","url":null,"abstract":"When designing an application-specific multiprocessor, two key questions arise: (i) how to size the multiprocessor platform to meet application requirements with lowest area and power consumption; and (ii) how to parallelize the target application in order maximize the utilization of the platform. In this paper, we present a methodology for early joint parallel application and multiprocessor design space exploration from sequential application traces and parallelization scenarios. We describe its implementation in Parana, a fast trace-driven simulator, targeting OpenMP applications on the STMicroelectronics' STxP70 Application-Specific Multiprocessor. Results for a NAS Parallel Benchmark and two computer vision applications show an error margin of less than 10% compared to the reference cycle-approximate simulator, with lower modeling effort and one order of magnitude faster execution time.","PeriodicalId":281383,"journal":{"name":"2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122362588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An online wear state monitoring methodology for off-the-shelf embedded processors 一种现成嵌入式处理器的在线磨损状态监测方法
Srinath Arunachalam, Thidapat Chantem, R. Dick, X. Hu
The continued scaling of transistors has led to an exponential increase in on-chip power density, which has resulted in increasing temperature. In turn, the increase in temperature directly leads to the increase in the rate of wear of a processor. Negative-bias temperature instability (NBTI) is one of the most dominant integrated circuit (IC) failure mechanisms [13, 5] that strongly depends on temperature. NBTI manifests in the form of increased circuit delays which can lead to timing failures and processor crashes. The ability to monitor the wear progression of a processor due to NBTI is valuable when designing real-time embedded systems. While NBTI can be detected using wear state sensors, not all chips are equipped with these sensors because detecting wear due to NBTI requires modifications to the chip design and incurs area and power overhead. NBTI sensor data may also not be exposed to users in software. In addition, wear sensors cannot take into account variations in wear due to the differences in the wear sensor devices and the other functional devices and their operating conditions. In this paper, we propose a lightweight, online methodology to monitor the wear process due to NBTI for off-the-shelf embedded processors. Our proposed method requires neither data on the threshold voltage and critical paths nor additional hardware. Our methodology can also be extended to predict the wear progression due to some other dominant IC failure mechanisms. Experiments on embedded processors provide insights on NBTI wear progression over time. This knowledge can be used to design real-time embedded systems that explicitly consider runtime wear progression to increase predictability and maintain lifetime reliability requirements.
晶体管的持续缩放导致芯片上功率密度呈指数级增长,从而导致温度升高。反过来,温度的升高直接导致处理器磨损率的增加。负偏置温度不稳定性(NBTI)是最主要的集成电路(IC)失效机制之一[13,5],它强烈依赖于温度。NBTI表现为增加电路延迟的形式,这可能导致时序故障和处理器崩溃。在设计实时嵌入式系统时,由于NBTI而监控处理器磨损进程的能力是有价值的。虽然可以使用磨损状态传感器检测NBTI,但并非所有芯片都配备了这些传感器,因为检测NBTI引起的磨损需要修改芯片设计,并且会产生面积和功率开销。NBTI传感器数据也可能不会在软件中暴露给用户。此外,由于磨损传感器设备与其他功能设备及其工作条件的差异,磨损传感器无法考虑磨损的变化。在本文中,我们提出了一种轻量级的在线方法来监测由于NBTI的现成嵌入式处理器的磨损过程。我们提出的方法既不需要阈值电压和关键路径的数据,也不需要额外的硬件。我们的方法也可以扩展到预测由于一些其他主要的集成电路失效机制而导致的磨损进程。对嵌入式处理器的实验提供了NBTI磨损随时间变化的见解。这些知识可以用于设计实时嵌入式系统,明确考虑运行时磨损的进展,以提高可预测性并保持寿命可靠性要求。
{"title":"An online wear state monitoring methodology for off-the-shelf embedded processors","authors":"Srinath Arunachalam, Thidapat Chantem, R. Dick, X. Hu","doi":"10.1109/CODESISSS.2015.7331374","DOIUrl":"https://doi.org/10.1109/CODESISSS.2015.7331374","url":null,"abstract":"The continued scaling of transistors has led to an exponential increase in on-chip power density, which has resulted in increasing temperature. In turn, the increase in temperature directly leads to the increase in the rate of wear of a processor. Negative-bias temperature instability (NBTI) is one of the most dominant integrated circuit (IC) failure mechanisms [13, 5] that strongly depends on temperature. NBTI manifests in the form of increased circuit delays which can lead to timing failures and processor crashes. The ability to monitor the wear progression of a processor due to NBTI is valuable when designing real-time embedded systems. While NBTI can be detected using wear state sensors, not all chips are equipped with these sensors because detecting wear due to NBTI requires modifications to the chip design and incurs area and power overhead. NBTI sensor data may also not be exposed to users in software. In addition, wear sensors cannot take into account variations in wear due to the differences in the wear sensor devices and the other functional devices and their operating conditions. In this paper, we propose a lightweight, online methodology to monitor the wear process due to NBTI for off-the-shelf embedded processors. Our proposed method requires neither data on the threshold voltage and critical paths nor additional hardware. Our methodology can also be extended to predict the wear progression due to some other dominant IC failure mechanisms. Experiments on embedded processors provide insights on NBTI wear progression over time. This knowledge can be used to design real-time embedded systems that explicitly consider runtime wear progression to increase predictability and maintain lifetime reliability requirements.","PeriodicalId":281383,"journal":{"name":"2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130249674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
An approximate compressor for wearable biomedical healthcare monitoring systems 用于可穿戴生物医学保健监测系统的近似压缩器
Farzad Samie, L. Bauer, J. Henkel
Technology advancements as well as the Internet-of-Things paradigm enable the design of wearable personal healthcare monitoring systems. Ultra-low-power design is a challenging area for these battery-operated wearable devices, where the energy supply is limited and hardware resources are scarce. Some biomedical applications tolerate small errors in the values of the biosignal or small degradation in the quality, which can be exploited to reduce the energy requirements. This paper presents an approximate compression technique for biosignals in a wearable healthcare monitoring system. It takes advantage of error tolerance in biosignals and finds the shortest code to compress the data while keeping the error in an acceptable range. Our approximate compressor does not demand any hardware modification and thus can be used in existing wearable devices. The proposed approach for reducing the size of the Huffman table can save 1 MBit storage, on average. It also makes our approximate compressor suitable for runtime adaptation, i.e. creating a new Huffman table based on updated values. Compared to state-of-the-art, our experimental results show up to 60% reduction in data size that is to be transmitted via radio. As wireless communication contributes significantly to the total energy consumption of wearable devices, this improvement can increase the battery lifetime of our healthcare monitoring prototype from 7 days to 10 days.
技术进步以及物联网范式使可穿戴个人医疗监测系统的设计成为可能。对于这些电池供电的可穿戴设备来说,超低功耗设计是一个具有挑战性的领域,因为能源供应有限,硬件资源稀缺。一些生物医学应用可以容忍生物信号值的小误差或质量的小退化,这可以用来减少能量需求。提出了一种可穿戴医疗监测系统中生物信号的近似压缩技术。它利用生物信号的容错特性,找到最短的编码压缩数据,同时将误差控制在可接受的范围内。我们的近似压缩机不需要任何硬件修改,因此可以在现有的可穿戴设备中使用。所提出的减小霍夫曼表大小的方法平均可以节省1mbit的存储空间。它还使我们的近似压缩器适合于运行时适应,即基于更新的值创建新的霍夫曼表。与最先进的技术相比,我们的实验结果表明,通过无线电传输的数据大小减少了60%。由于无线通信对可穿戴设备的总能耗有很大贡献,因此这种改进可以将我们的医疗监控原型的电池寿命从7天增加到10天。
{"title":"An approximate compressor for wearable biomedical healthcare monitoring systems","authors":"Farzad Samie, L. Bauer, J. Henkel","doi":"10.5555/2830840.2830855","DOIUrl":"https://doi.org/10.5555/2830840.2830855","url":null,"abstract":"Technology advancements as well as the Internet-of-Things paradigm enable the design of wearable personal healthcare monitoring systems. Ultra-low-power design is a challenging area for these battery-operated wearable devices, where the energy supply is limited and hardware resources are scarce. Some biomedical applications tolerate small errors in the values of the biosignal or small degradation in the quality, which can be exploited to reduce the energy requirements. This paper presents an approximate compression technique for biosignals in a wearable healthcare monitoring system. It takes advantage of error tolerance in biosignals and finds the shortest code to compress the data while keeping the error in an acceptable range. Our approximate compressor does not demand any hardware modification and thus can be used in existing wearable devices. The proposed approach for reducing the size of the Huffman table can save 1 MBit storage, on average. It also makes our approximate compressor suitable for runtime adaptation, i.e. creating a new Huffman table based on updated values. Compared to state-of-the-art, our experimental results show up to 60% reduction in data size that is to be transmitted via radio. As wireless communication contributes significantly to the total energy consumption of wearable devices, this improvement can increase the battery lifetime of our healthcare monitoring prototype from 7 days to 10 days.","PeriodicalId":281383,"journal":{"name":"2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127909297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Hardware synthesis from a recursive functional language 硬件合成从递归函数语言
Kuangya Zhai, Richard Townsend, L. Lairmore, Martha A. Kim, S. Edwards
Abstraction in hardware description languages stalled at the register-transfer level decades ago, yet few alternatives have had much success, in part because they provide only modest gains in expressivity. We propose to make a much larger jump: a compiler that synthesizes hardware from behavioral functional specifications. Our compiler translates general Haskell programs into a restricted intermediate representation before applying a series of semantics-preserving transformations, concluding with a simple syntax-directed translation to SystemVerilog. Here, we present the overall framework for this compiler, focusing on the intermediate representations involved and our method for translating general recursive functions into equivalent hardware. We conclude with experimental results that depict the performance and resource usage of the circuitry generated with our compiler.
几十年前,硬件描述语言中的抽象在寄存器传输级别停滞不前,但很少有替代方法取得了很大的成功,部分原因是它们在表达性方面只提供了有限的增益。我们建议做一个更大的跳跃:一个从行为功能规范合成硬件的编译器。在应用一系列保持语义的转换之前,我们的编译器将一般的Haskell程序转换为受限制的中间表示,最后以简单的语法导向转换为SystemVerilog。在这里,我们给出了这个编译器的总体框架,重点是所涉及的中间表示和我们将一般递归函数转换为等效硬件的方法。最后,我们用实验结果来描述编译器生成的电路的性能和资源使用情况。
{"title":"Hardware synthesis from a recursive functional language","authors":"Kuangya Zhai, Richard Townsend, L. Lairmore, Martha A. Kim, S. Edwards","doi":"10.1109/CODESISSS.2015.7331371","DOIUrl":"https://doi.org/10.1109/CODESISSS.2015.7331371","url":null,"abstract":"Abstraction in hardware description languages stalled at the register-transfer level decades ago, yet few alternatives have had much success, in part because they provide only modest gains in expressivity. We propose to make a much larger jump: a compiler that synthesizes hardware from behavioral functional specifications. Our compiler translates general Haskell programs into a restricted intermediate representation before applying a series of semantics-preserving transformations, concluding with a simple syntax-directed translation to SystemVerilog. Here, we present the overall framework for this compiler, focusing on the intermediate representations involved and our method for translating general recursive functions into equivalent hardware. We conclude with experimental results that depict the performance and resource usage of the circuitry generated with our compiler.","PeriodicalId":281383,"journal":{"name":"2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128497322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
The shift to multicores in real-time and safety-critical systems 在实时和安全关键型系统中向多核的转变
Selma Saidi, R. Ernst, S. Uhrig, Henrik Theiling, B. Dinechin
In real-time and safety-critical systems, the move towards multicores is becoming unavoidable in order to keep pace with the increasing required processing power and to meet the high integration trend while maintaining a reasonable power consumption. However, standard multicore systems are mainly designed to increase average performance, whereas embedded systems have additional requirements with respect to safety, reliability and realtime behavior. Therefore, the shift to multicores raises several challenges the embedded community has to face. These challenges involve the design of certifiable multicore platforms, the management of shared resources and the development/integration of parallel software. These issues are encountered at different steps of system development, from modeling and design to software implementation and hardware deployment. Therefore, both multi-core/semiconductor manufacturers and the real-time community have to bridge the gap in order to meet the challenges imposed by multicores. The goal of this paper is to trigger such a discussion as an attempt to bridge the gap between the two worlds and to raise awareness about the hurdles and challenges that need to be tackled.
在实时和安全关键型系统中,为了满足日益增长的处理能力需求和满足高集成度趋势,同时保持合理的功耗,向多核的发展是不可避免的。然而,标准的多核系统主要是为了提高平均性能而设计的,而嵌入式系统在安全性、可靠性和实时性方面有额外的要求。因此,向多核的转变提出了嵌入式社区必须面对的几个挑战。这些挑战包括可认证的多核平台的设计、共享资源的管理以及并行软件的开发/集成。这些问题在系统开发的不同阶段都会遇到,从建模和设计到软件实现和硬件部署。因此,多核/半导体制造商和实时社区都必须弥合差距,以应对多核带来的挑战。本文的目的是引发这样的讨论,试图弥合两个世界之间的差距,提高人们对需要解决的障碍和挑战的认识。
{"title":"The shift to multicores in real-time and safety-critical systems","authors":"Selma Saidi, R. Ernst, S. Uhrig, Henrik Theiling, B. Dinechin","doi":"10.1109/CODESISSS.2015.7331385","DOIUrl":"https://doi.org/10.1109/CODESISSS.2015.7331385","url":null,"abstract":"In real-time and safety-critical systems, the move towards multicores is becoming unavoidable in order to keep pace with the increasing required processing power and to meet the high integration trend while maintaining a reasonable power consumption. However, standard multicore systems are mainly designed to increase average performance, whereas embedded systems have additional requirements with respect to safety, reliability and realtime behavior. Therefore, the shift to multicores raises several challenges the embedded community has to face. These challenges involve the design of certifiable multicore platforms, the management of shared resources and the development/integration of parallel software. These issues are encountered at different steps of system development, from modeling and design to software implementation and hardware deployment. Therefore, both multi-core/semiconductor manufacturers and the real-time community have to bridge the gap in order to meet the challenges imposed by multicores. The goal of this paper is to trigger such a discussion as an attempt to bridge the gap between the two worlds and to raise awareness about the hurdles and challenges that need to be tackled.","PeriodicalId":281383,"journal":{"name":"2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125917887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 71
Computer security by hardware-intrinsic authentication 硬件内在认证的计算机安全
Caio Hoffman, M. Côrtes, Diego F. Aranha, G. Araújo
The widespread embedding of electronic devices into the daily-life objects, and their integration in the so called Internet of the Things (IoT), has raised a number of challenges for the design of Systems-on-Chip (SoCs) devices. Tiny manufacturing costs, stringent security, and ultra-low power operation constraints have considerably raised SoC design requirements. More than incremental approaches which try to re-use current cryptographic mechanisms, the new generation of IoT devices will require novel solutions which deeply integrate their hardware-intrinsic features to program execution. This paper proposes a low-cost PUF-based authentication architecture aiming to secure code execution in IoT SoCs. The solution is deeply embedded into the processor micro-architecture, so as to minimize re-design costs and performance penalties. This new architecture model not only deals with the most common threats against code and data authenticity and integrity, but also provides an approach to extract from processor's caches a stable and unpredictable key that is used in the code and data authentication process.
电子设备广泛嵌入到日常生活对象中,并将其集成到所谓的物联网(IoT)中,这对片上系统(soc)设备的设计提出了许多挑战。微小的制造成本、严格的安全性和超低功耗的操作限制大大提高了SoC的设计要求。与尝试重用当前加密机制的增量方法不同,新一代物联网设备将需要新颖的解决方案,将其硬件固有功能深度集成到程序执行中。本文提出了一种基于puf的低成本认证架构,旨在确保物联网soc中的代码执行安全。该解决方案深深嵌入到处理器微体系结构中,从而最大限度地减少重新设计成本和性能损失。这种新的体系结构模型不仅处理针对代码和数据真实性和完整性的最常见威胁,而且还提供了一种从处理器缓存中提取稳定且不可预测的密钥的方法,该密钥用于代码和数据身份验证过程。
{"title":"Computer security by hardware-intrinsic authentication","authors":"Caio Hoffman, M. Côrtes, Diego F. Aranha, G. Araújo","doi":"10.1109/CODESISSS.2015.7331377","DOIUrl":"https://doi.org/10.1109/CODESISSS.2015.7331377","url":null,"abstract":"The widespread embedding of electronic devices into the daily-life objects, and their integration in the so called Internet of the Things (IoT), has raised a number of challenges for the design of Systems-on-Chip (SoCs) devices. Tiny manufacturing costs, stringent security, and ultra-low power operation constraints have considerably raised SoC design requirements. More than incremental approaches which try to re-use current cryptographic mechanisms, the new generation of IoT devices will require novel solutions which deeply integrate their hardware-intrinsic features to program execution. This paper proposes a low-cost PUF-based authentication architecture aiming to secure code execution in IoT SoCs. The solution is deeply embedded into the processor micro-architecture, so as to minimize re-design costs and performance penalties. This new architecture model not only deals with the most common threats against code and data authenticity and integrity, but also provides an approach to extract from processor's caches a stable and unpredictable key that is used in the code and data authentication process.","PeriodicalId":281383,"journal":{"name":"2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130061887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Energy efficient FFT implementation through stage skipping and merging 通过阶段跳过和合并实现节能FFT
Namita Sharma, P. Panda, F. Catthoor
Fast Fourier Transform (FFT) implementation is characterized by a large number of memory access operations. For FFTs with a significant number of zeros at the input, commonly found in broadcasting standards, we propose energy optimizations leading to reduced memory accesses. We also present an energy estimate based technique for selecting an energy-efficient Register File size, for implementing FFT in both Single Instruction Multiple Data (SIMD) and non-SIMD architectures. Experimental results for different configurations show a variation of 18.5% to 58.5% in energy consumption across the best and worst choices of RF size in the considered range. The proposed implementation is up to 92% more energy efficient than both the non-optimized and pruned radix-2 FFT implementations.
快速傅里叶变换(FFT)实现的特点是大量的内存访问操作。对于在输入处具有大量零的fft,通常在广播标准中发现,我们建议进行能量优化,从而减少内存访问。我们还提出了一种基于能量估计的技术,用于选择节能的寄存器文件大小,用于在单指令多数据(SIMD)和非SIMD架构中实现FFT。不同配置的实验结果表明,在考虑的范围内,最佳和最差的射频尺寸选择的能耗变化为18.5%至58.5%。所提出的实现比非优化和修剪基数2的FFT实现的能源效率高出92%。
{"title":"Energy efficient FFT implementation through stage skipping and merging","authors":"Namita Sharma, P. Panda, F. Catthoor","doi":"10.1109/CODESISSS.2015.7331378","DOIUrl":"https://doi.org/10.1109/CODESISSS.2015.7331378","url":null,"abstract":"Fast Fourier Transform (FFT) implementation is characterized by a large number of memory access operations. For FFTs with a significant number of zeros at the input, commonly found in broadcasting standards, we propose energy optimizations leading to reduced memory accesses. We also present an energy estimate based technique for selecting an energy-efficient Register File size, for implementing FFT in both Single Instruction Multiple Data (SIMD) and non-SIMD architectures. Experimental results for different configurations show a variation of 18.5% to 58.5% in energy consumption across the best and worst choices of RF size in the considered range. The proposed implementation is up to 92% more energy efficient than both the non-optimized and pruned radix-2 FFT implementations.","PeriodicalId":281383,"journal":{"name":"2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2015-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125501087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1