首页 > 最新文献

2014 NASA/ESA Conference on Adaptive Hardware and Systems (AHS)最新文献

英文 中文
Online fault detection for Networks-on-Chip interconnect 片上网络互连的在线故障检测
Pub Date : 2014-07-14 DOI: 10.1109/AHS.2014.6880155
Junxiu Liu, J. Harkin, Yuhua Li, L. Maguire
A key requirement for modern Networks-on-Chip (NoC) is the ability to detect and diagnose faults and failures. A novel approach is proposed which addresses the challenge of fault detection using an online mechanism. The approach minimises online intrusion by employing dynamic rates of testing to maximize NoC throughput while still ensuring sufficient testing. This is achieved using a novel Monitor Module based on the back-off algorithm. The paper presents results on the minimal impact on the intrusion of the NoC for a range of test conditions and also highlights the low area/power overheads for scalability.
现代片上网络(NoC)的一个关键要求是能够检测和诊断故障和故障。提出了一种利用在线机制解决故障检测难题的新方法。该方法通过采用动态测试率来最大限度地提高NoC吞吐量,同时仍然确保充分的测试,从而最大限度地减少在线入侵。这是通过一种基于回退算法的新型监控模块实现的。本文介绍了在一系列测试条件下对NoC入侵的最小影响的结果,并强调了可扩展性的低面积/功耗开销。
{"title":"Online fault detection for Networks-on-Chip interconnect","authors":"Junxiu Liu, J. Harkin, Yuhua Li, L. Maguire","doi":"10.1109/AHS.2014.6880155","DOIUrl":"https://doi.org/10.1109/AHS.2014.6880155","url":null,"abstract":"A key requirement for modern Networks-on-Chip (NoC) is the ability to detect and diagnose faults and failures. A novel approach is proposed which addresses the challenge of fault detection using an online mechanism. The approach minimises online intrusion by employing dynamic rates of testing to maximize NoC throughput while still ensuring sufficient testing. This is achieved using a novel Monitor Module based on the back-off algorithm. The paper presents results on the minimal impact on the intrusion of the NoC for a range of test conditions and also highlights the low area/power overheads for scalability.","PeriodicalId":428581,"journal":{"name":"2014 NASA/ESA Conference on Adaptive Hardware and Systems (AHS)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125623125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Dynamically adaptive and reliable approximate computing using light-weight error analysis 基于轻量误差分析的动态自适应可靠近似计算
Pub Date : 2014-07-14 DOI: 10.1109/AHS.2014.6880184
B. Grigorian, Glenn D. Reinman
Prior art in approximate computing has extensively studied computational resilience to imprecision. However, existing approaches often rely on static techniques, which potentially compromise coverage and reliability. Our approach, on the other hand, decouples error analysis of the approximate accelerator from quality analysis of the overall application. We use high-level, application-specific metrics, or Light-Weight Checks (LWCs), to gain coverage by exploiting imprecision tolerance at the application level. Unlike metrics that compare approximate solutions to exact ones, LWCs can be leveraged dynamically for error analysis and recovery. The resulting methodology adapts to output quality at runtime, providing guarantees on worst-case application-level error. To ensure platform agnosticism, these light-weight metrics are integrated directly into the application, enabling compatibility with any approximate acceleration technique. Our results present a case study of dynamic error control for inverse kinematics. Using software-based neural acceleration with LWC support, we demonstrate improvements in coverage, reliability, and overall performance.
近似计算的现有技术广泛地研究了计算对不精确的弹性。然而,现有的方法通常依赖于静态技术,这可能会损害覆盖范围和可靠性。另一方面,我们的方法将近似加速器的误差分析与整体应用的质量分析解耦。我们使用高级的、特定于应用程序的度量,或者轻量级检查(lwc),通过利用应用程序级别的不精确容受性来获得覆盖范围。与将近似解决方案与精确解决方案进行比较的指标不同,lwc可以动态地用于错误分析和恢复。生成的方法适应运行时的输出质量,为最坏情况下的应用程序级错误提供保证。为了确保平台无关性,这些轻量级指标被直接集成到应用程序中,从而实现与任何近似加速技术的兼容性。我们的研究结果为逆运动学的动态误差控制提供了一个实例。使用基于软件的神经加速和LWC支持,我们证明了在覆盖、可靠性和整体性能方面的改进。
{"title":"Dynamically adaptive and reliable approximate computing using light-weight error analysis","authors":"B. Grigorian, Glenn D. Reinman","doi":"10.1109/AHS.2014.6880184","DOIUrl":"https://doi.org/10.1109/AHS.2014.6880184","url":null,"abstract":"Prior art in approximate computing has extensively studied computational resilience to imprecision. However, existing approaches often rely on static techniques, which potentially compromise coverage and reliability. Our approach, on the other hand, decouples error analysis of the approximate accelerator from quality analysis of the overall application. We use high-level, application-specific metrics, or Light-Weight Checks (LWCs), to gain coverage by exploiting imprecision tolerance at the application level. Unlike metrics that compare approximate solutions to exact ones, LWCs can be leveraged dynamically for error analysis and recovery. The resulting methodology adapts to output quality at runtime, providing guarantees on worst-case application-level error. To ensure platform agnosticism, these light-weight metrics are integrated directly into the application, enabling compatibility with any approximate acceleration technique. Our results present a case study of dynamic error control for inverse kinematics. Using software-based neural acceleration with LWC support, we demonstrate improvements in coverage, reliability, and overall performance.","PeriodicalId":428581,"journal":{"name":"2014 NASA/ESA Conference on Adaptive Hardware and Systems (AHS)","volume":"71 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132153622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Power-aware multi-objective evolvable hardware system on an FPGA 基于FPGA的功率感知多目标演化硬件系统
Pub Date : 2014-07-14 DOI: 10.1109/AHS.2014.6880159
B. Lopez, J. Valverde, E. D. L. Torre, T. Riesgo
Dynamic and Partial Reconfiguration (DPR) allows a system to be able to modify certain parts of itself during run-time. This feature gives rise to the capability of evolution: changing parts of the configuration according to the online evaluation of performance or other parameters. The evolution is achieved through a bio-inspired model in which the features of the system are identified as genes. The objective of the evolution may not be a single one; in this work, power consumption is taken into consideration, together with the quality of filtering, as the measure of performance, of a noisy image. Pareto optimality is applied to the evolutionary process, in order to find a representative set of optimal solutions as for performance and power consumption. The main contributions of this paper are: implementing an evolvable system on a low-power Spartan-6 FPGA included in a Wireless Sensor Network node and, by enabling the availability of a real measure of power consumption at run-time, achieving the capability of multi-objective evolution, that yields different optimal configurations, among which the selected one will depend on the relative “weights” of performance and power consumption.
动态和部分重新配置(DPR)允许系统在运行时修改自身的某些部分。这个特性产生了演化能力:根据对性能或其他参数的在线评估来改变部分配置。这种进化是通过一种受生物启发的模型实现的,在这种模型中,系统的特征被识别为基因。进化的目标可能不是单一的;在这项工作中,考虑了功耗和滤波质量,作为噪声图像性能的衡量标准。将帕累托最优性应用于进化过程中,以寻找具有代表性的性能和功耗最优解集。本文的主要贡献是:在包含在无线传感器网络节点中的低功耗Spartan-6 FPGA上实现可进化系统,并通过在运行时提供真实的功耗度量,实现多目标进化的能力,从而产生不同的最佳配置,其中所选择的配置将取决于性能和功耗的相对“权重”。
{"title":"Power-aware multi-objective evolvable hardware system on an FPGA","authors":"B. Lopez, J. Valverde, E. D. L. Torre, T. Riesgo","doi":"10.1109/AHS.2014.6880159","DOIUrl":"https://doi.org/10.1109/AHS.2014.6880159","url":null,"abstract":"Dynamic and Partial Reconfiguration (DPR) allows a system to be able to modify certain parts of itself during run-time. This feature gives rise to the capability of evolution: changing parts of the configuration according to the online evaluation of performance or other parameters. The evolution is achieved through a bio-inspired model in which the features of the system are identified as genes. The objective of the evolution may not be a single one; in this work, power consumption is taken into consideration, together with the quality of filtering, as the measure of performance, of a noisy image. Pareto optimality is applied to the evolutionary process, in order to find a representative set of optimal solutions as for performance and power consumption. The main contributions of this paper are: implementing an evolvable system on a low-power Spartan-6 FPGA included in a Wireless Sensor Network node and, by enabling the availability of a real measure of power consumption at run-time, achieving the capability of multi-objective evolution, that yields different optimal configurations, among which the selected one will depend on the relative “weights” of performance and power consumption.","PeriodicalId":428581,"journal":{"name":"2014 NASA/ESA Conference on Adaptive Hardware and Systems (AHS)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130742217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Method to self-repairing reconfiguration strategy selection of embryonic cellular array on reliability analysis 基于可靠性分析的胚胎细胞阵列自修复重构策略选择方法
Pub Date : 2014-07-14 DOI: 10.1109/AHS.2014.6880181
Zhai Zhang, You-ren Wang
Bio-inspired engineering system has the characteristics of self-diagnosis and self-repairing. Embryonics, new field programmable multi-celluar array hardware, is inspired by the Ontogeny of organism, contains the ability of fault tolerance with cell redundancy and corresponding reconfiguration mechanisms. Row/column elimination and cell elimination are the two self-repairing reconfiguration strategies mostly adopted. How to determine the one used in design from the perspective of reliability evaluation is the innovative method in Embryonics research. In this paper, two models named ideal reliability model and practical reliability model are proposed to analyze the reliability influences of the self-repairing reconfiguration strategies. The former model is abstracted just from the architecture and repair principles of embryonic cellular array without considering the implementation details in cell, while the assistant hardware of extra register and routing channels in switch box are extracted into the practical model. The selection principle of the self-repairing reconfiguration strategy is summarized on reliability analysis. Following the approach, designers can find out the more reliable strategy according to the chip size and design goal before the circuit implementation.
仿生工程系统具有自诊断和自修复的特点。胚胎学是一种新的现场可编程多细胞阵列硬件,它的灵感来自于生物体的个体发生,包含了细胞冗余的容错能力和相应的重构机制。行/列消除和单元消除是最常用的两种自修复重构策略。如何从可靠性评估的角度确定设计中使用的是胚胎学研究的创新方法。本文提出了理想可靠性模型和实际可靠性模型来分析自修复重构策略对系统可靠性的影响。前一种模型仅从胚胎细胞阵列的结构和修复原理中抽象出来,没有考虑细胞内的实现细节,而将开关箱中额外的寄存器和路由通道等辅助硬件提取到实际模型中。在可靠性分析的基础上总结了自修复重构策略的选择原则。根据该方法,设计人员可以在电路实现之前根据芯片尺寸和设计目标找到更可靠的策略。
{"title":"Method to self-repairing reconfiguration strategy selection of embryonic cellular array on reliability analysis","authors":"Zhai Zhang, You-ren Wang","doi":"10.1109/AHS.2014.6880181","DOIUrl":"https://doi.org/10.1109/AHS.2014.6880181","url":null,"abstract":"Bio-inspired engineering system has the characteristics of self-diagnosis and self-repairing. Embryonics, new field programmable multi-celluar array hardware, is inspired by the Ontogeny of organism, contains the ability of fault tolerance with cell redundancy and corresponding reconfiguration mechanisms. Row/column elimination and cell elimination are the two self-repairing reconfiguration strategies mostly adopted. How to determine the one used in design from the perspective of reliability evaluation is the innovative method in Embryonics research. In this paper, two models named ideal reliability model and practical reliability model are proposed to analyze the reliability influences of the self-repairing reconfiguration strategies. The former model is abstracted just from the architecture and repair principles of embryonic cellular array without considering the implementation details in cell, while the assistant hardware of extra register and routing channels in switch box are extracted into the practical model. The selection principle of the self-repairing reconfiguration strategy is summarized on reliability analysis. Following the approach, designers can find out the more reliable strategy according to the chip size and design goal before the circuit implementation.","PeriodicalId":428581,"journal":{"name":"2014 NASA/ESA Conference on Adaptive Hardware and Systems (AHS)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115965144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Fault diagnosis for MEMS INS using unscented Kalman filter enhanced by Gaussian process adaptation 高斯过程自适应增强无气味卡尔曼滤波的MEMS INS故障诊断
Pub Date : 2014-07-14 DOI: 10.1109/AHS.2014.6880167
I. Vitanov, N. Aouf
Miniature unmanned aerial vehicles (UAVs) such as quadrotors are increasingly in demand due to their small size and cost. The base navigation solution for such systems is typically a micro electro mechanical system (MEMS) based strap-down inertial navigation system (INS). To allow safe operation, navigation instrument failures need to be robustly handled through effective fault diagnosis. A popular approach to fault diagnosis in non-linear systems is the extended Kalman filter (EKF), which may, however, prove sub-optimal in the presence of greater non-linearity. In this paper, we instead adopt an unscented Kalman filter (UKF), which relies on a more accurate stochastic approximation - the unscented transform - rather than a Taylor series expansion. A downside to MEMS inertial navigation is an attendant time-dependent drift, which can distort estimation quality. Hence, MEMS INS sensors characteristically result in large biases in the navigation solution. To mitigate this problem we employ Gaussian Processes to approximate a time-dependent offset which can be utilised during on-line operation in an adaptive fashion, as a compensatory mechanism. We apply the enhanced GP-UKF by means of a bank of dedicated observers within an analytical redundancy framework. The results are competitive with the EKF and represent arguably the first application of an enhanced GP-UKF filter in the context of fault detection and isolation.
微型无人机(uav),如四旋翼机,由于其体积小和成本越来越多的需求。此类系统的基本导航解决方案通常是基于微机电系统(MEMS)的捷联惯性导航系统(INS)。为了保证导航仪表的安全运行,需要通过有效的故障诊断对导航仪表故障进行稳健处理。在非线性系统中,一种常用的故障诊断方法是扩展卡尔曼滤波器(EKF),然而,在较大的非线性存在下,它可能被证明是次优的。在本文中,我们采用了unscented卡尔曼滤波器(UKF),它依赖于更精确的随机逼近- unscented变换-而不是泰勒级数展开。MEMS惯性导航的一个缺点是伴随的随时间漂移,这可能会扭曲估计质量。因此,MEMS INS传感器通常会在导航解决方案中产生较大的偏差。为了缓解这个问题,我们使用高斯过程来近似时间相关的偏移量,该偏移量可以在在线操作期间以自适应的方式作为补偿机制使用。我们通过在分析冗余框架内的专门观察员银行应用增强型GP-UKF。结果与EKF具有竞争力,并且可以说是增强型GP-UKF滤波器在故障检测和隔离方面的首次应用。
{"title":"Fault diagnosis for MEMS INS using unscented Kalman filter enhanced by Gaussian process adaptation","authors":"I. Vitanov, N. Aouf","doi":"10.1109/AHS.2014.6880167","DOIUrl":"https://doi.org/10.1109/AHS.2014.6880167","url":null,"abstract":"Miniature unmanned aerial vehicles (UAVs) such as quadrotors are increasingly in demand due to their small size and cost. The base navigation solution for such systems is typically a micro electro mechanical system (MEMS) based strap-down inertial navigation system (INS). To allow safe operation, navigation instrument failures need to be robustly handled through effective fault diagnosis. A popular approach to fault diagnosis in non-linear systems is the extended Kalman filter (EKF), which may, however, prove sub-optimal in the presence of greater non-linearity. In this paper, we instead adopt an unscented Kalman filter (UKF), which relies on a more accurate stochastic approximation - the unscented transform - rather than a Taylor series expansion. A downside to MEMS inertial navigation is an attendant time-dependent drift, which can distort estimation quality. Hence, MEMS INS sensors characteristically result in large biases in the navigation solution. To mitigate this problem we employ Gaussian Processes to approximate a time-dependent offset which can be utilised during on-line operation in an adaptive fashion, as a compensatory mechanism. We apply the enhanced GP-UKF by means of a bank of dedicated observers within an analytical redundancy framework. The results are competitive with the EKF and represent arguably the first application of an enhanced GP-UKF filter in the context of fault detection and isolation.","PeriodicalId":428581,"journal":{"name":"2014 NASA/ESA Conference on Adaptive Hardware and Systems (AHS)","volume":"12 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123280201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Increasing multiprocessor lifetime by Youngest-First Round-Robin core gating patterns 通过最年轻优先轮询核心门控模式增加多处理器寿命
Pub Date : 2014-07-14 DOI: 10.1109/AHS.2014.6880182
A. Simevski, R. Kraemer, M. Krstic
Long-mission multiprocessor systems in which direct human intervention is impossible, like satellites in space, require special attention of their lifetime reliability. Relying on the well established power reduction techniques which are frequently used in multiprocessors - power and clock gating, as well as dynamic voltage and frequency scaling, we devise the Youngest-First Round-Robin (YFRR) core gating pattern to be used for reduction of aging effects i.e., lifetime extension of the system. The YFRR technique uses the information supplied by on-chip aging monitors placed in each multiprocessor core, in order to determine their relative age and construct the gating pattern. Furthermore, we introduce a simple analytical method based on theWeibul distribution in order to evaluate and estimate the lifetime reliability of multiprocessors that use core gating patterns. The analyses show an improvement of up to 32% when using the YFRR compared to a simple Round-Robin.
长期任务的多处理器系统,如太空中的卫星,不可能进行直接的人为干预,需要特别注意它们的寿命可靠性。依靠在多处理器中经常使用的成熟的功率降低技术-功率和时钟门控,以及动态电压和频率缩放,我们设计了最年轻的第一次轮询(YFRR)核心门控模式,用于减少老化效应,即延长系统的寿命。YFRR技术利用放置在每个多处理器核心中的片上老化监视器提供的信息,以确定它们的相对年龄并构建门控模式。此外,我们还介绍了一种基于weibul分布的简单分析方法,以评估和估计使用核心门控模式的多处理器的寿命可靠性。分析表明,与简单的Round-Robin相比,使用YFRR最多可提高32%。
{"title":"Increasing multiprocessor lifetime by Youngest-First Round-Robin core gating patterns","authors":"A. Simevski, R. Kraemer, M. Krstic","doi":"10.1109/AHS.2014.6880182","DOIUrl":"https://doi.org/10.1109/AHS.2014.6880182","url":null,"abstract":"Long-mission multiprocessor systems in which direct human intervention is impossible, like satellites in space, require special attention of their lifetime reliability. Relying on the well established power reduction techniques which are frequently used in multiprocessors - power and clock gating, as well as dynamic voltage and frequency scaling, we devise the Youngest-First Round-Robin (YFRR) core gating pattern to be used for reduction of aging effects i.e., lifetime extension of the system. The YFRR technique uses the information supplied by on-chip aging monitors placed in each multiprocessor core, in order to determine their relative age and construct the gating pattern. Furthermore, we introduce a simple analytical method based on theWeibul distribution in order to evaluate and estimate the lifetime reliability of multiprocessors that use core gating patterns. The analyses show an improvement of up to 32% when using the YFRR compared to a simple Round-Robin.","PeriodicalId":428581,"journal":{"name":"2014 NASA/ESA Conference on Adaptive Hardware and Systems (AHS)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124492144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Decentralized run-time recovery mechanism for transient and permanent hardware faults for space-borne FPGA-based computing systems 星载fpga计算系统暂态和永久硬件故障的分散运行时恢复机制
Pub Date : 2014-07-14 DOI: 10.1109/AHS.2014.6880157
V. Dumitriu, L. Kirischian, V. Kirischian
One of the most important problems for mission critical space-borne computing systems employing FPGA devices is fault tolerance to transient and permanent hardware faults. In many cases, the ability for run-time self-recovery from such faults is a vital feature. This paper presents a method and mechanism for run-time recovery of FPGA-based System-on-Chip (SoC) based on Collaborative Macro-Function Units (CMFUs). Each CMFU consist of a macro-function specific data-path, control unit and circuits providing self-integration, self-synchronization and self-recovery functions for the CMFU, without centralized control. The proposed mechanism allows run-time scrubbing or relocation of faulty components of the SoC providing much higher flexibility and reliability of the system. This mechanism was implemented and tested on a Xilinx Kintex-7 FPGA platform. It was determined that the proposed approach can provide seamless run-time recovery for pipelined SoCs, while being transparent to the application.
对于采用FPGA器件的关键任务星载计算系统来说,最重要的问题之一是对瞬态和永久硬件故障的容错。在许多情况下,从此类故障中进行运行时自我恢复的能力是一项至关重要的功能。提出了一种基于协同宏功能单元(CMFUs)的fpga片上系统(SoC)运行时恢复的方法和机制。每个CMFU由宏函数专用的数据路径、控制单元和电路组成,为CMFU提供自集成、自同步和自恢复功能,无需集中控制。所提出的机制允许运行时清洗或重新定位SoC的故障组件,从而提供更高的系统灵活性和可靠性。该机制在Xilinx Kintex-7 FPGA平台上进行了实现和测试。我们确定,所提出的方法可以为流水线soc提供无缝的运行时恢复,同时对应用程序透明。
{"title":"Decentralized run-time recovery mechanism for transient and permanent hardware faults for space-borne FPGA-based computing systems","authors":"V. Dumitriu, L. Kirischian, V. Kirischian","doi":"10.1109/AHS.2014.6880157","DOIUrl":"https://doi.org/10.1109/AHS.2014.6880157","url":null,"abstract":"One of the most important problems for mission critical space-borne computing systems employing FPGA devices is fault tolerance to transient and permanent hardware faults. In many cases, the ability for run-time self-recovery from such faults is a vital feature. This paper presents a method and mechanism for run-time recovery of FPGA-based System-on-Chip (SoC) based on Collaborative Macro-Function Units (CMFUs). Each CMFU consist of a macro-function specific data-path, control unit and circuits providing self-integration, self-synchronization and self-recovery functions for the CMFU, without centralized control. The proposed mechanism allows run-time scrubbing or relocation of faulty components of the SoC providing much higher flexibility and reliability of the system. This mechanism was implemented and tested on a Xilinx Kintex-7 FPGA platform. It was determined that the proposed approach can provide seamless run-time recovery for pipelined SoCs, while being transparent to the application.","PeriodicalId":428581,"journal":{"name":"2014 NASA/ESA Conference on Adaptive Hardware and Systems (AHS)","volume":"326-328 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126518755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Improved fault-tolerance through dynamic modular redundancy (DMR) on the RISA FPGA platform 在RISA FPGA平台上通过动态模块化冗余(DMR)提高容错性
Pub Date : 2014-07-14 DOI: 10.1109/AHS.2014.6880156
M. Trefzer, A. Tyrrell
Autonomously fault-tolerant systems have received a renewed interest for the design of dependable computing systems with the increasing requirements of a variety of critical applications including deep space probes, satellites, reactor control systems, and Internet-of-Things applications including health and environment monitoring. Autonomous fault-tolerant systems are based on hardware capable of self-monitoring and self-repair. In this context, this paper investigates the use of fine-grained, partial dynamic reconfiguration on FPGA for achieving a higher degree of fault-tolerance with lower permanent overhead than TMR, its potential use for long term system maintenance and its capability of detecting faults quickly. The case study shown in this paper focuses mainly on accelerating fault-detection trough optimising a fault-monitoring strategy using an evolutionary algorithm (EA).
随着各种关键应用(包括深空探测器、卫星、反应堆控制系统以及包括健康和环境监测在内的物联网应用)的需求不断增加,自主容错系统在设计可靠的计算系统方面重新引起了人们的兴趣。自主容错系统是基于能够自我监控和自我修复的硬件。在这种情况下,本文研究了在FPGA上使用细粒度的部分动态重构,以实现比TMR更低的永久开销,更高程度的容错,其用于长期系统维护的潜在用途以及快速检测故障的能力。本文所展示的案例研究主要集中在通过使用进化算法(EA)优化故障监测策略来加速故障检测。
{"title":"Improved fault-tolerance through dynamic modular redundancy (DMR) on the RISA FPGA platform","authors":"M. Trefzer, A. Tyrrell","doi":"10.1109/AHS.2014.6880156","DOIUrl":"https://doi.org/10.1109/AHS.2014.6880156","url":null,"abstract":"Autonomously fault-tolerant systems have received a renewed interest for the design of dependable computing systems with the increasing requirements of a variety of critical applications including deep space probes, satellites, reactor control systems, and Internet-of-Things applications including health and environment monitoring. Autonomous fault-tolerant systems are based on hardware capable of self-monitoring and self-repair. In this context, this paper investigates the use of fine-grained, partial dynamic reconfiguration on FPGA for achieving a higher degree of fault-tolerance with lower permanent overhead than TMR, its potential use for long term system maintenance and its capability of detecting faults quickly. The case study shown in this paper focuses mainly on accelerating fault-detection trough optimising a fault-monitoring strategy using an evolutionary algorithm (EA).","PeriodicalId":428581,"journal":{"name":"2014 NASA/ESA Conference on Adaptive Hardware and Systems (AHS)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134183960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A modular FPGA-based implementation of the Unscented Kalman Filter 无气味卡尔曼滤波器的模块化fpga实现
Pub Date : 2014-07-14 DOI: 10.1109/AHS.2014.6880168
Jeremy Soh, Xiaofeng Wu
Nanosatellites, while lowering the cost and ease of space access, suffer the issue of reduced performance compared to larger satellites, particularly when it comes to attitude determination and control. Field Programmable Gate Arrays (FPGAs) have been used in the past to make up for the shortfall in capability but tend to have more complicated development processes than general purpose microprocessors. To simplify development as well as promote portability and reusability between satellite missions, a hardware/software co-design of the Unscented Kalman Filter (UKF) implemented on a FPGA device is presented. The design is implemented on a Zynq-7000 XC7Z020 to establish proof-of-concept and verified using simulated data. The design achieved a 1.5× speed-up over a purely software implementation and the resource usage and power consumption are both low enough to be integrated into a full SoC.
纳米卫星虽然降低了成本和空间进入的便利性,但与大型卫星相比,其性能下降,特别是在姿态确定和控制方面。现场可编程门阵列(fpga)在过去已经被用来弥补能力的不足,但往往比通用微处理器有更复杂的开发过程。为了简化开发并促进卫星任务之间的可移植性和可重用性,提出了在FPGA器件上实现无气味卡尔曼滤波器(UKF)的硬件/软件协同设计。该设计在Zynq-7000 XC7Z020上实现,以建立概念验证并使用模拟数据进行验证。与纯软件实现相比,该设计实现了1.5倍的加速,并且资源使用和功耗都足够低,可以集成到完整的SoC中。
{"title":"A modular FPGA-based implementation of the Unscented Kalman Filter","authors":"Jeremy Soh, Xiaofeng Wu","doi":"10.1109/AHS.2014.6880168","DOIUrl":"https://doi.org/10.1109/AHS.2014.6880168","url":null,"abstract":"Nanosatellites, while lowering the cost and ease of space access, suffer the issue of reduced performance compared to larger satellites, particularly when it comes to attitude determination and control. Field Programmable Gate Arrays (FPGAs) have been used in the past to make up for the shortfall in capability but tend to have more complicated development processes than general purpose microprocessors. To simplify development as well as promote portability and reusability between satellite missions, a hardware/software co-design of the Unscented Kalman Filter (UKF) implemented on a FPGA device is presented. The design is implemented on a Zynq-7000 XC7Z020 to establish proof-of-concept and verified using simulated data. The design achieved a 1.5× speed-up over a purely software implementation and the resource usage and power consumption are both low enough to be integrated into a full SoC.","PeriodicalId":428581,"journal":{"name":"2014 NASA/ESA Conference on Adaptive Hardware and Systems (AHS)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125208384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
New voter design enabling hot redundancy for asynchronous network nodes 新的投票人设计支持异步网络节点的热冗余
Pub Date : 2014-07-14 DOI: 10.1109/AHS.2014.6880154
Felix Siegle, T. Vladimirova, J. Ilstad, Omar Emam
In this paper, a novel voter design is presented which allows the voting of asynchronous network streams in flow-controlled networks. The voter synchronises incoming data streams automatically and is able to handle failure modes that typically occur in streaming applications. The voter degrades to a comparator if one of the redundant channels has failed and reintegrates the channels once they are functional again. While the voter is mainly intended to be connected to a routing switch of the network, it also comprises a broadcast mechanism that enables a stand-alone operation. The design has been successfully implemented in hardware and evaluated by means of fault injection experiments.
本文提出了一种新的投票人设计,允许在流控网络中对异步网络流进行投票。投票器自动同步传入的数据流,并能够处理流应用程序中通常出现的故障模式。如果其中一个冗余通道发生故障,投票人将降级为比较器,并在通道恢复正常后重新集成通道。虽然投票器主要用于连接到网络的路由交换机,但它还包括一个广播机制,可以实现独立操作。该设计已在硬件上成功实现,并通过故障注入实验对其进行了评价。
{"title":"New voter design enabling hot redundancy for asynchronous network nodes","authors":"Felix Siegle, T. Vladimirova, J. Ilstad, Omar Emam","doi":"10.1109/AHS.2014.6880154","DOIUrl":"https://doi.org/10.1109/AHS.2014.6880154","url":null,"abstract":"In this paper, a novel voter design is presented which allows the voting of asynchronous network streams in flow-controlled networks. The voter synchronises incoming data streams automatically and is able to handle failure modes that typically occur in streaming applications. The voter degrades to a comparator if one of the redundant channels has failed and reintegrates the channels once they are functional again. While the voter is mainly intended to be connected to a routing switch of the network, it also comprises a broadcast mechanism that enables a stand-alone operation. The design has been successfully implemented in hardware and evaluated by means of fault injection experiments.","PeriodicalId":428581,"journal":{"name":"2014 NASA/ESA Conference on Adaptive Hardware and Systems (AHS)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127390785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
2014 NASA/ESA Conference on Adaptive Hardware and Systems (AHS)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1