Resistive random-access memory (RRAM) has been deployed to realize different hardware components, notably dot-product engines that accelerate neural network execution. Several designs of RRAM dot-product engines (RDPEs) have been proposed in the last decade. While generally a promising technology, RRAM devices suffer from various performance-affecting reliability issues. The stuck-at faults (SAFs) in RRAM devices significantly degrade the accuracy of the neural networks running on the RDPE. Therefore, the effect of these faults on applications needs to be assessed. Consequently, this paper introduces a new concept for evaluating the impact of unreliable RRAM behavior due to the most notable SAF phenomenon on executable software. For this, we propose a novel reliability concept called Execution-Guided Reliability (EGR). This EGR model is formulated mathematically based on the reliability block diagram method and later applied to evaluate RDPE reliability. Subsequently, we integrate EGR into a framework with multiple RDPEs to perform several experiments and show numerical results. We then explore and analyze the correlation between the EGR model and the computation error magnitudes of RDPEs to explain the behavior of SAFs in the RRAM devices. The results show that the correlation ranging from to indicates, with high confidence, a decreasing monotonic trend between EGR values and error magnitudes.
扫码关注我们
求助内容:
应助结果提醒方式:
