首页 > 最新文献

IEEE Micro最新文献

英文 中文
Hardware-Software co-design for real-time latency-accuracy navigation in tinyML applications tinyML应用程序中实时延迟-精确导航的软硬件协同设计
3区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-11-01 DOI: 10.1109/mm.2023.3317243
Payman Behnam, Jianming Tong, Alind Khare, Yangyu Chen, Yue Pan, Pranav Gadikar, Abhimanyu Bambhaniya, Tushar Krishna, Alexey Tumanov
tinyML applications increasingly operate in dynamically changing deployment scenarios, requiring optimizing for both accuracy and latency. Existing methods mainly target a single point in the accuracy/latency tradeoff space—insufficient as no single static point can be optimal under variable conditions. We draw on a recently proposed weight-shared SuperNet mechanism to enable serving a stream of queries that activates different SubNets within a SuperNet. This creates an opportunity to exploit the inherent temporal locality of different queries that use the same SuperNet. We propose a hardware-software co-design called SUSHI that introduces a novel SubGraph Stationary optimization. SUSHI consists of a novel FPGA implementation and a software scheduler that controls which SubNets to serve and what SubGraph to cache in real-time. SUSHI yields up to 32% improvement in latency, 0.98% increase in served accuracy, and achieves up to 78.7% saved off-chip energy across several neural network architectures.
tinyML应用程序越来越多地在动态变化的部署场景中运行,需要对准确性和延迟进行优化。现有的方法主要针对精度/延迟权衡的单个点,空间不足,因为在可变条件下没有单个静态点可以达到最优。我们利用了最近提出的权重共享超级网络机制,以支持在超级网络中激活不同子网的查询流。这为利用使用相同SuperNet的不同查询的固有时间局部性创造了机会。我们提出了一种名为SUSHI的硬件软件协同设计,它引入了一种新颖的子图平稳优化。SUSHI由一个新颖的FPGA实现和一个软件调度器组成,该调度器控制要服务的子网和要实时缓存的子图。SUSHI的延迟提高了32%,服务精度提高了0.98%,并且在多个神经网络架构中节省了78.7%的片外能量。
{"title":"Hardware-Software co-design for real-time latency-accuracy navigation in <i>tinyML</i> applications","authors":"Payman Behnam, Jianming Tong, Alind Khare, Yangyu Chen, Yue Pan, Pranav Gadikar, Abhimanyu Bambhaniya, Tushar Krishna, Alexey Tumanov","doi":"10.1109/mm.2023.3317243","DOIUrl":"https://doi.org/10.1109/mm.2023.3317243","url":null,"abstract":"tinyML applications increasingly operate in dynamically changing deployment scenarios, requiring optimizing for both accuracy and latency. Existing methods mainly target a single point in the accuracy/latency tradeoff space—insufficient as no single static point can be optimal under variable conditions. We draw on a recently proposed weight-shared SuperNet mechanism to enable serving a stream of queries that activates different SubNets within a SuperNet. This creates an opportunity to exploit the inherent temporal locality of different queries that use the same SuperNet. We propose a hardware-software co-design called SUSHI that introduces a novel SubGraph Stationary optimization. SUSHI consists of a novel FPGA implementation and a software scheduler that controls which SubNets to serve and what SubGraph to cache in real-time. SUSHI yields up to 32% improvement in latency, 0.98% increase in served accuracy, and achieves up to 78.7% saved off-chip energy across several neural network architectures.","PeriodicalId":13100,"journal":{"name":"IEEE Micro","volume":"12 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134996299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A 10.7-µJ/frame 88% Accuracy CIFAR-10 Single-chip Neuromorphic FPGA Processor Featuring Various Nonlinear Functions of Dendrites in Human Cerebrum 10.7µJ/帧88%精度的CIFAR-10单片神经形态FPGA处理器,具有多种人脑树突非线性功能
3区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-11-01 DOI: 10.1109/mm.2023.3315676
Atsutake Kosuge, Yao-Chung Hsu, Rei Sumikawa, Mototsugu Hamada, Tadahiro Kuroda, Tomoe Ishikawa
A neuromorphic architecture is suitable for low-power tiny-ML processors. However, the large number of synapses utilized in recent deep neural networks require multi-chip implementation, resulting in large power consumption due to chip-to-chip interfaces. Here, we present a 10.7-µJ/frame single-chip neuromorphic FPGA processor. To reduce the required hardware resources, we have developed two techniques. The first is a dendrite-inspired nonlinear neural network (dNNN) that mimics various nonlinear functions of dendrite spines in the human cerebrum. The second is a line scan-based architecture that reduces the total amount of hardware resources. The 14-layer convolutional neural network, which achieves an 88% accuracy with the CIFAR-10 dataset, was implemented on a single FPGA board. Compared to a state-of-the-art spiking CNNbased neuromorphic FPGA processor, the energy efficiency of the proposed architecture is improved by a factor of 94.4 while achieving a 6% better classification accuracy.
神经形态架构适用于低功耗微型ml处理器。然而,最近深度神经网络中使用的大量突触需要多芯片实现,由于芯片间接口导致功耗大。在这里,我们提出了一个10.7µJ/帧的单片神经形态FPGA处理器。为了减少所需的硬件资源,我们开发了两种技术。第一种是树突启发的非线性神经网络(dNNN),它模仿人类大脑中树突棘的各种非线性功能。第二种是基于行扫描的体系结构,它减少了硬件资源的总量。该14层卷积神经网络在单个FPGA板上实现,在CIFAR-10数据集上实现了88%的准确率。与最先进的基于cnn的神经形态FPGA处理器相比,该架构的能效提高了94.4倍,分类准确率提高了6%。
{"title":"A 10.7-µJ/frame 88% Accuracy CIFAR-10 Single-chip Neuromorphic FPGA Processor Featuring Various Nonlinear Functions of Dendrites in Human Cerebrum","authors":"Atsutake Kosuge, Yao-Chung Hsu, Rei Sumikawa, Mototsugu Hamada, Tadahiro Kuroda, Tomoe Ishikawa","doi":"10.1109/mm.2023.3315676","DOIUrl":"https://doi.org/10.1109/mm.2023.3315676","url":null,"abstract":"A neuromorphic architecture is suitable for low-power tiny-ML processors. However, the large number of synapses utilized in recent deep neural networks require multi-chip implementation, resulting in large power consumption due to chip-to-chip interfaces. Here, we present a 10.7-µJ/frame single-chip neuromorphic FPGA processor. To reduce the required hardware resources, we have developed two techniques. The first is a dendrite-inspired nonlinear neural network (dNNN) that mimics various nonlinear functions of dendrite spines in the human cerebrum. The second is a line scan-based architecture that reduces the total amount of hardware resources. The 14-layer convolutional neural network, which achieves an 88% accuracy with the CIFAR-10 dataset, was implemented on a single FPGA board. Compared to a state-of-the-art spiking CNNbased neuromorphic FPGA processor, the energy efficiency of the proposed architecture is improved by a factor of 94.4 while achieving a 6% better classification accuracy.","PeriodicalId":13100,"journal":{"name":"IEEE Micro","volume":"28 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134996653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fifty Years of the International Symposium on Computer Architecture: A Data-Driven Retrospective 计算机体系结构国际研讨会五十年:数据驱动回顾
3区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-11-01 DOI: 10.1109/mm.2023.3324465
Matthew D. Sinclair, Parthasarathy Ranganathan, Gaurang Upasani, Adrian Sampson, David Patterson, Rutwik Jain, Nidhi Parthasarathy, Shaan Shah
2023 marked the fiftieth year of the International Symposium on Computer Architecture (ISCA). As one of the oldest and preeminent computer architecture conferences, ISCA represents a microcosm of the broader community; correspondingly, a 50-year-retrospective offers us a great way to track the impact and evolution of the field. Analyzing the content and impact of all the papers published at ISCA so far, we show how computer architecture research has been at the forefront of advances that have driven the broader computing ecosystem. Decadal trends show a dynamic and rapidly-evolving field, with diverse contributions. Examining how the most highly-cited papers achieve their popularity reveals interesting trends on technology adoption curves and the path to impact. Our data also highlights a growing and thriving community, with interesting insights on diversity and scale. We conclude with a summary of the celebratory panel held at ISCA, with observations on the exciting future ahead.
2023年是国际计算机体系结构研讨会(ISCA)成立50周年。作为最古老和卓越的计算机体系结构会议之一,ISCA代表了更广泛社区的一个缩影;相应地,50年的回顾为我们提供了一个追踪该领域影响和演变的好方法。通过分析迄今为止在ISCA发表的所有论文的内容和影响,我们展示了计算机体系结构研究如何处于推动更广泛的计算生态系统发展的前沿。年代际趋势显示了一个动态和快速发展的领域,贡献各不相同。研究被引次数最多的论文是如何变得受欢迎的,可以发现技术采用曲线和影响路径的有趣趋势。我们的数据还突出了一个不断增长和繁荣的社区,在多样性和规模方面有有趣的见解。最后,我们总结了在ISCA举行的庆祝小组会议,并对令人兴奋的未来发表了看法。
{"title":"Fifty Years of the International Symposium on Computer Architecture: A Data-Driven Retrospective","authors":"Matthew D. Sinclair, Parthasarathy Ranganathan, Gaurang Upasani, Adrian Sampson, David Patterson, Rutwik Jain, Nidhi Parthasarathy, Shaan Shah","doi":"10.1109/mm.2023.3324465","DOIUrl":"https://doi.org/10.1109/mm.2023.3324465","url":null,"abstract":"2023 marked the fiftieth year of the International Symposium on Computer Architecture (ISCA). As one of the oldest and preeminent computer architecture conferences, ISCA represents a microcosm of the broader community; correspondingly, a 50-year-retrospective offers us a great way to track the impact and evolution of the field. Analyzing the content and impact of all the papers published at ISCA so far, we show how computer architecture research has been at the forefront of advances that have driven the broader computing ecosystem. Decadal trends show a dynamic and rapidly-evolving field, with diverse contributions. Examining how the most highly-cited papers achieve their popularity reveals interesting trends on technology adoption curves and the path to impact. Our data also highlights a growing and thriving community, with interesting insights on diversity and scale. We conclude with a summary of the celebratory panel held at ISCA, with observations on the exciting future ahead.","PeriodicalId":13100,"journal":{"name":"IEEE Micro","volume":"78 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135565315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Computing in Science & Engineering 计算机科学& &;工程
3区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-11-01 DOI: 10.1109/mm.2023.3324749
{"title":"Computing in Science &amp; Engineering","authors":"","doi":"10.1109/mm.2023.3324749","DOIUrl":"https://doi.org/10.1109/mm.2023.3324749","url":null,"abstract":"","PeriodicalId":13100,"journal":{"name":"IEEE Micro","volume":"41 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135567033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analysis of Historical Patenting Behavior and Patent Characteristics of Computer Architecture Companies—Part VII: Relationship Between Prosecution Time and Claims 计算机架构公司历史专利行为与专利特征分析——第七部分:起诉时间与权利要求的关系
3区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-11-01 DOI: 10.1109/mm.2023.3320318
Joshua J. Yi
A previous article in this series showed that the correlation between the prosecution time and the number of claims was relatively low. This article further analyzes that correlation by examining the effect that patent class has.
本系列之前的一篇文章表明,起诉时间与索赔数量之间的相关性相对较低。本文通过考察专利类别的影响,进一步分析了这种相关性。
{"title":"Analysis of Historical Patenting Behavior and Patent Characteristics of Computer Architecture Companies—Part VII: Relationship Between Prosecution Time and Claims","authors":"Joshua J. Yi","doi":"10.1109/mm.2023.3320318","DOIUrl":"https://doi.org/10.1109/mm.2023.3320318","url":null,"abstract":"A previous article in this series showed that the correlation between the prosecution time and the number of claims was relatively low. This article further analyzes that correlation by examining the effect that patent class has.","PeriodicalId":13100,"journal":{"name":"IEEE Micro","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135565316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IEEE Computer Society Volunteer Service Awards IEEE计算机协会志愿者服务奖
3区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-11-01 DOI: 10.1109/mm.2023.3327852
{"title":"IEEE Computer Society Volunteer Service Awards","authors":"","doi":"10.1109/mm.2023.3327852","DOIUrl":"https://doi.org/10.1109/mm.2023.3327852","url":null,"abstract":"","PeriodicalId":13100,"journal":{"name":"IEEE Micro","volume":"135 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135565328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Making Machine Learning More Energy Efficient by Bringing it Closer to the Sensor 通过使机器学习更接近传感器,使机器学习更节能
3区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-11-01 DOI: 10.1109/mm.2023.3316348
Marius Brehler, Lucas Camphausen, Benjamin Heidebroek, Dennis Krön, Henri Gründer, Simon Camphausen
Processing data close to the sensor on a low-cost, low power embedded device has the potential to unlock new areas for machine learning (ML). Whether it is possible to deploy such ML applications or not depends on the energy efficiency of the solution. One way to realize a lower energy consumption is to bring the application as close as possible to the sensor. We demonstrate the concept of transforming an ML application running near to the sensor into a hybrid near-sensor in-sensor application. This approach aims to reduce the overall energy consumption and we showcase it using a motion classification example, which can be considered as a simpler sub-problem of activity recognition. The reduction of energy consumption is achieved by combining a convolutional neural network with a decision tree. Both applications are compared in terms of accuracy and energy consumption, illustrating the benefits of the hybrid approach.
在低成本、低功耗的嵌入式设备上处理靠近传感器的数据,有可能开启机器学习(ML)的新领域。是否有可能部署这样的机器学习应用程序取决于解决方案的能源效率。实现低能耗的一种方法是使应用程序尽可能靠近传感器。我们演示了将运行在传感器附近的ML应用程序转换为混合近传感器内传感器应用程序的概念。该方法旨在降低整体能耗,我们使用一个运动分类示例来展示它,该示例可以被视为活动识别的一个更简单的子问题。通过将卷积神经网络与决策树相结合来实现能量消耗的降低。两种应用在精度和能耗方面进行了比较,说明了混合方法的优点。
{"title":"Making Machine Learning More Energy Efficient by Bringing it Closer to the Sensor","authors":"Marius Brehler, Lucas Camphausen, Benjamin Heidebroek, Dennis Krön, Henri Gründer, Simon Camphausen","doi":"10.1109/mm.2023.3316348","DOIUrl":"https://doi.org/10.1109/mm.2023.3316348","url":null,"abstract":"Processing data close to the sensor on a low-cost, low power embedded device has the potential to unlock new areas for machine learning (ML). Whether it is possible to deploy such ML applications or not depends on the energy efficiency of the solution. One way to realize a lower energy consumption is to bring the application as close as possible to the sensor. We demonstrate the concept of transforming an ML application running near to the sensor into a hybrid near-sensor in-sensor application. This approach aims to reduce the overall energy consumption and we showcase it using a motion classification example, which can be considered as a simpler sub-problem of activity recognition. The reduction of energy consumption is achieved by combining a convolutional neural network with a decision tree. Both applications are compared in terms of accuracy and energy consumption, illustrating the benefits of the hybrid approach.","PeriodicalId":13100,"journal":{"name":"IEEE Micro","volume":"314 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134996499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IEEE Computer Society Career Center IEEE计算机协会职业中心
3区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-11-01 DOI: 10.1109/mm.2023.3322209
{"title":"IEEE Computer Society Career Center","authors":"","doi":"10.1109/mm.2023.3322209","DOIUrl":"https://doi.org/10.1109/mm.2023.3322209","url":null,"abstract":"","PeriodicalId":13100,"journal":{"name":"IEEE Micro","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135565324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring Memory-Oriented Design Optimization of Edge-AI Hardware for Extended Reality Applications 面向扩展现实应用的边缘ai硬件面向内存设计优化研究
3区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-11-01 DOI: 10.1109/mm.2023.3321249
Vivek Parmar, Syed Shakib Sarwar, Ziyun Li, Hsien-Hsin S. Lee, Barbara De Salvo, Manan Suri
Low-Power Edge-AI capabilities are essential for on-device extended reality (XR) applications to support the vision of Metaverse. In this work, we investigate two representative XR workloads: (i) Hand detection and (ii) Eye segmentation, for hardware design space exploration. For both applications, we train deep neural networks and analyze the impact of quantization and hardware-specific bottlenecks. Through simulations, we evaluate a CPU and two systolic inference accelerator implementations. Next, we compare these hardware solutions with advanced technology nodes. The impact of integrating state-of-the-art emerging non-volatile memory (NVM) technology (STT/SOT/VGSOT MRAM) into the XR-AI inference pipeline is evaluated. We found that significant energy benefits (≥24%) can be achieved for hand detection (IPS=10) and eye segmentation (IPS=0.1) by introducing NVM in the memory hierarchy for designs at 7nm node while meeting minimum IPS (inference per second). Moreover, we can realize substantial reduction in area (≥30%) owing to the small form factor of MRAM.
低功耗边缘ai功能对于设备上扩展现实(XR)应用程序支持Metaverse愿景至关重要。在这项工作中,我们研究了两个代表性的XR工作负载:(i)手检测和(ii)眼分割,用于硬件设计空间探索。对于这两种应用,我们都训练了深度神经网络,并分析了量化和特定硬件瓶颈的影响。通过仿真,我们评估了一个CPU和两个收缩推理加速器的实现。接下来,我们将这些硬件解决方案与先进的技术节点进行比较。评估了将最先进的新兴非易失性存储器(NVM)技术(STT/SOT/VGSOT MRAM)集成到XR-AI推理管道中的影响。我们发现,在满足最小IPS(每秒推理)的情况下,通过在7nm节点设计的内存层次结构中引入NVM,可以实现手部检测(IPS=10)和眼睛分割(IPS=0.1)的显著能量效益(≥24%)。此外,由于MRAM的小尺寸,我们可以实现面积的大幅减少(≥30%)。
{"title":"Exploring Memory-Oriented Design Optimization of Edge-AI Hardware for Extended Reality Applications","authors":"Vivek Parmar, Syed Shakib Sarwar, Ziyun Li, Hsien-Hsin S. Lee, Barbara De Salvo, Manan Suri","doi":"10.1109/mm.2023.3321249","DOIUrl":"https://doi.org/10.1109/mm.2023.3321249","url":null,"abstract":"Low-Power Edge-AI capabilities are essential for on-device extended reality (XR) applications to support the vision of Metaverse. In this work, we investigate two representative XR workloads: (i) Hand detection and (ii) Eye segmentation, for hardware design space exploration. For both applications, we train deep neural networks and analyze the impact of quantization and hardware-specific bottlenecks. Through simulations, we evaluate a CPU and two systolic inference accelerator implementations. Next, we compare these hardware solutions with advanced technology nodes. The impact of integrating state-of-the-art emerging non-volatile memory (NVM) technology (STT/SOT/VGSOT MRAM) into the XR-AI inference pipeline is evaluated. We found that significant energy benefits (≥24%) can be achieved for hand detection (IPS=10) and eye segmentation (IPS=0.1) by introducing NVM in the memory hierarchy for designs at 7nm node while meeting minimum IPS (inference per second). Moreover, we can realize substantial reduction in area (≥30%) owing to the small form factor of MRAM.","PeriodicalId":13100,"journal":{"name":"IEEE Micro","volume":"51 12","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135012274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Special Issue on TinyML TinyML特刊
3区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-11-01 DOI: 10.1109/mm.2023.3322048
Vijay Janapa Reddi, Boris Murmann
This IEEE Micro special issue on tiny machine learning (TinyML) explores cutting-edge research on optimizing machine learning models for highly resource-constrained devices like microcontrollers and embedded systems. The articles cover techniques across the full TinyML stack, including efficient neural network design, on-device learning, model compression, hardware–software co-design, and specialized applications. These selected works showcase techniques to enable increasingly sophisticated intelligence on low-power, memory-constrained edge devices. They provide valuable insights to overcome challenges in deploying performant yet compact TinyML solutions that can perceive, reason, and interact intelligently, even at the very edge.
本期IEEE微型机器学习特刊(TinyML)探讨了针对微控制器和嵌入式系统等资源高度受限设备优化机器学习模型的前沿研究。这些文章涵盖了整个TinyML堆栈的技术,包括高效的神经网络设计、设备上学习、模型压缩、硬件软件协同设计和专门的应用程序。这些精选作品展示了在低功耗,内存受限的边缘设备上实现日益复杂的智能的技术。他们提供了有价值的见解,以克服部署高性能而紧凑的TinyML解决方案的挑战,这些解决方案可以智能地感知、推理和交互,甚至在非常边缘。
{"title":"Special Issue on TinyML","authors":"Vijay Janapa Reddi, Boris Murmann","doi":"10.1109/mm.2023.3322048","DOIUrl":"https://doi.org/10.1109/mm.2023.3322048","url":null,"abstract":"This IEEE Micro special issue on tiny machine learning (TinyML) explores cutting-edge research on optimizing machine learning models for highly resource-constrained devices like microcontrollers and embedded systems. The articles cover techniques across the full TinyML stack, including efficient neural network design, on-device learning, model compression, hardware–software co-design, and specialized applications. These selected works showcase techniques to enable increasingly sophisticated intelligence on low-power, memory-constrained edge devices. They provide valuable insights to overcome challenges in deploying performant yet compact TinyML solutions that can perceive, reason, and interact intelligently, even at the very edge.","PeriodicalId":13100,"journal":{"name":"IEEE Micro","volume":"89 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135565312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Micro
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1