Vivek Parmar, Syed Shakib Sarwar, Ziyun Li, Hsien-Hsin S. Lee, Barbara De Salvo, Manan Suri
{"title":"面向扩展现实应用的边缘ai硬件面向内存设计优化研究","authors":"Vivek Parmar, Syed Shakib Sarwar, Ziyun Li, Hsien-Hsin S. Lee, Barbara De Salvo, Manan Suri","doi":"10.1109/mm.2023.3321249","DOIUrl":null,"url":null,"abstract":"Low-Power Edge-AI capabilities are essential for on-device extended reality (XR) applications to support the vision of Metaverse. In this work, we investigate two representative XR workloads: (i) Hand detection and (ii) Eye segmentation, for hardware design space exploration. For both applications, we train deep neural networks and analyze the impact of quantization and hardware-specific bottlenecks. Through simulations, we evaluate a CPU and two systolic inference accelerator implementations. Next, we compare these hardware solutions with advanced technology nodes. The impact of integrating state-of-the-art emerging non-volatile memory (NVM) technology (STT/SOT/VGSOT MRAM) into the XR-AI inference pipeline is evaluated. We found that significant energy benefits (≥24%) can be achieved for hand detection (IPS=10) and eye segmentation (IPS=0.1) by introducing NVM in the memory hierarchy for designs at 7nm node while meeting minimum IPS (inference per second). Moreover, we can realize substantial reduction in area (≥30%) owing to the small form factor of MRAM.","PeriodicalId":13100,"journal":{"name":"IEEE Micro","volume":"51 12","pages":"0"},"PeriodicalIF":2.8000,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Exploring Memory-Oriented Design Optimization of Edge-AI Hardware for Extended Reality Applications\",\"authors\":\"Vivek Parmar, Syed Shakib Sarwar, Ziyun Li, Hsien-Hsin S. Lee, Barbara De Salvo, Manan Suri\",\"doi\":\"10.1109/mm.2023.3321249\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Low-Power Edge-AI capabilities are essential for on-device extended reality (XR) applications to support the vision of Metaverse. In this work, we investigate two representative XR workloads: (i) Hand detection and (ii) Eye segmentation, for hardware design space exploration. For both applications, we train deep neural networks and analyze the impact of quantization and hardware-specific bottlenecks. Through simulations, we evaluate a CPU and two systolic inference accelerator implementations. Next, we compare these hardware solutions with advanced technology nodes. The impact of integrating state-of-the-art emerging non-volatile memory (NVM) technology (STT/SOT/VGSOT MRAM) into the XR-AI inference pipeline is evaluated. We found that significant energy benefits (≥24%) can be achieved for hand detection (IPS=10) and eye segmentation (IPS=0.1) by introducing NVM in the memory hierarchy for designs at 7nm node while meeting minimum IPS (inference per second). Moreover, we can realize substantial reduction in area (≥30%) owing to the small form factor of MRAM.\",\"PeriodicalId\":13100,\"journal\":{\"name\":\"IEEE Micro\",\"volume\":\"51 12\",\"pages\":\"0\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2023-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Micro\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/mm.2023.3321249\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Micro","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/mm.2023.3321249","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
Exploring Memory-Oriented Design Optimization of Edge-AI Hardware for Extended Reality Applications
Low-Power Edge-AI capabilities are essential for on-device extended reality (XR) applications to support the vision of Metaverse. In this work, we investigate two representative XR workloads: (i) Hand detection and (ii) Eye segmentation, for hardware design space exploration. For both applications, we train deep neural networks and analyze the impact of quantization and hardware-specific bottlenecks. Through simulations, we evaluate a CPU and two systolic inference accelerator implementations. Next, we compare these hardware solutions with advanced technology nodes. The impact of integrating state-of-the-art emerging non-volatile memory (NVM) technology (STT/SOT/VGSOT MRAM) into the XR-AI inference pipeline is evaluated. We found that significant energy benefits (≥24%) can be achieved for hand detection (IPS=10) and eye segmentation (IPS=0.1) by introducing NVM in the memory hierarchy for designs at 7nm node while meeting minimum IPS (inference per second). Moreover, we can realize substantial reduction in area (≥30%) owing to the small form factor of MRAM.
期刊介绍:
IEEE Micro addresses users and designers of microprocessors and microprocessor systems, including managers, engineers, consultants, educators, and students involved with computers and peripherals, components and subassemblies, communications, instrumentation and control equipment, and guidance systems. Contributions should relate to the design, performance, or application of microprocessors and microcomputers. Tutorials, review papers, and discussions are also welcome. Sample topic areas include architecture, communications, data acquisition, control, hardware and software design/implementation, algorithms (including program listings), digital signal processing, microprocessor support hardware, operating systems, computer aided design, languages, application software, and development systems.