Yunping Zhao, Sheng Ma, Hengzhu Liu, Libo Huang, Yi Dai
{"title":"一种超高效的基于自旋的压缩dnn架构","authors":"Yunping Zhao, Sheng Ma, Hengzhu Liu, Libo Huang, Yi Dai","doi":"10.1145/3632957","DOIUrl":null,"url":null,"abstract":"Deep Neural Networks (DNNs) have achieved great progress in academia and industry. But they have become computational and memory intensive with the increase of network depth. Previous designs seek breakthroughs in software and hardware levels to mitigate these challenges. At the software level, neural network compression techniques have effectively reduced network scale and energy consumption. However, the conventional compression algorithm is complex and energy intensive. At the hardware level, the improvements in the semiconductor process have effectively reduced power and energy consumption. However, it is difficult for the traditional Von-Neumann architecture to further reduce the power consumption, due to the memory wall and the end of Moore’s law. To overcome these challenges, the spintronic device based DNN machines have emerged for their non-volatility, ultra low power, and high energy efficiency. However, there is no spin-based design has achieved innovation at both the software and hardware level. Specifically, there is no systematic study of spin-based DNN architecture to deploy compressed networks. In our study, we present an ultra-efficient Spin-based Architecture for Compressed DNNs (SAC), to substantially reduce power consumption and energy consumption. Specifically, we propose a One-Step Compression algorithm (OSC) to reduce the computational complexity with minimum accuracy loss. We also propose a spin-based architecture to realize better performance for the compressed network. Furthermore, we introduce a novel computation flow that enables the reuse of activations and weights. Experimental results show that our study can reduce the computational complexity of compression algorithm from \\(\\mathcal {O}(Tk^3) \\) to \\(\\mathcal {O}(k^2 \\log k) \\) , and achieve 14 × ∼ 40 × compression ratio. Furthermore, our design can attain a 2 × enhancement in power efficiency and a 5 × improvement in computational efficiency compared to the Eyeriss. Our models are available at an anonymous link https://bit.ly/39cdtTa.","PeriodicalId":50920,"journal":{"name":"ACM Transactions on Architecture and Code Optimization","volume":"11 10","pages":"0"},"PeriodicalIF":1.5000,"publicationDate":"2023-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SAC: An Ultra-Efficient Spin-based Architecture for Compressed DNNs\",\"authors\":\"Yunping Zhao, Sheng Ma, Hengzhu Liu, Libo Huang, Yi Dai\",\"doi\":\"10.1145/3632957\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep Neural Networks (DNNs) have achieved great progress in academia and industry. But they have become computational and memory intensive with the increase of network depth. Previous designs seek breakthroughs in software and hardware levels to mitigate these challenges. At the software level, neural network compression techniques have effectively reduced network scale and energy consumption. However, the conventional compression algorithm is complex and energy intensive. At the hardware level, the improvements in the semiconductor process have effectively reduced power and energy consumption. However, it is difficult for the traditional Von-Neumann architecture to further reduce the power consumption, due to the memory wall and the end of Moore’s law. To overcome these challenges, the spintronic device based DNN machines have emerged for their non-volatility, ultra low power, and high energy efficiency. However, there is no spin-based design has achieved innovation at both the software and hardware level. Specifically, there is no systematic study of spin-based DNN architecture to deploy compressed networks. In our study, we present an ultra-efficient Spin-based Architecture for Compressed DNNs (SAC), to substantially reduce power consumption and energy consumption. Specifically, we propose a One-Step Compression algorithm (OSC) to reduce the computational complexity with minimum accuracy loss. We also propose a spin-based architecture to realize better performance for the compressed network. Furthermore, we introduce a novel computation flow that enables the reuse of activations and weights. Experimental results show that our study can reduce the computational complexity of compression algorithm from \\\\(\\\\mathcal {O}(Tk^3) \\\\) to \\\\(\\\\mathcal {O}(k^2 \\\\log k) \\\\) , and achieve 14 × ∼ 40 × compression ratio. Furthermore, our design can attain a 2 × enhancement in power efficiency and a 5 × improvement in computational efficiency compared to the Eyeriss. Our models are available at an anonymous link https://bit.ly/39cdtTa.\",\"PeriodicalId\":50920,\"journal\":{\"name\":\"ACM Transactions on Architecture and Code Optimization\",\"volume\":\"11 10\",\"pages\":\"0\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2023-11-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Architecture and Code Optimization\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3632957\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Architecture and Code Optimization","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3632957","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
SAC: An Ultra-Efficient Spin-based Architecture for Compressed DNNs
Deep Neural Networks (DNNs) have achieved great progress in academia and industry. But they have become computational and memory intensive with the increase of network depth. Previous designs seek breakthroughs in software and hardware levels to mitigate these challenges. At the software level, neural network compression techniques have effectively reduced network scale and energy consumption. However, the conventional compression algorithm is complex and energy intensive. At the hardware level, the improvements in the semiconductor process have effectively reduced power and energy consumption. However, it is difficult for the traditional Von-Neumann architecture to further reduce the power consumption, due to the memory wall and the end of Moore’s law. To overcome these challenges, the spintronic device based DNN machines have emerged for their non-volatility, ultra low power, and high energy efficiency. However, there is no spin-based design has achieved innovation at both the software and hardware level. Specifically, there is no systematic study of spin-based DNN architecture to deploy compressed networks. In our study, we present an ultra-efficient Spin-based Architecture for Compressed DNNs (SAC), to substantially reduce power consumption and energy consumption. Specifically, we propose a One-Step Compression algorithm (OSC) to reduce the computational complexity with minimum accuracy loss. We also propose a spin-based architecture to realize better performance for the compressed network. Furthermore, we introduce a novel computation flow that enables the reuse of activations and weights. Experimental results show that our study can reduce the computational complexity of compression algorithm from \(\mathcal {O}(Tk^3) \) to \(\mathcal {O}(k^2 \log k) \) , and achieve 14 × ∼ 40 × compression ratio. Furthermore, our design can attain a 2 × enhancement in power efficiency and a 5 × improvement in computational efficiency compared to the Eyeriss. Our models are available at an anonymous link https://bit.ly/39cdtTa.
期刊介绍:
ACM Transactions on Architecture and Code Optimization (TACO) focuses on hardware, software, and system research spanning the fields of computer architecture and code optimization. Articles that appear in TACO will either present new techniques and concepts or report on experiences and experiments with actual systems. Insights useful to architects, hardware or software developers, designers, builders, and users will be emphasized.