{"title":"可重构数学加速器,用于IoT边缘设备上的超低功耗传感工作负载","authors":"Saksham Soni, Dileep Kurian, A. V, A. Sreenath","doi":"10.1109/MOS-AK.2019.8902425","DOIUrl":null,"url":null,"abstract":"Fast evolving algorithms in domains like machine learning/AI demand some level of programmability to remain market relevant. Current approaches to programmability such as DSP cores and FPGAs are not energy efficient and hence not suitable for power constrained IoT edge devices. This paper looks at an alternative approach to programmability through a coarse grain reconfigurable accelerator built as a library of mathematical functions implemented on a chassis. This architecture is implemented on Intel 14nm CMOS technology and takes an area of 0.015mm2 consuming less than 100uJ on typical workloads. Sensor fusion algorithms like Kalman and Madgwick filters are mapped onto this IP as a case study to verify the solution. The results show 100x improvement in power and performance compared to software implementation of these algorithm on generic DSP cores.","PeriodicalId":178751,"journal":{"name":"2019 IEEE Conference on Modeling of Systems Circuits and Devices (MOS-AK India)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reconfigurable Math Accelerator for ultra-low power sensing workloads on IoT edge devices\",\"authors\":\"Saksham Soni, Dileep Kurian, A. V, A. Sreenath\",\"doi\":\"10.1109/MOS-AK.2019.8902425\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Fast evolving algorithms in domains like machine learning/AI demand some level of programmability to remain market relevant. Current approaches to programmability such as DSP cores and FPGAs are not energy efficient and hence not suitable for power constrained IoT edge devices. This paper looks at an alternative approach to programmability through a coarse grain reconfigurable accelerator built as a library of mathematical functions implemented on a chassis. This architecture is implemented on Intel 14nm CMOS technology and takes an area of 0.015mm2 consuming less than 100uJ on typical workloads. Sensor fusion algorithms like Kalman and Madgwick filters are mapped onto this IP as a case study to verify the solution. The results show 100x improvement in power and performance compared to software implementation of these algorithm on generic DSP cores.\",\"PeriodicalId\":178751,\"journal\":{\"name\":\"2019 IEEE Conference on Modeling of Systems Circuits and Devices (MOS-AK India)\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE Conference on Modeling of Systems Circuits and Devices (MOS-AK India)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MOS-AK.2019.8902425\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE Conference on Modeling of Systems Circuits and Devices (MOS-AK India)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MOS-AK.2019.8902425","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Reconfigurable Math Accelerator for ultra-low power sensing workloads on IoT edge devices
Fast evolving algorithms in domains like machine learning/AI demand some level of programmability to remain market relevant. Current approaches to programmability such as DSP cores and FPGAs are not energy efficient and hence not suitable for power constrained IoT edge devices. This paper looks at an alternative approach to programmability through a coarse grain reconfigurable accelerator built as a library of mathematical functions implemented on a chassis. This architecture is implemented on Intel 14nm CMOS technology and takes an area of 0.015mm2 consuming less than 100uJ on typical workloads. Sensor fusion algorithms like Kalman and Madgwick filters are mapped onto this IP as a case study to verify the solution. The results show 100x improvement in power and performance compared to software implementation of these algorithm on generic DSP cores.