Sethu Jose, J. Sampson, N. Vijaykrishnan, M. Kandemir
With the growing popularity of the Internet of Things (IoTs), emerging applications demand that edge nodes provide higher computational capabilities and long operation times while requiring minimal maintenance. Ambient energy harvesting is a promising alternative to batteries, but only if the hardware and software are optimized for the intermittent nature of the power source. At the same time, many compute tasks in IoT workloads involve executing decomposable kernels that may have application-dependent accuracy requirements. In this work, we introduce a hardware-software co-optimization framework for such kernels that aim to achieve maximum forward progress while running on energy harvesting Non-Volatile Processors (NVP). Using this framework, we develop an FFT and a convolution accelerator that computes up to 3.2x faster, while consuming 5.4x less energy, compared to a baseline energy-harvesting system. With our accuracy-aware scheduling strategy, the approximate computing enabled by this framework delivers on average 6.2x energy reduction and 3.2x speedup by sacrificing minimal accuracy of up to 6.9%.
{"title":"A Scheduling Framework for Decomposable Kernels on Energy Harvesting IoT Edge Nodes","authors":"Sethu Jose, J. Sampson, N. Vijaykrishnan, M. Kandemir","doi":"10.1145/3526241.3530350","DOIUrl":"https://doi.org/10.1145/3526241.3530350","url":null,"abstract":"With the growing popularity of the Internet of Things (IoTs), emerging applications demand that edge nodes provide higher computational capabilities and long operation times while requiring minimal maintenance. Ambient energy harvesting is a promising alternative to batteries, but only if the hardware and software are optimized for the intermittent nature of the power source. At the same time, many compute tasks in IoT workloads involve executing decomposable kernels that may have application-dependent accuracy requirements. In this work, we introduce a hardware-software co-optimization framework for such kernels that aim to achieve maximum forward progress while running on energy harvesting Non-Volatile Processors (NVP). Using this framework, we develop an FFT and a convolution accelerator that computes up to 3.2x faster, while consuming 5.4x less energy, compared to a baseline energy-harvesting system. With our accuracy-aware scheduling strategy, the approximate computing enabled by this framework delivers on average 6.2x energy reduction and 3.2x speedup by sacrificing minimal accuracy of up to 6.9%.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130506607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Brain-inspired Hyperdimensional (HD) computing is a new machine learning approach that leverages simple and highly parallelizable operations. Unfortunately, none of the published HD computing algorithms to date have been able to accurately classify more complex image datasets, such as CIFAR100. In this work, we propose HDnn-PIM, that implements both feature extraction and HD-based classification for complex images by using processing-in-memory. We compare HDnn-PIM with HD-only and CNN implementations for various image datasets. HDnn-PIM achieves 52.4% higher accuracy as compared to pure HD computing. It also gains 1.2% accuracy improvement over state-of-the-art CNNs, but with 3.63x smaller memory footprint and 1.53x less MAC operations. Furthermore, HDnn-PIM is 3.6x-223x faster than RTX 3090 GPU, and 3.7x more energy efficient than state-of-the-art FloatPIM.
{"title":"HDnn-PIM: Efficient in Memory Design of Hyperdimensional Computing with Feature Extraction","authors":"Arpan Dutta, Saransh Gupta, Behnam Khaleghi, Rishikanth Chandrasekaran, Weihong Xu, T. Simunic","doi":"10.1145/3526241.3530331","DOIUrl":"https://doi.org/10.1145/3526241.3530331","url":null,"abstract":"Brain-inspired Hyperdimensional (HD) computing is a new machine learning approach that leverages simple and highly parallelizable operations. Unfortunately, none of the published HD computing algorithms to date have been able to accurately classify more complex image datasets, such as CIFAR100. In this work, we propose HDnn-PIM, that implements both feature extraction and HD-based classification for complex images by using processing-in-memory. We compare HDnn-PIM with HD-only and CNN implementations for various image datasets. HDnn-PIM achieves 52.4% higher accuracy as compared to pure HD computing. It also gains 1.2% accuracy improvement over state-of-the-art CNNs, but with 3.63x smaller memory footprint and 1.53x less MAC operations. Furthermore, HDnn-PIM is 3.6x-223x faster than RTX 3090 GPU, and 3.7x more energy efficient than state-of-the-art FloatPIM.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125424162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Stochastic computing is a paradigm in which logical operations are performed on randomly generated bit streams. Complex arithmetic operations can be performed by simple logic circuits, with a much smaller area footprint than conventional binary counterparts. However, the random or pseudorandom sources required to generate the bit streams are costly in terms of area and offset the gains. Also, due to randomness, the computation is not precise, which limits the applicability of the paradigm. Most importantly, to achieve reasonable accuracy, high latency is necessitated. Recently, deterministic approaches to stochastic computing have been proposed. They demonstrated that randomness is not a requirement. By structuring the computation deterministically, the result is exact and the latency is greatly reduced. However, despite being an improvement over conventional stochastic techniques, the latency increases quadratically with each level of logic. Beyond a few levels of logic, it becomes unmanageable. In this paper, we present a method for approximating the results of their deterministic method, with latency that only increases linearly with each level. The improvement comes at the cost of additional logic, but we demonstrate that the increase in area scales with √n, where n is the equivalent number of binary bits of precision. The new approach is general, efficient, composable, and applicable to all arithmetic operations performed with stochastic logic.
{"title":"A Scalable, Deterministic Approach to Stochastic Computing","authors":"Y. Kiran, Marc D. Riedel","doi":"10.1145/3526241.3530344","DOIUrl":"https://doi.org/10.1145/3526241.3530344","url":null,"abstract":"Stochastic computing is a paradigm in which logical operations are performed on randomly generated bit streams. Complex arithmetic operations can be performed by simple logic circuits, with a much smaller area footprint than conventional binary counterparts. However, the random or pseudorandom sources required to generate the bit streams are costly in terms of area and offset the gains. Also, due to randomness, the computation is not precise, which limits the applicability of the paradigm. Most importantly, to achieve reasonable accuracy, high latency is necessitated. Recently, deterministic approaches to stochastic computing have been proposed. They demonstrated that randomness is not a requirement. By structuring the computation deterministically, the result is exact and the latency is greatly reduced. However, despite being an improvement over conventional stochastic techniques, the latency increases quadratically with each level of logic. Beyond a few levels of logic, it becomes unmanageable. In this paper, we present a method for approximating the results of their deterministic method, with latency that only increases linearly with each level. The improvement comes at the cost of additional logic, but we demonstrate that the increase in area scales with √n, where n is the equivalent number of binary bits of precision. The new approach is general, efficient, composable, and applicable to all arithmetic operations performed with stochastic logic.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"514 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123201919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Session 7A: Special Session - 3: Machine Learning-Aided Computer-Aided Design","authors":"Sai Manoj Pudukotai Dinakarrao","doi":"10.1145/3542694","DOIUrl":"https://doi.org/10.1145/3542694","url":null,"abstract":"","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125136345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Session 3A: VLSI Design + VLSI Circuits and Power Aware Design 1","authors":"S. Mohanty","doi":"10.1145/3542686","DOIUrl":"https://doi.org/10.1145/3542686","url":null,"abstract":"","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121750945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
G. Kumar, Anjum Riaz, Yamuna Prasad, Satyadev Ahlawat
The IEEE 1687 standard, which is commonly used for efficient access of on-chip instruments, could be exploited by an intruder and thus needs to be secured. One of the techniques to alleviate the vulnerability of 1687 network is to use a secure access protocol that is based on licensed access software, Chip ID and locking SIB. A licensed access software is generally used to gain control of the embedded instruments and use them as per requirement. In this paper, a successful attack using various machine learning algorithms has been instigated on secure access protocol scheme. It is demonstrated that machine learning algorithms have the potential of breaching the secure communication between the access software and the board and hence access the sensitive instruments. Furthermore, Random Forest significantly outperforms the other models in terms of breaking the security.
{"title":"On Attacking Locking SIB based IJTAG Architecture","authors":"G. Kumar, Anjum Riaz, Yamuna Prasad, Satyadev Ahlawat","doi":"10.1145/3526241.3530370","DOIUrl":"https://doi.org/10.1145/3526241.3530370","url":null,"abstract":"The IEEE 1687 standard, which is commonly used for efficient access of on-chip instruments, could be exploited by an intruder and thus needs to be secured. One of the techniques to alleviate the vulnerability of 1687 network is to use a secure access protocol that is based on licensed access software, Chip ID and locking SIB. A licensed access software is generally used to gain control of the embedded instruments and use them as per requirement. In this paper, a successful attack using various machine learning algorithms has been instigated on secure access protocol scheme. It is demonstrated that machine learning algorithms have the potential of breaching the secure communication between the access software and the board and hence access the sensitive instruments. Furthermore, Random Forest significantly outperforms the other models in terms of breaking the security.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134460618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ransomware has become a serious threat in the cyberspace. Existing software pattern-based malware detectors are specific for certain ransomware and may not capture new variants. Recognizing a common essential behavior of ransomware - employing local cryptographic software for malicious encryption and therefore leaving footprints on the victim machine's caches, this work proposes an anti-ransomware methodology, Ran$Net, based on hardware activities. It consists of a passive cache monitor to log suspicious cache activities, and a follow-on non-profiled deep learning analysis strategy to retrieve the secret cryptographic key from the timing traces generated by the monitor. We implement the first of its kind tool to combat an open-source ransomware and successfully recover the secret key.
{"title":"Ran$Net: An Anti-Ransomware Methodology based on Cache Monitoring and Deep Learning","authors":"Xiang Zhang, Ziyue Zhang, Ruyi Ding, Gongye Cheng, A. Ding, Yunsi Fei","doi":"10.1145/3526241.3530830","DOIUrl":"https://doi.org/10.1145/3526241.3530830","url":null,"abstract":"Ransomware has become a serious threat in the cyberspace. Existing software pattern-based malware detectors are specific for certain ransomware and may not capture new variants. Recognizing a common essential behavior of ransomware - employing local cryptographic software for malicious encryption and therefore leaving footprints on the victim machine's caches, this work proposes an anti-ransomware methodology, Ran$Net, based on hardware activities. It consists of a passive cache monitor to log suspicious cache activities, and a follow-on non-profiled deep learning analysis strategy to retrieve the secret cryptographic key from the timing traces generated by the monitor. We implement the first of its kind tool to combat an open-source ransomware and successfully recover the secret key.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133793559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Session 4A: Testing,Reliability and Fault Tolerance","authors":"Mark Zwolinski","doi":"10.1145/3542688","DOIUrl":"https://doi.org/10.1145/3542688","url":null,"abstract":"","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125116154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Session details: Session 1A: Hardware Security","authors":"K. Gaj","doi":"10.1145/3542682","DOIUrl":"https://doi.org/10.1145/3542682","url":null,"abstract":"","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133544461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the field of technology and engineering education, there is a lot of uncertainty as to what the future trends are going to be. The institutions are preparing and training their students for jobs that they haven't even explored yet. To overcome this uncertainty, new domains with overlapping skill sets are constantly integrated to engage students with technological development for the future computing era. Robotics and the Internet of Things have been a popular area of interest amongst Electrical and Computer Engineers with high global value. Soft robots can be described as a form of biomimicry in which traditional hard robotics are replaced by a more sophisticated model that imitates human, animal, and plant life. In this article, we discuss a problem-based learning approach to integrate key concepts of soft robotics into the undergraduate electrical engineering curricula. The proposed module can be easily integrated into any IoT and Robotics curriculum.
{"title":"IoT-enabled Soft Robotics for Electrical Engineers","authors":"P. Sundaravadivel, P. Ghosh, Bikal Suwal","doi":"10.1145/3526241.3530369","DOIUrl":"https://doi.org/10.1145/3526241.3530369","url":null,"abstract":"In the field of technology and engineering education, there is a lot of uncertainty as to what the future trends are going to be. The institutions are preparing and training their students for jobs that they haven't even explored yet. To overcome this uncertainty, new domains with overlapping skill sets are constantly integrated to engage students with technological development for the future computing era. Robotics and the Internet of Things have been a popular area of interest amongst Electrical and Computer Engineers with high global value. Soft robots can be described as a form of biomimicry in which traditional hard robotics are replaced by a more sophisticated model that imitates human, animal, and plant life. In this article, we discuss a problem-based learning approach to integrate key concepts of soft robotics into the undergraduate electrical engineering curricula. The proposed module can be easily integrated into any IoT and Robotics curriculum.","PeriodicalId":188228,"journal":{"name":"Proceedings of the Great Lakes Symposium on VLSI 2022","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133333590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}