{"title":"XNOR-POP:一种用于宽io2 dram的二进制卷积神经网络的内存处理架构","authors":"Lei Jiang, Minje Kim, Wujie Wen, Danghui Wang","doi":"10.1109/ISLPED.2017.8009163","DOIUrl":null,"url":null,"abstract":"It is challenging to adopt computing-intensive and parameter-rich Convolutional Neural Networks (CNNs) in mobile devices due to limited hardware resources and low power budgets. To support multiple concurrently running applications, one mobile device needs to perform multiple CNN tests simultaneously in real-time. Previous solutions cannot guarantee a high enough frame rate when serving multiple applications with reasonable hardware and power cost. In this paper, we present a novel process-in-memory architecture to process emerging binary CNN tests in Wide-IO2 DRAMs. Compared to state-of-the-art accelerators, our design improves CNN test performance by 4× ∼ 11× with small hardware and power overhead.","PeriodicalId":385714,"journal":{"name":"2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2017-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"52","resultStr":"{\"title\":\"XNOR-POP: A processing-in-memory architecture for binary Convolutional Neural Networks in Wide-IO2 DRAMs\",\"authors\":\"Lei Jiang, Minje Kim, Wujie Wen, Danghui Wang\",\"doi\":\"10.1109/ISLPED.2017.8009163\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"It is challenging to adopt computing-intensive and parameter-rich Convolutional Neural Networks (CNNs) in mobile devices due to limited hardware resources and low power budgets. To support multiple concurrently running applications, one mobile device needs to perform multiple CNN tests simultaneously in real-time. Previous solutions cannot guarantee a high enough frame rate when serving multiple applications with reasonable hardware and power cost. In this paper, we present a novel process-in-memory architecture to process emerging binary CNN tests in Wide-IO2 DRAMs. Compared to state-of-the-art accelerators, our design improves CNN test performance by 4× ∼ 11× with small hardware and power overhead.\",\"PeriodicalId\":385714,\"journal\":{\"name\":\"2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-07-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"52\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISLPED.2017.8009163\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISLPED.2017.8009163","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
XNOR-POP: A processing-in-memory architecture for binary Convolutional Neural Networks in Wide-IO2 DRAMs
It is challenging to adopt computing-intensive and parameter-rich Convolutional Neural Networks (CNNs) in mobile devices due to limited hardware resources and low power budgets. To support multiple concurrently running applications, one mobile device needs to perform multiple CNN tests simultaneously in real-time. Previous solutions cannot guarantee a high enough frame rate when serving multiple applications with reasonable hardware and power cost. In this paper, we present a novel process-in-memory architecture to process emerging binary CNN tests in Wide-IO2 DRAMs. Compared to state-of-the-art accelerators, our design improves CNN test performance by 4× ∼ 11× with small hardware and power overhead.