Ken Namura, Johannes Maximilian Kühn, T. Adachi, H. Imachi, H. Kaneko, T. Kato, Go Watanabe, Naoto Tanaka, S. Kashihara, Hiroshi Miyashita, Y. Tomonaga, Ryosuke Okuta, Takuya Akiba, Brian K. Vogel, S. Kitajo, F. Osawa, K. Takahashi, Y. Takatsukasa, K. Mizumaru, T. Yamauchi, J. Ono, A. Takahashi, Tanvir Ahmed, Y. Doi, K. Hiraki, J. Makino
{"title":"MN-Core - A Highly Efficient and Scalable Approach to Deep Learning","authors":"Ken Namura, Johannes Maximilian Kühn, T. Adachi, H. Imachi, H. Kaneko, T. Kato, Go Watanabe, Naoto Tanaka, S. Kashihara, Hiroshi Miyashita, Y. Tomonaga, Ryosuke Okuta, Takuya Akiba, Brian K. Vogel, S. Kitajo, F. Osawa, K. Takahashi, Y. Takatsukasa, K. Mizumaru, T. Yamauchi, J. Ono, A. Takahashi, Tanvir Ahmed, Y. Doi, K. Hiraki, J. Makino","doi":"10.23919/VLSICircuits52068.2021.9492395","DOIUrl":null,"url":null,"abstract":"MN-Core is a highly efficient deep learning training accelerator reaching in excess of 1 TFLOPS/W (half-precision) at board level in real-world mixed-precision workloads. To reach and sustain this level of performance, the design is partitioned and packaged as four-die MCM package exceeding 3000mm2 of die area.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Symposium on VLSI Circuits","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/VLSICircuits52068.2021.9492395","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
MN-Core is a highly efficient deep learning training accelerator reaching in excess of 1 TFLOPS/W (half-precision) at board level in real-world mixed-precision workloads. To reach and sustain this level of performance, the design is partitioned and packaged as four-die MCM package exceeding 3000mm2 of die area.