首页 > 最新文献

2021 Symposium on VLSI Circuits最新文献

英文 中文
Multiplex PCR CMOS Biochip for Detection of Upper Respiratory Pathogens including SARS-CoV-2 多重PCR CMOS生物芯片检测包括SARS-CoV-2在内的上呼吸道病原体
Pub Date : 2021-06-13 DOI: 10.23919/VLSICircuits52068.2021.9492353
Arun Manickam, Kirsten A. Johnson, Rituraj Singh, Nicholas Wood, Edmond Ku, A. Cuppoletti, M. McDermott, A. Hassibi
A 1024-pixel CMOS biochip for multiplex polymerase chain reaction application is presented. Biosensing pixels include 137dB DDR photosensors and an integrated emission filter with OD~6 to perform real-time fluorescence-based measurements while thermocycling the reaction chamber with heating and cooling rates of > ±10°C/s. The surface of the CMOS IC is biofunctionalized with DNA capturing probes. The biochip is integrated into a fluidic consumable enabling loading of extracted nucleic acid samples and the detection of upper respiratory pathogens, including SARS-CoV-2.
提出了一种用于多重聚合酶链反应的1024像素CMOS生物芯片。生物传感像素包括137dB DDR光传感器和OD~6的集成发射滤波器,用于在加热和冷却速率>±10°C/s的反应室热循环时进行实时荧光测量。CMOS IC的表面被DNA捕获探针生物功能化。该生物芯片集成在流体消耗品中,可装载提取的核酸样品并检测包括SARS-CoV-2在内的上呼吸道病原体。
{"title":"Multiplex PCR CMOS Biochip for Detection of Upper Respiratory Pathogens including SARS-CoV-2","authors":"Arun Manickam, Kirsten A. Johnson, Rituraj Singh, Nicholas Wood, Edmond Ku, A. Cuppoletti, M. McDermott, A. Hassibi","doi":"10.23919/VLSICircuits52068.2021.9492353","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492353","url":null,"abstract":"A 1024-pixel CMOS biochip for multiplex polymerase chain reaction application is presented. Biosensing pixels include 137dB DDR photosensors and an integrated emission filter with OD~6 to perform real-time fluorescence-based measurements while thermocycling the reaction chamber with heating and cooling rates of > ±10°C/s. The surface of the CMOS IC is biofunctionalized with DNA capturing probes. The biochip is integrated into a fluidic consumable enabling loading of extracted nucleic acid samples and the detection of upper respiratory pathogens, including SARS-CoV-2.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115675997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Fully Integrated Switched-Capacitor Voltage Regulator with Multi-Rate Successive Approximation Achieving 190 ps Transient FoM and 83.7% Conversion Efficiency 一种具有多速率连续逼近的全集成开关电容稳压器,可实现190 ps的瞬态FoM和83.7%的转换效率
Pub Date : 2021-06-13 DOI: 10.23919/VLSICircuits52068.2021.9492333
Bing-Chen Wu, Tsung-Te Liu
This paper presents a fully integrated switched-capacitor dc–dc voltage regulator (SCVR) in standard 28 nm CMOS with a proposed regulation algorithm of multi-rate successive approximation (MRSA) and several conversion efficiency enhancement techniques. The proposed SCVR achieves 190 ps transient FoM with peak conversion efficiency of 83.7%@114.2 mA/mm2 and 110× supported loading range of 80 μA–8.8 mA.
提出了一种完全集成的开关电容dc-dc稳压器(SCVR),采用了多速率逐次逼近(MRSA)的调节算法和几种转换效率提高技术。所提出的SCVR实现了190 ps的瞬态FoM,峰值转换效率为83.7%@114.2 mA/mm2,支持的110x负载范围为80 μA-8.8 mA。
{"title":"A Fully Integrated Switched-Capacitor Voltage Regulator with Multi-Rate Successive Approximation Achieving 190 ps Transient FoM and 83.7% Conversion Efficiency","authors":"Bing-Chen Wu, Tsung-Te Liu","doi":"10.23919/VLSICircuits52068.2021.9492333","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492333","url":null,"abstract":"This paper presents a fully integrated switched-capacitor dc–dc voltage regulator (SCVR) in standard 28 nm CMOS with a proposed regulation algorithm of multi-rate successive approximation (MRSA) and several conversion efficiency enhancement techniques. The proposed SCVR achieves 190 ps transient FoM with peak conversion efficiency of 83.7%@114.2 mA/mm2 and 110× supported loading range of 80 μA–8.8 mA.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125373543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A 5.1ms Low-Latency Face Detection Imager with In-Memory Charge-Domain Computing of Machine-Learning Classifiers 基于机器学习分类器内存电荷域计算的5.1ms低延迟人脸检测成像仪
Pub Date : 2021-06-13 DOI: 10.23919/VLSICircuits52068.2021.9492432
Hyunsoo Song, Sungjin Oh, Juan Salinas, Sung-Yun Park, E. Yoon
We present a CMOS imager for low-latency face detection empowered by parallel imaging and computing of machine-learning (ML) classifiers. The energy-efficient parallel operation and multi-scale detection eliminate image capture delay and significantly alleviate backend computational loads. The proposed pixel architecture, composed of dynamic samplers in a global shutter (GS) pixel array, allows for energy-efficient in-memory charge-domain computing of feature extraction and classification. The illumination-invariant detection was realized by using log-Haar features. A prototype 240×240 imager achieved an on-chip face detection latency of 5.1ms with a 97.9% true positive rate and 2% false positive rate at 120fps. Moreover, a dynamic nature of in-memory computing allows an energy efficiency of 419pJ/pixel for feature extraction and classification, leading to the smallest latency-energy product of 3.66ms∙nJ/pixel with digital backend processing.
我们提出了一种CMOS成像仪,用于低延迟人脸检测,通过并行成像和机器学习(ML)分类器的计算。高效的并行运算和多尺度检测消除了图像捕获延迟,显著减轻了后端计算负荷。所提出的像素结构由全局快门(GS)像素阵列中的动态采样器组成,允许高效的内存电荷域计算特征提取和分类。利用log-Haar特征实现光照不变检测。原型240×240成像仪在120fps下实现了5.1ms的片上人脸检测延迟,真阳性率为97.9%,假阳性率为2%。此外,内存计算的动态特性允许419pJ/像素的能量效率用于特征提取和分类,导致数字后端处理的最小延迟能量积为3.66ms∙nJ/像素。
{"title":"A 5.1ms Low-Latency Face Detection Imager with In-Memory Charge-Domain Computing of Machine-Learning Classifiers","authors":"Hyunsoo Song, Sungjin Oh, Juan Salinas, Sung-Yun Park, E. Yoon","doi":"10.23919/VLSICircuits52068.2021.9492432","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492432","url":null,"abstract":"We present a CMOS imager for low-latency face detection empowered by parallel imaging and computing of machine-learning (ML) classifiers. The energy-efficient parallel operation and multi-scale detection eliminate image capture delay and significantly alleviate backend computational loads. The proposed pixel architecture, composed of dynamic samplers in a global shutter (GS) pixel array, allows for energy-efficient in-memory charge-domain computing of feature extraction and classification. The illumination-invariant detection was realized by using log-Haar features. A prototype 240×240 imager achieved an on-chip face detection latency of 5.1ms with a 97.9% true positive rate and 2% false positive rate at 120fps. Moreover, a dynamic nature of in-memory computing allows an energy efficiency of 419pJ/pixel for feature extraction and classification, leading to the smallest latency-energy product of 3.66ms∙nJ/pixel with digital backend processing.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126954303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A 32A 5V-Input, 94.2% Peak Efficiency High-Frequency Power Converter Module Featuring Package-Integrated Low-Voltage GaN NMOS Power Transistors 一种32A 5v输入,94.2%峰值效率的高频功率转换器模块,具有封装集成的低压GaN NMOS功率晶体管
Pub Date : 2021-06-13 DOI: 10.23919/VLSICircuits52068.2021.9492350
Nachiket V. Desai, H. Krishnamurthy, William J. Lambert, Jingshu Yu, H. Then, N. Butzen, Sheldon Weng, C. Schaef, N. Nidhi, M. Radosavljevic, J. Rode, J. Sandford, K. Radhakrishnan, K. Ravichandran, B. Sell, J. Tschanz, V. De
A 5V-input, high-frequency, high-density (9A/mm2) buck converter featuring a low-voltage GaN power transistor (with 5-10× better FoM than Si) with on-die gate clamps, integrated with a CMOS companion die in 4mm × 4mm package, achieves 94.2% peak efficiency for 5Vin/1Vout at 3MHz switching frequency with a 40nH inductor.
5v输入、高频、高密度(9A/mm2)降压变换器采用低压GaN功率晶体管(FoM比Si好5-10倍)和片上栅极箝位,集成了4mm × 4mm封装的CMOS伴随芯片,在3MHz开关频率下,在40nH电感的情况下,实现了5Vin/1Vout的94.2%峰值效率。
{"title":"A 32A 5V-Input, 94.2% Peak Efficiency High-Frequency Power Converter Module Featuring Package-Integrated Low-Voltage GaN NMOS Power Transistors","authors":"Nachiket V. Desai, H. Krishnamurthy, William J. Lambert, Jingshu Yu, H. Then, N. Butzen, Sheldon Weng, C. Schaef, N. Nidhi, M. Radosavljevic, J. Rode, J. Sandford, K. Radhakrishnan, K. Ravichandran, B. Sell, J. Tschanz, V. De","doi":"10.23919/VLSICircuits52068.2021.9492350","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492350","url":null,"abstract":"A 5V-input, high-frequency, high-density (9A/mm2) buck converter featuring a low-voltage GaN power transistor (with 5-10× better FoM than Si) with on-die gate clamps, integrated with a CMOS companion die in 4mm × 4mm package, achieves 94.2% peak efficiency for 5Vin/1Vout at 3MHz switching frequency with a 40nH inductor.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122630851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A 1.15μW 5.54mm3 Implant with a Bidirectional Neural Sensor and Stimulator SoC utilizing Bi-Phasic Quasi-static Brain Communication achieving 6kbps-10Mbps Uplink with Compressive Sensing and RO-PUF based Collision Avoidance 一个1.15μW 5.54mm3的植入物,利用双相准静态脑通信实现双向神经传感器和刺激器SoC,实现6kbps-10Mbps上行链路,具有压缩感知和基于RO-PUF的碰撞避免
Pub Date : 2021-06-13 DOI: 10.23919/VLSICircuits52068.2021.9492445
Baibhab Chatterjee, K. G. Kumar, Mayukh Nath, Shulan Xiao, Nirmoy Modak, D. Das, Jayant Krishna, Shreyas Sen
To solve the challenge of powering and communication in a brain implant with low end-end energy loss, we present Bi-Phasic Quasi-static Brain Communication (BP-QBC), achieving < 60dB worst-case channel loss, and ~41X lower power w.r.t. traditional Galvanic body channel communication (G-BCC) at a carrier frequency of 1MHz (~6X lower power than G-BCC at 10MHz) by blocking DC current paths through the brain tissue. An additional 16X improvement in net energy-efficiency (pJ/b) is achieved through compressive sensing (CS), allowing a scalable (6kbps-10Mbps) duty-cycled uplink (UL) from the implant to an external wearable, while reducing the active power consumption to 0.52μW at 10Mbps, i.e. within the range of harvested body-coupled power in the downlink (DL), with externally applied electric currents < 1/5th of ICNIRP safety limits. BP-QBC eliminates the need for sub-cranial interrogators, utilizing quasi-static electrical signals for end-to-end BCC, avoiding transduction losses.
为了解决低端-端能量损耗的脑植入物供电和通信的挑战,我们提出了双相准静态脑通信(BP-QBC),在1MHz的载波频率下(比10MHz时的G-BCC低约6倍),实现了小于60dB的最坏情况下的信道损耗和比传统电体信道通信(G-BCC)低约41倍的功率。通过压缩感知(CS)实现了16倍的净能源效率(pJ/b)提高,允许从植入物到外部可穿戴设备的可扩展(6kbps-10Mbps)占空比上行链路(UL),同时将有功功耗降低到10Mbps时的0.52μW,即在下行链路(DL)中收获的体耦合功率范围内,外部施加的电流< ICNIRP安全限制的1/5。BP-QBC消除了颅下询问器的需要,利用准静态电信号进行端到端BCC,避免了转导损失。
{"title":"A 1.15μW 5.54mm3 Implant with a Bidirectional Neural Sensor and Stimulator SoC utilizing Bi-Phasic Quasi-static Brain Communication achieving 6kbps-10Mbps Uplink with Compressive Sensing and RO-PUF based Collision Avoidance","authors":"Baibhab Chatterjee, K. G. Kumar, Mayukh Nath, Shulan Xiao, Nirmoy Modak, D. Das, Jayant Krishna, Shreyas Sen","doi":"10.23919/VLSICircuits52068.2021.9492445","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492445","url":null,"abstract":"To solve the challenge of powering and communication in a brain implant with low end-end energy loss, we present Bi-Phasic Quasi-static Brain Communication (BP-QBC), achieving < 60dB worst-case channel loss, and ~41X lower power w.r.t. traditional Galvanic body channel communication (G-BCC) at a carrier frequency of 1MHz (~6X lower power than G-BCC at 10MHz) by blocking DC current paths through the brain tissue. An additional 16X improvement in net energy-efficiency (pJ/b) is achieved through compressive sensing (CS), allowing a scalable (6kbps-10Mbps) duty-cycled uplink (UL) from the implant to an external wearable, while reducing the active power consumption to 0.52μW at 10Mbps, i.e. within the range of harvested body-coupled power in the downlink (DL), with externally applied electric currents < 1/5th of ICNIRP safety limits. BP-QBC eliminates the need for sub-cranial interrogators, utilizing quasi-static electrical signals for end-to-end BCC, avoiding transduction losses.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129154158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
A 5nm Fin-FET 2G-search/s 512-entry x 220-bit TCAM with Single Cycle Entry Update Capability for Data Center ASICs 5nm Fin-FET 2G-search/s 512-entry x 220-bit TCAM,具有单周期Entry更新能力,适用于数据中心asic
Pub Date : 2021-06-13 DOI: 10.23919/VLSICircuits52068.2021.9492464
Chetan Deshpande, Ritesh Garg, Gajanan Jedhe, Gaurang Narvekar, Sushil Kumar
This paper presents a 2G-search/s embedded Ternary Content Addressable Memory (TCAM) design in 5nm Fin-FET technology with the ability to update both SRAM words in a TCAM entry in a single clock cycle. This reduces TCAM update latency by 50% for data center Application Specific Integrated Circuits (ASICs) with only 1% area overhead and no search power penalty. We present a novel time multiplexed input bus interface on a single port TCAM cell array and new architecture to enable fast updates. Silicon measurement shows the highest reported search rate of 2G-search/s at a 3.48Mb/mm2 memory density including all global peripheral circuitry for a 512 entry, 220-bit wide, 110Kb TCAM.
本文提出了一种采用5nm Fin-FET技术的2G-search/s嵌入式三元内容可寻址存储器(TCAM)设计,能够在单个时钟周期内更新TCAM条目中的两个SRAM字。这将数据中心专用集成电路(asic)的TCAM更新延迟减少了50%,只有1%的面积开销,没有搜索功率损失。我们在单端口TCAM单元阵列上提出了一种新的时间复用输入总线接口和新的结构,以实现快速更新。硅测量显示,在3.48Mb/mm2内存密度下,包括512条、220位宽、110Kb TCAM的所有全局外围电路,其最高搜索率为2G-search/s。
{"title":"A 5nm Fin-FET 2G-search/s 512-entry x 220-bit TCAM with Single Cycle Entry Update Capability for Data Center ASICs","authors":"Chetan Deshpande, Ritesh Garg, Gajanan Jedhe, Gaurang Narvekar, Sushil Kumar","doi":"10.23919/VLSICircuits52068.2021.9492464","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492464","url":null,"abstract":"This paper presents a 2G-search/s embedded Ternary Content Addressable Memory (TCAM) design in 5nm Fin-FET technology with the ability to update both SRAM words in a TCAM entry in a single clock cycle. This reduces TCAM update latency by 50% for data center Application Specific Integrated Circuits (ASICs) with only 1% area overhead and no search power penalty. We present a novel time multiplexed input bus interface on a single port TCAM cell array and new architecture to enable fast updates. Silicon measurement shows the highest reported search rate of 2G-search/s at a 3.48Mb/mm2 memory density including all global peripheral circuitry for a 512 entry, 220-bit wide, 110Kb TCAM.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126911525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Technical Session 技术会议
Pub Date : 2021-06-13 DOI: 10.23919/vlsicircuits52068.2021.9492352
Ghazali Md Zin
{"title":"Technical Session","authors":"Ghazali Md Zin","doi":"10.23919/vlsicircuits52068.2021.9492352","DOIUrl":"https://doi.org/10.23919/vlsicircuits52068.2021.9492352","url":null,"abstract":"","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116271274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Machine Learning Inspired Transceiver with ISI-Resilient Data Encoding: Hybrid-Ternary Coding + 2-Tap FFE + CTLE + Feature Extraction and Classification for 44.7dB Channel Loss in 7.3pJ/bit 一种具有isi弹性数据编码的机器学习启发的收发器:混合三元编码+ 2-Tap FFE + CTLE +特征提取和分类,用于7.3pJ/bit的44.7dB信道损耗
Pub Date : 2021-06-13 DOI: 10.23919/VLSICircuits52068.2021.9492510
Zhiping Wang, M. Megahed, Yusang Chun, Tejasvi Anand
This paper presents a machine learning inspired energy-efficient transceiver targeting long-reach channels using an ISI-resilient hybrid-ternary encoding on the transmitter and feature extraction and classification on the receiver. In addition to data encoding, the proposed transceiver also employs a 2-tap FFE and CTLE to achieve communication on a 44.7dB loss FR4 channel with BER less than 1×10-6, and an energy efficiency of 7.3pJ/bit at 13.8Gb/s in 65nm CMOS.
本文提出了一种受机器学习启发的节能收发器,目标是远程信道,在发射器上使用isi弹性混合三元编码,在接收器上使用特征提取和分类。除了数据编码之外,该收发器还采用了2分接FFE和CTLE,在损耗44.7dB的FR4信道上实现通信,误码率小于1×10-6,在65nm CMOS中以13.8Gb/s的速度实现7.3pJ/bit的能效。
{"title":"A Machine Learning Inspired Transceiver with ISI-Resilient Data Encoding: Hybrid-Ternary Coding + 2-Tap FFE + CTLE + Feature Extraction and Classification for 44.7dB Channel Loss in 7.3pJ/bit","authors":"Zhiping Wang, M. Megahed, Yusang Chun, Tejasvi Anand","doi":"10.23919/VLSICircuits52068.2021.9492510","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492510","url":null,"abstract":"This paper presents a machine learning inspired energy-efficient transceiver targeting long-reach channels using an ISI-resilient hybrid-ternary encoding on the transmitter and feature extraction and classification on the receiver. In addition to data encoding, the proposed transceiver also employs a 2-tap FFE and CTLE to achieve communication on a 44.7dB loss FR4 channel with BER less than 1×10-6, and an energy efficiency of 7.3pJ/bit at 13.8Gb/s in 65nm CMOS.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133940121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Design and Technology Solutions for 3D Integrated High Performance Systems 3D集成高性能系统的设计和技术解决方案
Pub Date : 2021-06-13 DOI: 10.23919/VLSICircuits52068.2021.9492421
G. V. D. Plas, E. Beyne
3D system integration builds on interconnect scaling roadmaps of TSVs (5µm to 100nm CD) and fine pitch bumps/pads (to <1µm pitch) for D2W and W2W schemes. Si bridges connect chiplets at 9.5Gbp, 338fJ/b, while W2W fine pitch memory logic functional partitioning improves power/performance by 30% vs 2D. Impingement cooler, BSPDN, high density MIMCAP and integrated magnetics push the power wall to 300W/cm2. On the other hand, 3D design flows require further development. Process optimization, DfT, KGD/S and heterogeneous technology optimization of functionally partitioned 3D-SOC make high performance systems cost-effective.
3D系统集成建立在tsv(5µm至100nm CD)的互连缩放路线图和D2W和W2W方案的细间距凸起/垫(到<1µm间距)上。硅桥连接小芯片的速度为9.5Gbp, 338fJ/b,而W2W细间距存储逻辑功能分区比2D提高了30%的功率/性能。撞击冷却器,BSPDN,高密度MIMCAP和集成磁将功率壁推至300W/cm2。另一方面,3D设计流程需要进一步发展。功能分区3D-SOC的工艺优化、DfT、KGD/S和异构技术优化使高性能系统具有成本效益。
{"title":"Design and Technology Solutions for 3D Integrated High Performance Systems","authors":"G. V. D. Plas, E. Beyne","doi":"10.23919/VLSICircuits52068.2021.9492421","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492421","url":null,"abstract":"3D system integration builds on interconnect scaling roadmaps of TSVs (5µm to 100nm CD) and fine pitch bumps/pads (to <1µm pitch) for D2W and W2W schemes. Si bridges connect chiplets at 9.5Gbp, 338fJ/b, while W2W fine pitch memory logic functional partitioning improves power/performance by 30% vs 2D. Impingement cooler, BSPDN, high density MIMCAP and integrated magnetics push the power wall to 300W/cm2. On the other hand, 3D design flows require further development. Process optimization, DfT, KGD/S and heterogeneous technology optimization of functionally partitioned 3D-SOC make high performance systems cost-effective.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131582738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Fugaku and A64FX: the First Exascale Supercomputer and its Innovative Arm CPU Fugaku和A64FX:第一台百亿亿级超级计算机及其创新Arm CPU
Pub Date : 2021-06-13 DOI: 10.23919/VLSICircuits52068.2021.9492415
S. Matsuoka
Fugaku is the first exascale supercomputer in the world, designed and built primarily by Riken Center for Computational Science (R-CCS) and Fujitsu Ltd., but involving essentially all the major stakeholders in the Japanese HPC community. The name ‘Fugaku’ is an alternative name for Mt. Fuji, and was chosen to signify that the machine not only seeks very high performance, but also a broad base of users and applicability at the same time. The heart of Fugaku is the new Fujitsu A64FX Arm processor, which is 100% compliant to Aarch64 specifications, yet embodies technologies realized for the first time in a major server general-purpose CPU, such as 7nm process technology, on-package integrated HBM2 and terabyte-class SVE streaming capabilities, on-die embedded TOFU-D high-performance network including the network switch, and adoption of so-called ‘disaggregated architecture’ that allows separation and arbitrary combination of CPU core, memory, and network functions. Fugaku uses 158,974 A64FX CPUs in a single socket node configuration, making it the largest and fastest supercomputer ever created, signified by its groundbreaking achievements in major HPC benchmarks, as well as producing societal results in COVID-19 applications.
Fugaku是世界上第一台百亿亿次超级计算机,主要由理研计算科学中心(R-CCS)和富士通有限公司设计和建造,但基本上涉及日本高性能计算社区的所有主要利益相关者。“Fugaku”这个名字是富士山的另一个名字,被选中是为了表明这台机器不仅寻求非常高的性能,同时也有广泛的用户基础和适用性。Fugaku的核心是全新的Fujitsu A64FX Arm处理器,该处理器100%符合Aarch64规范,同时体现了首次在主要服务器通用CPU中实现的技术,如7nm制程技术,封装内集成HBM2和tb级SVE流功能,片上嵌入式TOFU-D高性能网络,包括网络交换机,采用所谓的“分解架构”,允许CPU核心、内存和网络功能的分离和任意组合。Fugaku在单个插槽节点配置中使用158,974个A64FX cpu,使其成为有史以来最大和最快的超级计算机,这标志着其在主要HPC基准测试中取得的突破性成就,以及在COVID-19应用程序中产生的社会成果。
{"title":"Fugaku and A64FX: the First Exascale Supercomputer and its Innovative Arm CPU","authors":"S. Matsuoka","doi":"10.23919/VLSICircuits52068.2021.9492415","DOIUrl":"https://doi.org/10.23919/VLSICircuits52068.2021.9492415","url":null,"abstract":"Fugaku is the first exascale supercomputer in the world, designed and built primarily by Riken Center for Computational Science (R-CCS) and Fujitsu Ltd., but involving essentially all the major stakeholders in the Japanese HPC community. The name ‘Fugaku’ is an alternative name for Mt. Fuji, and was chosen to signify that the machine not only seeks very high performance, but also a broad base of users and applicability at the same time. The heart of Fugaku is the new Fujitsu A64FX Arm processor, which is 100% compliant to Aarch64 specifications, yet embodies technologies realized for the first time in a major server general-purpose CPU, such as 7nm process technology, on-package integrated HBM2 and terabyte-class SVE streaming capabilities, on-die embedded TOFU-D high-performance network including the network switch, and adoption of so-called ‘disaggregated architecture’ that allows separation and arbitrary combination of CPU core, memory, and network functions. Fugaku uses 158,974 A64FX CPUs in a single socket node configuration, making it the largest and fastest supercomputer ever created, signified by its groundbreaking achievements in major HPC benchmarks, as well as producing societal results in COVID-19 applications.","PeriodicalId":106356,"journal":{"name":"2021 Symposium on VLSI Circuits","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132813703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
期刊
2021 Symposium on VLSI Circuits
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1