The progressive scaling demands effort from both the circuit and the device level, to cope with circuit variability and reliability issues. Advent of FinFET technology has suppresses the short channel effects and variability, but still suffers with self heating problem consequently increases temporal degradations. In this paper, we investigate severity of Negative Bias Temperature Instability (NBTI) and proposes an adaptable trip point sensing based compensation technique to satisfy performance metrics for NBTI aware Independent Gate (IG) FinFET based SRAM. Simulation results are carried out using HSPICE with PTM 32nm IG-FinFET technology demonstrate that threshold voltage deviates from its nominal value by 17%, causing 6% and 13% degradation in SNM and RNM, respectively under NBTI degradation at 125°C for 3 years. The proposed technique yields 42% reduced read failures under NBTI. Thus, proposed approach improves the stability of SRAM array during its operational life and hence, reliability of the system.
{"title":"NBTI aware IG-FinFET based SRAM design using adaptable trip-point sensing technique","authors":"N. Yadav, Shikha Jain, M. Pattanaik, G. K. Sharma","doi":"10.1145/2770287.2770316","DOIUrl":"https://doi.org/10.1145/2770287.2770316","url":null,"abstract":"The progressive scaling demands effort from both the circuit and the device level, to cope with circuit variability and reliability issues. Advent of FinFET technology has suppresses the short channel effects and variability, but still suffers with self heating problem consequently increases temporal degradations. In this paper, we investigate severity of Negative Bias Temperature Instability (NBTI) and proposes an adaptable trip point sensing based compensation technique to satisfy performance metrics for NBTI aware Independent Gate (IG) FinFET based SRAM. Simulation results are carried out using HSPICE with PTM 32nm IG-FinFET technology demonstrate that threshold voltage deviates from its nominal value by 17%, causing 6% and 13% degradation in SNM and RNM, respectively under NBTI degradation at 125°C for 3 years. The proposed technique yields 42% reduced read failures under NBTI. Thus, proposed approach improves the stability of SRAM array during its operational life and hence, reliability of the system.","PeriodicalId":6519,"journal":{"name":"2014 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)","volume":"40 1","pages":"122-128"},"PeriodicalIF":0.0,"publicationDate":"2014-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82560451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Honghui Sun, Liang Fang, Yao Wang, Yaqing Chi, Rulin Liu
As of today, the semiconductor industry has been looking for possible alternative materials of silicon, since the physical limitation of silicon-based devices, i.e., planar CMOS devices for most of the scenarios, is approaching soon. Among all the novel materials arising from the horizon, graphene is considered to be a very promising alternative for its unique electrical properties. Although all kinds of prospective electrical properties it has(e.g., high mobility), there are barriers for Graphene-based Field Effect Transistors (G-FETs) to overcome, in order to find its way to the substitution of Silicon Metal Oxide Semiconducting Field Effect Transistors (Si-MOSFETs). One of the most important engineering barriers to be overwhelmed is the parasitic parameters, among which the parasitic resistance is considered to be one of the most critical roadblock. Contact resistance in G-FETs is relatively high compared to that of conventional Si-MOSFETs. In this paper, we present an experimental demonstration of a new method to reduce the contact resistance in back gate G-FETs. In the proposed device structure, the source/drain regions are fabricated using multilayer graphene (MLG), thus the top and edge contacts are formed between the MLG and metal electrodes, while the conducting channel is still formed by using single-layer graphene (SLG). Due to the high conductivity of MLG and relative low conductivity of SLG, the contact resistance is reduced while the controllability of channel conductivity is preserved.
{"title":"A low contact resistance graphene field effect transistor with single-layer-channel and multi-layer-contact","authors":"Honghui Sun, Liang Fang, Yao Wang, Yaqing Chi, Rulin Liu","doi":"10.1145/2770287.2770321","DOIUrl":"https://doi.org/10.1145/2770287.2770321","url":null,"abstract":"As of today, the semiconductor industry has been looking for possible alternative materials of silicon, since the physical limitation of silicon-based devices, i.e., planar CMOS devices for most of the scenarios, is approaching soon. Among all the novel materials arising from the horizon, graphene is considered to be a very promising alternative for its unique electrical properties. Although all kinds of prospective electrical properties it has(e.g., high mobility), there are barriers for Graphene-based Field Effect Transistors (G-FETs) to overcome, in order to find its way to the substitution of Silicon Metal Oxide Semiconducting Field Effect Transistors (Si-MOSFETs). One of the most important engineering barriers to be overwhelmed is the parasitic parameters, among which the parasitic resistance is considered to be one of the most critical roadblock. Contact resistance in G-FETs is relatively high compared to that of conventional Si-MOSFETs. In this paper, we present an experimental demonstration of a new method to reduce the contact resistance in back gate G-FETs. In the proposed device structure, the source/drain regions are fabricated using multilayer graphene (MLG), thus the top and edge contacts are formed between the MLG and metal electrodes, while the conducting channel is still formed by using single-layer graphene (SLG). Due to the high conductivity of MLG and relative low conductivity of SLG, the contact resistance is reduced while the controllability of channel conductivity is preserved.","PeriodicalId":6519,"journal":{"name":"2014 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)","volume":"35 1","pages":"139-144"},"PeriodicalIF":0.0,"publicationDate":"2014-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84501394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
David B. Dgien, Poovaiah M. Palangappa, N. A. Hunter, Jiayin Li, K. Mohanram
This paper proposes a compression-based architecture for bit-write reduction in emerging non-volatile memories (NVMs). Bit-write reduction has many practical benefits, including lower write latency, lower dynamic energy, and enhanced endurance. The proposed architecture, which is integrated into the NVM module, relies on (i) a frequent pattern compression-decompression engine, (ii) a comparator to reduce bit-writes, and (iii) an opportunistic wear leveler to spread writes and enhance memory endurance by reducing the peak bit-writes/cell. Trace-based simulations of the SPEC CPU2006 benchmarks show a 20× reduction in raw bit-writes on average, which corresponds to a 2-3× improvement over state-of-the-art methods and a 27% reduction in peak cell bit-writes.
{"title":"Compression architecture for bit-write reduction in non-volatile memory technologies","authors":"David B. Dgien, Poovaiah M. Palangappa, N. A. Hunter, Jiayin Li, K. Mohanram","doi":"10.1145/2770287.2770300","DOIUrl":"https://doi.org/10.1145/2770287.2770300","url":null,"abstract":"This paper proposes a compression-based architecture for bit-write reduction in emerging non-volatile memories (NVMs). Bit-write reduction has many practical benefits, including lower write latency, lower dynamic energy, and enhanced endurance. The proposed architecture, which is integrated into the NVM module, relies on (i) a frequent pattern compression-decompression engine, (ii) a comparator to reduce bit-writes, and (iii) an opportunistic wear leveler to spread writes and enhance memory endurance by reducing the peak bit-writes/cell. Trace-based simulations of the SPEC CPU2006 benchmarks show a 20× reduction in raw bit-writes on average, which corresponds to a 2-3× improvement over state-of-the-art methods and a 27% reduction in peak cell bit-writes.","PeriodicalId":6519,"journal":{"name":"2014 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)","volume":"16 1","pages":"51-56"},"PeriodicalIF":0.0,"publicationDate":"2014-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88915894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents a new HSPICE macromodel of a Programmable Metallization Cell (PMC). The electrical characteristics of a PMC are simulated by using a geometric model that considers the vertical and lateral growth/dissolution of the metallic filament. The selection of the parameters is based on operational features, so the electrical characterization of the PMC is simple, easy to simulate and intuitive. The I-V and R-V plots of a PMC are generated at a very small error compared with experimental data; the proposed model also shows a small error for the relationship between the switching time and the pulse amplitude. The use of a PMC as resistive element in a crossbar memory is also presented; it is shown that a PMC-based crossbar offers substantial improvements over other resistive technologies.
{"title":"HSPICE macromodel of a Programmable Metallization Cell (PMC) and its application to memory design","authors":"P. Junsangsri, F. Lombardi, Jie Han","doi":"10.1145/2770287.2770299","DOIUrl":"https://doi.org/10.1145/2770287.2770299","url":null,"abstract":"This paper presents a new HSPICE macromodel of a Programmable Metallization Cell (PMC). The electrical characteristics of a PMC are simulated by using a geometric model that considers the vertical and lateral growth/dissolution of the metallic filament. The selection of the parameters is based on operational features, so the electrical characterization of the PMC is simple, easy to simulate and intuitive. The I-V and R-V plots of a PMC are generated at a very small error compared with experimental data; the proposed model also shows a small error for the relationship between the switching time and the pulse amplitude. The use of a PMC as resistive element in a crossbar memory is also presented; it is shown that a PMC-based crossbar offers substantial improvements over other resistive technologies.","PeriodicalId":6519,"journal":{"name":"2014 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)","volume":"37 1","pages":"45-50"},"PeriodicalIF":0.0,"publicationDate":"2014-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90102957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mostafizur Rahman, Mingyu Li, Jiajun Shi, S. Khasanvis, C. A. Moritz
Maintaining power scaling trend and cell stability are critical challenges facing CMOS SRAM at sub-20nm technologies. These challenges primarily stem from the fundamental limitations of MOSFETs, and the rigid device doping and sizing requirements of underlying SRAM design. In this paper, we propose a new volatile memory architecture called Tunnel FET based Random Access Memory (TNRAM) that solves CMOS SRAM scaling challenges through integration of ultra-low power Tunnel FETs (TFETs) in a novel circuit style. It is designed to operate with single type uniform transistors to eliminate nanoscale device sizing requirements, and is customized to prevent SRAM like stability concerns. Analytical projections show significant power benefits; 6T-TNRAM has 4.38x lower active power and 174x lower leakage power over HP 6T-SRAM at 16nm technology node.
{"title":"A new Tunnel-FET based RAM concept for ultra-low power applications","authors":"Mostafizur Rahman, Mingyu Li, Jiajun Shi, S. Khasanvis, C. A. Moritz","doi":"10.1145/2770287.2770301","DOIUrl":"https://doi.org/10.1145/2770287.2770301","url":null,"abstract":"Maintaining power scaling trend and cell stability are critical challenges facing CMOS SRAM at sub-20nm technologies. These challenges primarily stem from the fundamental limitations of MOSFETs, and the rigid device doping and sizing requirements of underlying SRAM design. In this paper, we propose a new volatile memory architecture called Tunnel FET based Random Access Memory (TNRAM) that solves CMOS SRAM scaling challenges through integration of ultra-low power Tunnel FETs (TFETs) in a novel circuit style. It is designed to operate with single type uniform transistors to eliminate nanoscale device sizing requirements, and is customized to prevent SRAM like stability concerns. Analytical projections show significant power benefits; 6T-TNRAM has 4.38x lower active power and 174x lower leakage power over HP 6T-SRAM at 16nm technology node.","PeriodicalId":6519,"journal":{"name":"2014 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)","volume":"127 1","pages":"57-58"},"PeriodicalIF":0.0,"publicationDate":"2014-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74613559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we introduce Bayesian network methods in order to evaluate the reliability of an application mapped onto the Sea-of-Tiles fabric based on DGFET nano devices. By using these methods, we show some interesting features of this kind of fabric at the functional level; the reliability of one tile of this fabric does not depend on the values of the control and the polarity gates, the diagnosis of a defective tile is possible with the input vector G = H = 1 (or G = H = 0) and with an observation on the value of the output F. Nevertheless, these features should also be checked at the device level to be more accurate. Bayesian networks give us the opportunity to estimate the reliability of a whole application mapped onto this fabric and to test the defective behaviour of tiles before the place-and-route procedure.
在本文中,我们引入贝叶斯网络方法来评估基于DGFET纳米器件的应用程序映射到Sea-of-Tiles织物上的可靠性。通过这些方法,我们在功能层面展示了这种织物的一些有趣的特征;这种织物的一个瓦片的可靠性不依赖于控制和极性门的值,有缺陷瓦片的诊断是可能的输入向量G = H = 1(或G = H = 0)和对输出f值的观察。然而,这些特征也应该在设备级别进行检查,以更准确。贝叶斯网络使我们有机会估计映射到该结构上的整个应用程序的可靠性,并在放置和路由程序之前测试瓷砖的缺陷行为。
{"title":"Stochastic reliability evaluation of Sea-of-Tiles based on Double Gate controllable-polarity FETs","authors":"C. Dezan, Sara Zermani","doi":"10.1145/2770287.2770328","DOIUrl":"https://doi.org/10.1145/2770287.2770328","url":null,"abstract":"In this paper, we introduce Bayesian network methods in order to evaluate the reliability of an application mapped onto the Sea-of-Tiles fabric based on DGFET nano devices. By using these methods, we show some interesting features of this kind of fabric at the functional level; the reliability of one tile of this fabric does not depend on the values of the control and the polarity gates, the diagnosis of a defective tile is possible with the input vector G = H = 1 (or G = H = 0) and with an observation on the value of the output F. Nevertheless, these features should also be checked at the device level to be more accurate. Bayesian networks give us the opportunity to estimate the reliability of a whole application mapped onto this fabric and to test the defective behaviour of tiles before the place-and-route procedure.","PeriodicalId":6519,"journal":{"name":"2014 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)","volume":"28 1","pages":"169-170"},"PeriodicalIF":0.0,"publicationDate":"2014-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88012640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we propose a reversible quantum 2n_ to-n BCD priority encoder circuit, where n is the number of output bits. The proposed design of the 2n-to-n BCD priority encoder circuit shows that it is composed of quantum circuits for OR operation and quantum NOT gates. We present an algorithm to construct a minimized quantum 2n-to-n BCD priority encoder circuit. A technique to calculate the quantum gate complexity of quantum circuits has also been proposed in the paper. Our circuit performs better than the existing ones in terms of quantum gates, delays, garbage outputs, constant inputs, quantum gate calculation complexity, area and power, e.g., the proposed quantum 8-to-3 BCD priority encoder circuit improves 41.25% on the number of quantum gates, 46.05% on delays, 48% on garbage outputs, 60% on constant inputs and 41.25% on area and power than the existing circuit. We also simulate the proposed quantum BCD priority encoder circuit using Microwind DSCH 2.7 which shows the functional correctness of the circuit.
本文提出了一种可逆的量子2n_ to-n BCD优先编码器电路,其中n为输出比特数。提出的2n- n BCD优先编码器电路设计表明,它由量子或运算电路和量子非门组成。提出了一种构造最小量子2n- n BCD优先编码器电路的算法。本文还提出了一种计算量子电路量子门复杂度的方法。我们的电路在量子门、延迟、垃圾输出、恒定输入、量子门计算复杂度、面积和功耗方面都比现有电路性能更好,例如,我们提出的量子8到3 BCD优先编码器电路在量子门数量上提高了41.25%,在延迟上提高了46.05%,在垃圾输出上提高了48%,在恒定输入上提高了60%,在面积和功耗上提高了41.25%。利用Microwind DSCH 2.7对所提出的量子BCD优先编码器电路进行了仿真,验证了电路功能的正确性。
{"title":"Minimization of a reversible quantum 2n-to-n BCD priority encoder","authors":"N. J. Lisa, H. Babu","doi":"10.1145/2770287.2770307","DOIUrl":"https://doi.org/10.1145/2770287.2770307","url":null,"abstract":"In this paper, we propose a reversible quantum 2n_ to-n BCD priority encoder circuit, where n is the number of output bits. The proposed design of the 2n-to-n BCD priority encoder circuit shows that it is composed of quantum circuits for OR operation and quantum NOT gates. We present an algorithm to construct a minimized quantum 2n-to-n BCD priority encoder circuit. A technique to calculate the quantum gate complexity of quantum circuits has also been proposed in the paper. Our circuit performs better than the existing ones in terms of quantum gates, delays, garbage outputs, constant inputs, quantum gate calculation complexity, area and power, e.g., the proposed quantum 8-to-3 BCD priority encoder circuit improves 41.25% on the number of quantum gates, 46.05% on delays, 48% on garbage outputs, 60% on constant inputs and 41.25% on area and power than the existing circuit. We also simulate the proposed quantum BCD priority encoder circuit using Microwind DSCH 2.7 which shows the functional correctness of the circuit.","PeriodicalId":6519,"journal":{"name":"2014 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)","volume":"22 1","pages":"77-82"},"PeriodicalIF":0.0,"publicationDate":"2014-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89975648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We discuss a novel application of a photonic circuit for integrated high-performance neuromorphic signal processing. Large fan-in is an especially important capability in distributed systems; however, electronic physics impose tradeoffs between bandwidth performance and fan-in degree. A circuit developed in the field of radio frequency (RF) photonics, wavelength(λ)-fan-in does not exhibit a corresponding tradeoff and can circumvent prior challenges to fan-in in optical distributed processing applications.
{"title":"Applications of wavelength-fan-in for high-performance distributed processing systems","authors":"A. Tait, P. Prucnal","doi":"10.1145/2770287.2770331","DOIUrl":"https://doi.org/10.1145/2770287.2770331","url":null,"abstract":"We discuss a novel application of a photonic circuit for integrated high-performance neuromorphic signal processing. Large fan-in is an especially important capability in distributed systems; however, electronic physics impose tradeoffs between bandwidth performance and fan-in degree. A circuit developed in the field of radio frequency (RF) photonics, wavelength(λ)-fan-in does not exhibit a corresponding tradeoff and can circumvent prior challenges to fan-in in optical distributed processing applications.","PeriodicalId":6519,"journal":{"name":"2014 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)","volume":"3 1","pages":"177-178"},"PeriodicalIF":0.0,"publicationDate":"2014-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75687287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper we propose to utilise 3D-stacked hybrid memories as alternative to traditional CMOS SRAMs in L1 and L2 cache implementations and analyse the potential implications of this approach on the processor performance, measured in terms of Instructions-per-Cycle (IPC) and energy consumption. The 3D hybrid memory cell relies on: (i) a Short Circuit Current Free Nano-Electro-Mechanical Field Effect Transistor (SCCF NEMFET) based inverter for data storage; and (ii) adjacent CMOS-based logic for read/write operations and data preservation. We compare 3D Stacked Hybrid NEMFET-CMOS Caches (3DS-HNCC) of various capacities against state of the art 45 nm low power CMOS SRAM counterparts (2D-CC). All the proposed implementations provide two orders of magnitude static energy reduction (due to NEMFET's extremely low OFF current), a slightly increased dynamic energy consumption, while requiring an approximately 55% larger footprint. The read access time is equivalent, while for write operations it is with about 3 ns higher, as it is dominated by the mechanical movement of the NEMFET's suspended gate. In order to determine if the write latency overhead inflicts any performance penalty, we consider as evaluation vehicle a state of the art mobile out-of-order processor core equipped with 32-kB instruction and data L1 caches, and a unified 2-MB L2 cache. We evaluate different scenarios, utilizing both 3DS-HNCC and 2D-CC at different hierarchy levels, on a set of SPEC 2000 benchmarks. Our simulations indicate that for the considered applications, despite of their increased write access time, 3DS-HNCC L2 caches inflict insignificant IPC penalty while providing, on average, 38% energy savings, when compared with 2D-CC. For L1 instruction caches the IPC penalty is also almost insignificant, while for L1 data caches IPC decreases between 1% to 12% were measured.
{"title":"Energy effective 3D stacked hybrid NEMFET-CMOS caches","authors":"M. Lefter, M. Enachescu, G. Voicu, S. Cotofana","doi":"10.1145/2770287.2770324","DOIUrl":"https://doi.org/10.1145/2770287.2770324","url":null,"abstract":"In this paper we propose to utilise 3D-stacked hybrid memories as alternative to traditional CMOS SRAMs in L1 and L2 cache implementations and analyse the potential implications of this approach on the processor performance, measured in terms of Instructions-per-Cycle (IPC) and energy consumption. The 3D hybrid memory cell relies on: (i) a Short Circuit Current Free Nano-Electro-Mechanical Field Effect Transistor (SCCF NEMFET) based inverter for data storage; and (ii) adjacent CMOS-based logic for read/write operations and data preservation. We compare 3D Stacked Hybrid NEMFET-CMOS Caches (3DS-HNCC) of various capacities against state of the art 45 nm low power CMOS SRAM counterparts (2D-CC). All the proposed implementations provide two orders of magnitude static energy reduction (due to NEMFET's extremely low OFF current), a slightly increased dynamic energy consumption, while requiring an approximately 55% larger footprint. The read access time is equivalent, while for write operations it is with about 3 ns higher, as it is dominated by the mechanical movement of the NEMFET's suspended gate. In order to determine if the write latency overhead inflicts any performance penalty, we consider as evaluation vehicle a state of the art mobile out-of-order processor core equipped with 32-kB instruction and data L1 caches, and a unified 2-MB L2 cache. We evaluate different scenarios, utilizing both 3DS-HNCC and 2D-CC at different hierarchy levels, on a set of SPEC 2000 benchmarks. Our simulations indicate that for the considered applications, despite of their increased write access time, 3DS-HNCC L2 caches inflict insignificant IPC penalty while providing, on average, 38% energy savings, when compared with 2D-CC. For L1 instruction caches the IPC penalty is also almost insignificant, while for L1 data caches IPC decreases between 1% to 12% were measured.","PeriodicalId":6519,"journal":{"name":"2014 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)","volume":"62 1","pages":"151-156"},"PeriodicalIF":0.0,"publicationDate":"2014-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89208801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Two sub-crosspoint physical topologies are proposed that places the decode circuitry beneath the metal-oxide RRAM crosspoint array. The first topology integrates only the row decode circuitry, while the second integrates both the row and column decoder. The topology for sub-crosspoint row decoding reduces area by up to 38.6% over the standard peripheral approach, with an improvement in area efficiency of 21.6% for small arrays. Sub-crosspoint row and column decoding reduces the RRAM crosspoint area by 27.1% and improves area efficiency to nearly 100%.
{"title":"Sub-crosspoint RRAM decoding for improved area efficiency","authors":"Ravi Patel, E. Friedman","doi":"10.1145/2770287.2770312","DOIUrl":"https://doi.org/10.1145/2770287.2770312","url":null,"abstract":"Two sub-crosspoint physical topologies are proposed that places the decode circuitry beneath the metal-oxide RRAM crosspoint array. The first topology integrates only the row decode circuitry, while the second integrates both the row and column decoder. The topology for sub-crosspoint row decoding reduces area by up to 38.6% over the standard peripheral approach, with an improvement in area efficiency of 21.6% for small arrays. Sub-crosspoint row and column decoding reduces the RRAM crosspoint area by 27.1% and improves area efficiency to nearly 100%.","PeriodicalId":6519,"journal":{"name":"2014 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH)","volume":"8 1","pages":"98-103"},"PeriodicalIF":0.0,"publicationDate":"2014-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75110454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}