Minesh Patel, Jeremie S. Kim, Hasan Hassan, O. Mutlu
{"title":"现代DRAM晶片上纠错的理解与建模:使用真实装置的实验研究","authors":"Minesh Patel, Jeremie S. Kim, Hasan Hassan, O. Mutlu","doi":"10.1109/DSN.2019.00017","DOIUrl":null,"url":null,"abstract":"Experimental characterization of DRAM errors is a powerful technique for understanding DRAM behavior and provides valuable insights for improving overall system performance, energy efficiency, and reliability. Unfortunately, recent DRAM technology scaling issues are forcing manufacturers to adopt on-die error-correction codes (ECC), which pose a significant challenge for DRAM error characterization studies by obfuscating raw error distributions using undocumented, proprietary, and opaque error-correction hardware. As we show in this work, errors observed in devices with on-die ECC no longer follow expected, well-studied distributions (e.g., lognormal retention times) but rather depend on the particular ECC scheme used.","PeriodicalId":271955,"journal":{"name":"2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"42","resultStr":"{\"title\":\"Understanding and Modeling On-Die Error Correction in Modern DRAM: An Experimental Study Using Real Devices\",\"authors\":\"Minesh Patel, Jeremie S. Kim, Hasan Hassan, O. Mutlu\",\"doi\":\"10.1109/DSN.2019.00017\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Experimental characterization of DRAM errors is a powerful technique for understanding DRAM behavior and provides valuable insights for improving overall system performance, energy efficiency, and reliability. Unfortunately, recent DRAM technology scaling issues are forcing manufacturers to adopt on-die error-correction codes (ECC), which pose a significant challenge for DRAM error characterization studies by obfuscating raw error distributions using undocumented, proprietary, and opaque error-correction hardware. As we show in this work, errors observed in devices with on-die ECC no longer follow expected, well-studied distributions (e.g., lognormal retention times) but rather depend on the particular ECC scheme used.\",\"PeriodicalId\":271955,\"journal\":{\"name\":\"2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"42\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DSN.2019.00017\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSN.2019.00017","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Understanding and Modeling On-Die Error Correction in Modern DRAM: An Experimental Study Using Real Devices
Experimental characterization of DRAM errors is a powerful technique for understanding DRAM behavior and provides valuable insights for improving overall system performance, energy efficiency, and reliability. Unfortunately, recent DRAM technology scaling issues are forcing manufacturers to adopt on-die error-correction codes (ECC), which pose a significant challenge for DRAM error characterization studies by obfuscating raw error distributions using undocumented, proprietary, and opaque error-correction hardware. As we show in this work, errors observed in devices with on-die ECC no longer follow expected, well-studied distributions (e.g., lognormal retention times) but rather depend on the particular ECC scheme used.