{"title":"Exascale computer architecture adjusting to the “New normal” for computing","authors":"J. Shalf","doi":"10.1109/E3S.2013.6705854","DOIUrl":null,"url":null,"abstract":"The current MPI+Fortran ecosystem has sustained HPC application software development for the past decade, but was architected for coarse-grained concurrency largely dominated by bulk-synchronous algorithms. The trends in computer architecture have turned our model for how to get good performance from computing systems upside-down, and will require rethinking our entire programming environment and algorithm design to be better aligned with the new cost metrics for these emerging hardware architectures. There are already promising avenues of exploration underway to mitigate these effects. Future hardware constraints on bandwidth and memory capacity, together with exponential growth in explicit on-chip parallelism will likely require a mass migration to new algorithms and software architecture that is as broad and disruptive as the migration from vector to parallel computing systems that occurred 15 years go. The challenge is to efficiently express massive parallelism and hierarchical data locality without subjecting the programmer to overwhelming complexity. The author covers how changes in hardware (governed by the fundamental physics of Silicon based CMOS technology) are breaking our existing abstract machine models, and DOE's program to overcome these obstacles to continued performance improvements. He examines potential approaches that range from revolutionary asynchronous and dataflow models of computation to evolutionary extensions to existing APIs and OpenMP directives.","PeriodicalId":231837,"journal":{"name":"2013 Third Berkeley Symposium on Energy Efficient Electronic Systems (E3S)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 Third Berkeley Symposium on Energy Efficient Electronic Systems (E3S)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/E3S.2013.6705854","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
The current MPI+Fortran ecosystem has sustained HPC application software development for the past decade, but was architected for coarse-grained concurrency largely dominated by bulk-synchronous algorithms. The trends in computer architecture have turned our model for how to get good performance from computing systems upside-down, and will require rethinking our entire programming environment and algorithm design to be better aligned with the new cost metrics for these emerging hardware architectures. There are already promising avenues of exploration underway to mitigate these effects. Future hardware constraints on bandwidth and memory capacity, together with exponential growth in explicit on-chip parallelism will likely require a mass migration to new algorithms and software architecture that is as broad and disruptive as the migration from vector to parallel computing systems that occurred 15 years go. The challenge is to efficiently express massive parallelism and hierarchical data locality without subjecting the programmer to overwhelming complexity. The author covers how changes in hardware (governed by the fundamental physics of Silicon based CMOS technology) are breaking our existing abstract machine models, and DOE's program to overcome these obstacles to continued performance improvements. He examines potential approaches that range from revolutionary asynchronous and dataflow models of computation to evolutionary extensions to existing APIs and OpenMP directives.