{"title":"Toucan — A Translator for Communication Tolerant MPI Applications","authors":"Sergio M. Martin, M. Berger, S. Baden","doi":"10.1109/IPDPS.2017.44","DOIUrl":null,"url":null,"abstract":"We discuss early results with Toucan, a source-to-source translatorthat automatically restructures C/C++ MPI applications tooverlap communication with computation. We co-designed thetranslator and runtime system to enable dynamic, dependence-drivenexecution of MPI applications, and require only a modest amount ofprogrammer annotation. Co-design was essential to realizingoverlap through dynamic code block reordering and avoiding the limitations of static code relocation and inlining. We demonstrate that Toucan hides significantcommunication in four representative applications running on up to 24Kcores of NERSC's Edison platform. Using Toucan, we have hidden from 33% to 85% of the communication overhead, with performance meeting or exceeding that of painstakingly hand-written overlap variants.","PeriodicalId":209524,"journal":{"name":"2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2017-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS.2017.44","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
We discuss early results with Toucan, a source-to-source translatorthat automatically restructures C/C++ MPI applications tooverlap communication with computation. We co-designed thetranslator and runtime system to enable dynamic, dependence-drivenexecution of MPI applications, and require only a modest amount ofprogrammer annotation. Co-design was essential to realizingoverlap through dynamic code block reordering and avoiding the limitations of static code relocation and inlining. We demonstrate that Toucan hides significantcommunication in four representative applications running on up to 24Kcores of NERSC's Edison platform. Using Toucan, we have hidden from 33% to 85% of the communication overhead, with performance meeting or exceeding that of painstakingly hand-written overlap variants.
我们用Toucan讨论了早期的结果,Toucan是一个源到源的翻译器,可以自动重组C/ c++ MPI应用程序,使通信与计算重叠。我们共同设计了转换器和运行时系统,以实现MPI应用程序的动态,依赖驱动的执行,并且只需要少量的程序员注释。协同设计是通过动态代码块重排序和避免静态代码重定位和内联的限制来实现重叠的关键。我们展示了Toucan在四个典型应用程序中隐藏了重要的通信,这些应用程序运行在NERSC的Edison平台的高达24k的内核上。使用Toucan,我们隐藏了33%到85%的通信开销,性能达到或超过了辛苦编写的重叠变体。