Xin Liu, Jingyu Zhou, Daqiang Zhang, Yao Shen, M. Guo
{"title":"A Parallel Skeleton Library for Embedded Multicores","authors":"Xin Liu, Jingyu Zhou, Daqiang Zhang, Yao Shen, M. Guo","doi":"10.1109/ICPPW.2010.21","DOIUrl":null,"url":null,"abstract":"Many SoCs adopt multicore architectures. As a result, embedded programmers are also facing the challenge of parallel programming. We propose a parallel skeleton library that can be used on embedded multicores. Our library is implemented in standard C++ using template features. We propose two parallel skeletons to support common program patterns on multicores. In our skeleton library, programmers can easily choose underlying parallel implementations with no code changes. Experimental results show that many applications can take advantage of these two skeletons for performance improvement, sometimes better than hand-parallelized code.","PeriodicalId":415472,"journal":{"name":"2010 39th International Conference on Parallel Processing Workshops","volume":"293 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 39th International Conference on Parallel Processing Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPPW.2010.21","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Many SoCs adopt multicore architectures. As a result, embedded programmers are also facing the challenge of parallel programming. We propose a parallel skeleton library that can be used on embedded multicores. Our library is implemented in standard C++ using template features. We propose two parallel skeletons to support common program patterns on multicores. In our skeleton library, programmers can easily choose underlying parallel implementations with no code changes. Experimental results show that many applications can take advantage of these two skeletons for performance improvement, sometimes better than hand-parallelized code.