Andrés Bruhn , Tobias Jakob , Markus Fischer , Timo Kohlberger , Joachim Weickert , Ulrich Brüning , Christoph Schnörr
{"title":"High performance cluster computing with 3-D nonlinear diffusion filters","authors":"Andrés Bruhn , Tobias Jakob , Markus Fischer , Timo Kohlberger , Joachim Weickert , Ulrich Brüning , Christoph Schnörr","doi":"10.1016/j.rti.2003.12.002","DOIUrl":null,"url":null,"abstract":"<div><p><span>This paper deals with parallelization<span> and implementation aspects of partial differential equation (PDE)-based image processing models for large cluster environments with distributed memory. As an example we focus on nonlinear diffusion filtering<span> which we discretize by means of an additive operator splitting (AOS). We start by decomposing the algorithm into small modules that shall be parallelized separately. For this purpose image partitioning strategies are discussed and their impact on the communication pattern and volume is analyzed. Based on the results we develop an algorithmic implementation with excellent scaling properties on massively connected low-latency networks. Test runs on two different high-end Myrinet clusters yield almost linear speedup factors up to 209 for 256 processors. This results in typical denoising times of </span></span></span><span><math><mtext>0.4</mtext><mspace></mspace><mtext>s</mtext></math></span> for five iterations on a 256×256×128 data cube.</p></div>","PeriodicalId":101062,"journal":{"name":"Real-Time Imaging","volume":"10 1","pages":"Pages 41-51"},"PeriodicalIF":0.0000,"publicationDate":"2004-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.rti.2003.12.002","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Real-Time Imaging","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1077201404000026","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14
Abstract
This paper deals with parallelization and implementation aspects of partial differential equation (PDE)-based image processing models for large cluster environments with distributed memory. As an example we focus on nonlinear diffusion filtering which we discretize by means of an additive operator splitting (AOS). We start by decomposing the algorithm into small modules that shall be parallelized separately. For this purpose image partitioning strategies are discussed and their impact on the communication pattern and volume is analyzed. Based on the results we develop an algorithmic implementation with excellent scaling properties on massively connected low-latency networks. Test runs on two different high-end Myrinet clusters yield almost linear speedup factors up to 209 for 256 processors. This results in typical denoising times of for five iterations on a 256×256×128 data cube.