{"title":"A task adaptive parallel graphics renderer","authors":"S. Whitman","doi":"10.1145/166181.166185","DOIUrl":null,"url":null,"abstract":"This paper presents a graphics renderer which incorporates new partitioning methodologies of memory and work for efficient execution on a parallel computer. The task adaptive domain decomposition scheme is an image space method involving dynamic partitioning of rectangular pixel area tasks. We show that this method requires little overhead, allows coherence within a parallel context, handles worst case scenarios with reasonable speedup, executes efficiently, and requires minimal processor synchronization. The implementation analysis indicates that load imbalance is the major cause of performance degradation at the higher processor counts. Even so, on a variety of test scenes, an average rendering speedup of 79 was achieved utilizing 96 processors on the BBN TC2000 multiprocessor with processor efficiency ranging from 66% to 94%.","PeriodicalId":394370,"journal":{"name":"Proceedings of 1993 IEEE Parallel Rendering Symposium","volume":"39 2","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1993-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"33","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of 1993 IEEE Parallel Rendering Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/166181.166185","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 33
Abstract
This paper presents a graphics renderer which incorporates new partitioning methodologies of memory and work for efficient execution on a parallel computer. The task adaptive domain decomposition scheme is an image space method involving dynamic partitioning of rectangular pixel area tasks. We show that this method requires little overhead, allows coherence within a parallel context, handles worst case scenarios with reasonable speedup, executes efficiently, and requires minimal processor synchronization. The implementation analysis indicates that load imbalance is the major cause of performance degradation at the higher processor counts. Even so, on a variety of test scenes, an average rendering speedup of 79 was achieved utilizing 96 processors on the BBN TC2000 multiprocessor with processor efficiency ranging from 66% to 94%.