首页 > 最新文献

Proceedings of 1993 IEEE Parallel Rendering Symposium最新文献

英文 中文
Integrating volume data analysis and rendering on distributed memory architectures 在分布式内存架构上集成海量数据分析和渲染
Pub Date : 1993-11-01 DOI: 10.1109/PRS.1993.586092
E. Camahort, I. Chakravarty
The ability to generate visual representations of data, and the ability to enhance data into a suitable form for the purpose of visual representation, form two key components in a scientific visualization system. By a visual representation we mean the ability to render the data, using visual cues, such that the important features are readily perceived by the user. By the ability to enhance data we mean the ability to apply transformations to the data so that salient features embedded in the data become discernible and quantifiable. The rendering of data, computer graphics, and the enhancement of data, image processing, have emerged over the last twenty years into separate scientific disciplines. However, in scientific visualization and other applications of empirical data interpretation, we are increasingly confronted with the need to combine both data rendering and data transformation capabilities under one system framework. This paper describes the design issues and implementation of a program for visualizing and enhancing volume data on distributed memory architectures. Our design is motivated by the desire to interactively view, transform, and interpret volume data acquired using seismic imaging techniques. Experimental results derived from an implementation on the Connection Machine CM-5 are described.
生成数据可视化表示的能力,以及将数据增强为适合可视化表示的形式的能力,构成了科学可视化系统中的两个关键组成部分。通过可视化表示,我们指的是使用视觉线索呈现数据的能力,这样重要的特征就很容易被用户感知到。通过增强数据的能力,我们指的是对数据应用转换的能力,从而使嵌入在数据中的显著特征变得可识别和可量化。在过去的二十年里,数据的渲染(计算机图形)和数据的增强(图像处理)已经成为独立的科学学科。然而,在科学可视化和其他经验数据解释的应用中,我们越来越多地面临着将数据呈现和数据转换能力结合在一个系统框架下的需求。本文描述了一个用于可视化和增强分布式内存体系结构上的卷数据的程序的设计问题和实现。我们设计的动机是希望通过地震成像技术交互式地查看、转换和解释获得的体积数据。本文描述了在CM-5型连接机上实现的实验结果。
{"title":"Integrating volume data analysis and rendering on distributed memory architectures","authors":"E. Camahort, I. Chakravarty","doi":"10.1109/PRS.1993.586092","DOIUrl":"https://doi.org/10.1109/PRS.1993.586092","url":null,"abstract":"The ability to generate visual representations of data, and the ability to enhance data into a suitable form for the purpose of visual representation, form two key components in a scientific visualization system. By a visual representation we mean the ability to render the data, using visual cues, such that the important features are readily perceived by the user. By the ability to enhance data we mean the ability to apply transformations to the data so that salient features embedded in the data become discernible and quantifiable. The rendering of data, computer graphics, and the enhancement of data, image processing, have emerged over the last twenty years into separate scientific disciplines. However, in scientific visualization and other applications of empirical data interpretation, we are increasingly confronted with the need to combine both data rendering and data transformation capabilities under one system framework. This paper describes the design issues and implementation of a program for visualizing and enhancing volume data on distributed memory architectures. Our design is motivated by the desire to interactively view, transform, and interpret volume data acquired using seismic imaging techniques. Experimental results derived from an implementation on the Connection Machine CM-5 are described.","PeriodicalId":394370,"journal":{"name":"Proceedings of 1993 IEEE Parallel Rendering Symposium","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125820727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 41
Parallel volume-rendering algorithm performance on mesh-connected multicomputers 网格连接多计算机上并行体绘制算法的性能
Pub Date : 1993-11-01 DOI: 10.1145/166181.166196
U. Neumann
This work examines the network performance of mesh-connected multicomputers applied to parallel volume rendering algorithms. This issue has not been addressed in papers describing particular parallel implementations, but is pertinent to anyone designing or implementing parallel rendering algorithms. Parallel volume rendering algorithms fall into two main classes-image and object partitions. Communication requirements for algorithms in these classes are analyzed. Network performance for these algorithms is estimated by using an existing model of mesh network behavior. The performance estimates are verified by tests on the Touchstone Delta. The results indicate that, for a fixed screen size, the performance of 2D mesh networks scales very well then used with object partition algorithms-the time required for communication actually decreases as the data and system sizes increase. A Touchstone Delta implementation of an object partition algorithm is briefly described to illustrate the algorithm's low communication requirements.
这项工作检查了应用于并行体绘制算法的网格连接多计算机的网络性能。这个问题还没有在描述特定并行实现的论文中得到解决,但它与任何设计或实现并行渲染算法的人都有关。并行体绘制算法分为两大类:图像分区和对象分区。分析了这些类中算法的通信需求。这些算法的网络性能通过使用现有的网状网络行为模型来估计。通过在Touchstone Delta上的测试验证了性能估计。结果表明,对于固定的屏幕尺寸,2D网格网络的性能在使用对象划分算法时可以很好地扩展——通信所需的时间实际上随着数据和系统大小的增加而减少。简要描述了一种对象划分算法的Touchstone Delta实现,以说明该算法的低通信要求。
{"title":"Parallel volume-rendering algorithm performance on mesh-connected multicomputers","authors":"U. Neumann","doi":"10.1145/166181.166196","DOIUrl":"https://doi.org/10.1145/166181.166196","url":null,"abstract":"This work examines the network performance of mesh-connected multicomputers applied to parallel volume rendering algorithms. This issue has not been addressed in papers describing particular parallel implementations, but is pertinent to anyone designing or implementing parallel rendering algorithms. Parallel volume rendering algorithms fall into two main classes-image and object partitions. Communication requirements for algorithms in these classes are analyzed. Network performance for these algorithms is estimated by using an existing model of mesh network behavior. The performance estimates are verified by tests on the Touchstone Delta. The results indicate that, for a fixed screen size, the performance of 2D mesh networks scales very well then used with object partition algorithms-the time required for communication actually decreases as the data and system sizes increase. A Touchstone Delta implementation of an object partition algorithm is briefly described to illustrate the algorithm's low communication requirements.","PeriodicalId":394370,"journal":{"name":"Proceedings of 1993 IEEE Parallel Rendering Symposium","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115254998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 92
A multicomputer polygon rendering algorithm for interactive applications 交互式应用的多计算机多边形绘制算法
Pub Date : 1993-11-01 DOI: 10.1145/166181.166187
D. Ellsworth
This paper presents a new multicomputer polygon rendering algorithm that is specialized for interactive applications. The algorithm differs from previous algorithms in two ways. First, it load balances the rasterization once per frame, instead of as the frame progresses, using the previous frame's distribution of polygons on the screen as input to the load-balancing algorithm. Second, it uses a new message sending scheme that reduces the number of messages required. These characteristics mean that the algorithm only requires global synchronization between frames, which allows for higher frame rates. The algorithm was selected using a simulator which confirmed that using the previous frame's polygon distribution on the screen is nearly as good as using the current frame's distribution. The algorithm is implemented on Caltech's Intel Touchstone Delta, a 512 processor multicomputer system, and preliminary performance figures are given. The highest performance achieved to date is 930,000 triangles per second using 256 processors and a 806,640 triangle data set.
本文提出了一种新的多机多边形绘制算法,该算法专门用于交互式应用。该算法与以前的算法有两个不同之处。首先,它使用前一帧在屏幕上的多边形分布作为负载平衡算法的输入,每帧一次负载平衡栅格化,而不是随着帧的进展。其次,它使用了一种新的消息发送方案,减少了所需的消息数量。这些特征意味着该算法只需要帧之间的全局同步,从而允许更高的帧速率。仿真结果表明,在屏幕上使用前一帧的多边形分布与使用当前帧的分布几乎一样好。该算法在加州理工学院的Intel Touchstone Delta 512处理器多机系统上实现,并给出了初步的性能数据。迄今为止,使用256个处理器和806,640个三角形数据集实现的最高性能是每秒930,000个三角形。
{"title":"A multicomputer polygon rendering algorithm for interactive applications","authors":"D. Ellsworth","doi":"10.1145/166181.166187","DOIUrl":"https://doi.org/10.1145/166181.166187","url":null,"abstract":"This paper presents a new multicomputer polygon rendering algorithm that is specialized for interactive applications. The algorithm differs from previous algorithms in two ways. First, it load balances the rasterization once per frame, instead of as the frame progresses, using the previous frame's distribution of polygons on the screen as input to the load-balancing algorithm. Second, it uses a new message sending scheme that reduces the number of messages required. These characteristics mean that the algorithm only requires global synchronization between frames, which allows for higher frame rates. The algorithm was selected using a simulator which confirmed that using the previous frame's polygon distribution on the screen is nearly as good as using the current frame's distribution. The algorithm is implemented on Caltech's Intel Touchstone Delta, a 512 processor multicomputer system, and preliminary performance figures are given. The highest performance achieved to date is 930,000 triangles per second using 256 processors and a 806,640 triangle data set.","PeriodicalId":394370,"journal":{"name":"Proceedings of 1993 IEEE Parallel Rendering Symposium","volume":"242 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114452812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Pixel merging for object-parallel rendering: A distributed snooping algorithm 对象并行渲染的像素合并:一种分布式窥探算法
Pub Date : 1993-11-01 DOI: 10.1145/166181.166188
M. Cox, P. Hanrahan
In the purely object-parallel approach to multiprocessor rendering, each processor is assigned responsibility to render a subset of the graphics database. When rendering is complete, pixels from the processors must be merged and globally z-buffered. On an arbitrary multiprocessor interconnection network, the straightforward algorithm for pixel merging requires d/sup -/A total network bandwidth per frame, where d/sup -/ is the depth complexity of the scene and A is the area of the screen or window. This algorithm is used by the Kubota Pacific Denali and appears to be used by the Evans and Sutherland Freedom series. An alternative algorithm, the PixelFlow algorithm, requires nA network bandwidth per frame, where n is the number of processors. But the merging is pipelined in PixelFlow so that each network link must only support A bandwidth per frame. However, that algorithm requires a separate special-purpose network for pixel merging. In this paper we present and analyze an expected-case log (d/sup -/)A algorithm for pixel merging that uses network broadcast, and we discuss the algorithm's applicability to shared-memory bus architectures.
在多处理器渲染的纯对象并行方法中,每个处理器被分配责任来渲染图形数据库的一个子集。渲染完成后,来自处理器的像素必须合并并进行全局z缓冲。在任意多处理器互连网络上,像素合并的直接算法需要d/sup -/每帧的总网络带宽,其中d/sup -/是场景的深度复杂度,A是屏幕或窗口的面积。该算法被久保田太平洋Denali所使用,似乎也被埃文斯和萨瑟兰自由系列所使用。另一种算法,PixelFlow算法,每帧需要nA网络带宽,其中n是处理器的数量。但是合并在PixelFlow中是流水线化的,所以每个网络链接每帧只能支持A带宽。然而,该算法需要一个单独的专用网络来进行像素合并。本文提出并分析了一种使用网络广播的期望情况日志(d/sup -/)像素合并算法,并讨论了该算法在共享内存总线体系结构中的适用性。
{"title":"Pixel merging for object-parallel rendering: A distributed snooping algorithm","authors":"M. Cox, P. Hanrahan","doi":"10.1145/166181.166188","DOIUrl":"https://doi.org/10.1145/166181.166188","url":null,"abstract":"In the purely object-parallel approach to multiprocessor rendering, each processor is assigned responsibility to render a subset of the graphics database. When rendering is complete, pixels from the processors must be merged and globally z-buffered. On an arbitrary multiprocessor interconnection network, the straightforward algorithm for pixel merging requires d/sup -/A total network bandwidth per frame, where d/sup -/ is the depth complexity of the scene and A is the area of the screen or window. This algorithm is used by the Kubota Pacific Denali and appears to be used by the Evans and Sutherland Freedom series. An alternative algorithm, the PixelFlow algorithm, requires nA network bandwidth per frame, where n is the number of processors. But the merging is pipelined in PixelFlow so that each network link must only support A bandwidth per frame. However, that algorithm requires a separate special-purpose network for pixel merging. In this paper we present and analyze an expected-case log (d/sup -/)A algorithm for pixel merging that uses network broadcast, and we discuss the algorithm's applicability to shared-memory bus architectures.","PeriodicalId":394370,"journal":{"name":"Proceedings of 1993 IEEE Parallel Rendering Symposium","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116021986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
A MIMD rendering algorithm for distributed memory architectures 分布式内存架构的MIMD渲染算法
Pub Date : 1993-11-01 DOI: 10.1145/166181.166186
T. Crockett, T. Orloff
We present a parallel rendering algorithm targeted to MIMD distributed-memory message-passing architectures. For maximum performance, the algorithm exploits both object-level and image level parallelism. The behavior of the algorithm is examined both analytically and experimentally. The results show that the choice of message size has a significant impact on performance. Scalability to large numbers of processors is found to be limited primarily by communication overheads. An experimental implementation for the Intel iPSC/860 confirms the analytical results and demonstrates increasing performance from 1 to 128 processors across a wide range of scene complexities.
提出了一种针对MIMD分布式内存消息传递体系结构的并行渲染算法。为了获得最佳性能,该算法同时利用了对象级和图像级并行性。通过分析和实验验证了该算法的性能。结果表明,消息大小的选择对性能有显著影响。对大量处理器的可伸缩性主要受到通信开销的限制。英特尔iPSC/860的实验实现证实了分析结果,并展示了在广泛的场景复杂性下从1到128个处理器的性能提高。
{"title":"A MIMD rendering algorithm for distributed memory architectures","authors":"T. Crockett, T. Orloff","doi":"10.1145/166181.166186","DOIUrl":"https://doi.org/10.1145/166181.166186","url":null,"abstract":"We present a parallel rendering algorithm targeted to MIMD distributed-memory message-passing architectures. For maximum performance, the algorithm exploits both object-level and image level parallelism. The behavior of the algorithm is examined both analytically and experimentally. The results show that the choice of message size has a significant impact on performance. Scalability to large numbers of processors is found to be limited primarily by communication overheads. An experimental implementation for the Intel iPSC/860 confirms the analytical results and demonstrates increasing performance from 1 to 128 processors across a wide range of scene complexities.","PeriodicalId":394370,"journal":{"name":"Proceedings of 1993 IEEE Parallel Rendering Symposium","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121239964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 41
Progressive refinement radiosity on ring-connected multicomputers 环连接多计算机上的渐进式精细辐射
Pub Date : 1993-11-01 DOI: 10.1145/166181.166192
T. Çapin, C. Aykanat, B. Özgüç
The progressive refinement method is investigated for parallelization on ring-connected multicomputers. A synchronous scheme, based on static task assignment, is proposed, in order to achieve better coherence during the parallel light distribution computations. An efficient global circulation scheme is proposed for the parallel light distribution computations, which reduces the total volume of concurrent communication by an asymptotical factor. The proposed parallel algorithm is implemented on a ring-embedded Intel's PSC/2 hypercube multicomputer. Load balance quality of the proposed static assignment schemes are evaluated experimentally. The effect of coherence in the parallel light distribution computations on the shooting patch selection sequence is also investigated.
研究了环连接多计算机并行化的递进细化方法。为了在并行配光计算中获得更好的相干性,提出了一种基于静态任务分配的同步方案。提出了一种有效的全局循环方案用于并行光分配计算,该方案通过一个渐近因子减少了并发通信的总量。所提出的并行算法在环形嵌入式Intel PSC/2超立方体多计算机上实现。对所提出的静态分配方案的负载平衡质量进行了实验评价。研究了平行光分布计算中相干性对拍摄补片选择顺序的影响。
{"title":"Progressive refinement radiosity on ring-connected multicomputers","authors":"T. Çapin, C. Aykanat, B. Özgüç","doi":"10.1145/166181.166192","DOIUrl":"https://doi.org/10.1145/166181.166192","url":null,"abstract":"The progressive refinement method is investigated for parallelization on ring-connected multicomputers. A synchronous scheme, based on static task assignment, is proposed, in order to achieve better coherence during the parallel light distribution computations. An efficient global circulation scheme is proposed for the parallel light distribution computations, which reduces the total volume of concurrent communication by an asymptotical factor. The proposed parallel algorithm is implemented on a ring-embedded Intel's PSC/2 hypercube multicomputer. Load balance quality of the proposed static assignment schemes are evaluated experimentally. The effect of coherence in the parallel light distribution computations on the shooting patch selection sequence is also investigated.","PeriodicalId":394370,"journal":{"name":"Proceedings of 1993 IEEE Parallel Rendering Symposium","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125021397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
A task adaptive parallel graphics renderer 一个任务自适应并行图形渲染器
Pub Date : 1993-11-01 DOI: 10.1145/166181.166185
S. Whitman
This paper presents a graphics renderer which incorporates new partitioning methodologies of memory and work for efficient execution on a parallel computer. The task adaptive domain decomposition scheme is an image space method involving dynamic partitioning of rectangular pixel area tasks. We show that this method requires little overhead, allows coherence within a parallel context, handles worst case scenarios with reasonable speedup, executes efficiently, and requires minimal processor synchronization. The implementation analysis indicates that load imbalance is the major cause of performance degradation at the higher processor counts. Even so, on a variety of test scenes, an average rendering speedup of 79 was achieved utilizing 96 processors on the BBN TC2000 multiprocessor with processor efficiency ranging from 66% to 94%.
本文提出了一种图形渲染器,它结合了新的内存划分方法,并能在并行计算机上有效地执行。任务自适应域分解方案是一种涉及矩形像素区域任务动态划分的图像空间方法。我们表明,这种方法需要很少的开销,允许并行上下文中的一致性,以合理的加速处理最坏的情况,有效地执行,并且需要最小的处理器同步。实现分析表明,负载不平衡是处理器数量较多时性能下降的主要原因。即便如此,在各种测试场景中,在BBN TC2000多处理器上使用96个处理器实现了79的平均渲染加速,处理器效率从66%到94%不等。
{"title":"A task adaptive parallel graphics renderer","authors":"S. Whitman","doi":"10.1145/166181.166185","DOIUrl":"https://doi.org/10.1145/166181.166185","url":null,"abstract":"This paper presents a graphics renderer which incorporates new partitioning methodologies of memory and work for efficient execution on a parallel computer. The task adaptive domain decomposition scheme is an image space method involving dynamic partitioning of rectangular pixel area tasks. We show that this method requires little overhead, allows coherence within a parallel context, handles worst case scenarios with reasonable speedup, executes efficiently, and requires minimal processor synchronization. The implementation analysis indicates that load imbalance is the major cause of performance degradation at the higher processor counts. Even so, on a variety of test scenes, an average rendering speedup of 79 was achieved utilizing 96 processors on the BBN TC2000 multiprocessor with processor efficiency ranging from 66% to 94%.","PeriodicalId":394370,"journal":{"name":"Proceedings of 1993 IEEE Parallel Rendering Symposium","volume":"39 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120889131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
An efficient parallel ray tracing scheme for distributed memory parallel computers 一种高效的分布式存储并行计算机并行光线跟踪方案
Pub Date : 1993-11-01 DOI: 10.1145/166181.166193
W. Lefer
The ray-tracing algorithm produces high quality images by taking multiple luminous effects into account. Hence, it requires many computations and a large memory capacity. The use of parallel machines is a solution in order to reduce significantly the synthesis time. Distributed Memory Parallel Computers offer an interesting performance/cost ratio but need to distribute computations and data. This paper is a study of the implementation of the ray-tracing algorithm on a Distributed Memory Parallel Computer. An original solution, based on the association of a data parallelism approach with a task parallelism one, is presented. A dynamic load redistribution mechanism allows us to ensure a good load balance during the synthesis phase. At the end of the paper, some results of our transputer implementation are presented.
光线追踪算法通过考虑多种发光效应产生高质量的图像。因此,它需要大量的计算和大的内存容量。为了显著减少合成时间,使用并联机器是一种解决方案。分布式内存并行计算机提供了一个有趣的性能/成本比,但需要分配计算和数据。本文研究了光线跟踪算法在分布式存储并行计算机上的实现。提出了一种基于数据并行化与任务并行化相结合的原始解决方案。动态负载重新分配机制使我们能够在合成阶段确保良好的负载平衡。最后给出了本系统的一些实现结果。
{"title":"An efficient parallel ray tracing scheme for distributed memory parallel computers","authors":"W. Lefer","doi":"10.1145/166181.166193","DOIUrl":"https://doi.org/10.1145/166181.166193","url":null,"abstract":"The ray-tracing algorithm produces high quality images by taking multiple luminous effects into account. Hence, it requires many computations and a large memory capacity. The use of parallel machines is a solution in order to reduce significantly the synthesis time. Distributed Memory Parallel Computers offer an interesting performance/cost ratio but need to distribute computations and data. This paper is a study of the implementation of the ray-tracing algorithm on a Distributed Memory Parallel Computer. An original solution, based on the association of a data parallelism approach with a task parallelism one, is presented. A dynamic load redistribution mechanism allows us to ensure a good load balance during the synthesis phase. At the end of the paper, some results of our transputer implementation are presented.","PeriodicalId":394370,"journal":{"name":"Proceedings of 1993 IEEE Parallel Rendering Symposium","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115040526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Scalable parallel volume raycasting for nonrectilinear computational grids 非线性计算网格的可伸缩平行体光线投射
Pub Date : 1993-11-01 DOI: 10.1145/166181.166194
J. Challinger
A scalable approach to parallel volume raycasting of structured and unstructured computational grids is presented. The algorithm is general enough to handle non-convex grids and cells, grids with voids, grids constructed from multiple grids, and embedded geometrical primitives. The algorithm is designed for a highly parallel MIMD architecture which features both local memory and shared memory with nonuniform access times. It has been implemented on a BBN TC2000 and benchmarked on several datasets. A variation of the algorithm which provides fast image updates for a changing transfer function is also presented. A distributed approach to controlling the execution of the volume render is used and the graphical user interface designed for this purpose is briefly described.
提出了一种可扩展的结构化和非结构化计算网格并行体射线投射方法。该算法足够通用,可以处理非凸网格和单元、带空洞的网格、由多个网格构造的网格以及嵌入的几何原语。该算法是针对高度并行的MIMD架构设计的,该架构具有本地存储器和共享存储器的非均匀访问时间。它已在BBN TC2000上实现,并在多个数据集上进行了基准测试。本文还提出了一种算法的变体,该算法可以为不断变化的传递函数提供快速的图像更新。本文使用了一种分布式方法来控制卷呈现的执行,并简要描述了为此目的设计的图形用户界面。
{"title":"Scalable parallel volume raycasting for nonrectilinear computational grids","authors":"J. Challinger","doi":"10.1145/166181.166194","DOIUrl":"https://doi.org/10.1145/166181.166194","url":null,"abstract":"A scalable approach to parallel volume raycasting of structured and unstructured computational grids is presented. The algorithm is general enough to handle non-convex grids and cells, grids with voids, grids constructed from multiple grids, and embedded geometrical primitives. The algorithm is designed for a highly parallel MIMD architecture which features both local memory and shared memory with nonuniform access times. It has been implemented on a BBN TC2000 and benchmarked on several datasets. A variation of the algorithm which provides fast image updates for a changing transfer function is also presented. A distributed approach to controlling the execution of the volume render is used and the graphical user interface designed for this purpose is briefly described.","PeriodicalId":394370,"journal":{"name":"Proceedings of 1993 IEEE Parallel Rendering Symposium","volume":"49 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121196706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 55
A data distributed, parallel algorithm for ray-traced volume rendering 一种用于光线跟踪体绘制的数据分布式并行算法
Pub Date : 1993-11-01 DOI: 10.1145/166181.166183
K. Ma, J. Painter, C. Hansen, M. Krogh
This paper presents a divide-and-conquer ray-traced volume rendering algorithm and a parallel image compositing method, along with their implementation and performance on the Connection Machine CM-5, and networked workstations. This algorithm distributes both the data and the computations to individual processing units to achieve fast, high-quality rendering of high-resolution data. The volume data, once distributed, is left intact. The processing nodes perform local raytracing of their subvolume concurrently. No communication between processing units is needed during this locally ray-tracing process. A subimage is generated by each processing unit and the final image is obtained by compositing subimages in the proper order, which can be determined a priori. Test results on the CM-5 and a group of networked workstations demonstrate the practicality of our rendering algorithm and compositing method.
本文提出了一种分而治之的光线跟踪体绘制算法和一种并行图像合成方法,以及它们在连接机CM-5和网络化工作站上的实现和性能。该算法将数据和计算分配到各个处理单元,以实现高分辨率数据的快速、高质量渲染。卷数据一旦分发,将保持完整。处理节点同时执行其子卷的本地光线跟踪。在这种局部光线追踪过程中,处理单元之间不需要通信。每个处理单元生成一个子图像,并将子图像按适当顺序组合得到最终图像,该顺序可以先验地确定。在CM-5和一组网络工作站上的测试结果表明了我们的绘制算法和合成方法的实用性。
{"title":"A data distributed, parallel algorithm for ray-traced volume rendering","authors":"K. Ma, J. Painter, C. Hansen, M. Krogh","doi":"10.1145/166181.166183","DOIUrl":"https://doi.org/10.1145/166181.166183","url":null,"abstract":"This paper presents a divide-and-conquer ray-traced volume rendering algorithm and a parallel image compositing method, along with their implementation and performance on the Connection Machine CM-5, and networked workstations. This algorithm distributes both the data and the computations to individual processing units to achieve fast, high-quality rendering of high-resolution data. The volume data, once distributed, is left intact. The processing nodes perform local raytracing of their subvolume concurrently. No communication between processing units is needed during this locally ray-tracing process. A subimage is generated by each processing unit and the final image is obtained by compositing subimages in the proper order, which can be determined a priori. Test results on the CM-5 and a group of networked workstations demonstrate the practicality of our rendering algorithm and compositing method.","PeriodicalId":394370,"journal":{"name":"Proceedings of 1993 IEEE Parallel Rendering Symposium","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131476537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 128
期刊
Proceedings of 1993 IEEE Parallel Rendering Symposium
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1