Pub Date : 2012-06-25DOI: 10.2312/EGGH/HPG12/001-011
J. Nilsson, Petrik Clarberg, Björn A. Johnsson, Jacob Munkberg, J. Hasselgren, Róbert Tóth, Marco Salvi, T. Akenine-Möller
This paper assumes the availability of a very fast higher-dimensional rasterizer in future graphics processors. Working in up to five dimensions, i.e., adding time and lens parameters, it is well-known that this can be used to render scenes with both motion blur and depth of field. Our hypothesis is that such a rasterizer can also be used as a flexible tool for other, less conventional, usage areas, similar to how the two-dimensional rasterizer in contemporary graphics processors has been used for widely different purposes other than the original intent. We show six such examples, namely, continuous collision detection, caustics rendering, higher-dimensional sampling, glossy reflections and refractions, motion blurred soft shadows, and finally multi-view rendering. The insights gained from these examples are used to put together a coherent model for what a future graphics pipeline that supports these and other use cases should look like. Our work intends to provide inspiration and motivation for hardware and API design, as well as continued research in higher-dimensional rasterization and its uses.
{"title":"Design and novel uses of higher-dimensional rasterization","authors":"J. Nilsson, Petrik Clarberg, Björn A. Johnsson, Jacob Munkberg, J. Hasselgren, Róbert Tóth, Marco Salvi, T. Akenine-Möller","doi":"10.2312/EGGH/HPG12/001-011","DOIUrl":"https://doi.org/10.2312/EGGH/HPG12/001-011","url":null,"abstract":"This paper assumes the availability of a very fast higher-dimensional rasterizer in future graphics processors. Working in up to five dimensions, i.e., adding time and lens parameters, it is well-known that this can be used to render scenes with both motion blur and depth of field. Our hypothesis is that such a rasterizer can also be used as a flexible tool for other, less conventional, usage areas, similar to how the two-dimensional rasterizer in contemporary graphics processors has been used for widely different purposes other than the original intent. We show six such examples, namely, continuous collision detection, caustics rendering, higher-dimensional sampling, glossy reflections and refractions, motion blurred soft shadows, and finally multi-view rendering. The insights gained from these examples are used to put together a coherent model for what a future graphics pipeline that supports these and other use cases should look like. Our work intends to provide inspiration and motivation for hardware and API design, as well as continued research in higher-dimensional rasterization and its uses.","PeriodicalId":294868,"journal":{"name":"EGGH-HPG'12","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125462493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-06-25DOI: 10.2312/EGGH/HPG12/049-055
Nicolas Feltman, Minjae Lee, K. Fatahalian
We derive the Shadow Ray Distribution Heuristic (SRDH), an accurate cost estimator for shadow ray traversal through a bounding volume hierarchy (BVH). The SRDH leverages up-front knowledge of the distribution and intersection results of previously traced shadow rays to construct a shadow-ray-specialized BVH and choose an associated traversal order policy which together promote early termination by quickly finding occlusions. In scenes containing large amounts of occlusion, SRDH reduces the number of BVH node traversal steps needed for shadow computations between 22% and 56% compared to average-case traversal through SAH-constructed trees. Evaluating the SRDH using a sparse shadow ray set recorded from a 16 x16 pixel rendering of the scene consistently produces BVHs whose traversal cost is within 6% of those built when all shadow rays are available to the metric at the time of construction. The benefits of the SRDH come at the cost of storing an additional BVH in memory and a 2.4x increase (on average) in BVH construction time.
{"title":"SRDH: specializing BVH construction and traversal order using representative shadow ray sets","authors":"Nicolas Feltman, Minjae Lee, K. Fatahalian","doi":"10.2312/EGGH/HPG12/049-055","DOIUrl":"https://doi.org/10.2312/EGGH/HPG12/049-055","url":null,"abstract":"We derive the Shadow Ray Distribution Heuristic (SRDH), an accurate cost estimator for shadow ray traversal through a bounding volume hierarchy (BVH). The SRDH leverages up-front knowledge of the distribution and intersection results of previously traced shadow rays to construct a shadow-ray-specialized BVH and choose an associated traversal order policy which together promote early termination by quickly finding occlusions. In scenes containing large amounts of occlusion, SRDH reduces the number of BVH node traversal steps needed for shadow computations between 22% and 56% compared to average-case traversal through SAH-constructed trees. Evaluating the SRDH using a sparse shadow ray set recorded from a 16 x16 pixel rendering of the scene consistently produces BVHs whose traversal cost is within 6% of those built when all shadow rays are available to the metric at the time of construction. The benefits of the SRDH come at the cost of storing an additional BVH in memory and a 2.4x increase (on average) in BVH construction time.","PeriodicalId":294868,"journal":{"name":"EGGH-HPG'12","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128202983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-06-25DOI: 10.2312/EGGH/HPG12/087-096
Ola Olsson, M. Billeter, Ulf Assarsson
This paper presents and investigates Clustered Shading for deferred and forward rendering. In Clustered Shading, view samples with similar properties (e.g. 3D-position and/or normal) are grouped into clusters. This is comparable to tiled shading, where view samples are grouped into tiles based on 2D-position only. We show that Clustered Shading creates a better mapping of light sources to view samples than tiled shading, resulting in a significant reduction of lighting computations during shading. Additionally, Clustered Shading enables using normal information to perform per-cluster back-face culling of lights, again reducing the number of lighting computations. We also show that Clustered Shading not only outperforms tiled shading in many scenes, but also exhibits better worst case behaviour under tricky conditions (e.g. when looking at high-frequency geometry with large discontinuities in depth). Additionally, Clustered Shading enables real-time scenes with two to three orders of magnitudes more lights than previously feasible (up to around one million light sources).
{"title":"Clustered deferred and forward shading","authors":"Ola Olsson, M. Billeter, Ulf Assarsson","doi":"10.2312/EGGH/HPG12/087-096","DOIUrl":"https://doi.org/10.2312/EGGH/HPG12/087-096","url":null,"abstract":"This paper presents and investigates Clustered Shading for deferred and forward rendering. In Clustered Shading, view samples with similar properties (e.g. 3D-position and/or normal) are grouped into clusters. This is comparable to tiled shading, where view samples are grouped into tiles based on 2D-position only. We show that Clustered Shading creates a better mapping of light sources to view samples than tiled shading, resulting in a significant reduction of lighting computations during shading. Additionally, Clustered Shading enables using normal information to perform per-cluster back-face culling of lights, again reducing the number of lighting computations. We also show that Clustered Shading not only outperforms tiled shading in many scenes, but also exhibits better worst case behaviour under tricky conditions (e.g. when looking at high-frequency geometry with large discontinuities in depth). Additionally, Clustered Shading enables real-time scenes with two to three orders of magnitudes more lights than previously feasible (up to around one million light sources).","PeriodicalId":294868,"journal":{"name":"EGGH-HPG'12","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126941206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2012-06-25DOI: 10.2312/EGGH/HPG12/097-103
M. McGuire, Michael Mara, D. Luebke
This paper presents a set of architecture-aware performance and integration improvements for a recent screenspace ambient obscurance algorithm. These improvements collectively produce a 7 x performance increase at 2560 x1600, generalize the algorithm to both forward and deferred renderers, and eliminate the radius- and scene-dependence of the previous algorithm to provide a hard real-time guarantee of fixed execution time. The optimizations build on three strategies: pre-filter the depth buffer to maximize memory hierarchy efficiency; reduce total bandwidth by carefully reconstructing positions and normals at high precision from a depth buffer; and exploit low-level intra- and inter-thread techniques for parallel, floating-point architectures.
{"title":"Scalable ambient obscurance","authors":"M. McGuire, Michael Mara, D. Luebke","doi":"10.2312/EGGH/HPG12/097-103","DOIUrl":"https://doi.org/10.2312/EGGH/HPG12/097-103","url":null,"abstract":"This paper presents a set of architecture-aware performance and integration improvements for a recent screenspace ambient obscurance algorithm. These improvements collectively produce a 7 x performance increase at 2560 x1600, generalize the algorithm to both forward and deferred renderers, and eliminate the radius- and scene-dependence of the previous algorithm to provide a hard real-time guarantee of fixed execution time. The optimizations build on three strategies: pre-filter the depth buffer to maximize memory hierarchy efficiency; reduce total bandwidth by carefully reconstructing positions and normals at high precision from a depth buffer; and exploit low-level intra- and inter-thread techniques for parallel, floating-point architectures.","PeriodicalId":294868,"journal":{"name":"EGGH-HPG'12","volume":"260 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131889801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}