首页 > 最新文献

EGGH-HPG'12最新文献

英文 中文
Maximizing parallelism in the construction of BVHs, octrees, and k-d trees 在构造bvh、八叉树和k-d树时最大化并行性
Pub Date : 2012-06-25 DOI: 10.2312/EGGH/HPG12/033-037
Tero Karras
A number of methods for constructing bounding volume hierarchies and point-based octrees on the GPU are based on the idea of ordering primitives along a space-filling curve. A major shortcoming with these methods is that they construct levels of the tree sequentially, which limits the amount of parallelism that they can achieve. We present a novel approach that improves scalability by constructing the entire tree in parallel. Our main contribution is an in-place algorithm for constructing binary radix trees, which we use as a building block for other types of trees.
在GPU上构造边界体层次结构和基于点的八叉树的许多方法都是基于沿着空间填充曲线排序原语的思想。这些方法的一个主要缺点是它们按顺序构建树的层次,这限制了它们可以实现的并行性。我们提出了一种新的方法,通过并行构建整个树来提高可伸缩性。我们的主要贡献是用于构造二叉基树的就地算法,我们将其用作其他类型树的构建块。
{"title":"Maximizing parallelism in the construction of BVHs, octrees, and k-d trees","authors":"Tero Karras","doi":"10.2312/EGGH/HPG12/033-037","DOIUrl":"https://doi.org/10.2312/EGGH/HPG12/033-037","url":null,"abstract":"A number of methods for constructing bounding volume hierarchies and point-based octrees on the GPU are based on the idea of ordering primitives along a space-filling curve. A major shortcoming with these methods is that they construct levels of the tree sequentially, which limits the amount of parallelism that they can achieve. We present a novel approach that improves scalability by constructing the entire tree in parallel. Our main contribution is an in-place algorithm for constructing binary radix trees, which we use as a building block for other types of trees.","PeriodicalId":294868,"journal":{"name":"EGGH-HPG'12","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115698875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 205
High-quality parallel depth-of-field using line samples 使用线样的高质量平行景深
Pub Date : 2012-06-25 DOI: 10.2312/EGGH/HPG12/023-031
Stanley Tzeng, Anjul Patney, A. Davidson, Mohamed S. Ebeida, S. Mitchell, John Douglas Owens
We present a parallel method for rendering high-quality depth-of-field effects using continuous-domain line samples, and demonstrate its high performance on commodity GPUs. Our method runs at interactive rates and has very low noise. Our exploration of the problem carefully considers implementation alternatives, and transforms an originally unbounded storage requirement to a small fixed requirement using heuristics to maintain quality. We also propose a novel blur-dependent level-of-detail scheme that helps accelerate rendering without undesirable artifacts. Our method consistently runs 4 to 5x faster than an equivalent point sampler with better image quality. Our method draws parallels to related work in rendering multi-fragment effects.
我们提出了一种使用连续域线样本绘制高质量景深效果的并行方法,并在商用gpu上展示了其高性能。我们的方法以交互速率运行,并且具有非常低的噪声。我们对这个问题的探索仔细地考虑了可选的实现方案,并使用启发式方法将最初的无界存储需求转换为较小的固定需求,以保持质量。我们还提出了一种新的依赖于模糊的细节级方案,该方案有助于加速渲染,而不会产生不良的工件。我们的方法始终比具有更好图像质量的等效点采样器快4到5倍。我们的方法与渲染多片段效果的相关工作相似。
{"title":"High-quality parallel depth-of-field using line samples","authors":"Stanley Tzeng, Anjul Patney, A. Davidson, Mohamed S. Ebeida, S. Mitchell, John Douglas Owens","doi":"10.2312/EGGH/HPG12/023-031","DOIUrl":"https://doi.org/10.2312/EGGH/HPG12/023-031","url":null,"abstract":"We present a parallel method for rendering high-quality depth-of-field effects using continuous-domain line samples, and demonstrate its high performance on commodity GPUs. Our method runs at interactive rates and has very low noise. Our exploration of the problem carefully considers implementation alternatives, and transforms an originally unbounded storage requirement to a small fixed requirement using heuristics to maintain quality. We also propose a novel blur-dependent level-of-detail scheme that helps accelerate rendering without undesirable artifacts. Our method consistently runs 4 to 5x faster than an equivalent point sampler with better image quality. Our method draws parallels to related work in rendering multi-fragment effects.","PeriodicalId":294868,"journal":{"name":"EGGH-HPG'12","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128446285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Adaptive scalable texture compression 自适应可伸缩纹理压缩
Pub Date : 2012-06-25 DOI: 10.2312/EGGH/HPG12/105-114
J. Nystad, Anders Lassen, Andrew Pomianowski, Sean Ellis, T. Olson
We describe a fixed-rate, lossy texture compression system that is designed to offer an unusual degree of flexibility and to support a very wide range of use cases, while providing better image quality than most formats in common use today. The system supports both 2D and 3D textures, at both standard and high dynamic range, at bit rates ranging from eight bits per pixel down to less than one bit per pixel in very fine steps. At any bit rate, texels can have from one to four color components. The system's flexibility results from a number of novel features. Color spaces and weights are represented using an encoding scheme that allows flexible allocation of bits between different types of information. The system uses bilinear interpolation to derive color space coordinates for a texel from sparse samples, and uses a procedural partition function to map texels to color spaces.
我们描述了一种固定速率的有损纹理压缩系统,该系统旨在提供不同寻常的灵活性,并支持非常广泛的用例,同时提供比目前常用的大多数格式更好的图像质量。该系统支持2D和3D纹理,在标准和高动态范围内,比特率范围从每像素8比特到每像素不到1比特,非常精细。在任何比特率下,色元可以有一到四个颜色成分。该系统的灵活性源于许多新颖的功能。颜色空间和权重使用一种编码方案来表示,这种编码方案允许在不同类型的信息之间灵活地分配位。该系统使用双线性插值方法从稀疏样本中获得色空间坐标,并使用程序划分函数将色空间映射到色空间。
{"title":"Adaptive scalable texture compression","authors":"J. Nystad, Anders Lassen, Andrew Pomianowski, Sean Ellis, T. Olson","doi":"10.2312/EGGH/HPG12/105-114","DOIUrl":"https://doi.org/10.2312/EGGH/HPG12/105-114","url":null,"abstract":"We describe a fixed-rate, lossy texture compression system that is designed to offer an unusual degree of flexibility and to support a very wide range of use cases, while providing better image quality than most formats in common use today. The system supports both 2D and 3D textures, at both standard and high dynamic range, at bit rates ranging from eight bits per pixel down to less than one bit per pixel in very fine steps. At any bit rate, texels can have from one to four color components. The system's flexibility results from a number of novel features. Color spaces and weights are represented using an encoding scheme that allows flexible allocation of bits between different types of information. The system uses bilinear interpolation to derive color space coordinates for a texel from sparse samples, and uses a procedural partition function to map texels to color spaces.","PeriodicalId":294868,"journal":{"name":"EGGH-HPG'12","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134339690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 86
kANN on the GPU with shifted sorting kANN在GPU上的移位排序
Pub Date : 2012-06-25 DOI: 10.2312/EGGH/HPG12/039-047
Shengren Li, L. Simons, Jagadeesh Bhaskar Pakaravoor, Fatemeh Abbasinejad, John Douglas Owens, N. Amenta
We describe the implementation of a simple method for finding k approximate nearest neighbors (ANNs) on the GPU. While the performance of most ANN algorithms depends heavily on the distributions of the data and query points, our approach has a very regular data access pattern. It performs as well as state of the art methods on easy distributions with small values of k, and much more quickly on more difficult problem instances. Irrespective of the distribution and also roughly of the size of the set of input data points, we can find 50 ANNs for 1M queries at a rate of about 1200 queries/ms.
我们描述了在GPU上找到k个近似最近邻(ann)的简单方法的实现。虽然大多数人工神经网络算法的性能严重依赖于数据和查询点的分布,但我们的方法具有非常规则的数据访问模式。在k值较小的简单分布上,它的性能与最先进的方法一样好,而在更困难的问题实例上,它的性能要快得多。不管输入数据点的分布和大小如何,我们可以以1200次查询/毫秒的速度为1M次查询找到50个人工神经网络。
{"title":"kANN on the GPU with shifted sorting","authors":"Shengren Li, L. Simons, Jagadeesh Bhaskar Pakaravoor, Fatemeh Abbasinejad, John Douglas Owens, N. Amenta","doi":"10.2312/EGGH/HPG12/039-047","DOIUrl":"https://doi.org/10.2312/EGGH/HPG12/039-047","url":null,"abstract":"We describe the implementation of a simple method for finding k approximate nearest neighbors (ANNs) on the GPU. While the performance of most ANN algorithms depends heavily on the distributions of the data and query points, our approach has a very regular data access pattern. It performs as well as state of the art methods on easy distributions with small values of k, and much more quickly on more difficult problem instances. Irrespective of the distribution and also roughly of the size of the set of input data points, we can find 50 ANNs for 1M queries at a rate of about 1200 queries/ms.","PeriodicalId":294868,"journal":{"name":"EGGH-HPG'12","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121010079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Representing appearance and pre-filtering subpixel data in sparse voxel octrees 在稀疏体素八叉树中表示外观和预滤波亚像素数据
Pub Date : 2012-06-25 DOI: 10.2312/EGGH/HPG12/125-134
E. Heitz, Fabrice Neyret
Sparse Voxel Octrees (SVOs) represent efficiently complex geometry on current GPUs. Despite the fact that LoDs come naturally with octrees, interpolating and filtering SVOs are still issues in current approaches. In this paper, we propose a representation for the appearance of a detailed surface with associated attributes stored within a voxel octree. We store macro- and micro-descriptors of the surface shape and associated attributes in each voxel. We represent the surface macroscopically with a signed distance field and we encode subvoxel microdetails with Gaussian descriptors of the surface and attributes within the voxel. Our voxels form a continuous field interpolated through space and scales, through which we cast conic rays. Within the ray marching steps, we compute the occlusion distribution produced by the macro-surface inside a pixel footprint, we use the microdescriptors to reconstruct light- and view-dependent shading, and we combine fragments in an A-buffer way. Our representation efficiently accounts for various subpixel effects. It can be continuously interpolated and filtered, it is scalable, and it allows for efficient depth-of-field. We illustrate the quality of these various effects by displaying surfaces at different scales, and we show that the timings per pixel are scale-independent.
稀疏体素八叉树(SVOs)在当前gpu上高效地表示复杂的几何结构。尽管lod与八叉树自然结合,但在目前的方法中,插值和过滤svo仍然是一个问题。在本文中,我们提出了一种详细表面的外观表示,并将相关属性存储在体素八叉树中。我们在每个体素中存储表面形状和相关属性的宏观和微观描述符。我们用带符号的距离场来宏观地表示表面,并使用表面的高斯描述符和体素内的属性来编码亚体素微细节。我们的体素形成了一个通过空间和尺度插值的连续场,通过它我们投射出圆锥射线。在光线行进步骤中,我们计算由像素足迹内的宏观表面产生的遮挡分布,我们使用微描述符来重建依赖于光和视图的阴影,并且我们以a缓冲的方式组合碎片。我们的表示有效地解释了各种亚像素效应。它可以连续插值和滤波,它是可扩展的,它允许有效的景深。我们通过显示不同尺度的表面来说明这些不同效果的质量,并且我们表明每个像素的时间与尺度无关。
{"title":"Representing appearance and pre-filtering subpixel data in sparse voxel octrees","authors":"E. Heitz, Fabrice Neyret","doi":"10.2312/EGGH/HPG12/125-134","DOIUrl":"https://doi.org/10.2312/EGGH/HPG12/125-134","url":null,"abstract":"Sparse Voxel Octrees (SVOs) represent efficiently complex geometry on current GPUs. Despite the fact that LoDs come naturally with octrees, interpolating and filtering SVOs are still issues in current approaches.\u0000 In this paper, we propose a representation for the appearance of a detailed surface with associated attributes stored within a voxel octree. We store macro- and micro-descriptors of the surface shape and associated attributes in each voxel. We represent the surface macroscopically with a signed distance field and we encode subvoxel microdetails with Gaussian descriptors of the surface and attributes within the voxel. Our voxels form a continuous field interpolated through space and scales, through which we cast conic rays. Within the ray marching steps, we compute the occlusion distribution produced by the macro-surface inside a pixel footprint, we use the microdescriptors to reconstruct light- and view-dependent shading, and we combine fragments in an A-buffer way. Our representation efficiently accounts for various subpixel effects. It can be continuously interpolated and filtered, it is scalable, and it allows for efficient depth-of-field. We illustrate the quality of these various effects by displaying surfaces at different scales, and we show that the timings per pixel are scale-independent.","PeriodicalId":294868,"journal":{"name":"EGGH-HPG'12","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129407130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
Adaptive image space shading for motion and defocus blur 自适应图像空间阴影运动和散焦模糊
Pub Date : 2012-06-25 DOI: 10.2312/EGGH/HPG12/013-021
K. Vaidyanathan, Róbert Tóth, Marco Salvi, S. Boulos, A. Lefohn
We present a novel anisotropic sampling algorithm for image space shading which builds upon recent advancements in decoupled sampling for stochastic rasterization pipelines. First, we analyze the frequency content of a pixel in the presence of motion and defocus blur. We use this analysis to derive bounds for the spectrum of a surface defined over a two-dimensional and motion-aligned shading space. Second, we present a simple algorithm that uses the new frequency bounds to reduce the number of shaded quads and the size of decoupling cache respectively by 2X and 16X, while largely preserving image detail and minimizing additional aliasing.
我们提出了一种新的各向异性采样算法用于图像空间着色,该算法建立在随机光栅化管道的解耦采样的最新进展之上。首先,我们分析了存在运动和散焦模糊的像素的频率内容。我们使用此分析来推导在二维和运动对齐的阴影空间上定义的表面光谱的界限。其次,我们提出了一种简单的算法,该算法使用新的频率边界将阴影四边形的数量和解耦缓存的大小分别减少了2倍和16倍,同时在很大程度上保留了图像细节并最小化了额外的混叠。
{"title":"Adaptive image space shading for motion and defocus blur","authors":"K. Vaidyanathan, Róbert Tóth, Marco Salvi, S. Boulos, A. Lefohn","doi":"10.2312/EGGH/HPG12/013-021","DOIUrl":"https://doi.org/10.2312/EGGH/HPG12/013-021","url":null,"abstract":"We present a novel anisotropic sampling algorithm for image space shading which builds upon recent advancements in decoupled sampling for stochastic rasterization pipelines. First, we analyze the frequency content of a pixel in the presence of motion and defocus blur. We use this analysis to derive bounds for the spectrum of a surface defined over a two-dimensional and motion-aligned shading space. Second, we present a simple algorithm that uses the new frequency bounds to reduce the number of shaded quads and the size of decoupling cache respectively by 2X and 16X, while largely preserving image detail and minimizing additional aliasing.","PeriodicalId":294868,"journal":{"name":"EGGH-HPG'12","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128302727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Parallel patch-based texture synthesis 基于并行补丁的纹理合成
Pub Date : 2012-06-25 DOI: 10.2312/EGGH/HPG12/115-124
A. Lasram, S. Lefebvre
Fast parallel algorithms exist for pixel-based texture synthesizers. Unfortunately, these synthesizers often fail to preserve structures from the exemplar without the user specifying additional feature information. On the contrary, patch-based synthesizers are better at capturing and preserving structural patterns. However, they require relatively slow algorithms to layout the patches and stitch them together. We present a parallel patch-based texture synthesis technique that achieves high degree of parallelism. Our synthesizer starts from a low-quality result and adds several patches in parallel to improve it. It selects patches that blend in a seamless way with the existing result, and that hide existing visual artifacts. This is made possible through two main algorithmic contributions: An algorithm to quickly find a good cut around a patch, and a deformation algorithm to further align features crossing the patch boundary. We show that even with a uniform parallel random sampling of the patches, our improved patch stitching achieves high quality synthesis results. We discuss several synthesis strategies, such as using patches of decreasing size or using various amounts of deformation during the optimization. We propose a complete implementation tuned to take advantage of massive GPU parallelism.
基于像素的纹理合成器存在快速并行算法。不幸的是,如果没有用户指定额外的特征信息,这些合成器往往不能从范例中保留结构。相反,基于补丁的合成器在捕获和保存结构模式方面做得更好。然而,它们需要相对较慢的算法来布局补丁并将它们拼接在一起。提出了一种基于并行补丁的纹理合成技术,实现了高度的并行性。我们的合成器从低质量的结果开始,并添加几个补丁并行改进它。它选择与现有结果无缝融合的补丁,并隐藏现有的视觉工件。这是通过两种主要算法实现的:一种快速找到补丁周围良好切口的算法,以及一种变形算法,用于进一步对齐跨越补丁边界的特征。研究表明,即使对贴片进行均匀平行随机采样,改进的贴片拼接也能获得高质量的合成结果。我们讨论了几种合成策略,例如在优化过程中使用减小尺寸的补丁或使用不同数量的变形。我们提出了一个完整的实现,以利用大规模GPU并行性。
{"title":"Parallel patch-based texture synthesis","authors":"A. Lasram, S. Lefebvre","doi":"10.2312/EGGH/HPG12/115-124","DOIUrl":"https://doi.org/10.2312/EGGH/HPG12/115-124","url":null,"abstract":"Fast parallel algorithms exist for pixel-based texture synthesizers. Unfortunately, these synthesizers often fail to preserve structures from the exemplar without the user specifying additional feature information. On the contrary, patch-based synthesizers are better at capturing and preserving structural patterns. However, they require relatively slow algorithms to layout the patches and stitch them together.\u0000 We present a parallel patch-based texture synthesis technique that achieves high degree of parallelism. Our synthesizer starts from a low-quality result and adds several patches in parallel to improve it. It selects patches that blend in a seamless way with the existing result, and that hide existing visual artifacts. This is made possible through two main algorithmic contributions: An algorithm to quickly find a good cut around a patch, and a deformation algorithm to further align features crossing the patch boundary. We show that even with a uniform parallel random sampling of the patches, our improved patch stitching achieves high quality synthesis results.\u0000 We discuss several synthesis strategies, such as using patches of decreasing size or using various amounts of deformation during the optimization. We propose a complete implementation tuned to take advantage of massive GPU parallelism.","PeriodicalId":294868,"journal":{"name":"EGGH-HPG'12","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125632812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
Power efficiency for software algorithms running on graphics processors 在图形处理器上运行的软件算法的功率效率
Pub Date : 2012-06-25 DOI: 10.2312/EGGH/HPG12/067-075
Björn A. Johnsson, P. Ganestam, M. Doggett, T. Akenine-Möller
Power efficiency has become the most important consideration for many modern computing devices. In this paper, we examine power efficiency of a range of graphics algorithms on different GPUs. To measure power consumption, we have built a power measuring device that samples currents at a high frequency. Comparing power efficiency of different graphics algorithms is done by measuring power and performance of three different primary rendering algorithms and three different shadow algorithms. We measure these algorithms' power signatures on a mobile phone, on an integrated CPU and graphics processor, and on high-end discrete GPUs, and then compare power efficiency across both algorithms and GPUs. Our results show that power efficiency is not always proportional to rendering performance and that, for some algorithms, power efficiency varies across different platforms. We also show that for some algorithms, energy efficiency is similar on all platforms.
电源效率已成为许多现代计算设备最重要的考虑因素。在本文中,我们研究了一系列图形算法在不同gpu上的功率效率。为了测量功耗,我们建立了一个功率测量装置,可以对高频电流进行采样。通过测量三种不同的主渲染算法和三种不同的阴影算法的功耗和性能来比较不同图形算法的功耗效率。我们在手机、集成CPU和图形处理器以及高端分立gpu上测量这些算法的功率特征,然后比较算法和gpu的功率效率。我们的结果表明,功率效率并不总是与渲染性能成正比,并且对于某些算法,功率效率在不同的平台上有所不同。我们还表明,对于某些算法,在所有平台上的能源效率是相似的。
{"title":"Power efficiency for software algorithms running on graphics processors","authors":"Björn A. Johnsson, P. Ganestam, M. Doggett, T. Akenine-Möller","doi":"10.2312/EGGH/HPG12/067-075","DOIUrl":"https://doi.org/10.2312/EGGH/HPG12/067-075","url":null,"abstract":"Power efficiency has become the most important consideration for many modern computing devices. In this paper, we examine power efficiency of a range of graphics algorithms on different GPUs. To measure power consumption, we have built a power measuring device that samples currents at a high frequency. Comparing power efficiency of different graphics algorithms is done by measuring power and performance of three different primary rendering algorithms and three different shadow algorithms. We measure these algorithms' power signatures on a mobile phone, on an integrated CPU and graphics processor, and on high-end discrete GPUs, and then compare power efficiency across both algorithms and GPUs. Our results show that power efficiency is not always proportional to rendering performance and that, for some algorithms, power efficiency varies across different platforms. We also show that for some algorithms, energy efficiency is similar on all platforms.","PeriodicalId":294868,"journal":{"name":"EGGH-HPG'12","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130051911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Algorithm and VLSI architecture for real-time 1080p60 video retargeting 实时1080p60视频重定向的算法和VLSI架构
Pub Date : 2012-06-25 DOI: 10.2312/EGGH/HPG12/057-066
P. Greisen, Manuel Lang, Simon Heinzle, A. Smolic
Aspect ratio retargeting for streaming video has actively been researched in the past years. While the mobile market with its huge diversity of screen formats is one of the most promising application areas, no existing algorithm is efficient enough to be embedded in such devices. In this work, we devise an efficient video retargeting algorithm by following an algorithm-architecture co-design approach and we present the first FPGA implementation that is able to retarget full HD 1080p video at up to 60 frames per second. We furthermore show that our algorithm can be implemented on embedded processors at interactive framerates. Our hardware architecture only requires a modest amount of hardware resources, and is portable to a dedicated ASIC for the use in consumer electronic devices such as displays or mobile phones.
流媒体视频的宽高比重定向在过去几年得到了积极的研究。虽然拥有多种屏幕格式的移动市场是最有前途的应用领域之一,但现有的算法效率还不足以嵌入到这样的设备中。在这项工作中,我们通过遵循算法架构协同设计方法设计了一种高效的视频重定向算法,并提出了第一个能够以高达每秒60帧的速度重定向全高清1080p视频的FPGA实现。我们进一步证明了我们的算法可以在交互式帧率的嵌入式处理器上实现。我们的硬件架构只需要少量的硬件资源,并且可以移植到专用的专用ASIC上,用于显示器或移动电话等消费电子设备。
{"title":"Algorithm and VLSI architecture for real-time 1080p60 video retargeting","authors":"P. Greisen, Manuel Lang, Simon Heinzle, A. Smolic","doi":"10.2312/EGGH/HPG12/057-066","DOIUrl":"https://doi.org/10.2312/EGGH/HPG12/057-066","url":null,"abstract":"Aspect ratio retargeting for streaming video has actively been researched in the past years. While the mobile market with its huge diversity of screen formats is one of the most promising application areas, no existing algorithm is efficient enough to be embedded in such devices. In this work, we devise an efficient video retargeting algorithm by following an algorithm-architecture co-design approach and we present the first FPGA implementation that is able to retarget full HD 1080p video at up to 60 frames per second. We furthermore show that our algorithm can be implemented on embedded processors at interactive framerates. Our hardware architecture only requires a modest amount of hardware resources, and is portable to a dedicated ASIC for the use in consumer electronic devices such as displays or mobile phones.","PeriodicalId":294868,"journal":{"name":"EGGH-HPG'12","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115443104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Reducing aliasing artifacts through resampling 通过重采样减少混叠现象
Pub Date : 2012-06-25 DOI: 10.2312/EGGH/HPG12/077-086
A. Reshetov
Post-processing antialiasing methods are well suited for deferred shading because they decouple antialiasing from the rest of graphics pipeline. In morphological methods, the final image is filtered with a data-dependent filter. The filter coefficients are computed by analyzing the non-local neighborhood of each pixel. Though very simple and efficient, such methods have intrinsic quality limitations due to spatial undersampling and temporal aliasing. We explore an alternative formulation in which filter coefficients are computed locally for each pixel by supersampling geometry, while shading is still done only once per pixel. During pre-processing, each geometric subsample is converted to a single bit indicating whether the subsample is different from the central one. The ensuing binary mask is then used in the post-processing step to retrieve filter coefficients, which were precomputed for all possible masks. For a typical 8 subsamples, it results in a sub-millisecond performance, while improving the image quality by about 10 dB. To compare subsamples, we use a novel symmetric angular measure, which has a simple geometric interpretation. We propose to use this measure in a variety of applications that assess the difference between geometric samples (rendering, mesh simplification, geometry encoding, adaptive tessellation).
后处理抗锯齿方法非常适合延迟着色,因为它们将抗锯齿与其他图形管道解耦。在形态学方法中,使用数据相关滤波器对最终图像进行过滤。通过分析每个像素的非局部邻域来计算滤波系数。这种方法虽然简单有效,但由于空间欠采样和时间混叠,存在固有的质量限制。我们探索了一种替代公式,其中过滤器系数通过超采样几何图形局部计算每个像素,而遮阳仍然只做一次每像素。在预处理过程中,将每个几何子样本转换为单个比特,表明该子样本是否与中心子样本不同。然后在后处理步骤中使用随后的二进制掩码来检索滤波器系数,这些系数是为所有可能的掩码预先计算的。对于典型的8个子样本,它会导致亚毫秒级的性能,同时将图像质量提高约10 dB。为了比较子样本,我们使用了一种新的对称角度量,它具有简单的几何解释。我们建议在各种评估几何样本(渲染、网格简化、几何编码、自适应镶嵌)之间差异的应用中使用这种方法。
{"title":"Reducing aliasing artifacts through resampling","authors":"A. Reshetov","doi":"10.2312/EGGH/HPG12/077-086","DOIUrl":"https://doi.org/10.2312/EGGH/HPG12/077-086","url":null,"abstract":"Post-processing antialiasing methods are well suited for deferred shading because they decouple antialiasing from the rest of graphics pipeline. In morphological methods, the final image is filtered with a data-dependent filter. The filter coefficients are computed by analyzing the non-local neighborhood of each pixel. Though very simple and efficient, such methods have intrinsic quality limitations due to spatial undersampling and temporal aliasing. We explore an alternative formulation in which filter coefficients are computed locally for each pixel by supersampling geometry, while shading is still done only once per pixel.\u0000 During pre-processing, each geometric subsample is converted to a single bit indicating whether the subsample is different from the central one. The ensuing binary mask is then used in the post-processing step to retrieve filter coefficients, which were precomputed for all possible masks. For a typical 8 subsamples, it results in a sub-millisecond performance, while improving the image quality by about 10 dB.\u0000 To compare subsamples, we use a novel symmetric angular measure, which has a simple geometric interpretation. We propose to use this measure in a variety of applications that assess the difference between geometric samples (rendering, mesh simplification, geometry encoding, adaptive tessellation).","PeriodicalId":294868,"journal":{"name":"EGGH-HPG'12","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125155058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
期刊
EGGH-HPG'12
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1