论文标题
树结构场景上的快速GPU边界框
Fast GPU bounding boxes on tree-structured scenes
论文作者
论文摘要
边界框的计算是高性能渲染中的一个基本问题,因为它是可见性淘汰和安装操作的输入。在以树结构的场景描述中,剪辑节点和混合节点分别包括交点和边界框的结合。这些是直接使用顺序算法在CPU上计算的,但是有效的,平行的GPU算法更难以捉摸。本文提出了一种快速且实用的解决方案,并采用了新的算法,用于其核心的经典括号匹配问题。核心算法是抽象地介绍的(根据载体抽象),然后将混凝土映射到真实GPU硬件的线程,工作组和调度级别。该算法是使用Compute着色器正式实现的,并且性能结果显示了连续CPU版本的巨大加速,并且确实是GPU硬件的最大理论吞吐量的合理分数。直接激励的应用程序是2D渲染,但是算法概括到其他域,核心括号匹配的问题具有其他应用程序,包括解析。
Computation of bounding boxes is a fundamental problem in high performance rendering, as it is an input to visibility culling and binning operations. In a scene description structured as a tree, clip nodes and blend nodes entail intersection and union of bounding boxes, respectively. These are straightforward to compute on the CPU using a sequential algorithm, but an efficient, parallel GPU algorithm is more elusive. This paper presents a fast and practical solution, with a new algorithm for the classic parentheses matching problem at its core. The core algorithm is presented abstractly (in terms of a PRAM abstraction), then with a concrete mapping to the thread, workgroup, and dispatch levels of real GPU hardware. The algorithm is implemented portably using compute shaders, and performance results show a dramatic speedup over a sequential CPU version, and indeed a reasonable fraction of maximum theoretical throughput of the GPU hardware. The immediate motivating application is 2D rendering, but the algorithms generalize to other domains, and the core parentheses matching problem has other applications including parsing.