Cache-Efficient Layouts of Bounding Volume Hierarchies (BVHs) Sung-Eui Yoon Lawrence Livermore National Laboratory Dinesh Manocha Univ.
Download ReportTranscript Cache-Efficient Layouts of Bounding Volume Hierarchies (BVHs) Sung-Eui Yoon Lawrence Livermore National Laboratory Dinesh Manocha Univ.
Cache-Efficient Layouts of Bounding Volume Hierarchies (BVHs)
Sung-Eui Yoon
Lawrence Livermore National Laboratory
Dinesh Manocha
Univ. of North Carolina at Chapel Hill
Goal
●
Compute cache-coherent layouts of bounding volume hierarchies (BVHs)
● ●
For various geometric applications Handles any kind of BVHs and spatial partitioning hierarchies (e.g., kd-tree)
2
Bounding Volume Hierarchies (BVHs)
●
Widely used data structures in:
●
Ray tracing
● ●
Collision detection Visibility culling
Ray tracing Dynamic simulation 3
Bounding Volumes (BVs)
●
Axis-aligned bounding boxes (AABBs)
●
Oriented bounding boxes (OBBs) [Gottschalk et al. 96]
●
Spheres [Hubbard 93]
Triangles of a mesh ●
Discrete orientation polytopes (k-DOPs) [Klosowski et al. 98]
4
Layout of BVHs
●
Nodes (and triangles) of BVHs are stored in arrays
● ●
What is a good layout?
How to compute cache-efficient layouts?
B A D C E Layout method 1D Layout of nodes: A B C D E 5
Motivation
●
Lower growth rate of data access speed Growth rate during 1993 – 2004
60 50 40 30 20 10 0 1.5X
Disk access speed
20X
RAM access speed
46X
CPU speed
Courtesy: http://www.hcibook.com/e3/online/moores-law/ 6
Memory Hierarchies and Caches Fast memory or cache Slow memory Block transfer CPU Disk Access time: 10 -6 sec 10 -4 sec 1 sec
7
Main Contributions
●
An algorithm computing cache-efficient layouts of BVHs
● ●
Probabilistic model Simple layout construction method
●
Applicable to spatial partitioning hierarchies
8
Related Work
● ● ●
Mesh layouts Layouts of search trees Layouts of BVHs
9
Related Work
●
Mesh layouts
●
Cache-coherent layouts of meshes and graphs [Yoon et al. 05, Yoon and Lindstrom 06] Require an input graph that represents access patterns on a BVH
● ●
Layouts of search trees Layouts of BVHs
10
Related Work
● ●
Mesh layouts Layouts of search trees
●
[Gil and Itai 99, Alstrup et al. 03] Require a probability function that each node will be accessed
●
Layouts of BVHs
11
Related Work
● ● ●
Mesh layouts Layouts of search trees Layouts of BVHs
●
Studied in collision detection [Ericson 04] and ray tracing [Havran 97]
●
Blocking-based layouts Emde Boas 77] [Terdiman 03, van
12
Outline
● ● ●
Probabilistic model Layout computation Results
13
Outline
● ● ●
Probabilistic model Layout computation Results
14
Traversals of Collision Queries on BVHs
●
Takes two objects
●
Two 3D objects for collision detection
●
One 3D object and one ray for ray tracing
BVH 1 BVH 2 15
Two Localities
● ●
Parent-child locality Spatial locality
16
Parent-child Locality
A B A B BVH 1 BVH 2 17
Spatial Locality
D C E D BVH 1 E C BVH 2 18
Probabilistic Model
●
Quantify localities in a uniform way
●
Measure the probability for localities
●
Based on geometric relationships between bounding volumes
19
Probabilistic Model
● ●
Pr (n)
●
Probability that a node, n, will be accessed during runtime traversal
Intersected
Two major factors
g ●
Prob. that p is accessed
●
Conditional prob. that p is also intersected given g is intersected
p n Accessed and Intersected b Pr(
n
) Pr(
p
) Pr(
X p
1 |
X g
1 ), where X p (or X g ) is a boolean random variable indicating collision between p (or g) and b 20
Probability Computation
● Pr(
X p
1 |
X g
1 )
: Conditional prob. that p is also intersected given g is intersected
●
Do not know any information about b
g p Intersected Intersected b n 21
Contact Space
●
Contact space of b against p and g
●
Denoted as S p and S g
Pr(
X p
1 |
X g
1 )
Vol
(
S p
Vol
(
S g
)
S g
) Intersected g p Intersected n S p = p S g = g b b S p ∩S g 22
Contact Space
●
Assume b is a sphere
●
Computed from Minkowski sum
S g S p S g S p ∩S g b S p ∩S g S p b ●
Configuration space, in general
●
Too expensive to compute
23
Approximate Probability Computation
●
Assumes “b” to be a point, a degenerated case
● ●
Exact value is not required Only 5% incorrect decisions compared to considering many other cases
●
Surface area heuristics (SAH) [MacDonald and Booth 90, Havran 00]
●
Equivalent to our approximation
24
Outline
● ● ●
Probabilistic model Layout computation Results
25
Overview of Layout Algorithm
● ●
Cache-oblivious layout computation
●
Do not assume any particular cache block sizes
●
Designed to work well with various (geometric) block sizes [Yoon and Lindstrom 06] Two main steps in recursion
●
Cluster construction w/ parent-child locality
●
Layout clusters w/ spatial locality
26
Clustering
●
Minimize the working set size during collision queries
●
Maximize the sum of probabilities of nodes in a cluster
●
NP-complete even for cache-aware layout given a search query [Gil and Itai 99]
27
Greedy Clustering
●
Employ top-down greedy clustering
●
Compute balanced sized clusters
●
Maintain convexity [Gil and Itai 99]
Cluster 0.9
0.5
0.8
0.1
28
Layout of Clusters
●
Uses cache-oblivious layouts of meshes
●
[Yoon et al. 05]
Spatial locality 29
Layout of Clusters
●
Uses cache-oblivious layouts of meshes
●
[Yoon et al. 05]
Spatial locality 30
Outline
● ● ●
Probabilistic model Layout computation Results
31
Results
●
Collision detection
●
Use oriented bounding box (OBB) [Gottschalk et al. 96]
●
Breadth-first tree traversal
●
Ray tracing
●
Use kd-tree [Wald 04]
●
Depth-first tree traversal
32
Collision Detection – Robot and Power Plant Models
20k triangles 1M triangles 33
Collision Detection – Performance Comparison I 41% ~ 500% performance improvement 1200 1000 800 600 400 200 Working set size (KB) 0 COLBVH Our cache oblivious layout VEB
van Emde Boas layout
Collision time (ms/100) BFL
Breadth first layout
COM L
Cache oblivious mesh layout
Different layouts DFL
Depth first layout 34
Collision Detection – Performance Comparison II 35% ~ 2600% performance improvement 2500 2000 1500 1000 500 0 Working set size (KB) Collision time (ms/100) layout
Emde Boas layout first layout
Different layouts
oblivious mesh layout first layout 35
Cache-Oblivious Layout vs Cache-Aware Layout
● ●
Cache-aware layouts
●
Take advantage of block size information (4KB) Minor performance degradation
●
8% compared to cache-aware layouts
36
Ray Tracing – Lucy Model
28 million triangles Pentium IV with 1GB 37
Ray Tracing – Performance Comparison 77% ~ 180% performance improvement
1200 1000 800 600 400 200 0
Render time (sec) layout Working set size (MB)
Boas layout BFL Breadth-first layout
Different layouts
layout 38
Major Differences over Other Layouts
●
Commonly used layouts
●
Consider connectivity of trees
●
Two improvements of our layouts
●
Probabilistic model based on geometry
●
Layout method considering two different localities
39
Limitations
● ● ●
No guarantee that our layout always improves the performance May not improve the performance of computationally intensive queries (e.g., exact penetration depth computation) Assumes that collision algorithm does not use front tracking
40
Advantages
●
Generality
●
Works with any geometric hierarchies
●
Does not require cache parameters
●
Usability
●
Can gain performance improvement without modifying codes
●
Replaces only data layouts
41
Conclusion
●
Cache-efficient layouts of BVHs
●
Probabilistic model
● ●
Simple layout construction method Applied to collision detection and ray tracing
42
Ongoing and Future Work
● ●
Extend to other proximity and LOD queries [Yoon et al. 06] Investigate other geometric hierarchies
●
Improve the quality of hierarchies
●
Apply to deforming models [Lauterbach et al. 06]
43
Acknowledgements
● ●
Model contributors Funding agencies
● ●
Army Research Office DARPA
●
Intel
● ● ● ● ●
Lawrence Livermore National Laboratory Microsoft National Science Foundation Office of Naval Research RDECOM
44
Acknowledgements
● ● ● ● ● ● ● ● ●
Russ Gayle Ted Kim Ming Lin Peter Lindstrom Brandon Lloyd Valerio Pascucci Stephane Redon LLNL data analysis group members Anonymous reviewers
45
Questions?
Thanks!
46
UCRL-PRES-223220
This work was performed under the auspices of the U.S. Department of Energy by University of California Lawrence Livermore National Laboratory under contract No. W-7405-ENG-48.
Note: this talk is not supported or sanctioned by DoE, UC, LLNL, CASC
47
Additional slides
48
BVHs of Massive Models
●
Complex and massive models
Double eagle tanker (82M triangles) Isosurface (472M) ●
High memory requirement
●
Can have gigabyte data size
St. Matthew (372M) 49
Memory Hierarchies
Size 1KB Register 1MB Caches 1GB Main memory > 1GB Disk storage Speed 10 0 ns 10 1 ns 10 2 ns 10 4 ns 50
Mesh Layouts
● ●
Rendering sequences
●
Triangle strips
●
[Deering 95, Hoppe 99, Bogomjakov and Gotsman 02] Processing sequences
●
[Isenburg and Gumhold 03, Isenburg and Lindstrom 04] Assume that access pattern globally follows the layout order!
51
Mesh Layouts
●
Cache-aware and cache-oblivious layouts of meshes and graphs
●
[Yoon et al. 05, Yoon and Lindstrom 06] Require an input graph that represents access patterns on a BVH
52
Layouts of Search Trees
● ●
Cache-aware layout of search tree [Gil and Itai 99] Cache-oblivious search tree layout [Alstrup et al. 03] Require a probability function that each node is accessed
53
Layouts of BVHs
● ●
Realtime collision detection book [Ericson 04] Layouts analysis in ray tracing [Havran 97]
● ●
Opcode [Terdiman 03]
●
Uses blocking van Emde Boas layout [van Emde Boas 77]
●
Uses recursive blocking
54
Layout of Clusters
●
Uses cache-oblivious layouts of meshes
●
[Yoon et al. 05]
Spatial locality 55