Afrigraph 2004 Interactive Ray-Tracing of Free-Form Surfaces Carsten Benthin Ingo Wald Philipp Slusallek Computer Graphics Lab Saarland University, Germany http://graphics.cs.uni-sb.de.

Download Report

Transcript Afrigraph 2004 Interactive Ray-Tracing of Free-Form Surfaces Carsten Benthin Ingo Wald Philipp Slusallek Computer Graphics Lab Saarland University, Germany http://graphics.cs.uni-sb.de.

Afrigraph 2004
Interactive Ray-Tracing of
Free-Form Surfaces
Carsten Benthin Ingo Wald Philipp Slusallek
Computer Graphics Lab
Saarland University, Germany
http://graphics.cs.uni-sb.de
Motivation
Why ray-tracing of free-form surfaces ?
Tessellating surfaces has disadvantages:
• Scene size complexity
• Accuracy
• Time complexity (preprocessing step)
• Workflow (Animation Tools, CAD)
Want directly ray-trace Splines,
NURBS, Subdivision Surfaces, ...
Previous Approaches
Splines and NURBS
• Bezier Clipping
– Nishita, Campagna, Wang
• Newton Iteration
– Wang, Martin
Subdivision Surfaces
• Adaptive refinement
– Kobbelt, Mueller
 Too slow for interactive use
Interactive Ray-Tracing (IRT)
OpenRT
• Wald, Benthin, Dietrich
• Interactive performance
even on a commodity PC
• Limited to triangles
Utah RTRT
• Shirley, Parker
• Supports even spline surfaces
• Need large parallel supercomputer
for interactive performance
Outline
Analysis of existing approaches (ray/primitive test)
Our simple and generic approach
Implementation: Bicubic Bezier, Loop Subdivision
Complex Scenes
Results
Conclusion & Future Work
Analysis I
Performance „killers“ of current CPU architectures
• Memory latency  cache misses
• Long pipelines  branch mispredictions
• Complex control flow  low instruction level parallelism
Code Optimization
• Ensure data locality & memory access pattern
• Simple code control flow (streamline)
• Data level parallelism (SIMD), e.g. Intel‘s SSE Instructions
Analysis II
Analytical/Iterative algorithms
• Algorithmic complexity
• Numerical problems
• Handling many special cases
• Complex control flow
Adaptive refinement algorithms
• Test for required refinement is costly
• Crack prevention  additional book-keeping data structures
• Complex control flow
Our Approach I
Simple and generic „divide and conquer“
Fixed number of refinement steps
Our Approach II
• Few core operations
– Refinement
– Pruning
– Final Intersection
• No refinement test
• No crack handling
• Constant memory costs
Implementation for
Bicubic Bezier Surfaces
• 4x4 control points, suited for 4-way-SIMD
• Optimized data layout for maximize SIMD performance
– SOA instead AOS
• Pruning: Ray as intersection of two planes (half-space criteria)
• Refinement: deCasteljau algorithm (alternate direction)
• Final Intersection: Triangulation + Ray/Triangle-Tests
Implementation for
Loop Subdivision Surfaces
• AOS data layout
• Regular/Irregular Triangles
– One refinement step  max one irregular vertex
• Pruning: Two planes critera (1-neighborhood)
• Refinement: Loop subdivision  four new triangles
• Final Intersection: Triangle test
Core Performance (Cycles)
Bicubic Bezier
Loop Subdivision
Pruning
86
172/222
Refinement
168/244
405/600
Final
Intersection
294-366
169
Subdivision core operations more complex  higher cost
Free-Form Objects
Reduce tested primitives/ray
• Use spatial acceleration structure (KD-Tree)
– Construct KD-Tree out of AABBs
– Use fast traversal from IRT
• Surface Area Heuristics  better KD-Tree
• Mailboxing
• Reduction: up to 92% (78%)
Dynamic Objects
• Less primtives  KD-Tree reconstruction every frame
Results (Bezier Surfaces)
P4 CPU 2.2 GHz, 640x480, #tris = #primitives * 2^steps * 18
Results (Subdivision Surfaces)
P4 CPU 2.2 GHz, 640x480, 640x480, #tris = #primitives * 4^steps
Video
Conclusion
Pros
• Simple and robust approach
• Interactive performance on single CPU
• Memory efficient, compute bounded ray/primitive test
• Support for trimming curves
• Half the speed compared to triangle approach
• Can easily integrated into existing ray-tracers (OpenRT)
Cons
• Over-refinement
• Subdivision Surface implementation not optimal
Current & Future Work
• Key to performance: tracing ray bundles (e.g. 16 rays)
– Amortize core operation costs over rays
• Different variants for final intersection step
– Plücker triangle test
– Bilinear Patch
• Very fast KD-Tree reconstruction code
– Multithreading
• Higher order bezier patches or even direct NURBS support
• Improve implementation for Subdivision Surfaces
Questions ?