GPU Gems 3 - Graphically Rich Book
Table of Contents
Copyright
Foreword
Preface
Contributors
Part I: Geometry
Chapter 1. Generating Complex Procedural Terrains Using the GPU
Section 1.1. Introduction
Section 1.2. Marching Cubes and the Density Function
Section 1.3. An Overview of the Terrain Generation System
Section 1.4. Generating the Polygons Within a Block of Terrain
Section 1.5. Texturing and Shading
Section 1.6. Considerations for Real-World Applications
Section 1.7. Conclusion
Section 1.8. References
Chapter 2. Animated Crowd Rendering
Section 2.1. Motivation
Section 2.2. A Brief Review of Instancing
Section 2.3. Details of the Technique
Section 2.4. Other Considerations
Section 2.5. Conclusion
Section 2.6. References
Chapter 3. DirectX 10 Blend Shapes: Breaking the Limits
Section 3.1. Introduction
Section 3.2. How Does It Work?
Section 3.3. Running the Sample
Section 3.4. Performance
Section 3.5. References
Section 4.1. Introduction
Section 4.2. Silhouette Clipping
Section 4.3. Shadows
Section 4.4. Leaf Lighting
Section 4.5. High Dynamic Range and Antialiasing
Section 4.6. Alpha to Coverage
Section 4.7. Conclusion
Section 4.8. References
Chapter 5. Generic Adaptive Mesh Refinement
Section 5.1. Introduction
Section 5.2. Overview
Section 5.3. Adaptive Refinement Patterns
Section 5.4. Rendering Workflow
Section 5.5. Results
Section 5.6. Conclusion and Improvements
Section 5.7. References
Chapter 6. GPU-Generated Procedural Wind Animations for Trees
Section 6.1. Introduction
Section 6.2. Procedural Animations on the GPU
Section 6.3. A Phenomenological Approach
Section 6.4. The Simulation Step
Section 6.5. Rendering the Tree
Section 6.6. Analysis and Comparison
Section 6.7. Summary
Section 6.8. References
Chapter 7. Point-Based Visualization of Metaballs on a GPU
Section 7.1. Metaballs, Smoothed Particle Hydrodynamics, and Surface
Particles
Section 7.2. Constraining Particles
Section 7.3. Local Particle Repulsion
Section 7.4. Global Particle Dispersion
Section 7.5. Performance
Section 7.6. Rendering
Section 7.7. Conclusion
Section 7.8. References
Part II: Light and Shadows
Chapter 8. Summed-Area Variance Shadow Maps
Section 8.1. Introduction
Section 8.2. Related Work
Section 8.3. Percentage-Closer Filtering
Section 8.4. Variance Shadow Maps
Section 8.5. Summed-Area Variance Shadow Maps
Section 8.6. Percentage-Closer Soft Shadows
Section 8.7. Conclusion
Section 8.8. References
Chapter 9. Interactive Cinematic Relighting with Global Illumination
Section 9.1. Introduction
Section 9.2. An Overview of the Algorithm
Section 9.3. Gather Samples
Section 9.4. One-Bounce Indirect Illumination
Section 9.5. Wavelets for Compression
Section 9.6. Adding Multiple Bounces
Section 9.7. Packing Sparse Matrix Data
Section 9.8. A GPU-Based Relighting Engine
Section 9.9. Results
Section 9.10. Conclusion
Section 9.11. References
Chapter 10. Parallel-Split Shadow Maps on Programmable GPUs
Section 10.1. Introduction
Section 10.2. The Algorithm
Section 10.3. Hardware-Specific Implementations
Section 10.4. Further Optimizations
Section 10.5. Results
Section 10.6. Conclusion
Section 10.7. References
Chapter 11. Efficient and Robust Shadow Volumes Using Hierarchical
Occlusion Culling and Geometry Shaders
Section 11.1. Introduction
Section 11.2. An Overview of Shadow Volumes
Section 11.3. Our Implementation
Section 11.4. Conclusion
Section 11.5. References
Chapter 12. High-Quality Ambient Occlusion
Section 12.1. Review
Section 12.2. Problems
Section 12.3. A Robust Solution
Section 12.4. Results
Section 12.5. Performance
Section 12.6. Caveats
Section 12.7. Future Work
Section 12.8. References
Chapter 13. Volumetric Light Scattering as a Post-Process
Section 13.1. Introduction
Section 13.2. Crepuscular Rays
Section 13.3. Volumetric Light Scattering
Section 13.4. The Post-Process Pixel Shader
Section 13.5. Screen-Space Occlusion Methods
Section 13.6. Caveats
Section 13.7. The Demo
Section 13.8. Extensions
Section 13.9. Summary
Section 13.10. References
Part III: Rendering
Chapter 14. Advanced Techniques for Realistic Real-Time Skin
Rendering
Section 14.1. The Appearance of Skin
Section 14.2. An Overview of the Skin-Rendering System
Section 14.3. Specular Surface Reflectance
Section 14.4. Scattering Theory
Section 14.5. Advanced Subsurface Scattering
Section 14.6. A Fast Bloom Filter
Section 14.7. Conclusion
Section 14.8. References
Chapter 15. Playable Universal Capture
Section 15.1. Introduction
Section 15.2. The Data Acquisition Pipeline
Section 15.3. Compression and Decompression of the Animated
Textures
Section 15.4. Sequencing Performances
Section 15.5. Conclusion
Section 15.6. References
Chapter 16. Vegetation Procedural Animation and Shading in Crysis
Section 16.1. Procedural Animation
Section 16.2. Vegetation Shading
Section 16.3. Conclusion
Section 16.4. References
Chapter 17. Robust Multiple Specular Reflections and Refractions
Section 17.1. Introduction
Section 17.2. Tracing Secondary Rays
Section 17.3. Reflections and Refractions
Section 17.4. Results
Section 17.5. Conclusion
Section 17.6. References
Section 18.1. Introduction
Section 18.2. A Brief Review of Relief Mapping
Section 18.3. Cone Step Mapping
Section 18.4. Relaxed Cone Stepping
Section 18.5. Conclusion
Section 18.6. References
Chapter 19. Deferred Shading in Tabula Rasa
Section 19.1. Introduction
Section 19.2. Some Background
Section 19.3. Forward Shading Support
Section 19.4. Advanced Lighting Features
Section 19.5. Benefits of a Readable Depth and Normal Buffer
Section 19.6. Caveats
Section 19.7. Optimizations
Section 19.8. Issues
Section 19.9. Results
Section 19.10. Conclusion
Section 19.11. References
Chapter 20. GPU-Based Importance Sampling
Section 20.1. Introduction
Section 20.2. Rendering Formulation
Section 20.3. Quasirandom Low-Discrepancy Sequences
Section 20.4. Mipmap Filtered Samples
Section 20.5. Performance
Section 20.6. Conclusion
Section 20.7. Further Reading and References
Part IV: Image Effects
Chapter 21. True Impostors
Section 21.1. Introduction
Section 21.2. Algorithm and Implementation Details
Section 21.3. Results
Section 21.4. Conclusion
Section 21.5. References
Chapter 22. Baking Normal Maps on the GPU
Section 22.1. The Traditional Implementation
Section 22.2. Acceleration Structures
Section 22.3. Feeding the GPU
Section 22.4. Implementation
Section 22.5. Results
Section 22.6. Conclusion
Section 22.7. References
Chapter 23. High-Speed, Off-Screen Particles
Section 23.1. Motivation
Section 23.2. Off-Screen Rendering
Section 23.3. Downsampling Depth
Section 23.4. Depth Testing and Soft Particles
Section 23.5. Alpha Blending
Section 23.6. Mixed-Resolution Rendering
Section 23.7. Results
Section 23.8. Conclusion
Section 23.9. References
Chapter 24. The Importance of Being Linear
Section 24.1. Introduction
Section 24.2. Light, Displays, and Color Spaces
Section 24.3. The Symptoms
Section 24.4. The Cure
Section 24.5. Conclusion
Section 24.6. Further Reading
Chapter 25. Rendering Vector Art on the GPU
Section 25.1. Introduction
Section 25.2. Quadratic Splines
Section 25.3. Cubic Splines
Section 25.4. Triangulation
Section 25.5. Antialiasing
Section 25.6. Code
Section 25.7. Conclusion
Section 25.8. References
Chapter 26. Object Detection by Color: Using the GPU for Real-Time
Video Image Processing
Section 26.1. Image Processing Abstracted
Section 26.2. Object Detection by Color
Section 26.3. Conclusion
Section 26.4. Further Reading
Chapter 27. Motion Blur as a Post-Processing Effect
Section 27.1. Introduction
Section 27.2. Extracting Object Positions from the Depth Buffer
Section 27.3. Performing the Motion Blur
Section 27.4. Handling Dynamic Objects
Section 27.5. Masking Off Objects
Section 27.6. Additional Work
Section 27.7. Conclusion
Section 27.8. References
Chapter 28. Practical Post-Process Depth of Field
Section 28.1. Introduction
Section 28.2. Related Work
Section 28.3. Depth of Field
Section 28.4. Evolution of the Algorithm
Section 28.5. The Complete Algorithm
Section 28.6. Conclusion
Section 28.7. Limitations and Future Work
Section 28.8. References
Part V: Physics Simulation
Section 29.1. Introduction
Section 29.2. Rigid Body Simulation on the GPU
Section 29.3. Applications
Section 29.4. Conclusion
Section 29.5. Appendix
Section 29.6. References
Chapter 30. Real-Time Simulation and Rendering of 3D Fluids
Section 30.1. Introduction
Section 30.2. Simulation
Section 30.3. Rendering
Section 30.4. Conclusion
Section 30.5. References
Chapter 31. Fast N-Body Simulation with CUDA
Section 31.1. Introduction
Section 31.2. All-Pairs N-Body Simulation
Section 31.3. A CUDA Implementation of the All-Pairs N-Body
Algorithm
Section 31.4. Performance Results
Section 31.5. Previous Methods Using GPUs for N-Body Simulation
Section 31.6. Hierarchical N-Body Methods
Section 31.7. Conclusion
Section 31.8. References
Chapter 32. Broad-Phase Collision Detection with CUDA
Section 32.1. Broad-Phase Algorithms
Section 32.2. A CUDA Implementation of Spatial Subdivision
Section 32.3. Performance Results
Section 32.4. Conclusion
Section 32.5. References
Chapter 33. LCP Algorithms for Collision Detection Using CUDA
Section 33.1. Parallel Processing
Section 33.3. Determining Contact Points
Section 33.4. Mathematical Optimization
Section 33.5. The Convex Distance Calculation
Section 33.6. The Parallel LCP Solution Using CUDA
Section 33.7. Results
Section 33.8. References
Chapter 34. Signed Distance Fields Using Single-Pass GPU Scan
Conversion of Tetrahedra
Section 34.1. Introduction
Section 34.2. Leaking Artifacts in Scan Methods
Section 34.3. Our Tetrahedra GPU Scan Method
Section 34.4. Results
Section 34.5. Conclusion
Section 34.6. Future Work
Section 34.7. Further Reading
Section 34.8. References
Part VI: GPU Computing
Chapter 35. Fast Virus Signature Matching on the GPU
Section 35.1. Introduction
Section 35.2. Pattern Matching
Section 35.3. The GPU Implementation
Section 35.4. Results
Section 35.5. Conclusions and Future Work
Section 35.6. References
Chapter 36. AES Encryption and Decryption on the GPU
Section 36.1. New Functions for Integer Stream Processing
Section 36.2. An Overview of the AES Algorithm
Section 36.3. The AES Implementation on the GPU
Section 36.4. Performance
Section 36.5. Considerations for Parallelism
Section 36.6. Conclusion and Future Work
Section 36.7. References
Chapter 37. Efficient Random Number Generation and Application
Using CUDA
Section 37.1. Monte Carlo Simulations
Section 37.2. Random Number Generators
Section 37.3. Example Applications
Section 37.4. Conclusion
Section 37.5. References
Chapter 38. Imaging Earth's Subsurface Using CUDA
Section 38.1. Introduction
Section 38.2. Seismic Data
Section 38.3. Seismic Processing
Section 38.4. The GPU Implementation
Section 38.5. Performance
Section 38.6. Conclusion
Section 38.7. References
Chapter 39. Parallel Prefix Sum (Scan) with CUDA
Section 39.1. Introduction
Section 39.2. Implementation
Section 39.3. Applications of Scan
Section 39.4. Conclusion
Section 39.5. References
Chapter 40. Incremental Computation of the Gaussian
Section 40.1. Introduction and Related Work
Section 40.2. Polynomial Forward Differencing
Section 40.3. The Incremental Gaussian Algorithm
Section 40.4. Error Analysis
Section 40.5. Performance
Section 40.6. Conclusion
Section 40.7. References
Variable-Length GPU Feedback
Section 41.1. Introduction
Section 41.2. Why Use the Geometry Shader?
Section 41.3. Dynamic Output with the Geometry Shader
Section 41.4. Algorithms and Applications
Section 41.5. Benefits: GPU Locality and SLI
Section 41.6. Performance and Limits
Section 41.7. Conclusion
Section 41.8. References
Addison-Wesley Warranty on the DVD
Addison-Wesley Warranty on the DVD
NVIDIA Statement on the Software
DVD System Requirements
Inside Back Cover
Geometry
Light and Shadows
Rendering
Image Effects
Physics Simulation
GPU Computing
Index
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Z
GPU Gems 3
by Hubert Nguyen
Publisher: Addison Wesley Professional Pub Date: August 02, 2007
Print ISBN-10: 0-321-51526-9
Print ISBN-13: 978-0-321-51526-1 eText ISBN-10: 0-321-54542-7
eText ISBN-13: 978-0-321-54542-8 Pages: 1008
Table of Contents | Index
Overview
"The GPU Gems series features a collection of the most essential algorithms required by Next-Generation 3D Engines."
—Martin Mittring, Lead Graphics Programmer, Crytek
This third volume of the best-selling GPU Gems series provides a snapshot of today's latest Graphics Processing Unit (GPU) programming techniques. The programmability of modern GPUs allows developers to not only distinguish themselves from one another but also to use this awesome processing power for non-graphics applications, such as physics simulation, financial analysis, and even virus detection—particularly with the CUDA architecture. Graphics remains the leading application for GPUs, and readers will find that the latest algorithms create ultra-realistic characters, better lighting, and post-rendering compositing effects.
Major topics include
Geometry
Light and Shadows Rendering
Image Effects
Physics Simulation GPU Computing
3Dfacto
Adobe Systems Apple
Budapest University of Technology and Economics CGGVeritas
The Chinese University of Hong Kong Cornell University
Crytek
Czech Technical University in Prague Dartmouth College
Digital Illusions Creative Entertainment Eindhoven University of Technology Electronic Arts
Havok
Helsinki University of Technology Imperial College London
Infinity Ward Juniper Networks
LaBRI–INRIA, University of Bordeaux mental images Microsoft Research Move Interactive NCsoft Corporation NVIDIA Corporation Perpetual Entertainment Playlogic Game Factory Polytime
Rainbow Studios SEGA Corporation UFRGS (Brazil) Ulm University
University of California, Davis University of Central Florida University of Copenhagen University of Girona
University of Illinois at Urbana-Champaign University of North Carolina Chapel Hill University of Tokyo
University of Waterloo
Section Editors include NVIDIA engineers: Cyril Zeller, Evan Hart, Ignacio Castaño, Kevin Bjorke, Kevin Myers, and Nolan Goodnight.
The accompanying DVD includes complementary examples and sample programs.
GPU Gems 3
by Hubert Nguyen
Publisher: Addison Wesley Professional Pub Date: August 02, 2007
Print ISBN-10: 0-321-51526-9
Print ISBN-13: 978-0-321-51526-1 eText ISBN-10: 0-321-54542-7
eText ISBN-13: 978-0-321-54542-8 Pages: 1008
Table of Contents | Index Copyright
Foreword Preface
Contributors
Part I: Geometry
Chapter 1. Generating Complex Procedural Terrains Using the GPU Section 1.1. Introduction
Section 1.2. Marching Cubes and the Density Function
Section 1.3. An Overview of the Terrain Generation System Section 1.4. Generating the Polygons Within a Block of Terrain Section 1.5. Texturing and Shading
Section 1.6. Considerations for Real-World Applications Section 1.7. Conclusion
Section 1.8. References
Chapter 2. Animated Crowd Rendering Section 2.1. Motivation
Section 2.2. A Brief Review of Instancing Section 2.3. Details of the Technique Section 2.4. Other Considerations Section 2.5. Conclusion
Section 2.6. References
Chapter 3. DirectX 10 Blend Shapes: Breaking the Limits Section 3.1. Introduction
Section 3.3. Running the Sample Section 3.4. Performance
Section 3.5. References
Chapter 4. Next-Generation SpeedTree Rendering Section 4.1. Introduction
Section 4.2. Silhouette Clipping Section 4.3. Shadows
Section 4.4. Leaf Lighting
Section 4.5. High Dynamic Range and Antialiasing Section 4.6. Alpha to Coverage
Section 4.7. Conclusion Section 4.8. References
Chapter 5. Generic Adaptive Mesh Refinement Section 5.1. Introduction
Section 5.2. Overview
Section 5.3. Adaptive Refinement Patterns Section 5.4. Rendering Workflow
Section 5.5. Results
Section 5.6. Conclusion and Improvements Section 5.7. References
Chapter 6. GPU-Generated Procedural Wind Animations for Trees Section 6.1. Introduction
Section 6.2. Procedural Animations on the GPU Section 6.3. A Phenomenological Approach Section 6.4. The Simulation Step
Section 6.5. Rendering the Tree
Section 6.6. Analysis and Comparison Section 6.7. Summary
Section 6.8. References
Chapter 7. Point-Based Visualization of Metaballs on a GPU
Section 7.1. Metaballs, Smoothed Particle Hydrodynamics, and Surface Particles
Section 7.2. Constraining Particles Section 7.3. Local Particle Repulsion Section 7.4. Global Particle Dispersion Section 7.5. Performance
Section 7.6. Rendering Section 7.7. Conclusion Section 7.8. References Part II: Light and Shadows
Chapter 8. Summed-Area Variance Shadow Maps Section 8.1. Introduction
Section 8.2. Related Work
Section 8.3. Percentage-Closer Filtering Section 8.4. Variance Shadow Maps
Section 8.5. Summed-Area Variance Shadow Maps Section 8.6. Percentage-Closer Soft Shadows
Section 8.7. Conclusion Section 8.8. References
Chapter 9. Interactive Cinematic Relighting with Global Illumination Section 9.1. Introduction
Section 9.2. An Overview of the Algorithm Section 9.3. Gather Samples
Section 9.4. One-Bounce Indirect Illumination Section 9.5. Wavelets for Compression
Section 9.6. Adding Multiple Bounces Section 9.7. Packing Sparse Matrix Data Section 9.8. A GPU-Based Relighting Engine Section 9.9. Results
Section 9.10. Conclusion Section 9.11. References
Chapter 10. Parallel-Split Shadow Maps on Programmable GPUs Section 10.1. Introduction
Section 10.2. The Algorithm
Section 10.3. Hardware-Specific Implementations Section 10.4. Further Optimizations
Section 10.5. Results Section 10.6. Conclusion Section 10.7. References
Chapter 11. Efficient and Robust Shadow Volumes Using Hierarchical Occlusion Culling and Geometry Shaders
Section 11.1. Introduction
Section 11.2. An Overview of Shadow Volumes Section 11.3. Our Implementation
Section 11.4. Conclusion Section 11.5. References
Chapter 12. High-Quality Ambient Occlusion Section 12.1. Review
Section 12.2. Problems
Section 12.3. A Robust Solution Section 12.4. Results
Section 12.5. Performance Section 12.6. Caveats
Section 12.7. Future Work Section 12.8. References
Chapter 13. Volumetric Light Scattering as a Post-Process Section 13.1. Introduction
Section 13.2. Crepuscular Rays
Section 13.3. Volumetric Light Scattering Section 13.4. The Post-Process Pixel Shader Section 13.5. Screen-Space Occlusion Methods Section 13.6. Caveats
Section 13.7. The Demo Section 13.8. Extensions Section 13.9. Summary Section 13.10. References Part III: Rendering
Chapter 14. Advanced Techniques for Realistic Real-Time Skin Rendering Section 14.1. The Appearance of Skin
Section 14.2. An Overview of the Skin-Rendering System Section 14.3. Specular Surface Reflectance
Section 14.4. Scattering Theory
Section 14.5. Advanced Subsurface Scattering Section 14.6. A Fast Bloom Filter
Section 14.7. Conclusion Section 14.8. References
Section 15.1. Introduction
Section 15.2. The Data Acquisition Pipeline
Section 15.3. Compression and Decompression of the Animated Textures Section 15.4. Sequencing Performances
Section 15.5. Conclusion Section 15.6. References
Chapter 16. Vegetation Procedural Animation and Shading in Crysis Section 16.1. Procedural Animation
Section 16.2. Vegetation Shading Section 16.3. Conclusion
Section 16.4. References
Chapter 17. Robust Multiple Specular Reflections and Refractions Section 17.1. Introduction
Section 17.2. Tracing Secondary Rays Section 17.3. Reflections and Refractions Section 17.4. Results
Section 17.5. Conclusion Section 17.6. References
Chapter 18. Relaxed Cone Stepping for Relief Mapping Section 18.1. Introduction
Section 18.2. A Brief Review of Relief Mapping Section 18.3. Cone Step Mapping
Section 18.4. Relaxed Cone Stepping Section 18.5. Conclusion
Section 18.6. References
Chapter 19. Deferred Shading in Tabula Rasa Section 19.1. Introduction
Section 19.2. Some Background
Section 19.3. Forward Shading Support Section 19.4. Advanced Lighting Features
Section 19.5. Benefits of a Readable Depth and Normal Buffer Section 19.6. Caveats
Section 19.7. Optimizations Section 19.8. Issues
Section 19.9. Results
Section 19.11. References
Chapter 20. GPU-Based Importance Sampling Section 20.1. Introduction
Section 20.2. Rendering Formulation
Section 20.3. Quasirandom Low-Discrepancy Sequences Section 20.4. Mipmap Filtered Samples
Section 20.5. Performance Section 20.6. Conclusion
Section 20.7. Further Reading and References Part IV: Image Effects
Chapter 21. True Impostors Section 21.1. Introduction
Section 21.2. Algorithm and Implementation Details Section 21.3. Results
Section 21.4. Conclusion Section 21.5. References
Chapter 22. Baking Normal Maps on the GPU Section 22.1. The Traditional Implementation Section 22.2. Acceleration Structures
Section 22.3. Feeding the GPU Section 22.4. Implementation Section 22.5. Results
Section 22.6. Conclusion Section 22.7. References
Chapter 23. High-Speed, Off-Screen Particles Section 23.1. Motivation
Section 23.2. Off-Screen Rendering Section 23.3. Downsampling Depth
Section 23.4. Depth Testing and Soft Particles Section 23.5. Alpha Blending
Section 23.6. Mixed-Resolution Rendering Section 23.7. Results
Section 23.8. Conclusion Section 23.9. References
Chapter 24. The Importance of Being Linear Section 24.1. Introduction
Section 24.2. Light, Displays, and Color Spaces Section 24.3. The Symptoms
Section 24.4. The Cure Section 24.5. Conclusion
Section 24.6. Further Reading
Chapter 25. Rendering Vector Art on the GPU Section 25.1. Introduction
Section 25.2. Quadratic Splines Section 25.3. Cubic Splines Section 25.4. Triangulation Section 25.5. Antialiasing Section 25.6. Code
Section 25.7. Conclusion Section 25.8. References
Chapter 26. Object Detection by Color: Using the GPU for Real-Time Video Image Processing
Section 26.1. Image Processing Abstracted Section 26.2. Object Detection by Color Section 26.3. Conclusion
Section 26.4. Further Reading
Chapter 27. Motion Blur as a Post-Processing Effect Section 27.1. Introduction
Section 27.2. Extracting Object Positions from the Depth Buffer Section 27.3. Performing the Motion Blur
Section 27.4. Handling Dynamic Objects Section 27.5. Masking Off Objects
Section 27.6. Additional Work Section 27.7. Conclusion
Section 27.8. References
Chapter 28. Practical Post-Process Depth of Field Section 28.1. Introduction
Section 28.2. Related Work Section 28.3. Depth of Field
Section 28.4. Evolution of the Algorithm Section 28.5. The Complete Algorithm Section 28.6. Conclusion
Section 28.7. Limitations and Future Work Section 28.8. References
Part V: Physics Simulation
Chapter 29. Real-Time Rigid Body Simulation on GPUs Section 29.1. Introduction
Section 29.2. Rigid Body Simulation on the GPU Section 29.3. Applications
Section 29.4. Conclusion Section 29.5. Appendix Section 29.6. References
Chapter 30. Real-Time Simulation and Rendering of 3D Fluids Section 30.1. Introduction
Section 30.2. Simulation Section 30.3. Rendering Section 30.4. Conclusion Section 30.5. References
Chapter 31. Fast N-Body Simulation with CUDA Section 31.1. Introduction
Section 31.2. All-Pairs N-Body Simulation
Section 31.3. A CUDA Implementation of the All-Pairs N-Body Algorithm Section 31.4. Performance Results
Section 31.5. Previous Methods Using GPUs for N-Body Simulation Section 31.6. Hierarchical N-Body Methods
Section 31.7. Conclusion Section 31.8. References
Chapter 32. Broad-Phase Collision Detection with CUDA Section 32.1. Broad-Phase Algorithms
Section 32.2. A CUDA Implementation of Spatial Subdivision Section 32.3. Performance Results
Section 32.4. Conclusion Section 32.5. References
Chapter 33. LCP Algorithms for Collision Detection Using CUDA Section 33.1. Parallel Processing
Section 33.2. The Physics Pipeline
Section 33.3. Determining Contact Points Section 33.4. Mathematical Optimization
Section 33.5. The Convex Distance Calculation
Section 33.6. The Parallel LCP Solution Using CUDA Section 33.7. Results
Section 33.8. References
Chapter 34. Signed Distance Fields Using Single-Pass GPU Scan Conversion of Tetrahedra
Section 34.1. Introduction
Section 34.2. Leaking Artifacts in Scan Methods Section 34.3. Our Tetrahedra GPU Scan Method Section 34.4. Results
Section 34.5. Conclusion Section 34.6. Future Work Section 34.7. Further Reading Section 34.8. References
Part VI: GPU Computing
Chapter 35. Fast Virus Signature Matching on the GPU Section 35.1. Introduction
Section 35.2. Pattern Matching
Section 35.3. The GPU Implementation Section 35.4. Results
Section 35.5. Conclusions and Future Work Section 35.6. References
Chapter 36. AES Encryption and Decryption on the GPU
Section 36.1. New Functions for Integer Stream Processing Section 36.2. An Overview of the AES Algorithm
Section 36.3. The AES Implementation on the GPU Section 36.4. Performance
Section 36.5. Considerations for Parallelism Section 36.6. Conclusion and Future Work Section 36.7. References
Chapter 37. Efficient Random Number Generation and Application Using CUDA
Section 37.1. Monte Carlo Simulations Section 37.2. Random Number Generators Section 37.3. Example Applications
Section 37.4. Conclusion Section 37.5. References
Chapter 38. Imaging Earth's Subsurface Using CUDA Section 38.1. Introduction
Section 38.2. Seismic Data
Section 38.3. Seismic Processing
Section 38.4. The GPU Implementation Section 38.5. Performance
Section 38.6. Conclusion Section 38.7. References
Chapter 39. Parallel Prefix Sum (Scan) with CUDA Section 39.1. Introduction
Section 39.2. Implementation Section 39.3. Applications of Scan Section 39.4. Conclusion
Section 39.5. References
Chapter 40. Incremental Computation of the Gaussian Section 40.1. Introduction and Related Work
Section 40.2. Polynomial Forward Differencing Section 40.3. The Incremental Gaussian Algorithm Section 40.4. Error Analysis
Section 40.5. Performance Section 40.6. Conclusion Section 40.7. References
Chapter 41. Using the Geometry Shader for Compact and Variable-Length GPU Feedback
Section 41.1. Introduction
Section 41.2. Why Use the Geometry Shader?
Section 41.3. Dynamic Output with the Geometry Shader Section 41.4. Algorithms and Applications
Section 41.5. Benefits: GPU Locality and SLI Section 41.6. Performance and Limits
Section 41.7. Conclusion Section 41.8. References
Addison-Wesley Warranty on the DVD Addison-Wesley Warranty on the DVD NVIDIA Statement on the Software
DVD System Requirements Inside Back Cover
Geometry
Light and Shadows Rendering
Image Effects
Physics Simulation GPU Computing Index
Copyright
About the Cover: The image on the cover has been rendered in real time in
the "Human Head" technology demonstration created by the NVIDIA Demo Team. It illustrates the extreme level of realism achievable with the GeForce 8 Series of GPUs. The demo renders skin by using a physically based model that was previously used only in high-profile prerendered movie projects. Actor Doug Jones is the model represented in the demo. He recently starred as the Silver Surfer in Fantastic Four: Rise of the Silver Surfer.
Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and the publisher was aware of a trademark claim, the designations have been printed with initial capital letters or in all capitals.
GeForce™, CUDA™, and NVIDIA Quadro® are trademarks or registered trademarks of NVIDIA Corporation.
The authors and publisher have taken care in the preparation of this book, but make no expressed or implied warranty of any kind and assume no
responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein.
NVIDIA makes no warranty or representation that the techniques described herein are free from any Intellectual Property claims. The reader assumes all risk of any such claims based on his or her use of these techniques.
The publisher offers excellent discounts on this book when ordered in quantity for bulk purchases or special sales, which may include electronic versions
and/or custom covers and content particular to your business, training goals, marketing focus, and branding interests. For more information, please contact: U.S. Corporate and Government Sales
(800) 382-3419
For sales outside of the United States, please contact: International Sales
Visit us on the Web: www.awprofessional.com
GPU gems 3 / edited by Hubert Nguyen. p. cm.
Includes bibliographical references and index.
ISBN-13: 978-0-321-51526-1 (hardback : alk. paper) ISBN-10: 0-321-51526-9
1. Computer graphics. 2. Real-time programming. I. Nguyen, Hubert.
T385.G6882 2007 006.6'6—dc22
2007023985 Copyright © 2008 NVIDIA Corporation
All rights reserved. Printed in the United States of America. This publication is protected by copyright, and permission must be obtained from the publisher prior to any prohibited reproduction, storage in a retrieval system, or
transmission in any form or by any means, electronic, mechanical,
photocopying, recording, or likewise. For information regarding permissions, write to:
Pearson Education, Inc.
Rights and Contracts Department 501 Boylston Street, Suite 900 Boston, MA 02116
Fax: (617) 671-3447 ISBN-13: 978-0-321-51526-1
Text printed in the United States on recycled paper at Courier in Kendallville, Indiana.
Foreword
Composition, the organization of elemental operations into a nonobvious whole, is the essence of imperative programming. The instruction set
architecture (ISA) of a microprocessor is a versatile composition interface, which programmers of software renderers have used effectively and creatively in their quest for image realism. Early graphics hardware increased rendering performance, but often at a high cost in composability, and thus in
programmability and application innovation. Hardware with microprocessor-like programmability did evolve (for example, the Ikonas Graphics System), but the dominant form of graphics hardware acceleration has been organized around a fixed sequence of rendering operations, often referred to as the
graphics pipeline. Early interfaces to these systems—such as CORE and later,
PHIGS—allowed programmers to specify rendering results, but they were not designed for composition.
OpenGL, which I helped to evolve from its Silicon Graphics-defined
predecessor IRIS GL in the early 1990s, addressed the need for composability by specifying an architecture (informally called the OpenGL Machine) that was accessed through an imperative programmatic interface. Many features—for example, tightly specified semantics; table-driven operations such as stencil and depth-buffer functions; texture mapping exposed as a general 1D, 2D, and 3D lookup function; and required repeatability properties—ensured that
programmers could compose OpenGL operations with powerful and reliable results. Some of the useful techniques that OpenGL enabled include texture-based volume rendering, shadow volumes using stencil buffers, and
constructive solid geometry algorithms such as capping (the computation of surface planes at the intersections of clipping planes and solid objects defined by polygons). Ultimately, Mark Peercy and the coauthors of the SIGGRAPH 2000 paper "Interactive Multi-Pass Programmable Shading" demonstrated that arbitrary RenderMan shaders could be accelerated through the composition of OpenGL rendering operations.
During this decade, increases in the raw capability of integrated circuit technology allowed the OpenGL architecture (and later, Direct3D) to be extended to expose an ISA interface. These extensions appeared as
programmable vertex and fragment shaders within the graphics pipeline and now, with the introduction of CUDA, as a data-parallel ISA in near parity with that of the microprocessor. Although the cycle toward complete
microprocessor-like versatility is not complete, the tremendous power of
graphics hardware acceleration is more accessible than ever to programmers. And what computational power it is! At this writing, the NVIDIA GeForce 8800
Ultra performs over 400 billion floating-point operations per second—more than the most powerful supercomputer available a decade ago, and five times more than today's most powerful microprocessor. The data-parallel
programming model the Ultra supports allows its computational power to be harnessed without concern for the number of processors employed. This is critical, because while today's Ultra already includes over 100 processors, tomorrow's will include thousands, and then more. With no end in sight to the annual compounding of integrated circuit density known as Moore's Law,
massively parallel systems are clearly the future of computing, with graphics hardware leading the way.
GPU Gems 3 is a collection of state-of-the-art GPU programming examples. It
is about putting data-parallel processing to work. The first four sections focus on graphics-specific applications of GPUs in the areas of geometry, lighting and shadows, rendering, and image effects. Topics in the fifth and sixth sections broaden the scope by providing concrete examples of nongraphical applications that can now be addressed with data-parallel GPU technology. These
applications are diverse, ranging from rigid-body simulation to fluid flow simulation, from virus signature matching to encryption and decryption, and from random number generation to computation of the Gaussian.
Where is this all leading? The cover art reminds us that the mind remains the most capable parallel computing system of all. A long-term goal of computer science is to achieve and, ultimately, to surpass the capabilities of the human mind. It's exciting to think that the computer graphics community, as we
identify, address, and master the challenges of massively parallel computing, is contributing to the realization of this dream.
Preface
It has been only three years since the first GPU Gems book was introduced, and some areas of real-time graphics have truly become ultrarealistic. Chapter 14, "Advanced Techniques for Realistic Real-Time Skin Rendering," illustrates this evolution beautifully, describing a skin rendering technique that works so well that the data acquisition and animation will become the most challenging problem in rendering human characters for the next couple of years.
All this progress has been fueled by a sustained rhythm of GPU innovation. These processing units continue to become faster and more flexible in their use. Today's GPUs can process enormous amounts of data and are used not only for rendering 3D scenes, but also for processing images or performing massively parallel computing, such as financial statistics or terrain analysis for finding new oil fields.
Whether they are used for computing or graphics, GPUs need a software interface to drive them, and we are in the midst of an important transition. The new generation of APIs brings additional orthogonality and exposes new capabilities such as generating geometry programmatically. On the computing side, the CUDA architecture lets developers use a C-like language to perform computing tasks rather than forcing the programmer to use the graphics
pipeline. This architecture will allow developers without a graphics background to tap into the immense potential of the GPU.
More than 200 chapters were submitted by the GPU programming community, covering a large spectrum of GPU usage ranging from pure 3D rendering to nongraphics applications. Each of them went through a rigorous review process conducted both by NVIDIA's engineers and by external reviewers.
We were able to include 41 chapters, each of which went through another review, during which feedback from the editors and peer reviewers often significantly improved the content. Unfortunately, we could not include some excellent chapters, simply due to the space restriction of the book. It was difficult to establish the final table of contents, but we would like to thank everyone who sent a submission.
Intended Audience
For the graphics-related chapters, we expect the reader to be familiar with the fundamentals of computer graphics including graphics APIs such as DirectX and OpenGL, as well as their associated high-level programming languages,
will find in this book a wealth of applicable techniques for today's and tomorrow's GPUs.
Readers interested in computing and CUDA will find it best to know parallel computing concepts. C programming knowledge is also expected.
Trying the Code Samples
GPU Gems 3 comes with a disc that includes samples, movies, and other
demonstrations of the techniques described in this book. You can also go to the book's Web page to find the latest updates and supplemental materials:
developer.nvidia.com/gpugems3.
Acknowledgments
This book represents the dedication of many people—especially the numerous authors who submitted their most recent work to the GPU community by
contributing to this book. Without a doubt, these inspirational and powerful chapters will help thousands of developers push the envelope in their
applications.
Our section editors—Cyril Zeller, Evan Hart, Ignacio Castaño Aguado, Kevin Bjorke, Kevin Myers, and Nolan Goodnight—took on an invaluable role,
providing authors with feedback and guidance to make the chapters as good as they could be. Without their expertise and contributions above and beyond their usual workload, this book could not have been published.
Ensuring the clarity of GPU Gems 3 required numerous diagrams, illustrations, and screen shots. A lot of diligence went into unifying the graphic style of
about 500 figures, and we thank Michael Fornalski and Jim Reed for their wonderful work on these. We are grateful to Huey Nguyen and his team for their support for many of our projects. We also thank Rory Loeb for his contribution to the amazing book cover design and many other graphic elements of the book.
We would also like to thank Catherine Kilkenny and Teresa Saffaie for tremendous help with copyediting as chapters were being worked on. Randy Fernando, the editor of the previous GPU Gems books, shared his wealth of experience acquired in producing those volumes.
We are grateful to Kurt Akeley for writing our insightful and forward-looking foreword.
managed this project to completion before handing the marketing aspect to Curt Johnson. Christopher Keane did fantastic work on the copyediting and typesetting.
The support from many executive staff members from NVIDIA was critical to this endeavor: Tony Tamasi and Dan Vivoli continually value the creation of educational material and provided the resources necessary to accomplish this project.
We are grateful to Jen-Hsun Huang for his continued support of the GPU Gems series and for creating an environment that encourages innovation and
teamwork.
We also thank everyone at NVIDIA for their support and for continually
building the technology that changes the way people think about computing.
Contributors
Thomas Alexander, Polytime
Thomas Alexander cofounded Exapath, a startup focused on mapping
networking algorithms onto GPGPUs. Previously he was at Juniper Networks working in the Infrastructure Product Group building core routers. Thomas has a Ph.D. in electrical engineering from Duke University, where he also worked on a custom-built parallel machine for ray casting.
Kavita Bala, Cornell University
Kavita Bala is an assistant professor in the Computer Science Department and Program of Computer Graphics at Cornell University. Bala specializes in
scalable rendering for high-complexity illumination, interactive global
illumination, perceptually based rendering, and image-based texturing. Bala has published research papers and served on the program committees of several conferences, including SIGGRAPH. In 2005, Bala cochaired the Eurographics Symposium on Rendering. She has coauthored the graduate-level textbook Advanced Global Illumination, 2nd ed. (A K Peters, 2006). Before Cornell, Bala received her S.M. and Ph.D. from the Massachusetts Institute of Technology, and her B.Tech. from the Indian Institute of
Technology Bombay.
Kevin Bjorke, NVIDIA Corporation
Kevin Bjorke is a member of the Technology Evangelism group at NVIDIA, and continues his roles as editor and contributor to the previous volumes of GPU
Gems. He has a broad background in production of both live-action and
games. Kevin has been a regular speaker at events such as SIGGRAPH and GDC since the mid-1980s. His current work focuses on applying NVIDIA's horsepower and expertise to help developers fulfill their individual ambitions.
Jean-Yves Blanc, CGGVeritas
Jean-Yves Blanc received a Ph.D. in applied mathematics in 1991 from the Institut National Polytechnique de Grenoble, France. He joined CGG in 1992, where he introduced and developed parallel processing for high-performance computing seismic applications. He is now in charge of IT strategy for the Processing and Reservoir product line.
Jim Blinn, Microsoft Research
Jim Blinn began doing computer graphics in 1968 while an undergraduate at the University of Michigan. In 1974 he became a graduate student at the University of Utah, where he did research in specular lighting models, bump mapping, and environment/reflection mapping and received a Ph.D. in 1977. He then went to JPL and produced computer graphics animations for various space missions to Jupiter, Saturn, and Uranus, as well as for Carl Sagan's PBS series "Cosmos" and for the Annenberg/CPB-funded project "The Mechanical Universe," a 52-part telecourse to teach college-level physics. During these productions he developed several other techniques, including work in cloud simulation, displacement mapping, and a modeling scheme variously called
blobbies or metaballs. Since 1987 he has written a regular column in the IEEE Computer Graphics and Applications journal, where he describes mathematical
techniques used in computer graphics. He has just published his third volume of collected articles from this series. In 1995 he joined Microsoft Research as a Graphics Fellow. He is a MacArthur Fellow, a member of the National Academy of Engineering, has an honorary Doctor of Fine Arts degree from Otis Parsons School of Design, and has received both the SIGGRAPH Computer Graphics Achievement Award (1983) and the Steven A. Coons Award (1999).
George Borshukov is a CG supervisor at Electronic Arts. He holds an M.S. from the University of California, Berkeley, where he was one of the creators of The
Campanile Movie and real-time demo (1997). He was technical designer for the
"bullet time" sequences in The Matrix (1999) and received an Academy Scientific and Technical Achievement Award for the image-based rendering technology used in the film. Borshukov led the development of photoreal digital actors for The Matrix sequels (2003) and received a Visual Effects
Society Award for the design and application of the Universal Capture system in those films. Other film credits include What Dreams May Come (1998),
Mission: Impossible 2 (2000), and Michael Jordan to the Max (2000). He is also
a co-inventor of the UV pelting approach for parameterization and seamless texturing of polygonal or subdivision surfaces. He joined Electronic Arts in 2004 to focus on setting a new standard for facial capture, animation, and rendering in next-generation interactive entertainment. He conceived the
Fight Night Round 3 concept and the Tiger Woods tech demos presented at
Sony's E3 events in 2005 and 2006.
Tamy Boubekeur, LaBRI–INRIA, University of Bordeaux
Tamy Boubekeur is a third-year Ph.D. student in computer science at INRIA in Bordeaux, France. He received an M.Sc. in computer science from the
University of Bordeaux in 2004. His current research focuses on 3D geometry processing and real-time rendering. He has developed new algorithms and data structures for the 3D acquisition pipeline, publishing several scientific papers in the fields of efficient processing and interactive editing of large 3D objects, hierarchical space subdivision structures, point-based graphics, and real-time surface refinement methods. He also teaches geometric modeling and virtual reality at the University of Bordeaux.
Ralph Brunner graduated from the Swiss Federal Institute of Technology (ETH) Zürich with an M.Sc. degree in computer science. He left the country after the bear infestation made the major cities uninhabitable and has been working in California on the graphics stack of Mac OS X since then.
Iain Cantlay, NVIDIA Corporation
Iain started his career in flight simulation, when 250 polys per frame was state of the art. With the advent of consumer-level 3D hardware, he moved to
writing game engines, with published titles including Machines and MotoGP 3. In 2005 he moved to the Developer Technology group at NVIDIA, which is the perfect place to combine his passions for games and 3D graphics.
Ignacio Castaño Aguado, NVIDIA Corporation
Ignacio Castaño Aguado is an engineer in the Developer Technology group at NVIDIA. When not playing Go against his coworkers or hiking across the Santa Cruz Mountains with his son, Ignacio spends his time solving computer
graphics problems that fascinate him and helping developers take advantage of the latest GPU technology. Before joining NVIDIA, Ignacio worked for several game companies, including Crytek, Relic Entertainment, and Oddworld
Inhabitants.
Mark Colbert is a Ph.D. student at the University of Central Florida working in the Media Convergence Lab. He received both his B.S. and his M.S. in
computer science from the University of Central Florida in 2004 and 2006. His current research focuses on user interfaces for interactive material and
lighting design.
Keenan Crane, University of Illinois
Keenan recently completed a B.S. in computer science at the University of Illinois at Urbana-Champaign, where he did research on GPU algorithms, mesh parameterization, and motion capture. As an intern on the NVIDIA Demo
Team, he worked on the "Mad Mod Mike" and "Smoke in a Box" demos. His foray into graphics programming took place in 1991 at Nishimachi
International School in Tokyo, Japan, where he studied the nuances of the
LogoWriter turtle language. This summer he will travel to Kampala, Uganda, to participate in a service project through Volunteers for Peace.
Eugene d'Eon, NVIDIA Corporation
Eugene d'Eon has been writing demos at NVIDIA since 2000, when he first joined the team as an intern, spending three months modeling, rigging, and rotoscoping the short film "Luxo Jr." for a real-time demo that was only shown once. After quickly switching to a more forgiving programming position, he has since been employing the most mathematical, overly sophisticated models
available to solve the simplest of shading and simulation problems in NVIDIA's real-time demos. He constantly struggles between writing a physically correct shader and just settling for what "looks good." Eugene received an Honours B.Math. from the University of Waterloo, applied mathematics and computer science double major, and is occasionally known for his musical abilities (piano and Guitar Hero) and ability to juggle "Eric's Extension." Research interests include light transport, scattering, reflectance models, skin shading, theoretical physics, and mathematical logic. He never drives faster than c, and unlike
most particles in the universe, neither his position nor his momentum can be known with any certainty. He never votes for someone who doesn't have a
clear stance on the Axiom of Choice. Eugene uses Elixir guitar strings.
Bernard Deschizeaux, CGGVeritas
Bernard Deschizeaux received a master's degree in high energy physics in 1988 and a Ph.D. in particle physics in 1991. Since then he has worked for CGG, a French service company for the oil and gas industry, where he applies his high-performance computing skills and physics knowledge to solve seismic processing challenges. His positions within CGG have varied from development to high-performance computing and algorithm research. He is now in charge of a GPGPU project developing an industrial solution based on GPU clusters.
Franck Diard, NVIDIA Corporation
Franck Diard is a senior software architect at NVIDIA. He received a Ph.D. in computer science from the University of Nice Sophia Antipolis (France) in 1998. Starting with vector balls and copper lists on Amiga in the late 1980s, he then programmed on UNIX for a decade with Reyes rendering, ray tracing, and computer vision before transitioning to Windows kernel drivers at NVIDIA. His interests have always been around scalability (programming multi-core, multi-GPU render farms) applied to image processing and graphics rendering. His main contribution to NVIDIA has been the SLI technology.
Frank Doepke, Apple
After discovering that one can make more people's lives miserable by writing buggy software than becoming a tax collector, Frank Doepke decided to
become a software developer. Realizing that evil coding was wrong, he set sail from Germany to the New World and has since been tracking graphic gems at Apple.
Henrik Dohlmann, 3Dfacto R&D
From 1999 to 2002, Henrik Dohlmann worked as a research assistant in the Image Group at the Department of Computer Science, University of
Copenhagen, from which he later received his Cand. Scient. degree in
computer science. Next, he took part in an industrial collaboration between the 3D-Lab at Copenhagen University's School of Dentistry and Image House. He moved to 3Dfacto R&D in 2005, where he now works as a software engineer.
Bryan Dudash, NVIDIA Corporation
Bryan entered the games industry in 1997, working for various companies in Seattle, including Sierra Online and Escape Factory. He has a master's degree from the University of Washington. In 2003 he joined NVIDIA and began
teaching (and learning) high-end, real-time computer graphics. Having studied Japanese since 2000, Bryan convinced NVIDIA in 2004 to move him to Tokyo, where he has been supporting APAC developers ever since. If you are ever in Tokyo, give him a ring.
Kenny Erleben, University of Copenhagen
In 2001 Kenny Erleben received his Cand. Scient. degree in computer science from the Department of Computer Science, University of Copenhagen. He then worked as a fulltime researcher at 3Dfacto A/S before beginning his Ph.D.
studies later in 2001. In 2004 he spent three months at the Department of Mathematics, University of Iowa. He received his Ph.D. in 2005 and soon thereafter was appointed assistant professor at the Department of Computer Science, University of Copenhagen.
Ryan has been a pioneer in music visualization for many years. While working at Nullsoft, he wrote many plug-ins for Winamp, most notably the popular
MilkDrop visualizer. More recently, he spent several years as a member of the NVIDIA Demo Team, creating the "GeoForms" and "Cascades" demos and doing other GPU research projects.
Nolan Goodnight, NVIDIA Corporation
Nolan Goodnight is a software engineer at NVIDIA. He works in the CUDA software group doing application and driver development. Before joining
NVIDIA he was a member of the computer graphics group at the University of Virginia, where he did research in GPU algorithms and approximation methods for rendering with precomputed light transport. Nolan's interest in the
fundamentals of computer graphics grew out of his work in geometric modeling for industrial design. He holds a bachelor's degree in physics and a master's degree in computer science.
Larry Gritz, NVIDIA Corporation
Larry Gritz is director and chief architect of NVIDIA's Gelato software, a
hardware-accelerated film-quality renderer. Prior graphics work includes being the author of BMRT; cofounder and vice president of Exluna, Inc. (later
acquired by NVIDIA), and lead developer of their Entropy renderer; head of Pixar's rendering research group; a main contributor to PhotoRealistic
RenderMan; coauthor of the book Advanced RenderMan: Creating CGI for
Motion Pictures; and occasional technical director on several films and
commercials. Larry has a B.S. from Cornell University and an M.S. and Ph.D. from The George Washington University.
John Hable, Electronic Arts
John Hable is a rendering engineer at Electronic Arts. He graduated from Georgia Tech with a B.S. and M.S. in computer science, where he solved the problem of reducing the rendering time of Boolean combinations of triangle meshes from exponential to quadratic time. His recent work focuses on the compression problems raised by trying to render high-quality facial animation in computer games. Currently he is working on a new EA title in Los Angeles.
Earl Hammon, Jr., Infinity Ward
Earl Hammon, Jr., is a lead software engineer at Infinity Ward, where he assisted a team of talented developers to create the multiplatinum and
critically acclaimed titles Call of Duty 2 and Call of Duty. He worked on Medal of
Honor: Allied Assault prior to becoming a founding member of Infinity Ward.
He graduated from Stanford University with an M.S. in electrical engineering, preceded by a B.S.E.E. from the University of Tulsa. His current project is Call
of Duty 4: Modern Warfare.
Takahiro Harada, University of Tokyo
Takahiro Harada is an associate professor at the University of Tokyo. He received an M.S. in engineering from the University of Tokyo in 2006. His current research interests include physically based simulation, real-time simulation, and general-purpose GPU computation.
Mark Harris is a member of the Developer Technology team at NVIDIA in
London, working with software developers all over the world to push the latest in GPU technology for graphics and high-performance computing. His primary research interests include parallel computing, general-purpose computation on GPUs, and physically based simulation. Mark earned his Ph.D. in computer science from the University of North Carolina at Chapel Hill in 2003 and his B.S. from the University of Notre Dame in 1998. Mark founded and maintains
www.GPGPU.org, a Web site dedicated to general-purpose computation on GPUs.
Evan Hart, NVIDIA Corporation
Evan Hart is a software engineer in the Developer Technology group at NVIDIA. Evan got his start in real-time 3D in 1997 working with visual
simulations. Since graduating from The Ohio State University in 1998, he has worked to develop and improve techniques for real-time rendering, having his hands in everything from games to CAD programs, with a bit of drivers on the side. Evan is a frequent speaker at GDC and he has contributed to chapters in the Game Programming Gems and ShaderX series of books.
Milo Ha an, Cornell University
Milo Ha an graduated with a degree in computer science from Comenius University in Bratislava, Slovakia. Currently he is a Ph.D. student in the Computer Science Department at Cornell University. His research interests include global illumination, GPU rendering, and numerical computations.
Jared Hoberock is a graduate student at the University of Illinois at Urbana-Champaign. He has worked two summers at NVIDIA as an intern and is a two-time recipient of the NVIDIA Graduate Fellowship. He enjoys spending two-time writing rendering software.
Lee Howes, Imperial College London
Lee Howes graduated with an M.Eng. in computing from Imperial College London in 2005 and is currently working toward a Ph.D. at Imperial. Lee's research relates to computing with FPGAs and GPUs and has included work with FFTs and financial simulation. As a distraction from education and to
dabble in the realms of reality, Lee has worked briefly with Philips and NVIDIA.
Yuntao Jia, University of Illinois at Urbana-Champaign
Yuntao Jia is currently pursuing a Ph.D. in computer science at the University of Illinois at Urbana-Champaign. He is very interested in computer graphics, and his current research interests include realistic rendering (especially on the GPU), video and image processing, and graph visualizations.
Alexander Keller, Ulm University
Alexander Keller studied computer science at the University of Kaiserslautern from 1988 to 1993. He then joined the Numerical Algorithms Group at the same university and defended his Ph.D. thesis on Friday, the 13th of June,
1997. In 1998 he was appointed scientific advisor of mental images. Among four calls in 2003, he chose to become a full professor for computer graphics at the University of Ulm in Germany. His research interests include quasi-Monte Carlo methods, photorealistic image synthesis, ray tracing, and scientific
computing. His 1997 SIGGRAPH paper "Instant Radiosity" can be considered one of the roots of GPGPU computing.
Alexander Kharlamov, NVIDIA Corporation
Alex is an undergraduate in the Department of Computational Mathematics and Cybernetics at the Moscow State University. He became interested in video games at the age of ten and decided that nothing else interested him that much. Currently he works as a member of NVIDIA's Developer Technology team implementing new techniques and effects for games and general-purpose computation on GPUs.
Peter Kipfer, Havok
Peter Kipfer is a software engineer at Havok, where he works as part of the Havok FX team that is pioneering work in large-scale real-time physics
simulation in highly parallel environments, such as multi-core CPUs or GPUs. He received his Ph.D. in computer science from the Universität of Erlangen-Nürnberg in 2003 for his work in the KONWIHR supercomputing project. He also worked as a postdoctoral researcher at the Technische Universität
München, focusing on general-purpose computing and geometry processing on the GPU.
Rusty Koonce, NCsoft Corporation
in physics. He has worked on multiple shipped video game titles across a wide range of platforms, including console, PC, and Mac. Computer graphics has held his interest since his first computer, a TRS-80. Today he calls Austin, Texas, home, where he enjoys doing his part to "Keep Austin Weird."
Kees van Kooten, Playlogic Game Factory
Kees van Kooten is a software developer for Playlogic Game Factory. In 2006 he graduated summa cum laude for his master's degree at the Eindhoven
University of Technology. The result of his master's project can be found in this book. His interests are closely related to the topics of his master's research: 3D graphics and real-time simulations. After working hours, Kees can often be found playing drums with "real" musicians.
Jaroslav K ivánek, Czech Technical University in Prague
Jaroslav K ivánek is an assistant professor at the Czech Technical University in Prague. He received his Ph.D. from IRISA/INRIA Rennes and the Czech
Technical University (joint degree) in 2005. In 2003 and 2004 he was a
research associate at the University of Central Florida. He received a master's in computer science from the Czech Technical University in Prague in 2001.
Bunny Laden, Apple
Bunny Laden graduated from the University of Washington with a Special Individual Ph.D. in cognitive science and music in 1989. She joined Apple in 1997, where she now writes documentation for Quartz, Core Image, Quartz Composer, and other Mac OS X technologies. She coauthored Programming
with Quartz (Morgan Kaufmann, 2006) and Learning Carbon (O'Reilly, 2001).
musical acoustics, and other assorted topics.
Andrew Lauritzen, University of Waterloo
Andrew Lauritzen recently received his B.Math. in computer science and is now completing a master's degree in computer graphics at the University of
Waterloo. To date, he has completed a variety of research in graphics, as well as theoretical physics. His current research interests include lighting and
shadowing algorithms, deferred rendering, and graphics engine design. Andrew is also a developer at RapidMind, where he works with GPUs and other high-performance parallel computers.
Scott Le Grand, NVIDIA Corporation
Scott is a senior engineer on the CUDA software team at NVIDIA. His previous commercial projects include the game BattleSphere for the Atari Jaguar;
Genesis, the first molecular modeling system for home computers, for the Atari ST; and Folderol, the first distributed computing project targeted at the protein folding problem. Scott has been writing video games since 1971, when he
played a Star Trek game on a mainframe and he was instantly hooked. In a former life, he picked up a B.S. in biology from Siena College and a Ph.D. in biochemistry from The Pennsylvania State University. In addition, he wrote a chapter for ShaderX and coedited a book on computational methods of protein structure prediction.
Ignacio Llamas, NVIDIA Corporation
Ignacio Llamas is a software engineer in NVIDIA's Developer Technology group. Before joining NVIDIA, Ignacio was a Ph.D. student at Georgia Tech's College of Computing, where he did research on several topics within computer
graphics. In addition to the exciting work he does at NVIDIA, he also enjoys snowboarding.
Charles Loop, Microsoft Research
Charles Loop works for Microsoft Research in Redmond, Washington. He received an M.S. in mathematics from the University of Utah in 1987 and a Ph.D. in computer science from the University of Washington in 1992. His
graphics research has focused primarily on the representation and rendering of smooth free-form shapes, including subdivision surfaces, polynomial splines and patches, and algebraic curves and surfaces.
Charles also works on interactive modeling and computer vision techniques. Lately, his efforts have gone into GPU algorithms for the display of curved objects.
Tristan Lorach, NVIDIA Corporation
Since graduating in 1995 with a master's in computer science applied on art and aesthetic, Tristan Lorach has developed a series of 3D real-time
interactive installations for exhibitions and events all over the world. From the creation of a specific engine for digging complex galleries into a virtual solid, to the conception of new 3D human interfaces for public events, Tristan has
always wanted to fill the gap between technology and artistic or ergonomic ideas. Most of his projects (such as "L'homme Transformé" and "Le Tunnel sous l'Atlantique") were presented in well-known exhibition centers like Beaubourg and Cité des Sciences in Paris. Now Tristan works at NVIDIA on the Technical Developer Relations team, based in Santa Clara, California.
David Luebke is a research scientist at NVIDIA. He received an M.S. and Ph.D. in computer science in 1998 from the University of North Carolina under
Frederick P. Brooks, Jr., following a B.A. in chemistry from the Colorado College. David spent eight years on the faculty of the University of Virginia before leaving in 2006 to help start the NVIDIA Research group. His research interests include real-time rendering, illumination models, and graphics
architecture.
Kenny Mitchell, Electronic Arts
Kenny is a lead engine programmer at Electronic Arts' UK Studio. His Ph.D. introduced the use of real-time 3D for information visualization on consumer hardware, including a novel recursive perspective projection technique. Over the past ten years he has shipped games using high-end graphics technologies including voxels, PN patches, displacement mapping and clipmaps. In between shipping games for EA's flagship Harry Potter franchise, he is also involved in developing new intellectual properties.
Jefferson Montgomery, Electronic Arts
Jefferson Montgomery holds a B.A.Sc. in engineering physics and an M.Sc. in computer science from the University of British Columbia. He is currently a member of the World Wide Visualization Group at Electronic Arts, tasked with adapting advanced techniques to the resource constraints faced by current game teams and producing real-time demonstrations such as those at Sony's E3 presentations in 2005 and 2006.
Kevin Myers, NVIDIA Corporation