9700 Architecture Presentation V1

Download Report

Transcript 9700 Architecture Presentation V1

RADEON

9700

Architecture and 3D Performance

Gordon Elder

RADEON

9700

What is the RADEON

9700 ?

Programmability(SMARTSHADER ™ 2.0) – First Full Floating Point Graphics Pipeline – Enables Compilation of High Level Shading Languages • • Performance – High Bandwidth – Parallelism – Efficiency Image Quality (SMOOTHVISION ™ – Multisample Antialiasing – Anisotropic Texture Filtering 2.0)

1 May 2020

Image Generation with Image Mapping

1 st Generation Programmability

Idea: Texture Mapping, Blinn and Newell 1976 Implementation: SGI VGXT 1990 Hardwired Vertex Processing Hardwired Fragment Processing with a Single Texture Result: Environment Mapping and other effects Blinn, J. F. and Newell, M. E. Texture and reflection in computer generated images. Communications of the ACM Vol. 19, No. 10 (October 1976), 542-547

1 May 2020

Image Generation with Texture Composition 2 nd Generation Programmability

Idea: Shade trees, R. Cook 1984 Implementation: RADEON ™ 8500 2001 Limited Vertex Programmability Limited Fragment Processing • Multiple Textures • Fixed Point Data • Short Programs Result: Current generation of effects.

Robert L. Cook Shade Trees. Computer Graphics Vol. 18, No. 3, (July 1984), 223-231

1 May 2020

Image Generation with General Purpose Floating Point Math & Texturing

3 rd Generation Programmability

Idea: RenderMan ® , Pixar 1987 Implementation: ATI RADEON ™ 9700 2002 Advanced Vertex Programmability Advanced Fragment Programmability • Floating Point Data • • Rich Instruction Set Large Instruction Store Result: Enabling Cinematic Rendering Compiling RenderMan ® , Maya, etc. Willina T. Reeves, David H. Salesin, Robert L. Cook Rendering Antialiased Shadows with Depth Maps. Computer Graphics Vol. 21, No. 4, (July 1987), 283-291

1 May 2020

SMARTSHADER

2.0

• • • Next-generation programmable shader technology – Enabling cinema-quality effects in real time First complete DirectX ® 9.0 feature support – 2.0 Vertex and Pixel Shaders – Floating Point Pixel Pipelines – 128-bit Floating Point Texture and Frame Buffer Formats – Two-Sided Stencil Shadow Acceleration – High Precision 32-bpp (10:10:10:2) Display Mode – Higher Order Surface Enhancements Full feature set also available for OpenGL ® – OpenGL ® Shading Language Support

1 May 2020

Vertex Shaders

(SMARTSHADER ™ 2.0)

• • Flow Control – Loops, jumps and subroutines – Allow re-use of certain parts of the shader code – Avoids repetition and saves instructions More Instructions, More Complex Effects – Up to 65,280 instructions per pass – Vertex shaders can be much more complex than they were in DX8

1 May 2020

Pixel Shaders

(SMARTSHADER ™ 2.0)

• • • More Complex Shaders by an Order of Magnitude – Up to 160 instructions per pass • 32 address ops, 64 color ops, 64 alpha ops • Compared with 12 instructions total in DX8.0

– Multi-pass rendering support • • High precision 128-bit floating point data formats for storing intermediate results between passes Shaders can now effectively be thousands of instructions long – performance is the only limitation – 24-bit per component floating point precision for all pixel shader operations - necessary for cinema-quality effects Allows shaders written in any present or future language to run on hardware with SMARTSHADER™ 2.0

– Even high level languages like RenderMan ® can now be compiled to run on RADEON™ 9700 in real time Pixel shader can also implement complex Image Processing algorithms

1 May 2020

RADEON 9700 Performance

Key design elements for best performance:

High Bandwidth, Parallelism, & Efficiency High Bandwidth

• AGP 8x provides 2 GB/sec transfers to or from the CPU or system memory.

• 310 MHz 256-bit DDR Memory Interface provides 20 GB/sec access to the Frame Buffer • Internal 256-bit data busses for Color, Texture and Z

Parallelism

• 4 Vertex Engines running at 325MHz provides 325 Mtriangles/sec (4 clocks per vertex per engine) • 8 Pixels/Clock Rasterization Architecture running at 325MHz provides a peak fill rate of 2.6 Gpix/sec

1 May 2020

RADEON 9700 Performance

(cont.)

• • • •

Efficiency

• Graphics systems tend to be Memory Bandwidth limited. The RADEON ™ 9700 is no exception. So it is important to use the bandwidth efficiently.

Hierarchical and Early Z checking allows pixels to be rejected

before

the pixel shader. This is when shader programs are long.

very

important Color, Texture and Z caches reduce memory bandwidth utilization. Benefit from spacial and temporal locality.

Lossless Color and Z data compression reduce memory bandwidth utilization.

Compressed Textures can be utilized to reduce memory bandwidth utilization.

Fast Color and Z clears eliminate need to access memory for clears

1 May 2020

RADEON

9700 Performance

(cont.)

One more interesting thing……..

Scalability

The RADEON ™ 9700 Architecture is capable of scaling up to 256 simultaneous units

1 May 2020

Image Quality (SMOOTHVISION

2.0 )

Performance matters too

Pixel antialiasing and anisotropic texture filtering improve image quality only if they are enabled.

• • Just going to higher resolutions isn’t the answer for improved image quality.

Artifacts due to poor texture sampling remain.

Dynamic antialiasing artifacts are still very visible.

Sufficient performance for high resolution display, high quality texture filtering,

and

antialiasing is needed.

The RADEON ™ 9700 was architected to do all three simultaneously.

1 May 2020

Anti-Aliasing

(SMOOTHVISION ™ 2.0)

• • • Non-Grid Programmable Multi-Sampling 2, 4, or 6 samples per pixel • Sample positions provide the maximum quality per sample • Lossless Z and Color compression minimizes bandwidth cost of higher sample counts.

Per Sample Gamma Correction – Takes gamma into account when blending samples – Creates smoother edge transitions

Standard Edge Gradient

Input

Gamma Corrected Edge Gradient

Input

1 May 2020

Anisotropic Filtering

(SMOOTHVISION ™ 2.0)

• Improved Adaptive Algorithm – Up to 16 Trilinear Samples (128-tap) – Calculates optimal number of samples for each polygon – Delivers full image quality benefit while conserving memory bandwidth

1 May 2020

RADEON

9700 Demos

1 May 2020

Conclusion

1 May 2020