Transcript Document

Designing a
Portable Shader
Library for Current
and Future API's
David Gosselin
3D Application Research Group
Outline
•
•
•
Introduction & Motivation
Demos
Case Study: ATI’s demo shader format
–
–
–
–
–
–
•
•
2
Design Goals
Artist’s Interface
Texture Definitions
Vertex and Index Buffers
Sub-shaders and Passes
Vertex and Pixel Shaders
Converting from D3D ASM to HLSL
Example
Motivation
•
Share the lessons learned from our shader
library
–
–
•
•
•
•
3
Shaders in general
Moving from D3D ASM to HLSL
Show some advantages to being shader-centric
Show ways to include fallback paths
Show cross platform generalizations
Would like to see more games use shaders and
not just target the least common denominator
for graphics
Why Shaders?
•
Direction of the Industry
– Hardware very shader driven
– Will continue down this path
– Modern engines need to be shader aware
•
•
•
4
High level shader languages (HLSL and
OpenGL) make shaders easier to write
Visual flexibility
Less engine churn
What is a Shader Library?
•
Library in the sense of a linkable .lib file
or collection of source code
– Not a collection of shaders
•
Abstracts Graphics API
–
–
–
–
•
•
5
Render States
Constant Binding
Shader Binding
Etc.
Manages Shaders
Integrates with preprocessing/export
Engine Block Diagram
Maya
3DS Max
Exporter
File IO
Material
Plug-ins
Preprocessor
Parsing
Shader/State
Management
OGL
Runtime Engine
6
D3D
Shader
Library
Shader
Files
What is in a Shader File?
•
•
Contains all the state and definitions needed to
render a piece of geometry.
Includes
–
–
–
–
–
–
–
–
7
•
Artist Instructions
Preprocessing Directives
Texture Definitions
Constant / Variable Definitions
Stream Definitions (Vertex/Index buffers)
Fixed Function States (Alpha, Stencil, Z, etc.)
Vertex Shader
Pixel Shader
Potentially multiple LOD/# light variants
Demos Showing the Necessity of
a Shader Library
8
Shader File Design Goals
•
•
•
•
•
•
•
•
9
Cross platform / cross API
Reduce the frequency of redesigning graphics
engine
Expandable to future shader languages and
API’s
Drive preprocessing (optimal VB/IB)
Fallback shaders
LOD shaders
Ability to target a wide range of graphics
hardware
Artist interaction to runtime execution
Artists Point-of-View
•
•
•
10
To the RIGHT are our
Maya and 3DS MAX
plug-ins
Artists associate a
shader with a material
within the art tool
Art tool plug-in pulls
instructions from the
file for the artists
Artist Notes
StartArtNotes
Supports:
- 3 Fast RT lights
- 3 Object Ambient Lights
11
Requirements:
- Base = RGB texture
- Bump = RGB normal map
EndArtNotes
Art Notes
• Describe maximum number of lights
supported
• Describe textures needed
• Any special instructions
– Artist editable variables
– Special scene geometry
– Etc.
12
Texture Placement
Texture
tBase 2D DXT1("Base", RGB, Box)
DefTexture tBase Trilinear
13
Texture
tBump 2D RGBX("Bump", RGB, Box)
DefTexture tBump Trilinear
Texture Declaration
•
How to preprocess textures
–
–
–
–
Output format
Input Format
Mipmap Generation
Separate RGB and alpha textures
Texture tBase
2D DXT1
Texture tFurShellTexture 2D RGBA
Texture tBump
Texture tAnisoLookup
14
Texture tNormCube
Texture tEnvMap
Texture tWater
("Base",
RGB, KaiserGamma)
(“T1",
RGB, Kaiser,
“T2",
GRAY,Kaiser)
2D DXT5 ("Bump", HEIGHT,Box,
“Opacity",GRAY,Box)
2D RGBA (“T5",
RGB, Box,
“T6",
GRAY,Box)
CM RGBX ("Base3", RGB, Box)
CMAuto // Engine generated cubemap
Renderable("WaterReflection")
Artist Editable Variables
15
Vector vColor(.7, .7, .7, 0.0) Editable(color)
Vector vReflectionColor (1, 1, 1, 1.0) Editable(Color)
Float vSpecularExp (16.0) Editable(Slider, 0, 512)
Variables
•
•
•
•
•
Standard types: Float, Vector, Matrix
Can be bound to render state
Can be bound to engine state
Constant
Can be artist editable (from within tool)
Matrix
Vector
Vector
Vector
Vector
Float
16
wvp(MATRIX_WVP)
osCamPos(CAMERA_POSITION, OBJECT_SPACE)
osLightPos(LIGHT_POSITION,OBJECT_SPACE, 0)
time(0.0, 0.0, 0.0, 0.0) AppUpdate(“Time”)
furFadeScaleBias(0.5, 0.5, 0.0, 0.0)
furHeight(4.0) EDITABLE
Vertex and Index Buffers
•
•
How to preprocess vertex/index buffers
One IB per unique stream map
StartStream sPosNorm (Normal)
float3 POSITION0 Position
float3 NORMAL0 VertexNormal
EndStream
StartStream sTexCoords (Normal)
float2 TEX0 UV0
// BaseTexU, BaseTexV
float3 TEX1 Tangent0
EndStream
StartStream sFurFins (FurFins) // Different geometry than above 2 streams
float3 POSITION0 Position
float3 NORMAL0 VertexNormal
float4 TEX0 FinFaceData0 // FinTexU, FinTexV, BaseTexUVDist, RandOffset
float3 TEX2 FaceNormal
EndStream
StreamMap smBasePass (sPosNorm, sTexCoords)
StreamMap smShellPass (sPosNorm, sTexCoords)
StreamMap smFinsPass (sFurFins)
17
Sub-Shaders
•
Single shader file with multiple subshaders
–
–
–
–
•
•
•
18
Fallbacks
LOD
Split-screens
Different number of lights
Controlled via a property string
First in file (top to bottom) with unique
properties that validates
Can contain multiple passes
Sub-Shader Example
//----------------------------------------------------StartShader //For vs.2.0 and ps.2.0 minimum
Property “Normal” // Can be anything “Wire frame”,
// “Invincible”, “One Light”, etc.
StartPass
…
EndPass
EndShader
//----------------------------------------------------StartShader //For vs.1.1 and ps.1.4
Property “Normal”
StartPass
…
EndPass
StartPass
…
EndPass
EndShader
19
Falling Back
• Determine which shaders validate before
loading vertex buffers, index buffers and
textures.
• Vertex/Index Buffers
– May need single stream vertex buffers for older
hardware
– All potential vertex streams defined in shader
– Load only those required for shaders which validate
– Extra data stored on disk, but optimal for runtime
• Textures
20
– Many fallback shaders won’t require all the defined
textures
– Only load textures required for validated shaders
Pass
•
Each pass has a Stream Mapping, unique set
of render state and textures, and vertex and
pixel shader code.
StartPass
//Set Stream Map
//Set Textures
//Set Render State
//Set Vertex Shader Constants
//Vertex Shader Code
//Set Pixel Shader Constants
//Pixel Shader Code
EndPass
•
21
Within a sub-shader, state is sticky between
passes.
Common Graphics Concepts
• Graphics hardware is very similar despite
differences in API’s
• Hardware is functionally identical:
22
–
–
–
–
–
–
–
–
–
Setting texture state
Vertex/index buffers, stream maps
Draw calls
Alpha blending & testing
Shader constant setup
Z state
Stencil state
Renderable textures
etc.
Common State Keywords
Clipping TRUE
Cull CW
FillMode Solid
ShadeMode Gouraud
SetTexture 0 NULL CoordIndex(0) Transform(0) Linear LODBias(0.0)
Clamp(www) Border(0x00000000)
SetBlender 0 Color(SelectArg1, Diffuse, Diffuse) Alpha(SelectArg1,
Diffuse, Diffuse)
Fog FALSE Table(None) Vertex(None) Color(0) Start(0.0) End(1.0)
Density(1.0)
AlphaTest FALSE
Blend FALSE Src(One) Dest(Zero) Op(Add)
ColorWriteEnable (R, G, B, A)
MultiSampleAntiAlias TRUE
MultiSampleMask 0xffffffff
DitherEnable FALSE
Z TRUE Write(TRUE) Func(LessEqual) Bias(0.0) SlopeScale(0.0)
23
Stencil FALSE Pass(Keep) Fail(Keep) ZFail(Keep) Func(Always)
Ref(0xFFFFFFFF)
StencilCCW FALSE Pass(Keep) Fail(Keep) ZFail(Keep) Func(Always)
Shader Languages
• DirectX fixed function
• DirectX assembly shaders
• DirectX HLSL
• OpenGL fixed function
• OpenGL assembly shaders
(ARB_vertex_program, ARB_fragment_program)
• The OpenGL Shading Language
(ARB_shading_language_100 )
24
• GameCube
• PS2
D3D ASM Shaders
•
•
VS/PS code embedded
Can also come from an external file.
VsConst 0 mWvp // Matrix takes up 4 constants
VsConst 4 vTimeConst
VsConst 5 (0.0, 0.1, 2.0, 5.0)
StartVertexShader
vs.1.1
dcl_position
v0
dcl_color0
v5
dcl_texcoord0 v7
25
m4x4 oPos, v0, c0 // Transform position
mov oT0, v7
// Base texture coordinates
mov oD0, v5
// Pass vertex light to PS
EndVertexShader
HLSL Support Design Goals
•
•
•
Build on our existing framework
Avoid explicit constant declarations
Allow HLSL include files to reference
common functions
– Includes can contain their own variables and
textures
•
26
Hidden from outside the shader library
Basic HLSL Shader
•
•
New HLSL token
Constants by name
Matrix mWvp(MATRIX_WVP)
Vector vTime AppUpdate(“Time”)
VsConst 0 mWvp //Takes up 4 constants
VsConst 4 vTime
VsConst 5 (0.0, 0.1, 2.0, 5.0)
StartVertexShader
vs.1.1
dcl_position v0
dcl_color0 v5
dcl_texcoord0 v7
StartVertexShader(HLSL)
float4x4 mWvp;
m4x4 oPos, v0, c0 //Transform
float4 vTimeConst;
mov oT0, v7 //Base tex coords
struct VS_OUTPUT
mov oD0, v5 //Vertex light
{
EndVertexShader
float4 Pos
: POSITION;
float4 Diffuse : COLOR0;
float2 TCoord0 : TEXCOORD0;
};
27
VS_OUTPUT main (float4 aPosition : POSITION,
float4 aDiffuse : COLOR0, float2 aTC0 : TEXCOORD0)
{
VS_OUTPUT outV = (VS_OUTPUT) 0;
outV.Pos = mul (mWvp, aPosition); // Transform position
outV.TCoord0 = vTC0; // Pass texture coordinates
outV.Diffuse = vDiffuse; // Pass vertex light
return outV;
}
EndVertexShaderHLSL
HLSL Pixel Shader
Texture tBaseTexture 2D DXT1("Base", RGB, KaiserGamma)
StartPixelShader(HLSL)
sampler tBaseTexture;
struct PsInput
{
float2 texCoord
: TEXCOORD0;
float3 vertexLight : COLOR0;
};
float4 main (PsInput i) : COLOR
{
// Sample base texture
float3 cBase = tex2D (tBaseTexture, i.texCoord);
float4 o;
o.rgb = cBase * i.vertexLight; //Final lighting
o.a
= 1.0f;
return o;
28
}
EndPixelShader
Matching Names
• D3DXCompileShader() returns a constant
table when a shader is compiled.
• ID3DXConstantTable has a member function
GetConstantDesc() which allows you to get:
– Name
– Register Index
– Type/Size
• Our shader library matches names to registers
to send to SetPixelShaderConstantF()
and SetVertexShaderConstantF()
29
Handling HLSL Includes
•
•
•
•
30
Didn’t use HLSL’s #include
Variables and textures in includes need
to be interpreted by shader library
Our parser concatenates the include files
with the embedded shader code
Shader author needs no knowledge
about the contents of the include file
other than the function declaration
Using an Include file
StartHLSL
#define SI_SKINNING_MAX_BONES 40
EndHLSL
…
VsInclude this
VsInclude "SiSkinning.shl"
StartVertexShader(HLSL)
float4x4 mVP;
struct VsInput
{
float4 pos
float4 weights
int4 indices
float3 normal
};
:
:
:
:
POSITION0;
BLENDWEIGHT0;
BLENDINDICES0;
NORMAL0;
…
VsOutput main (VsInput i)
{
// Skin position
float4 pos = SiSkin4x4 (i.pos, i.weights, i.indices);
o.pos = mul (pos, mVP);
…
31
}
EndVertexShader
HLSL Include Example
#replicate ($i, 0, 100, 1)
Matrix mSiWorld$i AppUpdate(world$imat)
#endreplicate
StartHLSL
float4x4 mSiWorld[SI_SKINNING_MAX_BONES];
32
float4 SiSkin4x4 (float4 aVec, float4 aWeights,
float4 aIndices)
{
float4 vec = (float4)0;
for(int bone = 0; bone < 4; bone++)
vec += (aWeights[bone] * (mul (aVec,
mSiWorld[aIndices[bone]]));
return vec;
}
...
EndHLSL
Concatenation of Files
D3DX Compiler
StartHLSL
#define SI_SKINNING_MAX_BONES 40
EndHLSL
Matrix mVP mWvp(MATRIX_WVP)
VsInclude "SiSkinning.shl"
StartVertexShader(HLSL)
float4x4 mVP;
struct VsInput
{
float4 pos
: POSITION0;
float4 weights : BLENDWEIGHT0;
int4 indices
: BLENDINDICES0;
float3 normal : NORMAL0;
};
VsOutput main (VsInput i)
{
float4 pos = SiSkin4x4 (i.pos, i.weights, i.indices);
o.pos = mul (pos, mVP);
}
EndVertexShader
33
Shader Library
Debugging
• Concatenated files means line numbers
will not be accurate
• Added a special tag to dump out the
concatenated code:
StartPixelShader(HLSL) HLSLDebugOutput(“dbg.hlsl”)
• Outputs the concatenated file to the given
file name
34
Textures in HLSL Includes
• Including an HLSL pixel shader also
requires implicitly binding textures to
texture stages
• In non-HLSL pixel shaders, we had:
SetTexture 0 tBaseTexture Trilinear
• Since we can’t specify the stage to bind
the texture to explicitly:
DefTexture tBaseTexture Trilinear
• HLSL compiler returns table for matching
our DefTextures names with stages
35
Texture Lookup in an Include
StartArtNotes
* Anisotropic Strand Lighting map on T2 (24 bit)
EndArtNotes
Texture tSiStrandLighting 2D DXT1("T2", RGB, Box)
DefTexture tSiStrandLighting Linear Clamp(cc)
StartHLSL
sampler tSiStrandLighting;
struct SiStrandPair
{
float diffuse;
float specular;
};
// Compute Wolfgang Heidrich's Anisotropic lighting
SiStrandPair SiComputeStrandLight (float3 normal,
float3 light, float3 view, float3 dirAniso)
{
SiStrandPair sPair;
36
Texture Lookup Continued
float LdA
= dot(light, dirAniso);
float VdA
= dot(view, dirAniso);
float2 fnLookup
= tex2D(tSiStrandLighting,
(float2(LdA, VdA) * 0.5) + (float2)0.5);
float spec
= fnLookup.y * fnLookup.y;
float diff
= fnLookup.x;
float selfShadow
= saturate(dot(normal,
light));
sPair.diffuse = diff * selfShadow;
sPair.specular = spec * selfShadow;
return(sPair);
}
EndHLSL
37
Default Shader
• Defines all default state for your engine within a
shader file.
• You don’t need to rely on D3D’s or OpenGL’s
default state since you override it with what is
most useful to your app. This also helps reduce
overall state change.
• Necessary for cross API development since
different API’s have different default state
• Your shaders ultimately have less redundancy
since you can rely on the defaults you set.
38
A Full Example:
Environment Mapped Bumps
StartArtNotes
Supports:
- 3 Fast RT lights
- 3 Object Ambient Lights
- Per-Pixel Specular Exponent
- Gloss Map
- Environment map cube map
- Bump map
Requirements:
- Color
- Bump
- Gloss
- SpecularExp
- T2
EndArtNotes
39
=
=
=
=
=
Set Editable Color (vBaseColor)
RGB normal map
GRAY Texture (Gloss Map)
GRAY Texture (Specular Exponent Map)
RGB Cube Map (Environment Map)
StartMisc
Animation Skinned(40)
NumFastRTLights 3
EndMisc
Example Continued
// TEXTURES
Texture
tGloss
Texture
tSpec
Texture
tEnv
Texture
tBump
2D
2D
CM
2D
DefTexture
DefTexture
DefTexture
DefTexture
Trilinear
Trilinear
Trilinear
Trilinear
tBump
tSpec
tGloss
tEnv
GRAY("Gloss", GRAY, Box)
GRAY("SpecularExp", GRAY, Box)
RGB("T2", RGB, Box)
RGB("Bump", RGB, Box)
// VARIABLES
Matrix mVP(VP)
Vector worldCamPos(CameraPosition, WorldSpace)
Vector vBaseColor(.8, .8, .8, 0) Editable(color)
40
Example Continued
// STREAMS
StartStream s1 Normal
float3 POSITION
float4 BLENDWEIGHT
ubyte4 BLENDINDICES
float3 NORMAL
float3 TANGENT0
float3 BINORMAL0
float2 TEX0
EndStream
Position
BlendWeight
BlendIndex
Normal
Tangent("Bump")
Binormal("Bump")
UV("Bump")
StreamMap sm1(s1)
41
// Global HLSL block
StartHLSL
#define SI_SKINNING_MAX_BONES 40
#define NUM_OBJECT_AMBIENT_LIGHTS 3
#define SPECULAR_K_MIN 16
#define SPECULAR_K_MAX 256
EndHLSL
Example Continued
StartShader "NormalFastRT1"
Property "Normal"
Property "FastRT1"
StartPass "Pass1"
SetStreamMap sm1
VsInclude this
VsInclude "SiRTLight.shl"(VS)
VsInclude "SiObjAmbLight.shl"
VsInclude "SiSkinning.shl"
VsInclude "SiMath.shl"(Misc)
StartVertexShader(HLSL)
float4x4 mVP;
float3
worldCamPos;
42
Example Continued
struct VsInput
{
float4 pos
float4 weights
float4 indices
float3 normal
float3 tangent
float3 binormal
float2 texCoord
};
43
:
:
:
:
:
:
:
POSITION0;
BLENDWEIGHT0;
BLENDINDICES0;
NORMAL0;
TANGENT0;
BINORMAL0;
TEXCOORD0;
struct VsOutput
{
float4 pos
float2 texCoord
float3 lightVec0TS
float3 lightSpacePos0
float3 viewVecTS
float3 invNormal
float3 invTangent
float3 invBinormal
float3 vertexLight
};
:
:
:
:
:
:
:
:
:
POSITION0;
TEXCOORD0;
TEXCOORD1;
TEXCOORD2;
TEXCOORD3;
TEXCOORD4;
TEXCOORD5;
TEXCOORD6;
COLOR0;
Example Continued
VsOutput main (VsInput i)
{
VsOutput o;
o.texCoord = i.texCoord;
float3x3 mTangent = 0;
// Skin
float4 pos = SiSkin4x4 (i.pos, i.weights, i.indices);
o.pos = mul (pos, mVP);
mTangent[0] = SiSkin3x3 (i.tangent, i.weights, i.indices);
mTangent[1] = SiSkin3x3 (i.binormal, i.weights, i.indices);
mTangent[2] = SiSkin3x3 (i.normal, i.weights, i.indices);
// Invert tangent space
float3x3 mInvTangent = transpose(mTangent);
o.invTangent = mInvTangent[0];
o.invBinormal = mInvTangent[1];
o.invNormal
= mInvTangent[2];
44
// Compute View Vector
float3 viewVec = worldCamPos - i.pos;
viewVec = normalize (viewVec);
o.viewVecTS = mul (mTangent, viewVec);
Example Continued
// Compute ambient lighting
float3 vertexLight = 0.0f;
for (int idx = 0; idx < NUM_OBJECT_AMBIENT_LIGHTS; idx++)
{
vertexLight += SiComputeObjectAmbientLight (pos,
mTangent[2],
idx);
}
o.vertexLight = vertexLight;
// Compute runtime light vectors for RT1
o.lightSpacePos0 = SiComputeRTLightSpacePosition (pos, 0);
float3 lightVec0 = SiComputeRTLightVectorNormalized (pos,0);
o.lightVec0TS = mul (mTangent, lightVec0);
return o;
}
EndVertexShader
45
Example Continued
PsInclude this
PsInclude "SiRTLight.shl"(PS)
PsInclude "SiMath.shl"(Misc)
StartPixelShader(HLSL)
sampler tBump;
sampler tGloss;
sampler tEnv;
sampler tSpec;
float4
vBaseColor;
struct PsInput
{
float2 texCoord
float3 lightVec0TS
float3 lightSpacePos0
float3 viewVecTS
float3 invNormal
float3 invTangent
float3 invBinormal
float3 vertexLight
};
46
:
:
:
:
:
:
:
:
TEXCOORD0;
TEXCOORD1;
TEXCOORD2;
TEXCOORD3;
TEXCOORD4;
TEXCOORD5;
TEXCOORD6;
COLOR0;
Example Continued
float4 main (PsInput i) : COLOR
{
// Create arrays of light vectors and positions
#define NUM_RT_LIGHTS 1
float3 vLightVec[NUM_RT_LIGHTS] = {i.lightVec0TS};
float3 vLightPos[NUM_RT_LIGHTS] = {i.lightSpacePos0};
// Sample normal map
float3 vNormal = tex2D (tBump, i.texCoord);
vNormal = SiConvertColorToVector (vNormal);
// Compute reflection vector
float3 reflectionVec = SiReflect (i.viewVecTS, vNormal);
// Sample Exponent and Gloss Map
float exponent = tex2D (tSpec, i.texCoord);
float gloss = tex2D (tGloss, i.texCoord);
47
Example Continued
// Loop over runtime lights computing light contributions
float3 diffuse = i.vertexLight * vNormal.z;
float3 specular = 0;
for (int idx = 0; idx < NUM_RT_LIGHTS; idx++)
{
float3 colorIntensity = SiComputeRTLightColorIntensity
(vLightPos[idx], lightIdx);
float diffuseNdotL = SiDot3Clamp (vNormal,
vLightVec[idx]);
diffuse += colorIntensity * diffuseNdotL;
float specularRdotL = SiComputeSpecular (reflectionVec,
vLightVec[idx], exponent,
SPECULAR_K_MIN, SPECULAR_K_MAX);
specular += colorIntensity * specularRdotL;
}
48
Example Continued
// Rotate reflection vector to object space
float3x3 mInvTangent = {i.invTangent, i.invBinormal, i.invNormal};
float3 reflectionVecOS = mul (mInvTangent, reflectionVec);
// sample environment map
float3 cEnv = texCUBE (tEnv, reflectionVecOS);
// Scale env map by fresnel and add to specular contribution
float fresnel = SiComputeFresnelApprox (vNormal, i.viewVecTS);
float specularEnv = cEnv * fresnel * gloss;
specular *= gloss;
// Compute final color
float4 o;
o.rgb = (vBaseColor * diffuse)+ (specular + specularEnv);
o.a = 0.0;
return o;
}
EndPixelShader
EndPass
EndShader
49
Summary
•
•
•
•
50
Why a shader library is needed
Goals of designing a shader library
Case study: ATI’s demo shader format
Modifications for HLSL (it wasn’t that
painful)
Questions?
[email protected]
51