CLR: Under the Hood Brad Abrams [email protected] Program Manager http://blogs.msdn.com/brada Health Warning • • • • This talk dives deep! Examines internal data structures They will change! Internals knowledge not needed to.

Download Report

Transcript CLR: Under the Hood Brad Abrams [email protected] Program Manager http://blogs.msdn.com/brada Health Warning • • • • This talk dives deep! Examines internal data structures They will change! Internals knowledge not needed to.

CLR: Under the Hood
Brad Abrams
[email protected]
Program Manager
http://blogs.msdn.com/brada
Health Warning
•
•
•
•
This talk dives deep!
Examines internal data structures
They will change!
Internals knowledge not needed to write
.NET apps
Quick Refresher: The Unmanaged World
C++ source
VB6 source
Compiler
Compiler
.obj
PCode/x86 .exe
Linker
VBRun
.exe
Hardware Platform
Loader
Hardware Platform
Quick Refresher: The CLR
Compilation
Source
Code
Language
Compiler
Native
Code
JIT
Compiler
Execution
Code (IL)
Assembly
Metadata
At installation or the
first time each
method is called
Agenda
1. Compilation
csc.exe /debug helloAtlanta.cs
2. Packaging and Metadata
helloAtlanta.exe
4. Execution
CLR
3. Loading
and Layout
mscorlib.dll
5. Runtime Services
Q&A
1. Compilation
HelloAtlanta.cs
using System;
class Output {
private int year;
public Output(int year) {
this.year = year;
}
public void SayHello() {
Console.WriteLine("Hello, it’s the year " + year);
}
}
class HelloAtlanta {
static void Main(string[] args) {
new Output(2004).SayHello();
new Output(2005).SayHello();
}
}
Agenda
1. Compilation
csc.exe /debug helloAtlanta.cs
2. Packaging and Metadata
helloAtlanta.exe
4. Execution
CLR
3. Loading
and Layout
mscorlib.dll
5. Runtime Services
Q&A
2. Packaging: Assemblies
PE HEADER
Manifest
Assembly
Type CIL
Files used
Type
Type Metadata
Type Exports
Resources
Resources
Strings/Blobs
helloAtlanta.exe
2. Metadata
IL – Intermediate Language
• IL – The language for execution
• Independent of CPU and platform
– Created by Microsoft, external commercial
and academic language/compiler writers
• Stack based virtual machine:
1+2+3-4
IL_0001:
IL_0002:
IL_0003:
IL_0004:
IL_0005:
IL_0006:
IL_0007:
ldc.i4.1
ldc.i4.2
add
ldc.i4.3
add
ldc.i4.4
sub
Evaluation Stack
2. Metadata
IL – Intermediate Language
• IL – The language for execution
• Independent of CPU and platform
– Created by Microsoft, external commercial
and academic language/compiler writers
• Stack based:
1+2+3-4
IL_0001:
IL_0002:
IL_0003:
IL_0004:
IL_0005:
IL_0006:
IL_0007:
ldc.i4.1
ldc.i4.2
add
ldc.i4.3
add
ldc.i4.4
sub
0000 0001
Evaluation Stack
2. Metadata
IL – Intermediate Language
• IL – The language for execution
• Independent of CPU and platform
– Created by Microsoft, external commercial
and academic language/compiler writers
• Stack based:
1+2+3-4
IL_0001:
IL_0002:
IL_0003:
IL_0004:
IL_0005:
IL_0006:
IL_0007:
ldc.i4.1
ldc.i4.2
add
ldc.i4.3
add
ldc.i4.4
sub
0000 0002
0000 0001
Evaluation Stack
2. Metadata
IL – Intermediate Language
• IL – The language for execution
• Independent of CPU and platform
– Created by Microsoft, external commercial
and academic language/compiler writers
• Stack based:
1+2+3-4
IL_0001:
IL_0002:
IL_0003:
IL_0004:
IL_0005:
IL_0006:
IL_0007:
ldc.i4.1
ldc.i4.2
add
ldc.i4.3
add
ldc.i4.4
sub
0000 0003
Evaluation Stack
2. Metadata
IL – Intermediate Language
• IL – The language for execution
• Independent of CPU and platform
– Created by Microsoft, external commercial
and academic language/compiler writers
• Stack based:
1+2+3-4
IL_0001:
IL_0002:
IL_0003:
IL_0004:
IL_0005:
IL_0006:
IL_0007:
ldc.i4.1
ldc.i4.2
add
ldc.i4.3
add
ldc.i4.4
sub
0000 0003
0000 0003
Evaluation Stack
2. Metadata
IL – Intermediate Language
• IL – The language for execution
• Independent of CPU and platform
– Created by Microsoft, external commercial
and academic language/compiler writers
• Stack based:
1+2+3-4
IL_0001:
IL_0002:
IL_0003:
IL_0004:
IL_0005:
IL_0006:
IL_0007:
ldc.i4.1
ldc.i4.2
add
ldc.i4.3
add
ldc.i4.4
sub
0000 0006
Evaluation Stack
2. Metadata
IL – Intermediate Language
• IL – The language for execution
• Independent of CPU and platform
– Created by Microsoft, external commercial
and academic language/compiler writers
• Stack based:
1+2+3-4
IL_0001:
IL_0002:
IL_0003:
IL_0004:
IL_0005:
IL_0006:
IL_0007:
ldc.i4.1
ldc.i4.2
add
ldc.i4.3
add
ldc.i4.4
sub
0000 0004
0000 0006
Evaluation Stack
2. Metadata
IL – Intermediate Language
• IL – The language for execution
• Independent of CPU and platform
– Created by Microsoft, external commercial
and academic language/compiler writers
• Stack based:
1+2+3–4=2
IL_0001:
IL_0002:
IL_0003:
IL_0004:
IL_0005:
IL_0006:
IL_0007:
ldc.i4.1
ldc.i4.2
add
ldc.i4.3
add
ldc.i4.4
sub
0000 0002
Evaluation Stack
2. Metadata
IL – Verification
•
When processing IL, runtime verifies the
IL to make sure it’s “safe”
– Every IL construct is called with the correct
amount of stack parameters
– Every method is called with the correct type
and number of parameters
•
•
•
Helps prevents buffer overflows,
underruns
Helps prevent security holes
Safety allows multiple managed
applications in the same process
Agenda
1. Compilation
csc.exe /debug helloAtlanta.cs
2. Packaging and Metadata
helloAtlanta.exe
4. Execution
CLR
3. Loading
and Layout
mscorlib.dll
5. Runtime Services
Q&A
3. Loading and Layout
Startup logic
• OS hands the image to a CLR shim
(mscoree.dll)
• Which starts the runtime
• Runtime performs the following:
– Loads assembly in to memory and sets up
the MD reader
– Resolves immediate dependencies
– Slurps metadata, creates internal data
structures
– Starts execution at Assembly entry point
Agenda
1. Compilation
csc.exe /debug helloAtlanta.cs
2. Packaging and Metadata
helloAtlanta.exe
4. Execution
CLR
3. Loading
and Layout
mscorlib.dll
5. Runtime Services
Q&A
4. Execution
Invocation
78abed08 System.Object.ToString()
Private execution
engine memory
Objects in
GC heap
78a7fe50 System.Object.Equals(System.Object)
78a74de8 System.Object.GetHashCode()
78abed58 System.Object.Finalize()
03690ceb Output..ctor(Int32)
03690cfb Output.SayHello()
03790118
MethodTable
(runtime info)
[03690cfb]
Prestub Dispatch
JIT Compiler
2004
03790118
push edi
push esi
push ebx
…
SayHello() Native Code
newobj
call
instance void Output::.ctor(int32)
instance void Output::SayHello()
4. Execution
Invocation – JIT output
Output.SayHello()
push
mov
edi
edi,ecx
:
mov
mov
call
mov
mov
mov
mov
call
mov
cmp
jne
mov
call
mov
mov
call
esi,dword ptr ds:[01AF201Ch]
ecx,788F34E8h
FD34FF55 // JIT_DbgIsJustMyCode
edx,eax
eax,dword ptr [edi+4]
dword ptr [edx+4],eax
ecx,esi
7530FA52
esi,eax
dword ptr ds:[01AF1070h],0
0000000C
ecx,1
752E8A85 // System.String.Concat
ecx,dword ptr ds:[01AF1070h]
edx,esi
dword ptr ds:[037B0010h] // System.Console.WriteLine
pop
ret
esi
:
push
pop
esi
edi
4. Execution
JIT Optimizations
• Register Allocation
– locals, temps, evaluation stack
• Loop unroll
• Dead code elimination
#define SOMETHING 0
if (SOMETHING > 10)
a = x;
// dead code statement
• Constant and Copy propagation
• Processor specific code generation
4. Execution
JIT Optimizations
• Range check elimination
//Range check will be eliminated
for (int i = 0; i < myArray.Length; i++)
{
Console.WriteLine(myArray[i].ToString());
}
//Range check will NOT be eliminated
for (int i = 0; i < myArray.Length - y; i++)
{
Console.WriteLine(myArray[i + x].ToString());
}
Agenda
1. Compilation
csc.exe /debug helloAtlanta.cs
2. Packaging and Metadata
helloAtlanta.exe
4. Execution
CLR
3. Loading
and Layout
mscorlib.dll
5. Runtime Services
Q&A
5. Runtime services
A look at the Garbage Collector
•
•
•
•
Reference tracking (tracing) GC
Large object heap – objects over 80k
Generational (three gen)
Mark sweep and compact
Agenda
1. Compilation
csc.exe /debug helloAtlanta.cs
2. Packaging and Metadata
helloAtlanta.exe
4. Execution
CLR
3. Loading
and Layout
mscorlib.dll
5. Runtime Services
Q&A
More Information
• My Blog: http://blogs.msdn.com/brada
The SLAR
Inside MS .NET
IL Assembler
Common
Language
Infrastructure
Annotated
Standard
Shared
Source
CLI
Essentials
Applied
Microsoft
.NET
Framework
Programming
Compiling
for the .NET
CLR
Questions?
BACKUP
New runtime features for V2.0
•
•
•
•
•
•
•
•
•
•
Generics
64 bit (Itanium and x86-64)
ReflectionOnly context
Delegate Relaxation
Lightweight Code Generation
NGen/NGen services
Stub based dispatch
MDbg
Edit and Continue
BCL Enhancements
Generational GC
Generation 1
Generation 0
New Heap Begins with New Generation
Accessible References Keep Objects Alive
Preserves / Compacts Referenced Objects
Objects Left Merge with Older Generation
New Allocations Rebuild New Generation
Generational GC
Generation 2
Generation 0
• Generations Dynamically Tuned
– CPU Cache Size
– Acceptable Fragmentation
• Older Generations are Larger / More Stable
– Require Collection Less Often
– Are More Expensive to Collect
– Can Have References to Newer Objects
What is Generics?
•
•
•
•
•
•
Type checking, no boxing, no downcasts
Increased sharing (typed collections)
Instantiated at run-time, not compile-time
Work for both reference and value types
Code shared for reference types
Exact run-time type information
What is Generics?
Generic Type Declaration
public class
List<T>
{
public void Add(T item) { … }
}
Swap<T>(ref
Generic Method
public static void
ref2){ … }
Type Parameter
public class Dictionary<K,V> {
T ref1, ref T
}
Type Argument
Dictionary<string,int> map = new
Dictionary<string,int>();
where
T:IComparable, T:new()
Constraints
public class List<T>
Open Constructed Type
List<T>
Closed Constructed Type
List<string>
C# Generics
public class List<T>
{
private T[] elements;
private int count;
public void Add(T element) {
if (count == elements.Length) Resize(count * 2);
elements[count++] = element;
}
public T this[int index] {
get { return elements[index];
}
List<int> intList
= new List<int>();
set { elements[index] = value; }
}
intList.Add(1);
// No boxing
intList.Add(2);
// No boxing
public int Count { intList.Add("Three");
// Compile-time error
get { return count; }
}
int i = intList[0]; // No cast required
}
Bits & bytes under the hood
• New metadata tables
GenericParam
GenericPar0x2a
MethodSpec
GenericPar
0x2b
GenericParamConstraint
GenericPar
0x2c
Number
Method (MethodDefOrRef)
Number
Number
Owner
Flags
Instantiation (Blob heap)
Constraint (TypeDefRefSpec)
Owner
Owner (TypeOrMethodDef)
(TypeOrMethodDef)
Name (sh)
Constraint (TypeDefOrRef)
• Flags defines variance and special constraints
• GenericParamConstraint defines type
constraints
Implementation choices
• Code sharing
– Modula 3 and ML use code sharing
– Type identity can be a problem
– Runtime type checking required
• Code specialization
– C++ templates uses specialization
– Code bloat can be a problem
4. Execution
JIT Optimizations
• JIT Inlining (method example)
Class Fib
Shared Sub NextFib (ByRef i As Integer, ByRef j As
Integer)
Dim k As Integer = i + j
i = j : j = k
End Sub
Shared
Dim
Dim
For
Sub Main()
fib1 As Integer = 1 : Dim fib2 As Integer = 1
n As Integer
n = 3 To 36
NextFib (fib1, fib2)
Next n
End Sub
End Class
4. Execution
JIT No inline == SLOW
push ebp
sub esp,0Ch
: mov ebp,esp
: push esi
Prolog
xor
mov
mov
xor
eax,eax
[ebp-4],eax
[ebp-8],eax
esi,esi
: nop
Zero vars
mov
mov
mov
dword ptr [ebp-4],1
dword ptr [ebp-8],1
esi,3
Init vars
lea ecx,[ebp-4] : lea edx,[ebp-8]
call [003E50D4h] ====================>
nop
: nop
add
esi,1
: jno
00000009
Call NextFib
xor ecx,ecx
: call 764618ED
n++
Overflow Handler
cmp
: jle
Next n
esi,24h
nop
mov esp,ebp
FFFFFFE3 : nop
: pop esi
: pop ebp
: ret
Epilog
4. Execution
NextFib() method
push
mov
sub
push
push
push
mov
mov
xor
nop
mov
add
jno
xor
ebp
ebp,esp
esp,0Ch
edi
esi
ebx
edi,ecx
esi,edx
ebx,ebx
eax,[edi]
eax,[esi]
00000009
ecx,ecx
call
mov
mov
mov
mov
nop
pop
pop
pop
mov
pop
ret
764618B7
ebx,eax
eax,[esi]
[edi],eax
[esi],ebx
ebx
esi
edi
esp,ebp
ebp
4. Execution
JIT Inlining == FAST!
sub esp,8
push esi
: push edi
Prolog
xor
mov
mov
eax,eax
[esp+8],eax
[esp+C],eax
Zero vars
mov
mov
mov
[esp+8],1
[esp+0C],1
edi,3
Init vars
lea
mov
add
mov
esi,[esp+8]
eax,[esi]
eax,edx
[esi],edx
:
:
:
:
add
cmp
edi,1
edi,24h
: jo 0D
: jle FFFFFFE4
Next n
pop
add
esi
esp,8
: pop edi
: ret
Epilog
lea
mov
jo
mov
ecx,[esp+C]
edx,[ecx]
16
[ecs],eax
Calc next