.NET Frameworks week 5 December 4, 2003

Download Report

Transcript .NET Frameworks week 5 December 4, 2003

.NET Frameworks
week 5
December 4, 2003
Agenda – December 5, 2003
Other .NET Languages  Class Exercise
How the .NET CLR provides Multiple Language Support
PE File Format  DUMPBIN  Class Exercise
bootstrapping the CLR .NET
How JIT compilation works
Packaging code in .NET
Assemblies
using the ILDASM  Class Exercise
Modules and assemblies
Assembly Identification
Strong names
The Global Assembly Cache  Class Exercise
Homework
Course Schedule
Week
Topics
5 - Thursday Bootstrapping the Common Language Resource (CLR)
12/4/2003 Modules and Assemblies
.Net Packaging
Public and Private Keys
Shared Assemblies
The Global Assemble Cache (GAC)
Strong Names
.NET Frameworks Language Overview
Frameworks Class Library
Course Schedule
Week
6 - Monday
12/8/2003
Topics
ASP.NET
Programming .NET Web Form Application
Introduction to Web Services
Web Security
7 - Thursday Web Services Class Project
12/11/2003
8 - Monday Web Services Class Project (continued)
12/15/2003
Homework Questions ?
• Tab Controls
• Setting Focus to TextBox's
Course Goals
Course Goals #1
Provide an overview of the .NET Architecture and
it’s Major Components
–
–
–
–
–
Programming Languages
ADO.NET
ASP.NET
Web Services
XML Integration
Course Goals #2
Understand the .NET Frameworks as an Object
Oriented, Strongly Typed, Computing Environment
Course Goals #3
Work with various .NET Framework components
–
–
–
–
–
–
–
–
Common Language Runtime (CLR)
.NET Framework Class Library (FCL)
Assemblies
Strong Names
The Global Assembly Cache (GAC)
ADO.NET
ASP.NET
Web Services
Course Goals #4
Develop a basic fluency programming in C#
Where Are We After 4 Weeks?
• Learned some C#
• Worked with Windows Forms and .ADO
• Spent some time on Classes, Types and
Objects
• Now we're going to look into the .NET
portion of .NET, Strong Names and Shared
Assemblies Tonight
• On to Web Forms and then Web Services
Tonight’s Class Goals
•
•
•
•
•
•
•
•
•
•
•
•
Other .NET Languages
How the .NET CLR provides Multiple Language Support
PE File Format  DUMPBIN
bootstrapping the CLR .NET
How JIT compilation works
Packaging code in .NET
Assemblies
using the ILDASM
Modules and assemblies
Assembly Identification
Strong names
The Global Assembly Cache
Other .NET Languages
Other languages
• The .NET Framework supports a host of language
environments
–
–
–
–
–
Visual Basic.NET
Visual C++.NET
JScript.NET
Fujitsu COBOL.NET
Python, Perl, Eiffel….
• Let’s do a quick comparison of C# with VB.NET
Language Comparison
Feature
C#
VB.NET
JScript
Declaring
variables
int x;
int x = 10;
Dim x As Integer
Public x As Integer = 10
var x : int;
var x : int = 10;
var x = 10;
// comment
/* multiline
comment */
' comment
x = 1 ' comment
Rem comment
// comment
/* multiline
comment */
nVal = 7
nVal = 7
nVal = 7;
if (nCnt <= nMax) {
If nCnt <= nMax Then
if (nCnt <= nMax) {
Comments
Assignment
Statements
Conditional
statements
nTotal += nCnt;
nCnt++;
}
else {
Else
nTotal +=nCnt;
nCnt--;
}
nTotal += nCnt
nCnt += 1
nTotal += nCnt
nCnt -= 1
nTotal += nCnt;
nCnt++;
}
else {
nTotal +=nCnt;
nCnt--;
End If
}
Language Comparison (continued)
Feature
For loop
While loop
Select
statement
C#
VB.NET
JScript
for (int i = 1; i <= 10; i++)
Console.WriteLine("The number
is {0}", i);
For n = 1 To 10
for (var n = 0; n < 10; n++) {
print("The number is " + n) }
while (n < 100)
n++;
While n < 100
n += 1
End While
while (n < 100)
n++;
switch(n)
{
case 0:
Console.WriteLine("Zero");
break;
case 1:
Console.WriteLine("One");
break;
case 2:
Console.WriteLine("Two");
break;
default: Console.WriteLine("?");
}
Select Case n
Case 0
MsgBox ("Zero")
Case 1
MsgBox ("One")
Case 2
MsgBox ("Two")
Case Else
MsgBox ("Default")
End Select
switch(int(n)) {
case 0 :
print("Zero")
break
case 1 :
print("One")
Break
case 2 :
print("Two")
default : print("Default")
}
MsgBox("The number is " & n)
Next
Language Comparison (continued)
Keyword
Classes
Current object
Namespaces
Accessing a
namespace
C#
VB.NET
JScript
class
Class
class
this
Me
this
namespace
Namespace
package
using
Imports
import
Visual Basic
Class Exercise
or
How about a visual basic hello world
visual studio program (i.e. a form, some
text, a few buttons, change the colors,
display a messagebox, be creative...)?
Class Exercise
• Take 30 minutes
• Use Visual Studio (except choose Visual
Basic.NET as the language)
• See how it feels!
• Be prepared to discuss what you liked and
what you didn't like
How does the .NET CLR (Common
Langauge Runtime) support multiple
compliant languages ?
• CTS – Common Type System
• CLS – Common Language Specification
• FCL –.NET Framework Class Library
CTS
Common Type System
Common Type System (CTS)
• The CLR is heavily dependent on Type Information
• Because of this, Microsoft has created a formal
specification, the CTS.
– It defines how types are defined
– It defines how types behave
• Every .NET language must abide by this spec.
– This ensures that common types have the same
conceptual semantics across various .NET languages.
Common Type System (CTS)
• The CTS states that all types must ultimately
inherit from a predefined base type, System.Object
• The CTS states that a type can contain zero or
more members
• Members can be:
– Field : a data variable that is part of an object’s state.
– Method : a function that performs an operation on an
object
– Property : to the caller, this looks like a field, but to the
implementer, it looks like a method (or two)
– Event : a notification method between objects.
Common Type System (CTS)
• The CTS also specifies type visibility rules and rules for
access to members of a type.
• The following options are allowed for methods/fields:
– Private : the method is accessible only by other
members of the class
– Protected : the method is accessible by derived types
as well
– Assembly : the method is callable by any code in the
same executable unit.
– Protected and assembly / Protected or assembly
– Public accessible by any code in any executable unit
CLS
Common Language Specification
CLS
Common Language Specification
• Language integration is a great goal, but is very hard to
do.
• Languages are intrinsically different, especially if they
have a legacy behind them (e.g. VB, C++)
• The CLS is the minimum set of features that compilers
must support if they are to target the CLR
• As a programmer, you can build in checks for CLS
compliance if you expect your type to be used from
other languages.
CLS (contd.)
CLS in context:
CLR/CTS
VB
C#
CLS
C++
CLS Compliance
• Examples of things that would not be CLS compliant:
– Unsigned integers
– Case sensitive overloads of names
• The compliance rules apply only to parts of the type that are visible
outside the defining executable unit.
• You can catch non-compliance at compile time by using the
CLSCompliant(true) attribute
– E.g.
• [assembly:CLSCompliant(true)] sets it on for the whole
executable unit.
• Can turn it off on individual methods with
[CLSCompliant(false)]
FCL
.NET Framework Class Library
NET Framework Class Library
• The .NET Framework includes classes, interfaces,
and value types that expedite and optimize the
development process and provide access to
system functionality.
• To facilitate interoperability between languages,
the .NET Framework types are CLS-compliant
and can therefore be used from any programming
language whose compiler conforms to the
common language specification (CLS).
.NET Framework Class Library
The .NET Framework types are the foundation on which
.NET applications, components, and controls are built. The
.NET Framework includes types that perform the following
functions:
– Represent base data types and exceptions.
– Encapsulate data structures.
– Perform I/O.
– Access information about loaded types.
– Invoke .NET Framework security checks.
– Provide data access, rich client-side GUI, and servercontrolled, client-side GUI.
.NET Framework Class Library
• The .NET Framework provides a rich set of
interfaces, as well as abstract and concrete (nonabstract) classes.
• You can use the concrete classes as is or, in many
cases, derive your own classes from them.
• To use the functionality of an interface, you can
either create a class that implements the interface
or derive a class from one of the .NET Framework
classes that implements the interface.
What Happens When You
Compile?
What happens when you compile
• The respective language compiler processes
your code into intermediate language (IL)
code.
C# Code
C# Compiler
VB.Net Code
COBOL Code
VB Compiler
COBOLCompiler
IL and
related data
.NET Standardization
• Microsoft’s version of IL is know as MSIL(Microsoft
intermediate language)
• A word on standards
– Right from day one, the intent has been to standardize the .NET
framework and the C# language
– To this end, Microsoft has submitted the .NET Framework and
C# to ECMA (before 1994 it was known as ECMA - European
Computer Manufacturers Association).
• http://msdn.microsoft.com/net/ecma/default.asp
• CLI (Common Language Infrastructure) includes the CLR,
FCL and IL
• IL is referred to as CIL.
– A hyperlinked version of the C# spec can be found at:
• http://www.jaggersoft.com/csharp_standard/index.htm
What’s in an executable, anyway?
• Three types of executables:
– Windows executable
– Console executable
– Dynamic Link Library (dll)
• Executables are in PE File Format
• Dumpbin is a traditional tool for looking inside an exe
(or a dll).
• ILDASM is a .NET tool for looking inside a .NET
executable
Portable Executable File Format
A bit about the PE File Format
• Microsoft introduced the PE File format, more commonly known
as the PE format, as part of the original Win32 specifications.
• The term "Portable Executable" was chosen because the intent was
to have a common file format for all flavors of Windows, on all
supported CPUs. To a large extent, this goal has been achieved
with the same format used on Windows NT and descendants,
Windows 95 and descendants, and Windows CE.
• PE files are derived from the earlier Common Object File Format
(COFF) found on VAX/VMS. This makes sense since much of the
original Windows NT team came from Digital Equipment
Corporation. It was natural for these developers to use existing
code to quickly bootstrap the new Windows NT platform.
A bit about the PE File Format
• OBJ files emitted by Microsoft compilers use the COFF format.
You can get an idea of how old the COFF format is by looking at
some of its fields, which use octal encoding! COFF OBJ files have
many data structures and enumerations in common with PE files,
and I'll mention some of them as I go along.
• The addition of 64-bit Windows required just a few modifications
to the PE format. This new format is called PE32+. No new fields
were added, and only one field in the PE format was deleted. The
remaining changes are simply the widening of certain fields from
32 bits to 64 bits. In most of these cases, you can write code that
simply works with both 32 and 64-bit PE files. The Windows
header files have the magic pixie dust to make the differences
invisible to most C++-based code.
A bit about the PE File Format
• The distinction between EXE and DLL files is entirely one of
semantics. They both use the exact same PE format. The only
difference is a single bit that indicates if the file should be treated
as an EXE or as a DLL. Even the DLL file extension is artificial.
You can have DLLs with entirely different extensions—for
instance .OCX controls and Control Panel applets (.CPL files) are
DLLs.
The Portable Executable (PE) file
format.
• Since the days of Windows NT 3.5, MS has used the PE
file format as the layout format of their executable files.
DOS MZ Header
PE Header
COFF Header
Executable
Code
(.text, .idata)
The PE file format (continued)
• The advantage of the PE format is that it
maps out almost identically in memory at
runtime.
• The .text section has machine instructions
• The .idata section (Import data) contains
references to other dlls.
DUMPBIN
DUMPBIN Reference
• The Microsoft COFF Binary File Dumper
(DUMPBIN.EXE) displays information about 32-bit
Common Object File Format (COFF) binary files.
• You can use DUMPBIN to examine COFF object files,
standard libraries of COFF objects, executable files, and
dynamic-link libraries (DLLs).
• Note DUMPBIN runs only from the command line.
C:\Program Files\Microsoft Visual Studio .NET\Vc7\bin\dumpbin.exe
DUMPBIN command line
• To run DUMPBIN, use the following syntax:
DUMPBIN [options] files...
• Specify one or more binary files, along with any options
required to control the information. DUMPBIN displays
the information to standard output. You can either
redirect it to a file or use the /OUT option to specify a file
name for the output.
• When you run DUMPBIN on a file without specifying an
option, DUMPBIN displays the /SUMMARY output.
• When you type the command dumpbin without any other
command-line input, DUMPBIN displays a usage
statement that summarizes its options.
Class Exercise
1. Create a directory C:\Demo
2. Download Demo.zip to C:\Demo
3. Unzip all files to C:\ directory (this will put
the files into the C:\Demo directory tree)
4. Open a console window
5. Set Default to C:\Demo (cd c:\demo)
6. Type:
dumpbin/headers helloworld.exe | more
Managed .NET executables
also use
the PE file format.
The structure of a .NET PE file
• We will keep referring to this as we delve deeper into
assemblies.
• The MZ header ensures that this code does not crash on a
DOS box.
DOS MZ Header
PE/COFF Header
CLR Header
IL
Metadata
Delving further into a PE file
MZ Header
PE/COFF Header
Definition
Tables
CLR Header
IL
Reference
Tables
Metadata
Manifest
Tables
bootstrapping the CLR .NET
bootstrapping the CLR .NET
• Every managed exe/dll does have one 6 byte piece of x86
code in it’s .text section
– JMP _CorExeMain or
– JMP _CorDllMain
• When the exe runs, the first instruction it encounters is
this JMP instruction
• The loader looks for _CorExeMain in the .idata section,
and finds that it is in MSCorEE.dll
• MSCorEE.dll gets loaded into the process’s address
space, the loader gets the address of _CorExeMain and
fixes the JMP instruction to reference the function.
Starting up the .NET CLR
• The process’s main thread jumps to _CorExeMain and
starts executing
• _CorExeMain initializes the CLR, and looks at the PE
file’s CLR header to determine the managed entry point to
execute.
• The IL code for the entry point is Just-In-Time (JIT)
compiled to native instructions, and the CLR jumps to the
native code.
• At this point, the managed application’s code is running.
• Library files (dlls) follow a similar process with
_CorDllMain.
How JIT compilation works
How JIT compilation works
public class App
{
public static void Main()
{
System.Console.WriteLine("Hello, ");
System.Console.WriteLine("C# World");
}
}
Just before the Main method executes, the CLR detects all
types that are referenced by Main’s code. (This is possible
because of all the metadata that is built into the type .)
How JIT compilation works
• This causes the CLR to allocate an internal data
structure that is used to manage access to the
referenced type.
• This internal data structure has an entry for each
method defined by the type.
• Before the first call, each method entry points to an
undocumented stub function.(Richter calls this
JITCompiler)
• When Console.WriteLine is invoked for the first
time, it hits the JITCompiler function
How JIT compilation works
• Inside JITCompiler,
– Inside the dll that implements Console, lookup the method
WriteLine in the metadata.
– From the metadata, get the IL (Microsoft intermediate
language) for this method
– Allocate a block of memory to hold the native code.
– Verify and compile the IL, put the resulting native code into
the allocated memory
– Modify the internal data structure to point to the allocated
block of memory
– Jump into it and start executing
• The next time WriteLine is invoked, code executes immediately.
• Code is JITted every time it is executed – no caching between
executions.
How JIT compilation is made
more efficient
• A JIT compiler can compile to the exact hardware
you are running on, not the lowest common
denominator (x86)
• A JIT compiler can optimize for multiple CPUs.
• Since JIT happens at runtime, the CLR could use
branch prediction to compile sections of IL while
the application runs
• If you are still doubtful, you could use ngen.exe
(that ships with the Framework SDK) to
precompile code– generally not a good idea
IL and verification
• The IL (Microsoft intermediate language) execution
engine is stack based
– Simple to implement
• Since IL is abstract and carries complete metadata, the
CLR can perform code verification
– No memory is read from without having been
written to.
– Every method is called with the right number/types
of parameters.
– Array/string boundaries are not overwritten.
– Return values are used properly.
Reverse engineering IL code
• Since all .NET executables/dlls are rich in metadata,
and contain fairly high level IL, they can be reverse
engineered
– Using ILDASM to look at IL code is one way
– Anakrino – a piece of freeware, does a remarkable
job (http://www.saurik.com/net/exemplar/)
• If you are concerned about intellectual property, you
can
– Use an IL obfuscator
– Write unmanaged code and interop with a wrapper.
Packaging code in .NET
Packaging code in .NET
• The .NET Framework has some clear-cut goals in terms of
deployment/packaging
– Reduce, or better, eliminate ‘DLL Hell’
• Differing versions of dlls with the same name
overwriting one another, causes application
instability.
– Reduce installation complexity
• Having dozens of locations to copy files to, and
dozens of registry entries to create makes installation
complex, and uninstallation brittle.
– The packaging of code should facilitate security – i.e. let
the system decide what a piece of code can do.
Assemblies
Assemblies
•
•
•
•
An assembly is the basic unit of deployment in .NET.
Every .exe or .dll you build is an assembly.
Every assembly follows the PE file format.
Unlike traditional exe or dll files, an assembly can
consist of more than one file.
– These files are called modules
Definition Tables
• ModuleDef – information that identifies the module
• TypeDef – one entry per type defined in the module
– Each entry has the type’s name, base type, flags
and pointers to entries in other tables for methods,
fields, properties and events
• MethodDef – one entry per method defined in the
module
– Each entry has the method’s name, flags, signature
and offset
• FieldDef – One entry per field defined in the module
– Each entry has field name, flags and type
Definition Tables (continued)
• ParamDef – one entry per parameter
defined in the module
– Each entry has name, flags
• PropertyDef - one entry per property
defined in the module
– Each entry includes name, flags and type
• EventDef - one entry per event defined in
the module
– Each entry includes name and flags
Reference Tables
• AssemblyRef – Has one entry for each assembly
reference by the module
– Contains complete assembly identification info.
• ModuleRef – One entry per PE module that implements
types referenced by this module
– Contains module’s filename and extension
• TypeRef – One entry for each type referenced by the
module.
– Contains type name and module reference
– If the type is implemented in the same module, the
module reference indicates a ModuleDef entry, else
indicates a ModuleRef entry
Reference Tables (continued)
• MemberRef – One entry per member (field,
method, property or event) referenced by
the module.
– Each entry has the member’s name and
signature and points to the TypeRef entry to the
type that defines the member.
MSIL Disassembler
(Ildasm.exe)
MSIL Disassembler (Ildasm.exe)
• The MSIL Disassembler (Ildasm.exe) is included with the .NET
Framework SDK.
• The Ildasm.exe parses any .NET Framework .exe or .dll assembly,
and shows the information in human-readable format.
• Ildasm.exe shows more than just the Microsoft intermediate
language (MSIL) code — it also displays namespaces and types,
including their interfaces.
• You can use Ildasm.exe to examine native .NET Framework
assemblies, such as Mscorlib.dll, as well as .NET Framework
assemblies provided by others or created yourself.
• Most .NET Framework developers will find Ildasm.exe
indispensable.
"C:\Program Files\Microsoft Visual Studio .NET\
FrameworkSDK\Bin\ildasm.exe"
ILDASM
IL Assembly
Code
Let’s analyze the IL code
• First, the class
.class public auto ansi beforefieldinit App
extends [mscorlib]System.Object
{
} // end of class App
• .class - this defines a class
• public – the class is accessible from the outside
• auto – the CLR performs autolayout of this class at
runtime
• ansi – use ANSI string buffers between
managed/unmanaged boundaries
Analyzing the IL (continued)
• beforefieldinit - Initialize the class any time
before first static field access
• extends [mscorlib]System.Object – this
class derives from the root System.Object
class
Analyzing the IL (contd.)
• The Main method
.method private hidebysig static void Main() cil
managed
{
.entrypoint
// Code size
21 (0x15)
.maxstack 1
IL_0000: ldstr
"Hello, "
IL_0005: call
void
[mscorlib]System.Console::WriteLine(string)
IL_000a: ldstr
"C# World"
IL_000f: call
void
[mscorlib]System.Console::WriteLine(string)
IL_0014: ret
} // end of method App::Main
Analyzing the IL (continued)
• .method – this defines a method
• Private – this method is only visible inside this exe
• hidebysig – this method hides any method with the same
signature higher up in the class hierarchy
• static – this method can execute without an instance of the
class
• cil managed – this is code that defers to the CLR for
runtime decisions
• .entrypoint – this method is the entry point for the
program.
• .maxstack – the maximum number of stack slots required
is 1 (IL is a stack based language)
Class Exercise
Modules and Assemblies
Modules and Assemblies
• Individual module files are not much use unless they
are part of an assembly.
• Assemblies have the following characteristics:
– An assembly is the smallest unit of deployment
– An assembly defines the reusable types
– An assembly has a version number
– An assembly can have security information
associated with it.
• In order for a PE file to be an assembly, it needs to
have a manifest metadata table.
Manifest Tables
• AssemblyDef – contains a single entry identifying the
module
– Name, Version, Culture, Flags, Hash Algorithm and
publisher’s public key.
• FileDef – contains one entry for each file contained in
the assembly
– File name, extension, hash value, flags
• ManifestResourceDef – one entry per resource that is
part of the assembly.
• ExportedTypesDef – one entry for each public type
exported by the assembly
• Can see the manifest using ILDASM
Why are multimodule assemblies
useful ?
• You can move commonly used types into one module,
and rarely used typed into another module and build
them into an assembly.
• If the assembly is downloaded from the Internet, the
modules are downloaded only if needed.
• Important: Creating multi-module assemblies has to
be done by hand. VS.NET does not give you a way to
do this.
Assembly Identification
•
•
•
All assemblies are identified via a 4 part naming
scheme:
1. Assembly Name
2. Assembly Culture
3. Assembly Version
4. Assembly Publisher Information
This information gets built into the metadata for the
assembly
Additional information like trademark, product
name, copyright etc. can also be associated with the
assembly version resource field.
Assembly Name
• The assembly’s name is the same as the filename,
without the path and extension.
• The compiler automatically does this.
• You can see the assembly name in the manifest
using ILDASM
.assembly hello
• You can change the assembly name to be different
from the filename, but it is not recommended
Assembly Culture
• Assemblies include culture or locale information as part of
their identification
• Cultures are identified as as string “<pt>-<st>” where
<pt> is the primary tag and <st> is the secondary tag.
– E.g. “en-US” for US English, “de-CH” for Swiss
German
• Typically, you would not assign a culture when you build,
because the code is not culture specific. In this case, the
assembly is culture neutral.
• If you do need to build culture-specific info, put those
resources into culture specific satellite assemblies. Satellite
assemblies should not contain code, and should not be
linked to the main assembly – use Reflection instead.
Assembly Version
• This is the version that is stored in the
AssemblyDef manifest metadata table.(Remember
that ?). Referred to as AssemblyVersion
– Four part version number:
• major.minor.build.revision
• E.g. 1.0.3300.0
• There are two other versions
– AssemblyFileVersion – stored in the Win32
version resource. Ignored by the CLR
– AssemblyInformationalVersionAttribute –
exists to indicate the version of the product that
includes the assembly. Ignored by the CLR
Version rules
• You have to at least specify major
• If you specify major and minor, you can specify an
asterisk (*) for build. This will cause build to be equal to
the number of days since January 1, 2000 local time, and
for revision to be equal to the number of seconds since
midnight local time, modulo 2.
• If you specify major, minor, and build, you can specify
an asterisk for revision. This will cause revision to be
equal to the number of seconds since midnight local time,
modulo 2.
• Examples
–
–
–
–
–
1.0.34.56
1.2.*
1.0.25.*
1 (same as 1.0.0.0)
1.4 (same as 1.4.0.0)
- Empty (defaults to 0.0.0.0)
How to specify an assembly’s
identity and version info
• Use custom attributes available in the
System.Reflection namespace.
• Set the attributes at the assembly level
• Examples
using System.Reflection;
…
[assembly:AssemblyCulture(“en-US”)]
[assembly:AssemblyVersion(“1.2.3.4”)]
[assembly:AssemblyCompany(“Joe’s Software Shack”)]
VS.NET and AssemblyInfo.cs
• VS.NET automatically creates a file to hold
all the assembly information
– AssemblyInfo.cs
• You can change the default values in here to
suit your needs
• One insidious default value:
[assembly: AssemblyVersion("1.0.*")]
– What does this do ?
– If you care about the build numbers and/or revisions,
change this.
Simple Application Deployment
• Assemblies don’t dictate a special means of
packaging.
• Easiest way to deploy is to copy all the files to the
target machine. (xcopy deployment)
• Since the assembly has all the dependant assembly
information in its metadata, the application will
just run, and look for other user defined
assemblies in the same folder.
• No registry entries to create/clean up.
• Assemblies deployed in this manner are called
privately deployed assemblies.
Deploying in subdirectories
• What if you wanted to deploy dependant
assemblies in their own subdirectories ?
• .NET allows you to do this via configuration files.
• Configuration files in .NET are XML files
• Application config files for executable
applications (EXEs) are always named
<appname>.exe.config and are placed in the
application’s base directory.
Example application config file
<?xml version="1.0" encoding = "utf-8" ?>
<configuration>
<runtime>
<assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
<probing privatePath="bar" />
</assemblyBinding>
</runtime>
</configuration>
App.exe.config
• The previous example says that if the dependant
assembly is not found in the base directory, look in
a subdirectory called “bar”
• Good idea to cut and paste these files from
existing config files, and edit them to fit your
situation.
• privatePath can contain multiple semicolondelimited paths.
• If you are doing this with VStudio, name the file
App.config and add it to your project. The IDE
takes care of generating the <application
name>.exe.config file in the appropriate bin
directory
Probing and privatePath
• Probing is a fancy word for searching
• If your application is looking for an assembly
called MyAsm.dll, the probe paths are as follows:
–
–
–
–
–
<AppBaseDir>\MyAsm.dll
<AppBaseDir>\MyAsm\MyAsm.dll
<AppBaseDir>\privatePath1\MyAsm.dll
<AppBaseDir>\privatePath1\MyAsm\MyAsm.dll
…
• For satellite assemblies , the probing starts at
<AppBaseDir>\<culture>\
– <AppBaseDir>\en-US\MyAsm.dll
– <AppBaseDir>\en-US\MyAsm\MyAsm.dll
– …
Probing
• privatePath can only specify subdirectories.
Siblings and parents/ancestors cannot be used.
• Application specific config settings can always be
overridden by the administrator in the machine
wide Machine.config file located at
– C:\Windows\Microsoft.NET\Framework\version\CONFIG
• Config files can be created by a GUI tool – the
Microsoft.NET Framework Configuration Tool (in
Control Panel->Administrative Tools)
– Under Application, select Add an Application To Configure
– Easier to just cut and paste and edit by hand
Shared Assemblies
Shared Assemblies
• With private deployment, every application
gets it’s own copy of dependant assemblies.
• We also need a way for multiple
applications to share assemblies.
– E.g. the assemblies that ship with the .NET
Framework that implement the FCL
– Exception classes that you may implement that
are shared by various applications on a machine
• Assemblies that are shared by multiple
applications are called shared assemblies.
The problems with sharing
• Shared libraries are not new, but have had their
host of problems, mostly related to versioning
• Version 1 of a piece of code is overwritten by
Version 2, where code has been modified.
– Now any application built with Version 1 of the code is
no longer guaranteed to work.
– Mfc42.dll is a prime example
• COM tried fixing this by immutable interfaces and
registry entries, but this was easily abused.
• Led to DLL Hell, and the common (and largely
accurate) notion that a Windows NT installation
degrades over time.
The crux of the problem
• How do you fix bugs and add features to a
file and also guarantee that dependant
applications do not break ?
• Can you run various versions of the shared
files side-by-side ?
• Can un-installing a shared file be made
clean ?
The .NET solution
• .NET supports two kinds of assemblies
– Strongly named assemblies
– Weakly named assemblies (Richter’s term)
• Both types of assemblies are structurally identical
– PE files as described earlier
• Strongly named assemblies have some more
information in the metadata and CLR header to
help in identification.
• Both strongly and weakly named assemblies can
be privately deployed.
• Only strongly named assemblies support global
(or shared) deployment.
A brief digression…
• Most crypto-systems fall into one of 2 categories
– Symmetric key encryption
– Asymmetric key encryption
• Symmetric key encryption uses the same key to
encrypt and decrypt
– Symmetric keys are easy to implement.
– Safety depends on key length
– Key exchange can be compromised (Remember, both
sender and receiver need to know the same key).
Asymmetric Key Encryption
• The key is in 2 parts
– A public key which is well known
– A private key, which only one party knows.
• Message encryption
– If you (sender) want to send an encrypted message to a
receiver, you encrypt the message the message with the
receiver’s public key (which is well known).
– The receiver can decrypt the message with the private key
(which only the receiver knows).
– Messages intended for the receiver can only be read by the
receiver
• The key exchange problem goes away
Digital Signing
• Asymmetric key encryption has an interesting
side-effect
– If the sender signs a message with his/her
private key (which only the sender knows), the
receiver can decrypt the message with the
sender’s public key (which is well known).
– This serves not to protect the message, but to
confirm the sender’s identity.
– Known as a digital signature.
Strong names
Strong names
•
•
•
•
Strong names are digital signatures that are added to an
assembly.
A strongly named assembly has 4 attributes that uniquely
identify it.
1. Filename (without the extension)
2. Version (M.m.B.R)
3. Culture
4. Public key token
We’ve already seen the first 3.
With asymmetric key encryption, one of the disadvantages is
that the keys are fairly large. A public key token is a one way
hash of the public key that is much smaller.
Strong names (continued)
• Example
– “MyAsm, Version=1.2.3.4, Culture=neutral,
PublicKeyToken=b77a5c561934e089”
• Any company that wants to uniquely mark it’s
assemblies must get a public/private key pair.
• No two companies can have the same
public/private key pair (similar to GUIDs.)
• A strongly named assembly is signed with the
publisher’s private key.
SN.exe
• The .NET Framework SDK provides the
Strong Name Utility (SN.exe) to create and
use key pairs.
• To generate a public/private key pair:
sn –k MyPublicPrivateKeys.keys
– Notice that this is a binary file.
• To extract the public key portion:
sn –p MyPublicPrivateKeys.keys
MyPublicKey.Key
• To view the public key and public key token:
sn –tp MyPublicKey.Key
Some caveats
• sn does not validate the input file.
sn –tp junk.txt
The above will print out values that look like a
public key and token.
• Larger companies may use key containers rather
than SN to generate key pairs.
• The public key is 128 bytes.
• The public key token is 8 bytes.
Signing an assembly
• To sign an assembly, use the AssemblyKeyFile
attribute.
[assembly:AssemblyKeyFile(“MyPublicPrivate.Keys”)]
• Ildasm now shows you a public key in the
manifest assembly def.
• Any other assembly that is now built with a
reference to the signed assembly will have the
dependant assembly’s public key token in the
assembly ref.
What happens when you sign as
assembly?
• When you build a strongly named assembly, the
assembly’s FileDef manifest metadata table includes a list
of all files that make up the assembly.
• As each file is added, the file’s contents are hashed and
this hash value is stored along with the file’s name in the
FileDef table.
• After the PE file containing the manifest is built, the entire
contents are hashed (with the SHA-1 algorithm).
• The hash value is digitally signed with the publisher’s
private key and stored in a reserved section of the PE file.
• The publisher’s public key is embedded into the
AssemblyDef manifest metadata table.
Signing an assembly
MyAsm.dll
Sign with
private
key
IL
Hash value
Hash
PE
file
Metadata
Manifest
Public Key
CLR Header
RSA digital signature
Public Key
RSA digital signature
Some other facts on signing
• sn –Tp MyAsm.dll
– Can be used to display the public key and public key
token of a signed assembly
• sn –v MyAsm.dll
– Can be used to verify if an assembly has been signed
• Only the PE file containing the manifest has the
public key stored in it.
The Global Assembly Cache
The Global Assembly Cache
• Once we have a shared assembly, we need
to put it some well known place where the
CLR can find it and load it.
• This well known location is the Global
Assembly cache (or the GAC).
• To look at the GAC through Windows
Explorer, navigate to
C:\Windows\Assembly
Or it’s equivalent.
The GAC
The GAC
• This view of the GAC is possible due to an
Explorer shell extension (ShFusion.dll) that
gets installed when you install the .NET
Framework.
• You can see the assembly name, version,
culture and public key token.
• You can right click and check out the
properties on the assembly, including
version information
The GAC
• To see the real structure of the GAC, open up a command
window and navigate to it.
– C:\Windows\assembly\GAC
• Note that under each assembly are folders with names that
include both the assembly version and the public key token
• The format of the directory name is
<Version>_<Culture>_<PublicKeyToken>
• The GAC can hold multiple versions of assemblies, and
applications will only run with the version they were built
with
• The public key token deals with the case where assemblies
published by different publishers have the same name.
Installing assemblies into the
GAC
• Once the assembly has a strong name, it can be
installed in the GAC in one of two ways:
– Drag and drop using the Windows shell extension.
– Using the Global Assembly Cache utility
(GACUTIL.exe) on the command line.
– Both techniques will flag you with an error if you try
and install an assembly that has not been strongly
named.
– If you use the Windows shell extension, uninstalling is
simply a matter of deleting from Windows explorer.
GACUTIL
• Allows you to view and manipulate the contents of the
global assembly cache.
• The following command inserts the file mydll.dll into the
global assembly cache.
gacutil /i mydll.dll
• The following command removes the assembly hello from
the global assembly cache.
gacutil /u hello
– Note that the previous command might remove more than one
assembly from the assembly cache because the assembly name is
not fully specified. For example, if both version 1.0.0.0 and 3.2.2.1
of hello are installed in the cache, the command gacutil /u hello
removes both of the assemblies. Also, note that the file extension
(.dll) is not used
GACUTIL (contd.)
• Use the following example to avoid removing
more than one assembly. This command removes
only the hello assembly that matches the fully
specified version number, culture, and public key.
gacutil /u hello,
Version=1.0.0.1,Culture=”de”,PublicKeyToken=4
5e343aae32233ca
• The following command lists the contents of the
global assembly cache.
gacutil /l
Strong Named Assembly
Class Exercise
Assemblies that reference
strongly named assemblies
• Whenever you build an assembly, the assembly will have a
reference to other strongly named assemblies. Why ?
– Because System.Object is defined in MSCorLib.dll
• When compiling references to other strongly named
assemblies, you should not have to specify a full path all
the way into the GAC in your /r: option.
• For this reason the .NET Framework installs two copies of
the Microsoft assemblies – one set in the GAC (to be
loaded at runtime) and one set in the CLR directory that
the compiler uses (to be used for compile time references.)
What does strong naming buy us ?
• When a strongly named assembly is installed in
the GAC, the following steps take place
– The system hashes the contents of the file containing
the manifest, and obtains a hash value (H1)
– The system extracts the publishers public key, and uses
it to unsign the RSA digital signature embedded in the
PE file, obtaining the original hash value (H0)
– If H0 is not the same as H1, the file has been tampered
with.
What does strong naming buy us ?
• When a strongly named assembly is installed in
the GAC, the following steps take place
(continued)
– In addition, for a multi-module assembly, the system
hashes the contents of the assembly’s other files and
compares them with the hash values stored in the
manifest file’s FileDef table. If any hash value does not
match, the assembly has been tampered with.
– If the assembly has been tampered with, it will not
install into the GAC
Binding to an assembly
• When an app needs to bind to an assembly, it uses the
following info in the AssemblyRef table
– Name, version, culture, public key token
to locate the assembly in the GAC.
• If it can be found, it is loaded. If it is found, we know it has
not been tampered with, and is the same version of the
assembly that the app was built against.
• If is not found in the GAC, the CLR looks in the app’s base
directory and probes, as described earlier.
• If it is still not found, a FileNotFoundException is thrown.
An important caveat
• Strong naming only guarantees that the file
was not tampered with after being signed.
• A malicious party could change the
assembly and resign it (with their own
private key.)
• The publisher’s identity cannot be known
from the key alone – you need to use
Authenticode to do this.
Internet Deployment
• Strongly named assemblies can be deployed at a
URL.
• The application configuration file for an
application using such an assembly can use a
CODEBASE directive to specify location, version,
culture, name and public key token of the
assembly.
• The assembly runs out of an Internet download
cache.
• The CLR compares the hash values everytime the
assembly is loaded – a potential performance hit.
Using CODEBASE
<?xml version="1.0" encoding = "utf-8" ?>
<configuration>
<runtime>
<assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
<dependentAssembly>
<codeBase version="0.0.0.0"
href="http://localhost/asslydep/math.dll" />
<assemblyIdentity name="math"
publicKeyToken="05171503206bc67f"
culture="neutral"/>
</dependentAssembly>
</assemblyBinding>
</runtime>
</configuration>
Binding to an Assembly
1. CLR applies all the configuration policies.
2. If the required assembly is already loaded,
use it.
3. Is there a match in the GAC ?
If so, load that file and use it
4. Is there a CODEBASE hint, and does it
match the reference ?
If so, load that file and use it
Binding to an Assembly (continued)
5. Can the file be found in by probing ?
a. Look in the application’s base directory, and a
subdirectory with the same name as the
assembly.
b. Probe the directories specified by privatePath
c. Continue probing through culture-dependant
folders.
6. If it is still not found, throw a
System.IO.FileNotFoundException
Some observations
• The GAC is the first place the CLR looks
– If you have the assembly both in the GAC and
in the application base directory, the one in the
GAC will be used
• When you use a codebase with a URL, the
assembly is run from a download directory.
– If you do not change the version, the cache will
need to be cleared to force a download.
To find where an assembly is
running from
• You can reflect on the Type object to figure out
where the assembly that defines the type was
loaded from
typeof(<typename>).Module.FullyQualifiedName
• This returns a string with the full path.
Homework
Monday December 8, 2003
GuestBook Database with a
Visual Basic.NET Windows Form