Name of instructor

Download Report

Transcript Name of instructor

Evolving C++ onto the CLI Environment
Integrating a Static and Dynamic Programming Model
Stanley B. Lippman
[email protected]
How Should We Think about
Programming Languages?

I use to think about them under the conceit of Aristotle
and Copernicus cosmology with these forming two poles
of a continuum …

Aristotlean languages impose a top-down vision independent of
the actual machine technology of the day …


CLU, Scheme, Self
Copernican languages emerge from a bottom-up discovery of
the actual machine technology of the day …

FORTRAN, C/C++/Java
And you could then cluster the other languages – CLOS,
Smalltalk, Ada – along the continuum …
How Should We Expect to be Thought
of as Programmers?

Under this conceit, one would expect







the Copernican languages to be widely used, but
the Aristotlean languages to be widely admired.
the Copernican languages to come out of industry [labor], but
the Aristotlean languages to come from the academy [thought].
the Copernican programmer to be considered less intelligent and
therefore less worthy of respect, and
the Aristotlean programmer to be … well, you get the idea …
Admittedly, this only gets us so far … but I see nothing in
the history of programming to invalidate the conceit.
Let’s Step Back a Moment

One of the cool things about computer science is that it
is a very young field – therefore, we can hold it in our
mind’s eye – that is, if we constrain it to the digital era:




Forget Leibniz, although he was a genius and a visionary, and
deserved better from Newton and the Royal Society.
Forget Babage, although he came very very close, and
Forget Ada Lovelace, although she provides an attractive
alternative to the image of the programmer as a soulless nerd.
So, what was the language used by the ENIAC computer,
which I am going to start from?
There Was No Language
Oddly
enough, the idea of an independent
program – let alone the idea of a programming
language – wasn’t part of the original invention of
the computer.
Rather,
cables were plugged into one
configuration for that formulation or reconfigured
for this formulation, and so on.
This is where the term tight coupling originated … 
Programs Dwell in a
Computational Environment

The modern computing era began without the concept of
either a program or a programming language.

Of course, a transcription of a mathematical formula into
a format within the computer was necessary, but




it was not thought of as a symbolic program notation
there was no concept of inventing a language …
there was no such entity thought of as a computer programmer …
The immediately intractable problems were hardware –


could the vacuum tubes persist long enough to maintain a
computation.
the math was done in base 10 …
Von Neumann Hijacked the ENVAC

The successor to ENIAC was the EDVAC.
Von Neumann arranged to be taken on as a consultant to the ENIAC
project…. He now become an avid supporter of the Moore School’s
work, legitimizing it in the eyes of the scientific establishment, and
he was helpful in its getting the EDVAC contract.
On June 30, 1945, a 101-page document arrived at the Moore
School from von Neumann….Von Neumann’s paper on EDVAC was
replete with references to neurons and other parts of the human
nervous system, comparing them to the automatic computer… The
machine he described had a stored, programmable memory.

Again, the ENIAC computer had no concept of a stored program, let
alone a programming language. It cost $486,804.22. We’re told it


solved problems in 15 seconds that would have required several weeks’
work by a trained main.
solved in 2 hours a problem which would have taken 100 trained men a
year to solve manually.
Programs and Program Languages
A Dialectic with the Computational Environment

The introduction of the program solved logistical
bottlenecks of the pre-existing computational
environment …

Trade-off of decoupling the processing from the program:


Faster, `automatic’ loading of the program
the need to invent and implement a software abstraction layer …
in this case, a loader …
This decoupling has been accelerating …
The Evolution of Complex Structure …

Software did not begin as software – it was a hard-wired
configuration – a dance without a separate dance notation.

This evolved into a reproducible program bit map that could be
loaded and flushed from memory. A purely numeric representation.


The first abstraction level, in a sense, was the use of hexidecimal over
binary …
The assembler formed a nucleus of a symbolic representation of a
program but still at the level of individual instructions that could be
grouped by function. A mnemonic representation.

This was controversial and Grace Hopper reports that it was resisted by a
portion of the very small number of programmers who felt it was getting
too far from the machine.
At each stage, more software complexity is introduced between the
program representation and the machine.
The Invention of Programming Languages
Created a Tradition of Uncivility …

FORTRAN, of course, was a proof of concept that we could program
as a higher level of abstraction and still generate efficient code …
 They eliminated aspects of the design if it proved too difficult to
compile … this in itself was proof against its purity of design …

FORTRAN also began the language wars …
 The disappointed ALGOL team described it as graffitti written on a
bathroom wall
 They attributed its success to the 800-lb guerilla that was IBM

These kinds of battles have never ceased – they’ve just
changed as the languages have. Why?
A Darwinian Way to Think about
Programming Languages

Programming languages are a response to a
particular computational environment:



facilitates expression within a current environment …
improves on one or a set of existing program solutions
…
provides a vocabulary and shared point of view … a
community
All Languages Become Extinct …
 As the computational environment changes, the more
specialized the language to the previous computational
environment, the less adaptive it proves in the new
environment …
 But the historical accumulation of structure seems to
overwhelm these efforts.
 The conditions that give rise to a language leads to its
eventual extinction …
There are many more extinct
than active programming languages
All Languages Compete for Scarce Resources …
 Although a language is not an organism, there is a
continual struggle for survival among its population …
 There is a competition for finite budgetary resources to feed new
projects and sustain existing one …
 There is competition to reproduce in the minds of a new
generation of programmers.
Language wars are virtually bloody both in tooth and claw
All Languages Resist Extinction …
 Typically, the language leaders at some point cease
resisting and attempt to readapt the language to the
changing environment …
 this however may backfire … emphasizing its current
maladaption to the new environment …
 The population of a language constricts when it fails to reproduce
in the minds of the new members of the community
 Dropping below a certain threshold, it no longer has the critical
mass to command finite budgetary resources
 It is relegated to unique niche environments – the deserts and
swamps of the software development landscape …
So, Where Is This Leading Us?
1.
The Common Language Infrastructure (CLI) is a major
consolidation of thoughts about a software abstraction
layer between the program and the Operating System.


This is not in itself new – Smalltalk carried its own environment,
and Java targets its own virtual machine.
What is new is its inclusiveness: it supports over 30 languages ...
It rightly diminishes the focus on languages …
This is what we look at briefly in the next section.
So, Where Is This Leading Us?
2.
C++/CLI is an adaptation of ISO-C++ to the dynamic
programming object model of the CLI. It follows a
tradition of C++ adaptations:




C with Classes (~1979) (ADT)
Object-Oriented Programming (~1984) (OO)
Generic Programming (~1991) (Templates)
Dynamic Programming (~2005) (CLI)
This is what we look at in the last section of this talk
The CLI Changes
the Computational Environment
Therefore, it puts existing languages at risk, and provides an
opportunity for new languages to thrive.
A Note on Terminology

I will be speaking of both the CLI and CLR as kind of counterpoints
of one another. Here is their formal relationship …
1.
Common Language Infrastructure (CLI)
This is an ECMA/ISO platform-independent standard. It
represents the abstract facilities/architecture.
2.
Common Language Runtime (CLR)
This is the Windows Operating System implementation of the
CLI. This is what we mean when we speak of .NET …
The CLI/CLR Provides a VES

A Virtual Execution System (VES) provides an environment for
executing managed code

It provides a software layer between the managed code and the
native operating system.

It is responsible for loading and running programs …

It provides the services needed to execute managed code and data
…

Garbage collection, for example, is an aspect of the VES, not of a
particular language …
Security
Execution
Support
Frameworks
Base Classes
IL to
native code
compilers
Common Language Runtime
Architectural Overview
GC, stack walk, code manager
Class loader and layout
The CLI/CLR Provides a CIL

Each CLI language is compiled down into a Common Intermediate
Language (CIL) based on a stack program model.

All tools ideally work off of the CIL, and are therefore shared across
all languges … browsers, debuggers, and so on.
 New tools target the CIL, not a language …

Metadata is generated in parallel describing both the program and
its environment … this allows automation of many previously
manual `plumbing’ …

An extensive object-oriented Base Class Library (BCL) framework is
shared across all CLI languages …
Metadata
The Lifeblood of the CLI
Serialization
Source
Code
(e.g. SOAP)
Other
Compiler
Type Browser
Schema
Generator
Reflection
Designers
Compiler
Debugger
Metadata
Profiler
(and code)
Proxy Generator
XML encoding
(WSDL)
The CLI Provides a CTS
 The
Common Language Infrastructure (CLI) defines a
Common Type System (CTS) over which all CLI languages
are built.
A
unified type system rooted in an Object base class.
 All types and literal values have an underlying class
representation
 All types are guaranteed to be a kind of Object and
share a common set of operations
 All types can be converted to an instance of type
Object.
 All types have an associated Type class that provides
runtime reflection support.
The CLI Provides a CTS
 The
Common Language Infrastructure (CLI) defines a
Common Type System (CTS) over which all CLI languages
are built.
 Separation
of class types based on
behavior/design charactertistics:
 Reference class is polymorphic: supports OO
design
 Value class is blitable: supports small, efficient
independent types
 Interface class is abstract: supports defining
families of services
The CLI Provides a CTS
A
secondary set of types including numeric types,
a delegate type, an event type, an array type, an
enum type …
A
single class inheritance model with support for
multiple interface inheritance
Each CLI language generally exposes these to the
programmer as built-in language facilities … this
is a first order design aspect of building a CLI
language.
The CLI Provides a CLS
 The
Common Language Specification (CLS) defines
a set of restrictions on the Common Type System
(CTS) to ensure interoperability among CLI
languages.
 These
rules apply to
types that are visible in assemblies other than those in
which they are defined.
 Members that are accessible outside the assembly.

 CLS-compliant
code is guaranteed to be both
consumable and inheritable by all CLI languages.
The canonical example of a CLS constraint is to prohibit
unsigned integral values as part of the public interface …
Adapting C++ to this
New Environment
C++/CLI represents a tuple …

C++
The first term, C++, refers of course to the C++
programming language invented by Bjarne
Stroustrup at Bell Laboratories.
It supports a static object model that is optimized
for the speed and size of its executables.
It does not support run-time modification of the
program other than, of course, heap allocation.
It allows unlimited access to the underlying
machine, but very little access to the types active
in the running program and no real access to the
associated infrastructure of that program.
C++/CLI represents a tuple …

CLI
The third term, CLI, refers to the Common Language
Infrastructure, a multi-tiered architecture supporting a
dynamic component programming model.
In many ways, this represents a complete reversal of the
C++ object model.
A runtime software layer, the virtual execution system, runs
between the program and the underlying operating system.
Access to the underlying machine is fairly constrained.
Access to the types active in the executing program and the
associated program infrastructure – both as discovery and
construction – is supported.
C++/CLI represents a tuple …

/
The second term, slash (/), represents a binding
between C++ and the CLI.
So, a first approximation of an answer as to what
is C++/CLI is to say that it is a binding of the
static C++ object model with the dynamic
component object model of the CLI.
The design of this binding is the focus of the rest
of this talk.
The Design of C++/CLI
The Architectural Underpinning
The 3 Elements of a CLI Language

There are three aspects in the design of a CLI language that
hold across all languages.
1.
A mapping of language level syntax to the underlying Common
Type System.
2.
A choice of a level of detail to expose of the underlying CLI
infrastructure to the direct manipulation of the programmer.
3.
A choice of what additional functionality to provide over that
supported directly by the CLI
Item #1 is largely the same across all CLI languages. Items
#2 and #3 are what distinguish one CLI language from
another.
I like to think of these three items as representing
coordinates positioning each language in a threedimensional design space supported by the CLI.
Mapping to the Common Type System

This design aspect is common to all CLI languages – the
syntax of course varies. So, for example,
public abstract class Shape {…} // C#
public ref class Shape abstract {…}; // C++/CLI
Shape s = new Cube(); // C#
Shape^ s = gcnew Cube; // C++/CLI
represents the C# and C++/CLI support to define an
abstract CLI reference class and allocate a derived instance
on the CLI heap.

Our choice of syntax is based on an attempt to closely
integrate the CLI class types with that of ISO-C++.
The C++/CLI Types
ref class R abstract {};
value class V{};
interface class I{};
ref class R2 : R, I {};
enum class E : short { e1, e2 };
delegate void D( signature );
event D handler;
array< T, dim > a;
Level of Detail

The second design aspect reflects the level of
detail of the underlying CLR implementation
model to incorporate into the language.

How does one go about determining this?


What are the kinds of problems the language is likely to
be tasked to solve?

What are the kinds of programmers the language is likely
to attract and be used by?
Let’s look at an example: the issue of value types
occurring on the managed heap.
Value Types on the Managed Heap

Value types can find themselves on the managed
heap in a number of circumstances:

Implicit boxing


we assign an object of a value type to an Object
we invoke a virtual method through a value type that is not
overridden

When a value type serves as a member of a reference
class type.

When a value type is being stored as the element type of a
CLI array.
The design question a CLI language has to ask is,
should we allow the programmer to manipulate the address
of a value type of this sort?
What Are the Issues?

Any object located on the managed heap is
subject to relocation during the compaction phase
of a sweep of the garbage collector.

Any pointers to that object must be tracked and
updated by the runtime; the programmer has no
way to manually track it herself.

Therefore, if we were to allow the programmer to
take the address of a value type potentially
resident on the managed heap, we would need to
introduce a tracking form of pointer in addition to
the existing native pointer.
What Are the Trade-Offs?
Simplicity and Safety on the One Hand

Directly introducing support in the language for
one or a family of tracking pointers makes it a
more complicated language.


By not supporting this, we expand the available pool of
programmers to hire from by requiring less sophistication.
Allowing the programmer access to these
ephemeral value types increases the possibility of
programmer error – she may purposely or by
accident do bad things to the memory.

By not supporting this, we create a potentially safer
runtime environment.
What Are the Trade-Offs?
Efficiency and Flexibility on the Other Hand

Each time we assign the same Object with a value
type, a new boxing of the value occurs …


Allowing access to the boxed value type allows in-memory
update, which may provide significant performance …
Without a form of tracking pointer, we cannot
iterate over a CLI array using pointer arithmetic.
This means that the CLI array cannot participate
in the STL iterator pattern and work with the
generic algorithms.

Allowing access to the boxed value type allows significant
design flexibility.
The Level of Detail
Reflects the Target Programmer

We chose to provide a collection of addressing
modes that handle value types on the managed
heap.

int ival = 1024;

int^ boxedi = ival;

array<int>^ ia = gcnew array<int>{1,1,2,3,5,8};

interior_ptr<int> begin = &ia[0];

value struct smallInt { int m_ival; … } si;

pin_ptr<int> ppi = &si.m_ival;
A Language Layer over the CLI


A third design aspect is an language-specific layer
of functionality over that directly supported by
the CLI.

This requires a mapping between the language-level
support and the underlying CLI …

Or it may be handled by tagging a type with a languagespecific attribute discoverable at run-time …

In some cases, it just isn’t possible …

value types are blitted …

virtual function resolution within ctors/dtors …
So this is a compromise between what we might
wish to do, and what we find ourselves able to do.
Three General Categories of Extension
1.
a form of Resource Acquisition is Initialization (RAII) for
reference types. In particular, to provide an automated
facility for deterministic finalization of garage collected
types that hold scarce resources.
2.
a form of deep-copy semantics associated with the C++
copy constructor and copy assignment operator – this
could not be extended to value types.
3.
Direct support C++ templates for CLI types in addition to
the CLI generic support.
Non-Deterministic Finalization

Before the memory associated with an object is
reclaimed by the garbage collector, an associated
Finalize() method, if present, is invoked.

You can think of this method as a kind of superdestructor since it is not tied to the program
lifetime of the object.

We refer to this as finalization. The timing of just
when or even whether a Finalize() method is
invoke is undefined.

This is what is meant when we say that garbage
collection exhibits non-deterministic finalization.
The Problem of Scarce Resources

Non-deterministic finalization works well with dynamic memory
management. When available memory gets sufficiently scarce, the
garbage collector kicks in and things pretty much just work.

Non-deterministic finalization does not work well, however, when
an object maintains a critical resource such as a database
connection, a lock of some sort, or perhaps native heap memory.

In this case, we would like to release the resource as soon as it is no
longer needed. The solution currently supported by the CLI is for a
class to free the resources in its implementation of the Dispose()
method of the IDisposable interface.

The problem here is that Dispose() requires an explicit invocation, and
therefore is liable not to be invoked.
Automating Disposal …

A fundamental design pattern in C++ is spoken of as
resource acquisition is initialization.




That is, a class acquires resources within its constructor.
Conversely, a class frees its resources within its destructor.
This is managed automatically within the lifetime of the class
object.
This is what we would like to do with reference types in
terms of the freeing of scarce resources:


Use the destructor to encapsulate the necessary code for the
freeing of any resources associated with the class.
Have the destructor invoked automatically tied with the lifetime of
the class object.
A Two-Step Solution … Step 1
1.
Mapping the Destructor to Dispose()


The CLI has no notion of the class destructor for a reference type.
So the destructor has to be mapped into something else in the
underlying implementation.
Internally, then, the compiler does the following transformations:



the class has its base class list extended to inherit from the
IDisposable interface.
the destructor is transformed into the Dispose() method of
IDisposable.
That get us half the way to our goal. We still need a way to
automate the invocation of the destructor.
A Two-Step Solution … Step 2
2.
Mapping the object to a lifetime

A special stack-based notation for a reference type is supported;
that is, one in which its lifetime is associated within the scope of
its declaration.

Internally, the compiler transforms the notation to allocate the
reference object on the managed heap.

With the termination of the scope, the compiler inserts a
invocation of the Dispose() method – the user-defined destructor.

Reclamation of the actual memory associated with the object
remains under the control of the garbage collector.
An Example
ref class wrapper {
Native *pn;
public:
Wrapper( int val ) { pn = new Native( val ); } // RAII
~Wrapper(){ delete pn; }
void foo();
protected:
! Wrapper() { delete pn; }
};
void f1() {
Wrapper^ w1 = gcnew Wrapper( 1024 );
Wrapper w2( 2048 ); // no ^ token !
w1->foo(); w2.foo();
// …
// w2 is disposed of here
// w1 will be finalized at some point
}
The CLI Represents
a Language Framework
Programming Languages are over-emphasized,
much as national identity …
Language as a Unit of Deployment

A language is often used as a vehicle for the
deployment of a new programming model – that
is, of a new paradigm.

It tends to demonstratively improve on existing models
that have run into some bottleneck of scale.

Or it supports a new model either of technology or
abstraction.

These languages tend to be pure – that is, to provide
support for its program model only.

This makes the language both simpler and more elegant.

It requires a relinquishment of the past
In Like a Lion, Out Like a Lamb

When there is a reinvention of the dominant
program model, there is also a programming
language extinction.

The current generation of languages has no vocabulary to
directly express the new model.

Adding that vocabulary compromises the elegance of the
original purity of design.

A pure language moves from a youthful development
community to a acknowledged design influence.

This passionate sweeping in and hangdog slinking out of
programming languages has taken its toll socially on the
professional programmer class.

This is not really working, imo. What kind of solutions
suggests themselves?
Where the CLI Comes in

However, there is a possible language model we
can glean from C++.

What has been surprisingly successful for C++ has been
its ability to support multiple program models.

What has been less successful is the absence of a
unifying architecture and crafted boundaries.

Well, perhaps what we need is a conscious design
– a deliberate mosaic of component paradigms
using a common type system and virtual machine
model.

Oh, this is where the CLI comes in …
A Language Framework

The CLI seems to offer the glimmerings of a
framework for the design of a possible mosaic of
component language gems.

I would like to see you guys come up with a new
paradigm of how we should program – all two
thousand of them.

There is so much hard work and invention that is
required of us in the 21st century.

The university must delivery up the science;
industry will deliver up the engineering.
Questions?
Concerns?
Criticisms?
[email protected]