Parallax - A New Operating System for Scalable, Distributed, and Parallel Computing Stop DIME Network Architecture (DNA) for a New Generation of Many-core Computing Start Dr.
Download ReportTranscript Parallax - A New Operating System for Scalable, Distributed, and Parallel Computing Stop DIME Network Architecture (DNA) for a New Generation of Many-core Computing Start Dr.
Slide 1
Parallax - A New Operating
System for Scalable, Distributed,
and Parallel Computing
Stop
DIME Network
Architecture (DNA)
for a New
Generation of
Many-core
Computing
Start
Dr. Rao Mikkilineni, Kawa Objects
Ian Seyler, Return Infinity
May 16, 2011
Private & Confidential
SMTPS11
Slide 2
Agenda
• The hardware upheaval
and the von Neumann
Bottleneck
• Possible Solution using
a Parallel DIME™
network computing
model with telecom
grade trust
• Parallax – A potentially
new Operating System
(OS)
• Proof of concept demo
The history of the evolution of current OSs
is filled with lessons on wasted billions
(does anyone remember Multics or OS2?),
unmet expectations (who would have
thought UNIX, the original System V,
would vanish), surprise winners
(Windows and Linux), and stealthy
survivors (Mach in a Mac)
Private & Confidential
2
Slide 3
Many-core Servers
• SeaMicro – Custom Servers – 512, 1.66 GHz 64 bit
X86 Intel Atom cores in 10 RU; 2,048 CPUs/rack
• Calxeda - highly integrated Server‐on‐Chip built
around a new generation ARM processor – 480 cores
• Silicon Graphics – Altix UV –
– 2048 cores, 16 TB memory per Single System Image scales
to 32,768 processor sockets providing up to 262,144 Intel
Xeon cores (8-cores per socket)
Private & Confidential
3
Slide 4
Hardware Upheaval and
von Neumann Bottleneck
Up to 46,080 processing cores or 29.8 petabytes of storage per container
No Operating System that provides Application-centric Resource
Management in real-time
512 Cores
480 Cores
Layers of Management Infrastructure
Running an OS that cannot see beyond tens of cores
Network Infrastructure With Complex Management Systems
Private & Confidential
4
Slide 5
Current Economics of IT
% of TCO over Five Years
40.00
35.00
30.00
$61.2M
$31.6M
25.00
20.00
15.00
Hardware Upheaval is not Matched by
Software Innovation!!
10.00
5.00
0.00
Physical
Virtual
Private & Confidential
5
Slide 6
SPC Element Network & von
Neumann Bottleneck
Network, Storage,
Virtualization, application etc.
etc. Management
Distributed Intelligent
Managed Element Network
Service
Regulation
Executable
Instructions
Parallel FCAPS* Management of Stored Program
Computing Element using Signaling Channel
...mngt code...
...mngt code...
...mngt code...
Stop
...mngt code...
...mngt code...
...mngt code...
Real-time Application
Management
(Provisioning,
Monitoring & Control)
Start
End-to-end distributed
transaction response is
no longer controlled by
Distributedthe individual node OS
Serviceresource
Regulation
Application in a shared
Executable Instructions
...mngt code...
environment
...mngt code...
...mngt code...
...mngt code...
...code...
...code...
...code...
Managed Intelligent
1. Signaling & SelfComputing Element
Management of Node
Application
2. Workflow with DIME
Network Management (Service Component in a
...mngt code...
...mngt code...
...code...
...code...
...code...
...mngt code...
...mngt code...
...mngt code...
...code...
...code...
...code...
Distributed Workflow)
Serial
Processing
Service Package
Executable Instructions
Hello World
Private & Confidential
* Fault, Configuration,
Accounting,
Performance and
Security (Node &
Network)
Slide 7
Service
(Service Regulator and Service Package)
DIMEs In A Multi-Core Server
Run-time Orchestrator
Linux
Proof of Concept Features
• DIME Instantiation
• Discovery
• Workflow Orchestration
• Scaling
Signaling
• Dynamic Reconfiguration
• Fault Management
DIME Sub-network Managers
F
C
A
P
S
Network
App
App
MICE
MICE
A
B
I/O
F
A A B A B A B B
A
F
B
Parallax
OS (P)
Free Memory (F)
Server 1
Physical
Server 1
P PA P P FP
Free Memory (F)
Free
B
S S
S
Physical
Memory
Server 1
Server 2
Private & Confidential
S
(F)
S
A Memory (S)
Shared
F
B
Free Memory (F)
Physical
Server 1
Server 3
7
Slide 8
DNA In A Multi-core Server
http://youtu.be/IMXxmRSVGoI
Neumann, J. v. “The General and Logical Theory of Automata” In E. b. Taub, John
von Neumann Collected Works (pp. Vol 5, p259). Chicago: University of Illinois Press
(1951)
George B. Dyson, “Darwin among the Machines, the evolution of global
intelligence”, Helix Books, Addition Wesley Publishing Company, Inc., Reading, MA,
1997, p123.
Private & Confidential
8
Slide 9
Service Deployment
Run-time Orchestrator
Linux
DIME Sub-network Managers
F
C
A
P
S
Service Component
Developer
(Service Creation)
Network
Service Control
Manager
(Service Assurance)
Service Workflow
Creator
(Service Delivery)
F
Node 1
Worker 1
F
Node 1
Worker 2
Node 2
Worker 1
Hello World
F
Node 3
Worker 1
Node 2
Worker 2
Hello World
Private & Confidential
Node 3
Worker 2
Hello World
9
Slide 10
Lessons From Biology
"The basic principle of dealing with malfunctions in
"It's very
likelytheir
thateffect
on the
basis of philosophy
nature
is to make
as unimportant
as possible
and
to every
apply correctives,
necessary
at all, at and
that
error has iftothey
be are
caught,
explained,
leisure. In our dealings with artificial automata, on the
corrected,
a require
systemanofimmediate
the complexity
of the
other
hand, we
diagnosis.
Therefore,
we are trying
to arrange
in such
living organism
would
not runthe
forautomata
a millisecond."
a manner that errors will become as conspicuous as
possible, and intervention and correction follow
immediately."
--- von Neumann, Theory of SelfReproducing Automata
at the Hixon
--- John(1948)
von Neumann,
"The
General
and Logical
Theory of
Automata", John von
Symposium,
Pasadena,
California
Neumann Collected Works, Edited by A. H. Taub,
Volume 5, p 289 (Hixon Symposium 1948)
Private & Confidential
10
Slide 11
Questions?
Stop
DIME Network
Architecture (DNA)
for a New
Generation of
Many-core
Computing
Start
Dr. Rao Mikkilineni, Kawa Objects
Ian Seyler, Return Infinity
May 16, 2011
Private & Confidential
SMTPS11
Parallax - A New Operating
System for Scalable, Distributed,
and Parallel Computing
Stop
DIME Network
Architecture (DNA)
for a New
Generation of
Many-core
Computing
Start
Dr. Rao Mikkilineni, Kawa Objects
Ian Seyler, Return Infinity
May 16, 2011
Private & Confidential
SMTPS11
Slide 2
Agenda
• The hardware upheaval
and the von Neumann
Bottleneck
• Possible Solution using
a Parallel DIME™
network computing
model with telecom
grade trust
• Parallax – A potentially
new Operating System
(OS)
• Proof of concept demo
The history of the evolution of current OSs
is filled with lessons on wasted billions
(does anyone remember Multics or OS2?),
unmet expectations (who would have
thought UNIX, the original System V,
would vanish), surprise winners
(Windows and Linux), and stealthy
survivors (Mach in a Mac)
Private & Confidential
2
Slide 3
Many-core Servers
• SeaMicro – Custom Servers – 512, 1.66 GHz 64 bit
X86 Intel Atom cores in 10 RU; 2,048 CPUs/rack
• Calxeda - highly integrated Server‐on‐Chip built
around a new generation ARM processor – 480 cores
• Silicon Graphics – Altix UV –
– 2048 cores, 16 TB memory per Single System Image scales
to 32,768 processor sockets providing up to 262,144 Intel
Xeon cores (8-cores per socket)
Private & Confidential
3
Slide 4
Hardware Upheaval and
von Neumann Bottleneck
Up to 46,080 processing cores or 29.8 petabytes of storage per container
No Operating System that provides Application-centric Resource
Management in real-time
512 Cores
480 Cores
Layers of Management Infrastructure
Running an OS that cannot see beyond tens of cores
Network Infrastructure With Complex Management Systems
Private & Confidential
4
Slide 5
Current Economics of IT
% of TCO over Five Years
40.00
35.00
30.00
$61.2M
$31.6M
25.00
20.00
15.00
Hardware Upheaval is not Matched by
Software Innovation!!
10.00
5.00
0.00
Physical
Virtual
Private & Confidential
5
Slide 6
SPC Element Network & von
Neumann Bottleneck
Network, Storage,
Virtualization, application etc.
etc. Management
Distributed Intelligent
Managed Element Network
Service
Regulation
Executable
Instructions
Parallel FCAPS* Management of Stored Program
Computing Element using Signaling Channel
...mngt code...
...mngt code...
...mngt code...
Stop
...mngt code...
...mngt code...
...mngt code...
Real-time Application
Management
(Provisioning,
Monitoring & Control)
Start
End-to-end distributed
transaction response is
no longer controlled by
Distributedthe individual node OS
Serviceresource
Regulation
Application in a shared
Executable Instructions
...mngt code...
environment
...mngt code...
...mngt code...
...mngt code...
...code...
...code...
...code...
Managed Intelligent
1. Signaling & SelfComputing Element
Management of Node
Application
2. Workflow with DIME
Network Management (Service Component in a
...mngt code...
...mngt code...
...code...
...code...
...code...
...mngt code...
...mngt code...
...mngt code...
...code...
...code...
...code...
Distributed Workflow)
Serial
Processing
Service Package
Executable Instructions
Hello World
Private & Confidential
* Fault, Configuration,
Accounting,
Performance and
Security (Node &
Network)
Slide 7
Service
(Service Regulator and Service Package)
DIMEs In A Multi-Core Server
Run-time Orchestrator
Linux
Proof of Concept Features
• DIME Instantiation
• Discovery
• Workflow Orchestration
• Scaling
Signaling
• Dynamic Reconfiguration
• Fault Management
DIME Sub-network Managers
F
C
A
P
S
Network
App
App
MICE
MICE
A
B
I/O
F
A A B A B A B B
A
F
B
Parallax
OS (P)
Free Memory (F)
Server 1
Physical
Server 1
P PA P P FP
Free Memory (F)
Free
B
S S
S
Physical
Memory
Server 1
Server 2
Private & Confidential
S
(F)
S
A Memory (S)
Shared
F
B
Free Memory (F)
Physical
Server 1
Server 3
7
Slide 8
DNA In A Multi-core Server
http://youtu.be/IMXxmRSVGoI
Neumann, J. v. “The General and Logical Theory of Automata” In E. b. Taub, John
von Neumann Collected Works (pp. Vol 5, p259). Chicago: University of Illinois Press
(1951)
George B. Dyson, “Darwin among the Machines, the evolution of global
intelligence”, Helix Books, Addition Wesley Publishing Company, Inc., Reading, MA,
1997, p123.
Private & Confidential
8
Slide 9
Service Deployment
Run-time Orchestrator
Linux
DIME Sub-network Managers
F
C
A
P
S
Service Component
Developer
(Service Creation)
Network
Service Control
Manager
(Service Assurance)
Service Workflow
Creator
(Service Delivery)
F
Node 1
Worker 1
F
Node 1
Worker 2
Node 2
Worker 1
Hello World
F
Node 3
Worker 1
Node 2
Worker 2
Hello World
Private & Confidential
Node 3
Worker 2
Hello World
9
Slide 10
Lessons From Biology
"The basic principle of dealing with malfunctions in
"It's very
likelytheir
thateffect
on the
basis of philosophy
nature
is to make
as unimportant
as possible
and
to every
apply correctives,
necessary
at all, at and
that
error has iftothey
be are
caught,
explained,
leisure. In our dealings with artificial automata, on the
corrected,
a require
systemanofimmediate
the complexity
of the
other
hand, we
diagnosis.
Therefore,
we are trying
to arrange
in such
living organism
would
not runthe
forautomata
a millisecond."
a manner that errors will become as conspicuous as
possible, and intervention and correction follow
immediately."
--- von Neumann, Theory of SelfReproducing Automata
at the Hixon
--- John(1948)
von Neumann,
"The
General
and Logical
Theory of
Automata", John von
Symposium,
Pasadena,
California
Neumann Collected Works, Edited by A. H. Taub,
Volume 5, p 289 (Hixon Symposium 1948)
Private & Confidential
10
Slide 11
Questions?
Stop
DIME Network
Architecture (DNA)
for a New
Generation of
Many-core
Computing
Start
Dr. Rao Mikkilineni, Kawa Objects
Ian Seyler, Return Infinity
May 16, 2011
Private & Confidential
SMTPS11