Towards Elastic Operating Systems
Download
Report
Transcript Towards Elastic Operating Systems
Towards Elastic
Operating
Systems
Amit Gupta
Ehab Ababneh
Richard Han
Eric Keller
University of Colorado,
Boulder
2
OS + Cloud Today
OS/Process
ELB/
Cloud
Mgr
Resources Limited
• Thrashing
• CPUs limited
• I/O bottlenecks
• Network
• Storage
Present Workarounds
• Additional Scripting/Code changes
• Extra Modules/Frameworks
• Coordination
• Synch/Aggregating State
Stretch Process
3
OS/Process
Advantages
• Expands available Memory
• Extends the scope of Multithreaded
Parallelism (More CPUs available)
• Mitigates I/O bottlenecks
• Network
• Storage
4
ElasticOS : Our Vision
5
ElasticOS: Our Goals
“Elasticity” as an OS Service
Elasticize all resources – Memory,
CPU, Network, …
Single machine abstraction
Apps unaware whether they’re running
on 1 machine or 1000 machines
Simpler Parallelism
Compatible with an existing OS
(e.g Linux, …)
6
“Stretched” Process
Unified Address Space
OS/Process
Elastic Page Table
V
R
Location
7
Movable Execution Context
OS/Process
•
•
•
OS handles elasticity – Apps don’t change
Partition locality across multiple nodes
• Useful for single (and multiple) threads
For multiple threads, seamlessly exploit
network I/O and CPU parallelism
8
Replicate Code, Partition Data
CODE
CODE
CODE
Data 1
Data 2
• Unique copy of data (unlike DSM)
• Execution context follows data
(unlike Process Migration, SSI )
9
Exploiting Elastic Locality
• We need an adaptive page clustering
algorithm
• LRU, NSWAP i.e “always pull”
• Execution follows data i.e “always jump”
• Hybrid (Initial): Pull pages, then Jump
10
Status and Future Work
Complete our initial prototype
Improve our page placement
algorithm
Improve context jump efficiency
Investigate Fault Tolerance issues
Contact:
[email protected]
Thank You
Questions ?
12
Algorithm Performance(1)
13
Algorithm Performance(2)
14
Page Placement
Multinode Adaptive LRU
Pulls
Jump
Threshold
Execution
Pull
First
Reached
Context !
Mem
CPUs
Swap
Mem
CPUs
Swap
15
Locality in a Single Thread
Temporal Locality
Mem
CPUs
Swap
Mem
CPUs
Swap
16
Locality across Multiple Threads
CPUs
Swap
Mem
Mem
CPUs
Swap
CPUs
Swap
17
Unlike DSM…
18
Exploiting Elastic Locality
• Assumptions
• Replicate Code Pages, Place Data
Pages (vs DSM)
• We need an adaptive page clustering
algorithm
• LRU, NSWAP
• Us (Initial): Pull pages, then Jump
19
Replicate Code, Distribute Data
CODE
CODE
CODE
Data 1
Accessing
• Unique
Data
1
Data 2
copy of data (vs DSM)
Accessing
• Execution context follows data
Data 2
(vs Process Migration)
Accessing
Data 1
Benefits
20
OS handles elasticity – Apps don’t
change
Partition locality across multiple nodes
Useful for single (and multiple) threads
For multiple threads, seamlessly exploit
network I/O and CPU parallelism
21
Benefits (delete)
OS handles elasticity
Application ideally runs unmodified
Application is naturally partitioned …
By Page Access locality
By seamlessly exploiting multithreaded
parallelism
By intelligent page placement
22
How should we place pages ?
23
Execution Context Jumping
A single thread example
Process
Address Space
Address Space
Node 1
Node 2
TIME
24
“Stretch” a Process
Unified Address Space
Process
Address Space
Address Space
Node 1
Node 2
Page Table
V R
IP Addr
25
Operating Systems Today
Resource
Limit = 1 Node
Mem
Disks
CPUs
OS
Process
26
Cloud Applications at Scale
More Queries ?
Load
Balancer
More
Resources ?
Cloud
Manager
Process
Process
Process
Partitioned
Data
Partitioned
Data
Partitioned
Data
Framework (eg. Map Reduce)
27
Our findings
Important Tradeoff
Data
Page Pulls
Vs
Execution Context Jumps
Latency cost is realistic
Our Algorithm: Worst case scenario
“always pull” == NSWAP
marginal improvements
28
Advantages
Natural Groupings: Threads &
Pages
Align resources with inherent
parallelism
Leverage existing mechanisms
for synchronization
29
“Stretch” a Process :
Unified Address Space
A “Stretched” Process
=
Collection of Pages + Other Resources
{ Across Several Machines }
Page Table
Mem
V R
IP Addr
Mem
Swap
Swap
CPUs
CPUs
30
delete Exec. context follows
Data
Replicate Code Pages
Read-Only
=> No Consistency burden
Smartly distribute Data Pages
Execution context can jump
Moves towards data
*Converse also allowed*
31
Elasticity in Cloud Apps Today
Input Data
D1
Mem
Disk
CPUs
~~~~
~~~~
~~~~
D2
Dx
….
~~~~
~~~~
~~~~
Output Data
32
Input Queries
Load Balancer
D1
Mem
Disk
CPUs
D2
Dy
….
~~~~
~~~~
~~~~
Output Data
Dx
33
(delete)Goals : Elasticity
dimensions
Extend Elasticity to
Memory
CPU
I/O
Network
Storage
34
Thank You
35
Bang
Head
Here !
36
Stretching a Thread
37
Overlapping Elastic Processes
38
*Code Follows Data*
39
Application Locality
40
Possible Animation?
41
Multinode Adaptive LRU
42
Possible Animation?
43
Open Topics
Fault
tolerance
Stack
handling
Dynamic
Locking
Linked Libraries
44
Elastic Page Table
Virtual Addr
Phy. Addr
Valid
Node
(IP addr)
A
B
1
Localhost
Local Mem
C
D
0
Localhost
Swap space
E
F
1
128.138.60.1
Remote Mem
G
H
0
128.138.60.1
Remote
Swap
45
“Stretch” a Process
Move
beyond resource boundaries of
ONE machine
CPU
Memory
Network, I/O
46
Input Data
CPUs
D1
Mem
Disk
CPUs
D2
Mem
Disk
~~~~
~~~~
~~~~
….
~~~~
~~~~
~~~~
Output Data
47
~~~~
~~~~
~~~~
Data
CPUs
CPUs
D1 Mem
D2 Mem
Disk
Disk
48
Reinventing Elasticity Wheel