pptx - Computer Science

Download Report

Transcript pptx - Computer Science

Exokernel: An Operating System Architecture
for Application-Level Resource Management
Dawson Engler, Frans Kaashoek,
James O’Toole
MIT Laboratory for Computer Science
Function of Traditional Kernel
• Provides abstraction(s) of the hardware
– Processes
– Virtual Memory
– File System
• Provides Protection
– Hardware
– Kernel Itself
– Users From Each Other
Motivation: A Database
• I/O Abstraction: Cooked I/O
– Operating System buffers I/O
• Database Requirement
– Cannot tell a Database user that transaction has
committed until log pages have hit the surface of
the disk
– Database may need to sequence writes
– Database better at predicting future I/O
The Ever Shrinking Kernel
• Linux Windows –VM,FS..
• MicroKernels – Fewer Abstractions: rm FS
– Mach
– L4
• Virtual Machines (VMM is between OS and hardware) -- Virtualization
– DISCO
– Xen
•
ExoKernel -- Multiplexing
–
–
Aegis
XOK
Exokernel Architecture
Environments
Request
Revoke
Securely Expose Hardware
• Hardware:
– Disks, Physical Memory, TLB, Frame Buffer, Network Access
• Less Tangible Resources:
–
–
–
–
CPU Time Slices
Interrupts, Exceptions, Cross Domain Calls
DMA
Privileged Instructions
• Exokernel Exports (readonly):
– Freelists, cached TLB entries, disk arm positions
Exokernel Functions
• Resource Allocation (Inter-environment)
– Grant (or not) Resource Requests (Policy <- SysAd)
– Process Release (Dealloc) Requests
– Revoke Resources
• Visible Revocation (May get to chose which to free)
• Abort
• Note: Usually some resources exempt: page table mem
– Track Resource Ownership
• Guard all resource usage or binding points
Resource Allocation
• Allocation (almost always explicit)
– Alloc system call
• Deallocation
– Dealloc System Call
– Visible Revocation
• E.g.: Loss of the CPU when time slices expires:
– Library OS must save required processor state
– Abort Protocol
• Break all existing secure bindings
• Library OS gets a Repossession Exception – includes a
Repossession Vector
Secure Bindings
• Break up protection into bind and access
• Can be implemented in:
– Hardware
• TLB
• Frame Buffer Ownership Tag
– Software
• STLB
– Downloading Code into ExoKernel
• Dynamic Packet Filter
Examples
• Physical Page
– Bind: Get Exokernel to Load Mapping into TLB
• Page allocation
– Exokernel grants self-authenticating capability (R/W)
– LibOS stores capability in Page Table
– Passes Capability, Mapping on TLB write request
– Access: LibOS/Application code uses TLB
• Network Access
– Bind: Download DPF (Dynamic Packet Filter)
– Access: Exokernel Runs DPF on every incoming pkt
• Sends packets to correct Environment
m = malloc (3000);
...
emacs
strcpy(m, “The Ever Shrinking Kernel”);
Virtual
Physical
17
2
CAP
Library OS
R only
freelist
2
Req Alloc 2
5
2
STLB
v
RW
freelist
2
Check
5
ExoKernel
Miss
Hardware
TLB
0
1
2
3
4
5
MIPs
Downloading Code
• Advantages:
– Avoid Kernel Crossing
– Executed when environment is not scheduled
• Allowed because execution time is bounded
• Specification
– High Level Language
• Individual DPF code can be merged
• Safety by Language
–C
• Application Specific Handlers
– Dynamic Message Vectoring
– Message Initiation
• Protection: SFI (Sandboxing), Infinite Loop??
TLB Miss in Aegis
1. Aegis checks if mapping is in STLB. If so, load into TLB.
2. If the virtual address is one of the pinned pages, Aegis
loads the mapping into the TLB.
3. Environment checks its page tables for segmentation
fault. If not, use page tables to get physical page and
associated capability.
4. Aegis checks the capability. If valid, loads mapping into
TLB.
5. Control returned to the environment.
Protected Control Transfer
• Two Properties  Use Registers to Pass Msg
– Operation is Atomic
– No overwrite of environment-visible registers
• Acall
– Donate remainder of Current Timeslice
• Scall
– Donate all timeslices
Micro benchmarks
IPC Performance ExOS vs. Ultrix
Performance Summary
• Microbenchmarks: 10X
• Cheetah web server (XOK) 8X
Persistent Storage
•
•
•
•
Disk Block Shadowing
Disk Block tag
Low level metadata language
Untrusted Deterministic Function
Persistent storage
emacs
ExOS
Library OS
PhD
Thesis
ExOS
Library OS
XOK
Disk
crash
Conclusions
• Microbenchmarks and #Kernel Crossings not
critical
• Power (E.g. downloaded code) is critical factor
• Top Down vs. Bottom Up
• Encourages Innovation
– Writing an OS is like writing a compiler
– Operating System is Untrusted
– Untrusted Code Evolves Faster than Trusted
… and Caveats
•
•
•
•
Hardware Specific: MIPs vs. 486
Persistent Storage is Complex
MultiCPU and scaleability??
Are all of the DISCO tricks available here??
Additional References
• Application Performance and Flexibility on
Exokernel Systems, Frans Kaashoek, Dawson
Engler, Gregory Ganger et al
• Pdos.csail.mit.edu/exo/exo-slides/sld001.htm
Overriding Abstractions
• OS Extensions
• How to override generic abstractions
implemented in protected kernel, with better
application specific abstractions in user space
• Even if possible, won’t be efficient