CS136, Advanced Architecture Virtual Machines Outline • Virtual Machines • Xen VM: Design and Performance • Conclusion CS136

Download Report

Transcript CS136, Advanced Architecture Virtual Machines Outline • Virtual Machines • Xen VM: Design and Performance • Conclusion CS136

CS136, Advanced Architecture
Virtual Machines
Outline
• Virtual Machines
• Xen VM: Design and Performance
• Conclusion
CS136
2
Introduction to Virtual Machines
• VMs developed in late 1960s
– Remained important in mainframe computing over the years
– Largely ignored in single-user computers of 1980s and 1990s
• Recently regained popularity due to
– Increasing importance of isolation and security in modern
systems,
– Failures in security and reliability of standard operating
systems,
– Sharing of a single computer among many unrelated users,
– Dramatic increases in raw speed of processors, making VM
overhead more acceptable
CS136
3
What Is a Virtual Machine (VM)?
• Broadest definition:
– Any abstraction that provides a Turing-complete and
standardized programming interface
– Examples: x86 ISA; Java bytecode; even Python and Perl
– As level gets higher, utility of definition gets lower
• Better definition:
– An abstract machine that provides a standardized interface
similar to a hardware ISA, but at least partly under control of
software that provides added features
– Best to distinguish “true” VM from emulators (although Java
VM is entirely emulated)
• Often, VM is partly supported in hardware, with
minimal software control
– E.g., give multiple virtual x86s on one real one, similar to way
virtual memory gives illusion of more memory than reality
CS136
4
System Virtual Machines
• “(Operating) System Virtual Machines” provide
complete system-level environment at binary ISA
– Assumes ISA always matches native hardware
– E.g., IBM VM/370, VMware ESX Server, and Xen
• Presents illusion that VM users have an entire
private computer, including copy of OS
• Single machine runs multiple VMs, and can
support multiple (and different) OSes
– On conventional platform, single OS “owns” all HW resources
– With VM, multiple OSes all share HW resources
• Underlying HW platform is host; its resources are
shared among guest VMs
CS136
5
Virtual Machine Monitors (VMMs)
• Virtual machine monitor (VMM) or hypervisor is
software that supports VMs
• VMM determines how to map virtual resources to
physical ones
• Physical resource may be time-shared,
partitioned, or emulated in software
• VMM much smaller than a traditional OS;
– Isolation portion of a VMM is  10,000 lines of code
CS136
6
VMM Overhead
• Depends on workload
• User-level CPU-bound programs (e.g., SPEC)
have near-zero virtualization overhead
– Runs at native speeds since OS rarely invoked
• I/O-intensive workloads are OS-intensive
– Execute many system calls and privileged instructions
– Can result in high virtualization overhead
• Goal for system VMs:
– Run almost all instructions directly on native hardware
• But if I/O-intensive workload is also I/O-bound
– Processor utilization is low (since waiting for I/O)
– Processor virtualization can be hidden in I/O costs
– So virtualization overhead is low
CS136
7
Important Uses of VMs
1. Multiple OSes
•
•
No more dual boot!
Can even transfer data (e.g., cut-and-paste) between VMs
2. Protection
•
•
Crash or intrusion in one OS doesn’t affect others
Easy to replace failed OS with fresh, clean one
3. Software Management
•
•
VMs can run complete SW stack, even old OSes like DOS
Run legacy OS, stable current, test release on same HW
4. Hardware Management
•
•
CS136
Independent SW stacks can share HW
» Run application on own OS (helps dependability)
Migrate running VM to different computer
» To balance load or to evacuate from failing HW
8
Virtual Machine Monitor Requirements
• VM Monitor
– Presents SW interface to guest software
– Isolates guests’ states from each other
– Protects itself from guest software (including guest OSes)
• Guest software should behave exactly as if
running on native HW
– Except for performance-related behavior or limitations of
fixed resources shared by multiple VMs
– Hard to achieve perfection in real system
• Guest software shouldn’t be able to change
allocation of real system resources directly
• Hence, VMM must control  everything even
though guest VM and OS currently running is
temporarily using them
– Access to privileged state, address translation, I/O,
exceptions and interrupts, …
CS136
9
Virtual Machine Monitor Requirements
(continued)
•
VMM must be at higher privilege level than
guest VM, which generally runs in user mode
 Execution of privileged instructions handled by VMM
•
E.g., timer or I/O interrupt:
– VMM suspends currently running guest
– Saves state
– Handles interrupt
» Possibly handle internally, possibly delivers to a guest
– Decides which guest to run next
– Loads its state
– Guest VMs that want timer are given virtual one
CS136
10
Hardware Requirements
Hardware needs roughly same as paged virtual
memory:
1. At least 2 processor modes, system and user
2. Privileged subset of instructions
•
•
•
CS136
Available only in system mode
Trap if executed in user mode
All system resources controllable only via these instructions
11
ISA Support for Virtual Machines
• If ISA designers plan for VMs, easy to limit:
– What instructions VMM must handle
– How long it takes to emulate them
• Because chip makers ignored VM technology, ISA
designers didn’t “plan ahead”
– Including 80x86 and most RISC architectures
• Guest system must see only virtual resources
– Guest OS runs in user mode on top of VMM
– If guest tries to touch HW-related resource, must trap to VMM
» Requires HW support to initiate trap
» VMM must then insert emulated information
– If HW built wrong, guest will see or change privileged stuff
» VMM must then modify guest’s binary code
CS136
12
ISA Impact on Virtual Machines
• Consider x86 PUSHF/POPF instructions
– Push flags register on stack or pop it back
– Flags contains condition codes (good to be able to
save/restore) but also interrupt enable flag (IF)
• Pushing flags isn’t privileged
– Thus, guest OS can read IF and discover it’s not the way it
was set
» VMM isn’t invisible any more
• Popping flags in user mode ignores IF
– VMM now doesn’t know what guest wants IF to be
– Should trap to VMM
• Possible solution: modify code, replacing
pushf/popf with special interrupting instructions
– But now guest can read own code and detect VMM
CS136
13
Hardware Support for Virtualization
• Old “correct” implementation: trap on every
pushf/popf so VM can fix up results
– Very expensive, since pushf/popf used frequently
• Alternative: IF shouldn’t be in same place as
condition codes
– Pushf/popf can be unprivileged
– IF manipulation is now very rare
• Pentium has even better solution
– In user mode, VIF (“Virtual Interrupt Flag”) holds what guest
wants IF to be
– Pushf/popf manipulate VIF instead of IF
– Host can now control real IF, guest sees virtual one
– Basic idea can be extended for many similar “OS-only” flags
and registers
CS136
14
Impact of VMs on Virtual Memory
• Each guest manages own page tables
– How to make this work?
• VMM separates real and physical memory
– Real memory is intermediate level between virtual and physical
– Some call it virtual, physical, and machine memory
– Guest maps virtual to real memory via its page tables
» VMM page tables map real to physical
• VMM maintains shadow page table that maps
directly from guest virtual space to HW physical
address space
– Rather than pay extra level of indirection on every memory access
– VMM must trap any attempt by guest OS to change its page table
or to access page table pointer
CS136
15
ISA Support for VMs & Virtual
Memory
• IBM 370 architecture added additional level of
indirection that was managed by VMM
– Guest OS kept page tables as before, so shadow pages were
unnecessary
• To virtualize software TLB, VMM manages real
one and has copy of contents for each guest VM
– Any instruction that accesses TLB must trap
• Hardware TLB still managed by hardware
– Must flush on VM switch unless PID tags available
• HW or SW TLBs with PID tags can mix entries
from different VMs
– Avoids flushing TLB on VM switch
CS136
16
Impact of I/O on Virtual Machines
•
Most difficult part of virtualization
–
–
–
–
•
Increasing number of I/O devices attached to computer
Increasing diversity of I/O device types
Sharing real device among multiple VMs
Supporting myriad of device drivers, especially with differing
guest OSes
Give each VM generic versions of each type of I/O
device, and let VMM handle real I/O
– Drawback: slower than giving VM direct access
•
Method for mapping virtual I/O device to physical
depends on type:
– Disks partitioned by VMM to create virtual disks for guests
– Network interfaces shared between VMs in short time slices
» VMM tracks messages for virtual network addresses
» Routes to proper guest
– USB might be directly attached to VM
CS136
17
Example: Xen VM
•
Xen: Open-source System VMM for 80x86 ISA
– Project started at University of Cambridge, GNU license
•
Original vision of VM is running unmodified OS
– Significant wasted effort just to keep guest OS happy
•
“Paravirtualization” - small modifications to guest OS to
simplify virtualization
Three examples of paravirtualization in Xen:
1. To avoid flushing TLB when invoking VMM, Xen mapped
into upper 64 MB of address space of each VM
2. Guest OS allowed to allocate pages, just check that it
didn’t violate protection restrictions
3. To protect guest OS from user programs in VM, Xen takes
advantage of 80x86’s four protection levels
–
–
–
–
CS136
Most x86 OSes keep everything at privilege levels 0 or at 3.
Xen VMM runs at highest level (0)
Guest OS runs at next level (1)
Applications run at lowest (3)
18
Xen Changes for Paravirtualization
• Port of Linux to Xen changed  3000 lines, or  1% of
80x86-specific code
– Doesn’t affect application binary interfaces (ABI/API) of guest OS
• OSes supported in Xen 2.0:
OS
Linux 2.4
Linux 2.6
NetBSD 2.0
NetBSD 3.0
Plan 9
FreeBSD 5
Runs as host OS
Runs as guest OS
Yes
Yes
No
Yes
No
No
Yes
Yes
Yes
Yes
Yes
Yes
http://wiki.xensource.com/xenwiki/OSCompatibility
CS136
19
Xen and I/O
• To simplify I/O, privileged VMs assigned to each
hardware I/O device: “driver domains”
– Xen Jargon: “domains” = Virtual Machines
• Driver domains run physical device drivers
– Interrupts still handled by VMM before being sent to
appropriate driver domain
• Regular VMs (“guest domains”) run simple virtual
device drivers
– Communicate with physical device drivers in driver domains
to access physical I/O hardware
• Data sent between guest and driver domains by
page remapping
CS136
20
Xen Performance
• Performance relative to native Linux for Xen, for 6
benchmarks (from Xen developers)
100%
100%
97%
92%
95%
96%
99%
dbench
SPEC
WEB99
80%
Performance 60%
relative to
native Linux 40%
20%
0%
SPEC
INT2000
Linux build PostgreSQL PostgreSQL
time
Inf.
OLTP
Retrieval
• But are these user-level CPU-bound programs?
I/O-intensive workloads? I/O-bound I/O-Intensive?
CS136
21
Xen Performance, Part II
• Subsequent study noticed Xen experiments based
on 1 Ethernet network interface card (NIC), and
single NIC was performance bottleneck
Linux
Xen w / privileged driver VM ("driver dom ain")
Xen w / guest VM + driver VM
2500
2000
Receive 1500
Throughput
(Mbits/sec) 1000
500
0
1
2
3
4
Num ber of Netw ork Interface Cards
CS136
22
Xen Performance, Part III
Linux
Xen w / privileged driver VM only
Xen w / guest VM + driver VM
4
3
Event count
relative to Xen w /
privileged driver 2
dom ain
1
Instructions
L2 m isses
I-TLB m isses
D-TLB m isses
1. > 2X instructions for guest VM + driver VM
2. > 4X L2 cache misses
3. 12X – 24X Data TLB misses
CS136
23
Xen Performance, Part IV
1. > 2X instructions: caused by page remapping and
transfer between driver and guest VMs, and by
communication over channel between 2 VMs
2. 4X L2 cache misses: Linux uses zero-copy
network interface that depends on ability of NIC to
do DMA from different locations in memory
– Since Xen doesn’t support “gather” DMA in its virtual network
interface, it can’t do true zero-copy in the guest VM
3. 12X – 24X Data TLB misses: 2 Linux optimizations
– Superpages for part of Linux kernel space: 4MB pages lowers
TLB misses versus using 1024 4 KB pages. Not in Xen
– PTEs marked global aren’t flushed on context switch, and Linux
uses them for kernel space. Not in Xen
•
Future Xen may address 2. and 3., but 1. inherent?
CS136
24
Conclusion
• VM Monitor presents SW interface to guest
software, isolates guest states, and protects itself
from guest software (including guest OSes)
• Virtual Machine revival
– Overcome security flaws of large OSes
– Manage software, manage hardware
– Processor performance no longer highest priority
• Virtualization challenges for processor, virtual
memory, and I/O
– Paravirtualization to cope with those difficulties
• Xen as example VMM using paravirtualization
– 2005 performance on non-I/O bound, I/O intensive apps:
80% of native Linux without driver VM, 34% with driver VM
CS136
25