Scale and Performance in the Denali Isolation Kernel

Download Report

Transcript Scale and Performance in the Denali Isolation Kernel

Andrew Whitaker, Marianne Shaw, and Steven D. Gribble
Presented By
Steve Rizor
Abstract

The Denali isolation kernel is an operating system architecture designed to safely multiplex
a large number of internet services on shared hardware

Allows new services to be “pushed” onto third-party infrastructures, relieving authors from
the burden of maintaining physical infrastructure

Exposes a virtual machine abstraction but does not attempt to emulate the underlying
hardware precisely

Modifies the virtual architecture to gain scale, performance, and simplicity of
implementation
Introduction
With the proliferation of Internet services comes the
need for hardware solutions – but obviously one
machine per service is usually highly inefficient
A large fraction of web
services are infrequently
accessed, while a small
fraction is frequently
accessed.
Introduction
Why not virtualize all of the
infrequently-accessed services?
If one machine can handle
10,000 requests per hour
for one service, why can’t
one machine handle 1
request per hour for 10,000
services?
Making a Case for Isolation Kernels

Many services can already run on one machine – but there is a
need for security
 Isolation not only enables many services to run, but they run without the ability
to affect one another
 This enables the push of new/untrusted services without the worry of harming
other services
 It also brings about an interesting experimentation infrastructure – the ability to
deploy wide-area testbeds for network research: thousands of running
subjects without the physical machines
Isolation Kernel Design Principles
An isolation kernel is a small-kernel operating system architecture targeted
at hosting multiple un-trusted applications that require little data sharing.
1. Expose low-level resources rather than high-level abstractions.
•
High-level abstractions entail significant complexity and typically have a wide API,
violating the security principle of economy of mechanism. They also invite “layer below”
attacks, in which an attacker gains unauthorized access to a resource by requesting it
below the layer of enforcement
2. Prevent direct sharing by exposing only private, virtualized namespaces.
•
Little direct sharing is needed across Internet services, and therefore an isolation kernel
should prevent direct sharing by conning each application to a private namespace.
Memory pages, disk blocks, and all other resources should be virtualized, eliminating
the need for a complex access control policy: the only sharing allowed is through the
virtual network.
Isolation Kernel Design Principles
An isolation kernel is a small-kernel operating system architecture targeted
at hosting multiple un-trusted applications that require little data sharing.
3. Scalability.
•
An isolation kernel designed for internet services must be able to scale up into the
thousands on a single machine. As such, the memory footprint (including the kernel
metadata) must be minimized. Since the set of all unpopular services won’t fit in
memory, the kernel must treat memory as a cache of popular services, swapping
inactive services to disk. It will also have a poor hit rate, so there must be rapid
swapping to reduce cache miss penalties.
4. Modify the virtualized architecture for simplicity, scale, and performance.
•
VMMs such as Disco adhere to the first two principles. They also strive to support
legacy operating systems by precisely emulating the physical hardware. In this case,
however, deviating from the underlying physical hardware can enhance performance,
simplicity, and scalability. The drawback to this is that this removes support for
unmodified legacy operating systems.
Delani Isolation Kernel
While the Delani Isolation Kernel
looks like a standard VMM:
The virtual machine interface is
quite different from most others
The Delani virtual instruction set is a subset of x86, so that most virtual instructions execute
directly on the physical processor. x86 VMMs normally have to use binary rewriting and
memory protection techniques to virtualize some of the instructions. Since Delani does not
support legacy operating systems, those instructions are simply defined to have ambiguous
semantics. At worst, the VM will harm only itself. However, such instructions are rarely
used, and none are emitted by C compilers such as gcc.
The instruction set also adds an “idle-with-timeout” instruction that relinquishes control to
another VM instead of using time in an idle loop, an instruction to terminate the VM, and
several virtual registers revealing information about the system.
Delani Isolation Kernel

Delani’s virtual machine interface is also different in that the emulated hardware
is not a representation of the physical system:



Delani uses a round-robin schedule across all the active VMs (those with active
threads) and uses a buffered interrupt scheme to prevent thrashing


By keeping the emulated devices static, there is no need to poll for hardware.
By keeping the devices simple, it reduces the number of programmed I/O instructions used to transmit
or receive a single packet.
Those VMs which voluntarily give up time via the “idle-with-timeout” instruction are given priority once
the timeout has finished
Each Denali VM is given its own (virtualized) physical 32-bit address space.


A VM may only access a subset of this 32-bit address space, the size and range of which is chosen by
the isolation kernel when the VM is instantiated. The kernel itself is mapped into a portion of the address
space that the VM cannot access; because of this, we can avoid physical TLB flushes on VM/VMM
crossings.
Virtual registers are stored in a page at the beginning of a VM's (virtual) physical address space. This
page is shared between the VM and the isolation kernel, avoiding the overhead of kernel traps for
register modications. In other respects, the virtual registers behave like normal memory (for example,
they can be paged out to disk).
Benchmarks
For testing, since a standard operating system must be modified for use on the Delani
Isolation Kernel, a small guest OS was developed based on the virtual machine
interface named Ilwaco.
Because of the simplification of the virtual network device, fewer programmed I/O instructions
are needed per packet. However, there still needs to be a user/kernel switch for Delani, where
there does not need to be one in BSD. Adding a syscall to BSD packets (forcing this
user/kernel switch) brings the BSD performance more into line with Delani.
Benchmarks
The performance gains for buffering interrupt requests are quite obvious.
Note the performance hit around 800 VMs due to memory demands and
excessive paging.
Benchmarks
Using the new instruction, there is a huge
performance gain over normal OS-idle loops.
Benchmarks
Even at 800 virtual machines
running, there is still an
astonishing throughput
The effects of paging are
quite obvious – with a larger
amount of memory, the cliff
can be pushed further out.
Benchmarks
Running the Quate II Linux server on Delani, it is apparent that even with
30 servers (4 clients each), there is no change in latency or reliability. The
scheduling algorithm combined with the idle-with-timeout instruction and
the buffered interrupts keep the servers running without issues.
References

Andrew Whitaker, Marianne Shaw, and Steven D. Gribble, “Scale and Performance in
the Denali Isolation Kernel”, OSDI’02.