Practical, transparent operating system support for superpages

Download Report

Transcript Practical, transparent operating system support for superpages

Practical, transparent operating
system support for superpages
Juan Navarro, Sitaram Iyer, Peter
Druschel, Alan Cox
OSDI 2002
1
What’s a Superpage?
• A very large page size, much greater than
the base page size
• Supported by most computer architectures
today
• Machines that support superpages usually
have several different page sizes,
beginning with the base page and then in
increasing sizes, each a power of 2 –
today, some as large as a gigabyte.
2
Background Summary
• Virtual memory automates the movement
of a process’s address space (code and
data) between disk and primary memory.
• Virtual addresses are translated using
information stored in the page table.
• Page tables are stored in primary memory.
• Extra memory references due to page
table degrades performance
3
Translation Lookaside Buffer
• TLB (translation lookaside buffer) – faster
memory; caches portions of the page table
• If most memory references “hit” in the
TLB, the overhead of address translation
is acceptable.
• TLB coverage: the amount of memory that
can be accessed strictly through TLB
entries.
4
The Problem
• Computer memories have increased in size
faster than TLBs.
• TLB coverage as a percentage of total
memory has decreased over the years.
– At the time this paper was written, most TLBs
covered a megabyte or less of physical memory
– Many applications have working sets that are not
completely covered by the TLB
• Result: more TLB misses, poorer
performance.
5
The Solution
• Superpages!
• Increase coverage without increasing TLB
size.
• How?
– By increasing amount of memory each TLB
entry can map
6
Hardware-Imposed Constraints
• Must have enough contiguous free memory to
store each superpage
• Superpage addresses (physical and virtual)
must be aligned on the superpage size: e.g., a
64KB SP must start at address 0, or 64KB, or
128KB, etc.
• TLB entry only has one set of bits (R, M, etc.)
and thus can only provide coarse-grained info –
not good for efficient page management.
7
Design Issues
• Issues for a superpage management
system:
– Storage allocation and fragmentation control
– Promotion
– Demotion
– Eviction
8
Issues: Frame Allocation
• When a page fault occurs, must choose a
frame for the new page
– In non-superpage systems any frame will do
– In a superpage system we may later decide to
include this page in a superpage – how does
this affect the decision?
• Possible approaches to allocation:
– Reservation based
– Relocation based
9
Reservation-based Allocation
• When a page is initially loaded choose a
superpage size and reserve aligned,
contiguous frames to hold it.
– As other pages are referenced, load them into
the previously reserved frames
• Will adjoining pages ever be needed by the
program?
10
Object mapping
Mapped pages
Virtual
address
space
Superpage
alignment
boundaries
Physical
address
space
Allocated frames
Unused
page frame
reservation
Figure 2: Reservation based allocation
11
Relocation-based Allocation
• Wait until a superpage is formed, then
move pages to contiguous locations
– Incurs overhead of moving pages when
superpages are created.
• Tradeoff: relocation costs versus unused
reservations (internal fragmentation)
12
Choosing a Page Size
• Regardless of whether reservation-based
or relocation-based allocation is used, size
of superpage must be chosen also.
• When a computer has several page sizes
(base page + several larger sizes), how to
choose which size to use?
• The issue: larger versus smaller
13
Choosing a Page Size
• Possibilities:
– the largest superpage size available
– a superpage size that most closely matches the
VM object the page belongs to
– a smaller size, based on memory availability.
• Tradeoff: possible performance gains from
large page versus possible loss of
contiguous physical memory space that may
be needed later
14
Large Pages?
• Large page sizes increase TLB coverage
the most, optimize I/O.
• But … they can also greatly increase the
memory requirements of a process
– Some pages are only partially filled
– Small localities = a kind of internal
fragmentation (page only partially referenced)
– If pages are not filled or have internal
fragmentation, paging traffic can actually
increase instead of decrease.
15
Small Pages?
• Small page sizes reduce internal
fragmentation (amount of wasted space in
an allocated block & the amount of
unreferenced content in a loaded page).
• But … they have all the problems that
large pages solve, plus they also have the
possibility of causing more page faults.
16
So Why Not Use Multiple Page
Sizes?
• Memory management is more complex
– Uniform page size is simple
• Multiple page sizes causes external
fragmentation
– It’s hard to maintain blocks of contiguous free
space to accommodate large superpages.
17
SP1
SP1
SP2
SP3
SP4 leaves,
is replaced by
SP5.
SP2 leaves.
No room for a
large
superpage
SP3
External
fragmentation
SP5
SP4
18
Fragmentation Control
• Memory can become fragmented with
reservation-based approach and pages of
various sizes.
• Possible solutions:
– Page out or overwrite areas of memory that
haven’t been used recently
– Preempt unused portions of existing
reservations
19
Issues
• Issues for a superpage management
system:
– Allocation and fragmentation control
– Promotion
– Demotion
– Eviction
20
Issues: Promotion
• Initially, base pages are treated normally.
• Promote when enough pages have been loaded
to justify creating a superpage:
– Combine TLB entries into one entry
– Load remaining pages, if necessary, to fill reservation
• Promotion may be incremental
• Tradeoff: early promotion (before all base pages
have been faulted in) reduces TLB misses but
wastes memory if all pages of the superpage are
not needed; late promotion delays benefits of
greater TLB coverage.
21
Issues
• Issues for a superpage management
system:
– Allocation, fragmentation, promotion – done
– Promotion - done
– Demotion & eviction
22
Issues: Demotion
• Reduce superpage size
– To individual base pages
– To a smaller superpage
• All or some of the base pages may have been
chosen for eviction
• Difficulty: use bits and dirty bits in the TLB aren’t
as helpful as if they referred to a base page.
– If the dirty bit is set, the entire superpage must be
written to disk, even if only part of it has changed.
23
Design of System Proposed by
Navarro, et al.
• The system discussed in this paper is
reservation-based.
• It supports multiple superpage sizes to reduce
internal fragmentation
– Effect on external fragmentation?
• It demotes infrequently referenced pages to
reclaim memory frames
• It is able to maintain contiguity (large blocks of
contiguous free frames) without using
compaction
24
Design Decisions in This
System
• With respect to allocation and
fragmentation
– Storage Management
– Reservation-based allocation
• Choosing a page size
– Fragmentation control
25
Storage Management
• Free space (available for reservations) is
stored on multiple lists, ordered by
superpage size
– Buddy system is used for allocation
• Partially filled reservations are kept on a
multi-list (one list for each page size) by
largest page size that can be obtained by
preempting unused portion
• Population maps track allocated portions
of reservations
26
Frame Allocation
• A page fault triggers a decision: does the
page have an existing reservation or not?
• If not, then
– select a preferred SP size,
– locate a set of contiguous, aligned frames
– load the page into the correct (aligned) frame
– enter the mapping in the page table
– reserve the remaining frames
• Or, load the page into a previously
reserved frame & enter mapping in PT
27
Choosing a Superpage Size in the
Navarro System
• Since the decision is made early, can’t
decide based on process’s behavior.
• Base decision on the memory object type;
prefer too large to too small
– If the decision is too large, it is easy to reclaim
the unneeded space
– If the decision is too small, relocation is
needed
28
Guidelines for Choosing
Superpage Size
• For fixed size memory objects (e.g. code
segments) reserve the largest super page
possible that is not too large.
• For dynamic-sized objects (stacks, heaps)
that grow one page at a time: allocate
extra space for growth.
29
Preempting Reservations in the
Navarro System
• After a page fault, if the guidelines call for
a superpage that is too large for any
available free block:
– Reserve a smaller size superpage or
– Preempt an existing reservation that has
enough unallocated frames to satisfy the
request
• This system uses preemption wherever
possible.
30
Preemption Policy - LRA
• Which reservation is preempted if more
than one can satisfy the request?
• Choose the one “whose most recent page
allocation occurred least recently” - LRA
– Reason: spatial locality suggests that related
pages will all be accessed fairly closely
together in time;(e.g., arrays, memory
mapped files). If a reservation hasn’t added
new pages recently, it’s unlikely to do so any
time soon.
31
Fragmentation Control
• Contiguity (of storage) is a contended resource
• Memory becomes fragmented due to
– Multiple page sizes
– Wired pages (can’t be paged out)
• Result: not enough large, properly aligned
blocks of free memory.
• Navarro et al. propose several implementation
techniques to address this problem
32
Fragmentation Control in the
Navarro System*
• The “buddy allocator” (free list manager)
maintains multiple lists of free blocks,
ordered by size
– When possible, coalesce adjacent blocks of
free memory to form larger blocks.
• Modify the page replacement daemon to
include contiguity as one of the factors to
be considered.
33
Navarro System: Design
Decisions
• With respect to promotion, demotion &
eviction
– Incremental promotions
– Speculative demotions
– Paging out dirty superpages
34
Promotion & Demotion
• Navarro et. al. implement incremental
promotion
– e.g., if 4 aligned pages of a 16 page
reservation becomes filled, promote to a midsize superpage
• Demotion: when a base page is evicted,
its superpage is demoted.
– Speculative demotion: demote active
superpages to determine if the whole page is
still in use or just parts
35
Paging Out Dirty Superpages*
• If a dirty superpage is to be flushed to disk, there
is no way to tell if one page is dirty or all pages.
• Writing out the entire superpage is a huge
perfomance hit.
• Navarro, et. al’s solution: Don’t write to clean
superpages.
– If a process tries to write to a SP, demote the SP.
– Repromote later if all base pages are dirty.
• They also experimented with a content hash
which could tell if a page had been changed
36
Goal of Superpage Management
Systems
• Good TLB coverage with minimal internal
fragmentation
• Navarro, et. al. Conclusion: create the
largest superpage possible that isn’t larger
than the size of the memory object (except
for stack/heap).
• If there isn’t enough memory, preempt
existing reservations (these pages had
their chance)
37
Current Usage
• Superpages were most often used at the
time this paper was written to store
portions of the kernel and various buffers.
– Reason: the memory requirements for these
objects are static and can be known in
advance.
– Superpage size can be chosen to fit the
object.
• More likely to be implemented in clusters
and large servers than in desktop
machines.
38
Current Research
• This paper focuses on supporting
superpage use in application memory, as
opposed to kernel memory.
• An ongoing research area: memory
compaction – whenever there are idle
CPU cycles, work to establish large
contiguous blocks of free memory
– Compare to disk management
39
Summary: Potential Advantages of
Superpages
• Ideally, superpages can improve
performance
– Without increasing size of TLB (which would
be expensive and increase TLB access time)
– Without increasing base page size (which can
lead to internal fragmentation)
• Superpages allow use of small (base) and
large (super) page sizes at the same time.
40
Summary - Tradeoff
• Large superpages increase TLB coverage
• Large superpages are more likely to fragment
memory. (Why?)
• Benefits of large superpages must be weighed
against “contiguity restoration techniques”
– Pages loaded into reserved areas must be loaded at
the proper offset.
– Must be enough space for the entire superpage
– More overhead for free space management
41
Authors’ Conclusions
• Can achieve 30%-60% improvement in
performance, based on tests using an
accepted set of benchmark programs as
well as actual applications.
• Must employ contiguity restoration
techniques: demotion, preemption,
compaction
• Must be able to support a variety of page
sizes
42
Conclusion
• Superpage management can be
transparently integrated into an existing
OS (FreeBSD, in this case).
– “hooks” connect the OS to the superpage
module at critical events: page faults, page
allocation, page replacement, etc.
• Tests show this technique scales well,
according to authors.
43
Follow-up
• “Supporting superpage allocation without
additional hardware support”, Mel
Gorman, Patrick Healy, Proceedings of the
7th International Symposium on Memory
Management , 2008
44
Premise
• Fragmentation control is essential for
successful implementation of superpages.
• Navarro’s approach doesn’t always work.
• Major hindrance: “wired” pages – pages
that can’t be paged out or moved – tend to
become scattered throughout memory
• (Navarro addressed this issue; proposed
to monitor creation of kernel wired pages,
cluster them in one location.)
45
• Another problem: page replacement
processes that don’t consider superpage
structure
– Reclaim pages based on age, does not
consider contiguity. [Note: Navarro system
does claim to take this into consideration –
such as activating the paging daemon
whenever the system fails to satisfy a request
for a certain super-page size]
46
GPBM
• Grouping Pages By Mobility (GPBM) is a
placement policy described by Gorman &
Healy that allocates frames to pages
based on whether or not the pages can
later be relocated.
• Treats the address space as if divided into
arenas, which correspond in size to the
largest superpage.
47
Page Mobility Types
• Movable – no restrictions; can be
relocated as long as PT is updated
• Reclaimable – kernel pages that can be
added to the free list (certain kinds of
caches, for example)
• Temporary – pages that are known to be
needed for a short time; treated as
reclaimable
• Non-reclaimable – wired pages
48
How are the classes used?
• Group pages of the same type into arenas
of the same type.
• The number of movable and reclaimable
arenas have the most effect on the
number of superpages that can be
allocated.
• Contiguity-aware page replacement is
used.
49
Summary
• Superpages promise performance improvement but
so far no generally accepted approach for user level
pages.
• Reservation based approach seems to be most
popular
• Contiguity is the biggest problem
• Some researchers propose hardware solutions,
such as re-designing the memory controller to allow
holes in SPs, or re-designing TLB to permit SPs that
consist of non-contiguous base pages.
– To date, no hardware solutions implemented.
50