Working Set

Download Report

Transcript Working Set • Pure paging will not load any page with a new process  first instruction gives a page fault (demand paging) •

• Pure paging will not load any page with a new process  first instruction gives a page fault (

demand paging

) • But programs usually have

locality of reference

• The set of pages used by a program is its

working set

Working Set

• Idea: keep a program’s working set in memory – Resulted in the

working set model

• Loading many pages (working set?) at load time is called

prepaging

• If the working set does not fit in memory 

thrashing

The Working Set Page Replacement Algorithm w(k,t) k • The working set is the set of pages used by the

memory references most recent • w(k,t) is the size of the working set at time,

• Note: after initial fast increase we see asymptotic behavior (many values of k give roughly the same working set size) 3

The Working Set Page Replacement Algorithm The working set algorithm using virtual time 4

The WSClock Page Replacement Algorithm • All pages form a ring (circular list) • Each entry holds (time, R, M) • R & M are updated by the HW • Start where the hand points • if R==1 • Set R = 0 • Move to next page • if R==0 • if age > τ && M==0: reclaim • if age > τ && M==1: schedule write • When you get back to the start, either: • Some writes are scheduled • Search until one write finished • No writes are scheduled • take first clean page • take the page you’re at Operation of the WSClock algorithm 5

Review of Page Replacement Algorithms 6

Some Issues with Paging

• Whose page should be evicted – Only my pages?

– Anyone’s pages?

• What is a good page size?

• Should we separate instructions and data?

• Could we share pages?

– Parent and children… 7

Local versus Global Allocation Policies • Which pages are considered?

– All pages for all processes – Only for one process • Confusion: sometimes local & global means – – – Local = fixed set of pages per process Global = assign a number of pages to a process and dynamically update it (more or fewer)!

Replacement still made “locally”!

Local versus Global Allocation Policies • Original configuration • Local page replacement • Global page replacement 9

Local versus Global Allocation Policies Page fault rate as a function of the number of page frames assigned 10

Load Control

• Despite good designs, system may still thrash • When PFF algorithm indicates – some processes need more memory – but no processes need less • Solution : Reduce number of processes competing for memory – – swap one or more to disk, divide up pages they held reconsider degree of multiprogramming 11

Page Size (1)

Small page size • Advantages – less internal fragmentation – better fit for various data structures, code sections – less unused program in memory • Disadvantages – programs need many pages, larger page tables 12

Page Size (2)

• Overhead due to page table and internal fragmentation page table space

overhead

 • Where – – – s = average process size in bytes p = page size in bytes e = page table entry size



2 internal fragmentation Optimized when

 2

• S=1MB, e=8 bytes  4 KB pages 13

Separate Instruction and Data Spaces

• One address space • Separate I and D spaces 14

Shared Pages

Two processes sharing same program sharing its page table 15

Cleaning Policy

• Need for a background process, paging daemon – periodically inspects state of memory – kswapd in Linux (Kernel Swap Daemon) • When too few frames are free – selects pages to evict using a replacement algorithm • It can use same circular list (clock) – as regular page replacement algorithm but with diff ptr 16

Implementation Issues

Operating System Involvement with Paging Four times when OS involved with paging 1.

   Process creation determine program size create page table Create swap space?

  Process execution MMU reset for new process TLB (Translation Lookaside Buffer) flushed   Page fault time determine virtual address causing fault swap target page out, needed page in  Process termination time release page table, pages 17

Page Fault Handling (1)

1. Hardware traps to kernel 2. General registers saved 3. OS determines which virtual page needed 4. OS checks validity of address, seeks page frame 5. If selected frame is dirty, write it to disk 18

Page Fault Handling (2)

6. OS schedules fetch of the new page 7. Page tables updated 8. Faulting instruction backed up 9. Faulting process scheduled 10. Registers restored 11. Program continues 19

Instruction Backup

• An instruction is causing a page fault • If PC is 1004 when OS is notified, how to tell if 1004 is an instruction or a parameter?

• Many architectures load the next instruction in hidden register 20

Locking Pages in Memory • Virtual memory and I/O occasionally interact • Proc issues call for read from device into buffer – while waiting for I/O, another processes starts up – – has a page fault buffer for the first proc may be chosen to be paged out • Need to specify some pages locked – exempted from being target pages 21

Backing Store

(a) Paging to static swap area (b) Backing up pages dynamically 22

Separation of Policy and Mechanism Page fault handling with an external pager Where to implement the replacement policy?

Segmentation (1)

• One-dimensional address space with growing tables • One table may bump into another 24

Segmentation (2)

Allows each table to grow or shrink, independently 25

Segmentation (3)

Comparison of paging and segmentation 26

Implementation of Pure Segmentation

(a)-(d) Development of checkerboarding (e) Removal of the checkerboarding by compaction 27

Segmentation and Paging

• Let’s look at some examples – Intel x86 architecture support – Linux • HW for Segmentation • HW for Paging 28

Segmentation in x86

• Three different type of addresses – Logical – Linear – Physical

Logical Address Segmentation Unit Linear Address Paging Unit Physical Address

Segmentation

• The logical address consists of – 16 bit Segment Identifier (

Segment Selectors

) – 32 bit Segment Offset • The Segment selector is used to identify the right segment and its descriptor (later) 30

Segmentation

• Segment selectors use special registers – cs – code segment – – ss – stack segment ds – data segment – es, fs, gs – general purpose segments 31

Segmentation Level Protection on the Pentium 32

Segmentation

• The

Segment Selector

indexes a table (Descriptor Table) of

Segment Descriptors

– – Global Descriptor Table (GDT in gdtr) Local Descriptor Table (LDT in ldtr) • The Segment Descriptor contains – Base and limit – – – Privilege levels Present bit much more … • Cached in hidden registers for easy access 33

Segmentation • Pentium code segment descriptor (8 bytes) • Data segments differ slightly 34

Segmentation

Segment Descriptor + Linear Address 8 + * Index TI Segment Selector Logical Address gdtr or ldtr GDT or LDT Offset © Bovet, Cesati, 2005

Segmentation in Linux

• Linux (basically) only uses four segments – User code and data – Kernel code and data • CPL bits of cs register defines the privilege level – 0 = kernel mode – 3 = user mode • One GDT per processor – – 32 segments (14 unused) Contains above and reserved segments • Most apps don’t use the LDT – Some do (Wine) 36

Regular Paging on Intel x86 Mapping of a linear address onto a physical address 37

Paging

• Alternatives – Extended Paging • • One-level scheme Page size 4 MB – Physical Address Extension (PAE) • Use 36 bit physical addressing (max 64 GB) • • New

Page Directory Pointer Table

(PDPT) introduced No increase for processes (still 4 GB)!

– Page Size Extension PSE-36 • Not used in Linux 38

Paging for 64 bit Architectures

• 64 bits require (at least) 3-level paging – ia64: 9 + 9 + 9 + 12 – X86_64: 9 + 9 + 9 + 9 + 12 – ppc64: 10 + 10 + 9 + 12 • To accommodate all strategies Linux uses 4 level paging (≥2.6.11) 39

Paging in Linux

• 4 level paging – Page Global Directory – Page Upper Directory – Page Middle Directory – Page Table Size = 0 on x86-32 40

cr3

Paging in Linux

Global Dir Upper Dir Middle Dir Table Offset Page Global Directory + Page Upper Directory + Page Middle Directory + Page Table + Memory +

Cache Considerations

• Keep data in the cache!

– Place frequently used fields of structs at the beginning of the memory area (same cache line) – Try to distribute large structures uniformly wrt cache lines • Requires knowledge of cache line size • Cache synchronization (usually) done in HW 42