Working Set • Pure paging will not load any page with a new process first instruction gives a page fault (demand paging) •
Download ReportTranscript Working Set • Pure paging will not load any page with a new process first instruction gives a page fault (demand paging) •
Working Set
• Pure paging will not load any page with a new process first instruction gives a page fault (
demand paging
) • But programs usually have
locality of reference
• The set of pages used by a program is its
working set
1
Working Set
• Idea: keep a program’s working set in memory – Resulted in the
working set model
• Loading many pages (working set?) at load time is called
prepaging
• If the working set does not fit in memory
thrashing
2
The Working Set Page Replacement Algorithm w(k,t) k • The working set is the set of pages used by the
k
memory references most recent • w(k,t) is the size of the working set at time,
t
• Note: after initial fast increase we see asymptotic behavior (many values of k give roughly the same working set size) 3
The Working Set Page Replacement Algorithm The working set algorithm using virtual time 4
The WSClock Page Replacement Algorithm • All pages form a ring (circular list) • Each entry holds (time, R, M) • R & M are updated by the HW • Start where the hand points • if R==1 • Set R = 0 • Move to next page • if R==0 • if age > τ && M==0: reclaim • if age > τ && M==1: schedule write • When you get back to the start, either: • Some writes are scheduled • Search until one write finished • No writes are scheduled • take first clean page • take the page you’re at Operation of the WSClock algorithm 5
Review of Page Replacement Algorithms 6
Some Issues with Paging
• Whose page should be evicted – Only my pages?
– Anyone’s pages?
• What is a good page size?
• Should we separate instructions and data?
• Could we share pages?
– Parent and children… 7
Local versus Global Allocation Policies • Which pages are considered?
– All pages for all processes – Only for one process • Confusion: sometimes local & global means – – – Local = fixed set of pages per process Global = assign a number of pages to a process and dynamically update it (more or fewer)!
Replacement still made “locally”!
8
Local versus Global Allocation Policies • Original configuration • Local page replacement • Global page replacement 9
Local versus Global Allocation Policies Page fault rate as a function of the number of page frames assigned 10
Load Control
• Despite good designs, system may still thrash • When PFF algorithm indicates – some processes need more memory – but no processes need less • Solution : Reduce number of processes competing for memory – – swap one or more to disk, divide up pages they held reconsider degree of multiprogramming 11
Page Size (1)
Small page size • Advantages – less internal fragmentation – better fit for various data structures, code sections – less unused program in memory • Disadvantages – programs need many pages, larger page tables 12
Page Size (2)
• Overhead due to page table and internal fragmentation page table space
overhead
• Where – – – s = average process size in bytes p = page size in bytes e = page table entry size
p
p
2 internal fragmentation Optimized when
p
2
se
• S=1MB, e=8 bytes 4 KB pages 13
Separate Instruction and Data Spaces
• One address space • Separate I and D spaces 14
Shared Pages
Two processes sharing same program sharing its page table 15
Cleaning Policy
• Need for a background process, paging daemon – periodically inspects state of memory – kswapd in Linux (Kernel Swap Daemon) • When too few frames are free – selects pages to evict using a replacement algorithm • It can use same circular list (clock) – as regular page replacement algorithm but with diff ptr 16
Implementation Issues
Operating System Involvement with Paging Four times when OS involved with paging 1.
Process creation determine program size create page table Create swap space?
2.
3.
4.
Process execution MMU reset for new process TLB (Translation Lookaside Buffer) flushed Page fault time determine virtual address causing fault swap target page out, needed page in Process termination time release page table, pages 17
Page Fault Handling (1)
1. Hardware traps to kernel 2. General registers saved 3. OS determines which virtual page needed 4. OS checks validity of address, seeks page frame 5. If selected frame is dirty, write it to disk 18
Page Fault Handling (2)
6. OS schedules fetch of the new page 7. Page tables updated 8. Faulting instruction backed up 9. Faulting process scheduled 10. Registers restored 11. Program continues 19
Instruction Backup
• An instruction is causing a page fault • If PC is 1004 when OS is notified, how to tell if 1004 is an instruction or a parameter?
• Many architectures load the next instruction in hidden register 20
Locking Pages in Memory • Virtual memory and I/O occasionally interact • Proc issues call for read from device into buffer – while waiting for I/O, another processes starts up – – has a page fault buffer for the first proc may be chosen to be paged out • Need to specify some pages locked – exempted from being target pages 21
Backing Store
(a) Paging to static swap area (b) Backing up pages dynamically 22
Separation of Policy and Mechanism Page fault handling with an external pager Where to implement the replacement policy?
23
Segmentation (1)
• One-dimensional address space with growing tables • One table may bump into another 24
Segmentation (2)
Allows each table to grow or shrink, independently 25
Segmentation (3)
Comparison of paging and segmentation 26
Implementation of Pure Segmentation
(a)-(d) Development of checkerboarding (e) Removal of the checkerboarding by compaction 27
Segmentation and Paging
• Let’s look at some examples – Intel x86 architecture support – Linux • HW for Segmentation • HW for Paging 28
Segmentation in x86
• Three different type of addresses – Logical – Linear – Physical
Logical Address Segmentation Unit Linear Address Paging Unit Physical Address
29
Segmentation
• The logical address consists of – 16 bit Segment Identifier (
Segment Selectors
) – 32 bit Segment Offset • The Segment selector is used to identify the right segment and its descriptor (later) 30
Segmentation
• Segment selectors use special registers – cs – code segment – – ss – stack segment ds – data segment – es, fs, gs – general purpose segments 31
Segmentation Level Protection on the Pentium 32
Segmentation
• The
Segment Selector
indexes a table (Descriptor Table) of
Segment Descriptors
– – Global Descriptor Table (GDT in gdtr) Local Descriptor Table (LDT in ldtr) • The Segment Descriptor contains – Base and limit – – – Privilege levels Present bit much more … • Cached in hidden registers for easy access 33
Segmentation • Pentium code segment descriptor (8 bytes) • Data segments differ slightly 34
Segmentation
Segment Descriptor + Linear Address 8 + * Index TI Segment Selector Logical Address gdtr or ldtr GDT or LDT Offset © Bovet, Cesati, 2005
35
Segmentation in Linux
• Linux (basically) only uses four segments – User code and data – Kernel code and data • CPL bits of cs register defines the privilege level – 0 = kernel mode – 3 = user mode • One GDT per processor – – 32 segments (14 unused) Contains above and reserved segments • Most apps don’t use the LDT – Some do (Wine) 36
Regular Paging on Intel x86 Mapping of a linear address onto a physical address 37
Paging
• Alternatives – Extended Paging • • One-level scheme Page size 4 MB – Physical Address Extension (PAE) • Use 36 bit physical addressing (max 64 GB) • • New
Page Directory Pointer Table
(PDPT) introduced No increase for processes (still 4 GB)!
– Page Size Extension PSE-36 • Not used in Linux 38
Paging for 64 bit Architectures
• 64 bits require (at least) 3-level paging – ia64: 9 + 9 + 9 + 12 – X86_64: 9 + 9 + 9 + 9 + 12 – ppc64: 10 + 10 + 9 + 12 • To accommodate all strategies Linux uses 4 level paging (≥2.6.11) 39
Paging in Linux
• 4 level paging – Page Global Directory – Page Upper Directory – Page Middle Directory – Page Table Size = 0 on x86-32 40
cr3
Paging in Linux
Global Dir Upper Dir Middle Dir Table Offset Page Global Directory + Page Upper Directory + Page Middle Directory + Page Table + Memory +
41
Cache Considerations
• Keep data in the cache!
– Place frequently used fields of structs at the beginning of the memory area (same cache line) – Try to distribute large structures uniformly wrt cache lines • Requires knowledge of cache line size • Cache synchronization (usually) done in HW 42