Multiprocessor Memory Allocation

Download Report

Transcript Multiprocessor Memory Allocation

Operating Systems
CMPSCI 377
Lecture 18: I/O Systems & Storage
Emery Berger
University of Massachusetts Amherst
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
Last Time

File Systems Implementation


How disks work
How to organize data (files) on disks


Data structures
Placement of files on disk
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
2
Today


I/O Systems
Storage
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
3
I/O Devices

Many different kinds of I/O devices

Software that controls them: device drivers
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
4
Connection to CPU


Devices connect to ports or
system bus:
 Allows devices to
communicate w/CPU
 Typically shared by
multiple devices
Two ways of
communicating with CPU:
 Ports (programmed I/O)
 Direct-Memory Access
(DMA)
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
5
Ports (a.k.a. Programmed I/O)

Device port – 4 registers





Status indicates device busy, data ready, error
Control indicates command to perform
Data-in read by host to get input from device
Data-out written by CPU to device
Controller receives commands from bus,
translates into actions, reads/writes data onto
bus
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
6
Polling

CPU:



Busy-wait until status = idle
Set command register, data-out (output)
Set status = command-ready


Controller: set status = busy
Controller:



Reads command register, performs command
Places data in data-in (input)
Change state to idle or error
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
7
Interrupts



Avoids busy waiting
Device interrupts CPU when I/O operation
completes
On interrupt:



Determine which device caused interrupt
If last command to device was input operation,
retrieve data from register
Initiate next operation for device
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
8
DMA

Ports (“programmed I/O”)



Fine for small amounts of data, low-speed
Too expensive for large data transfers!
Solution: Direct Memory Access (DMA)



Device & CPU communicate through RAM
Pointers to source, destination, & # bytes
DMA interrupts CPU when entire transfer
complete
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
9
I/O – Outline

I/O Systems



I/O hardware basics
Services provided by OS
How OS implements these services
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
10
Low-Level Device Characteristics







Transfer unit: character, block
Access method: sequential, random
Timing: synchronous, asynchronous
Sharing: dedicated, sharable (by multiple
threads/processes)
Speed: latency, seek time, transfer rate, delay
I/O direction: read-only, write-only, read-write
Examples: terminal, CD-ROM, modem, keyboard,
tape, graphics controller
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
11
Application I/O Interface

High-level abstractions to hide device details

Block devices (read, write, seek)



Character-stream devices (get, put)



Also memory-mapped
File abstraction
Keyboard, printers, etc.
Network devices (socket connections)
But how do we hide huge differences in
latency?
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
12
Blocking & Non-Blocking


Blocking: wait until operation completes
Non-blocking: returns immediately (or after
timeout) with whatever data is available
(none, some, all)


Select call
Note: asynchronous – returns immediately,
signals completion via callback or interrupt
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
13
I/O – Outline

I/O Systems



I/O hardware basics
Services provided by OS
How OS implements these services
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
14
OS Services


Buffering
Caching
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
15
I/O Buffering

Buffer = memory area to store data before
transfer from/to CPU



Disk buffer stores block when read from disk
Transferred over bus by DMA controller into
buffer in physical memory
DMA controller interrupts CPU when transfer
complete
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
16
Why Buffering?

Speed mismatches between device, CPU


Different data transfer sizes


Compute contents of display in buffer (slow), send
buffer to screen (fast) = double buffering
ftp brings file over network one packet at a time, stores
to disk one block at a time
Minimizes time user process blocks on write


Write = copy data to kernel buffer, return to user
Kernel: writes from buffer to disk at later time
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
17
Caching

Keep recently-used disk blocks in main
memory after I/O call completes

read(diskAddress):
check memory first, then
read from disk


write(diskAddress):
if in memory, update
value, otherwise, read block in and update in
place
Provide “synchronization” operations to
commit writes to disk: flush(), msync()
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
18
Cache Write Policies


Trade-off between speed & reliability
Write-through:



Write to all levels of memory simultaneously
(memory containing block & disk)
High reliability
Write-back

Write only to memory (committing later)
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
19
Full Example: Read Operation


User process requests read from device
OS checks if data is in buffer. If not,





OS tells device drivers to perform input
Device driver tells DMA controller what to do, blocks
DMA controller transfers data to kernel buffer
DMA controller interrupts CPU when xfer complete
OS transfers data to user process,
places process in ready queue,

Process continues at point after system call
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
20
I/O Summary

I/O expensive for several reasons:


Slow devices & slow communication links
Contention from multiple processes


Supported via system calls, interrupt handling (slow)
Approaches to improving performance:



Caching – reduces data copying
Reduce interrupt frequency via large data transfers
DMA controllers – offload computation from CPU
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
21
Storage

Goal: Improve performance of storage
systems (i.e., the disk)



Scheduling
Interleaving
Read-ahead
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
22
Disk Operations

Disks are SLOW!


Latency:



1 disk seek = 40,000,000 cycles
Seek – position head over track/cylinder
Rotational delay – time for sector to rotate
underneath head
Bandwidth:

Transfer time – move bytes from disk to
memory
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
23
Calculating Disk Operations Time

Read/write n bytes:
I/O time = seek + rotational delay + transfer(n)


Can’t reduce transfer time
Can we reduce latency?
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
24
Reducing Latency

Minimize seek time & rotational latency:





Smaller disks
Faster disks
Sector size tradeoff
Layout
Scheduling
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
25
Disk Head Scheduling

Change order of disk accesses


Reduce length & number of seeks
Algorithms




FCFS
SSTF
SCAN, C-SCAN
LOOK, C-LOOK
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
26
I/O time = seek + rotational delay + transfer(n)
FCFS


First-come, first-serve
Example: disk tracks – (65, 40, 18, 78)

18
assume head at track 50
40
65
78
disk

Total seek time?

15+25+22+60 = 122
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
27
I/O time = seek + rotational delay + transfer(n)
SSTF


Shortest-seek-time first (like SJF)
Example: disk tracks – (65, 40, 18, 78)

18
assume head at track 50
40
65
78
disk

Total seek time?


10+22+47+13 = 92
Is this optimal?
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
28
SCAN


I/O time = seek + rotational delay + transfer(n)
Always move back and forth across disk
a.k.a. elevator
Example: disk tracks – (65, 40, 18, 78)

18
assume head at track 50, moving forwards
40
65
78
disk

Total seek time?


15+13+22+60+22=132
When is this good?
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
29
C-SCAN


I/O time = seek + rotational delay + transfer(n)
Circular SCAN:
Go back to start of disk after reaching end
Example: disk tracks – (65, 40, 18, 78)

18
assume head at track 50, moving forwards
40
65
78
disk

Total seek time?


15+13+22+100+18+22=190
Can be expensive, but more fair
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
30
LOOK variants

Instead of going to end of disk,
go to last request in each direction

SCAN, C-SCAN ) LOOK, C-LOOK
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
31
Exercises

5000 blocks, currently at 143, moving forwards
(86, 1470, 913, 1774, 948, 1509, 1022, 1750, 130 )







FCFS
SSTF
SCAN
LOOK
C-SCAN
C-LOOK
Find sequence & total seek time

First = 0, last = 4999
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
32
Solutions

FCFS


SSTF


143, 913, 948, 1022, 1470, 1509, 1750, 1774, 130, 86 – distance: 3,319
C-SCAN


143, 913, 948, 1022, 1470, 1509, 1750, 1774, 4999, 130, 86 – distance: 9,769
LOOK


143, 130, 86, 913, 948, 1022, 1470, 1509, 1750, 1774 – distance: 1,745
SCAN


143, 86, 1470, 913, 1774, 948, 1509, 1022, 1750, 130 – distance: 7,081
143, 913, 948, 1022, 1470, 1509, 1750, 1774, 4999, 0, 86, 130 – distance:
9,985
C-LOOK

143, 913, 948, 1022, 1470, 1509, 1750, 1774, 86, 130 – distance: 3,363
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
33
Storage – Outline

Improving performance of storage systems
(i.e., the disk)




Scheduling
Interleaving
Read-ahead
Tertiary storage
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
34
Disk Interleaving

Contiguous allocation


Requires OS response before disk spins past
next block
Instead of physically contiguous allocation,
interleave blocks

Relative to OS speed, rotational speed
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
35
Read-Ahead

Reduce seeks by reading blocks from disk
ahead of user’s request


Place in buffer on disk controller
Similar to pre-paging

Easier to predict future accesses on disk?
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
36
RAID

One way to really improve throughput:
add disks



Data spread across 100 disks = 100x
improvement in bandwidth
But: reliability drops by 100
Solution - RAID: Redundant Array of
Inexpensive Disks

Replicate, but use some disks/blocks for
checking
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
37
RAID Levels

RAID-1: Mirroring


RAID-2: Add error-correcting checks




Interleave disk blocks with ECC codes (parity, XOR)
10 disks requires 4 check disks
Same performance as level 1
RAID-4: Striping data



Just copy disks = 2x disks, ½ for checking
Spread blocks across disks
Improves read performance, but impairs writes
RAID-5: Striping data & check info

Removes bottleneck on check disks
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
38
Storage – Summary

OS can improve storage system performance




Scheduling
Interleaving
Read-ahead
Adding intelligent hardware (RAID)
improves performance & reliability
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
39