Multiprocessor Memory Allocation
Download
Report
Transcript Multiprocessor Memory Allocation
Operating Systems
CMPSCI 377
Lecture 18: I/O Systems & Storage
Emery Berger
University of Massachusetts Amherst
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
Last Time
File Systems Implementation
How disks work
How to organize data (files) on disks
Data structures
Placement of files on disk
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
2
Today
I/O Systems
Storage
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
3
I/O Devices
Many different kinds of I/O devices
Software that controls them: device drivers
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
4
Connection to CPU
Devices connect to ports or
system bus:
Allows devices to
communicate w/CPU
Typically shared by
multiple devices
Two ways of
communicating with CPU:
Ports (programmed I/O)
Direct-Memory Access
(DMA)
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
5
Ports (a.k.a. Programmed I/O)
Device port – 4 registers
Status indicates device busy, data ready, error
Control indicates command to perform
Data-in read by host to get input from device
Data-out written by CPU to device
Controller receives commands from bus,
translates into actions, reads/writes data onto
bus
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
6
Polling
CPU:
Busy-wait until status = idle
Set command register, data-out (output)
Set status = command-ready
Controller: set status = busy
Controller:
Reads command register, performs command
Places data in data-in (input)
Change state to idle or error
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
7
Interrupts
Avoids busy waiting
Device interrupts CPU when I/O operation
completes
On interrupt:
Determine which device caused interrupt
If last command to device was input operation,
retrieve data from register
Initiate next operation for device
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
8
DMA
Ports (“programmed I/O”)
Fine for small amounts of data, low-speed
Too expensive for large data transfers!
Solution: Direct Memory Access (DMA)
Device & CPU communicate through RAM
Pointers to source, destination, & # bytes
DMA interrupts CPU when entire transfer
complete
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
9
I/O – Outline
I/O Systems
I/O hardware basics
Services provided by OS
How OS implements these services
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
10
Low-Level Device Characteristics
Transfer unit: character, block
Access method: sequential, random
Timing: synchronous, asynchronous
Sharing: dedicated, sharable (by multiple
threads/processes)
Speed: latency, seek time, transfer rate, delay
I/O direction: read-only, write-only, read-write
Examples: terminal, CD-ROM, modem, keyboard,
tape, graphics controller
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
11
Application I/O Interface
High-level abstractions to hide device details
Block devices (read, write, seek)
Character-stream devices (get, put)
Also memory-mapped
File abstraction
Keyboard, printers, etc.
Network devices (socket connections)
But how do we hide huge differences in
latency?
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
12
Blocking & Non-Blocking
Blocking: wait until operation completes
Non-blocking: returns immediately (or after
timeout) with whatever data is available
(none, some, all)
Select call
Note: asynchronous – returns immediately,
signals completion via callback or interrupt
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
13
I/O – Outline
I/O Systems
I/O hardware basics
Services provided by OS
How OS implements these services
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
14
OS Services
Buffering
Caching
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
15
I/O Buffering
Buffer = memory area to store data before
transfer from/to CPU
Disk buffer stores block when read from disk
Transferred over bus by DMA controller into
buffer in physical memory
DMA controller interrupts CPU when transfer
complete
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
16
Why Buffering?
Speed mismatches between device, CPU
Different data transfer sizes
Compute contents of display in buffer (slow), send
buffer to screen (fast) = double buffering
ftp brings file over network one packet at a time, stores
to disk one block at a time
Minimizes time user process blocks on write
Write = copy data to kernel buffer, return to user
Kernel: writes from buffer to disk at later time
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
17
Caching
Keep recently-used disk blocks in main
memory after I/O call completes
read(diskAddress):
check memory first, then
read from disk
write(diskAddress):
if in memory, update
value, otherwise, read block in and update in
place
Provide “synchronization” operations to
commit writes to disk: flush(), msync()
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
18
Cache Write Policies
Trade-off between speed & reliability
Write-through:
Write to all levels of memory simultaneously
(memory containing block & disk)
High reliability
Write-back
Write only to memory (committing later)
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
19
Full Example: Read Operation
User process requests read from device
OS checks if data is in buffer. If not,
OS tells device drivers to perform input
Device driver tells DMA controller what to do, blocks
DMA controller transfers data to kernel buffer
DMA controller interrupts CPU when xfer complete
OS transfers data to user process,
places process in ready queue,
Process continues at point after system call
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
20
I/O Summary
I/O expensive for several reasons:
Slow devices & slow communication links
Contention from multiple processes
Supported via system calls, interrupt handling (slow)
Approaches to improving performance:
Caching – reduces data copying
Reduce interrupt frequency via large data transfers
DMA controllers – offload computation from CPU
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
21
Storage
Goal: Improve performance of storage
systems (i.e., the disk)
Scheduling
Interleaving
Read-ahead
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
22
Disk Operations
Disks are SLOW!
Latency:
1 disk seek = 40,000,000 cycles
Seek – position head over track/cylinder
Rotational delay – time for sector to rotate
underneath head
Bandwidth:
Transfer time – move bytes from disk to
memory
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
23
Calculating Disk Operations Time
Read/write n bytes:
I/O time = seek + rotational delay + transfer(n)
Can’t reduce transfer time
Can we reduce latency?
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
24
Reducing Latency
Minimize seek time & rotational latency:
Smaller disks
Faster disks
Sector size tradeoff
Layout
Scheduling
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
25
Disk Head Scheduling
Change order of disk accesses
Reduce length & number of seeks
Algorithms
FCFS
SSTF
SCAN, C-SCAN
LOOK, C-LOOK
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
26
I/O time = seek + rotational delay + transfer(n)
FCFS
First-come, first-serve
Example: disk tracks – (65, 40, 18, 78)
18
assume head at track 50
40
65
78
disk
Total seek time?
15+25+22+60 = 122
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
27
I/O time = seek + rotational delay + transfer(n)
SSTF
Shortest-seek-time first (like SJF)
Example: disk tracks – (65, 40, 18, 78)
18
assume head at track 50
40
65
78
disk
Total seek time?
10+22+47+13 = 92
Is this optimal?
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
28
SCAN
I/O time = seek + rotational delay + transfer(n)
Always move back and forth across disk
a.k.a. elevator
Example: disk tracks – (65, 40, 18, 78)
18
assume head at track 50, moving forwards
40
65
78
disk
Total seek time?
15+13+22+60+22=132
When is this good?
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
29
C-SCAN
I/O time = seek + rotational delay + transfer(n)
Circular SCAN:
Go back to start of disk after reaching end
Example: disk tracks – (65, 40, 18, 78)
18
assume head at track 50, moving forwards
40
65
78
disk
Total seek time?
15+13+22+100+18+22=190
Can be expensive, but more fair
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
30
LOOK variants
Instead of going to end of disk,
go to last request in each direction
SCAN, C-SCAN ) LOOK, C-LOOK
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
31
Exercises
5000 blocks, currently at 143, moving forwards
(86, 1470, 913, 1774, 948, 1509, 1022, 1750, 130 )
FCFS
SSTF
SCAN
LOOK
C-SCAN
C-LOOK
Find sequence & total seek time
First = 0, last = 4999
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
32
Solutions
FCFS
SSTF
143, 913, 948, 1022, 1470, 1509, 1750, 1774, 130, 86 – distance: 3,319
C-SCAN
143, 913, 948, 1022, 1470, 1509, 1750, 1774, 4999, 130, 86 – distance: 9,769
LOOK
143, 130, 86, 913, 948, 1022, 1470, 1509, 1750, 1774 – distance: 1,745
SCAN
143, 86, 1470, 913, 1774, 948, 1509, 1022, 1750, 130 – distance: 7,081
143, 913, 948, 1022, 1470, 1509, 1750, 1774, 4999, 0, 86, 130 – distance:
9,985
C-LOOK
143, 913, 948, 1022, 1470, 1509, 1750, 1774, 86, 130 – distance: 3,363
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
33
Storage – Outline
Improving performance of storage systems
(i.e., the disk)
Scheduling
Interleaving
Read-ahead
Tertiary storage
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
34
Disk Interleaving
Contiguous allocation
Requires OS response before disk spins past
next block
Instead of physically contiguous allocation,
interleave blocks
Relative to OS speed, rotational speed
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
35
Read-Ahead
Reduce seeks by reading blocks from disk
ahead of user’s request
Place in buffer on disk controller
Similar to pre-paging
Easier to predict future accesses on disk?
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
36
RAID
One way to really improve throughput:
add disks
Data spread across 100 disks = 100x
improvement in bandwidth
But: reliability drops by 100
Solution - RAID: Redundant Array of
Inexpensive Disks
Replicate, but use some disks/blocks for
checking
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
37
RAID Levels
RAID-1: Mirroring
RAID-2: Add error-correcting checks
Interleave disk blocks with ECC codes (parity, XOR)
10 disks requires 4 check disks
Same performance as level 1
RAID-4: Striping data
Just copy disks = 2x disks, ½ for checking
Spread blocks across disks
Improves read performance, but impairs writes
RAID-5: Striping data & check info
Removes bottleneck on check disks
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
38
Storage – Summary
OS can improve storage system performance
Scheduling
Interleaving
Read-ahead
Adding intelligent hardware (RAID)
improves performance & reliability
UNIVERSITY OF MASSACHUSETTS, AMHERST • Department of Computer Science
39