Transcript RAID

Redundant Array of Independent Disks


Many systems today need to store many
terabytes of data.
Don’t want to use single, large disk
 too expensive
 failures could be catastrophic

Would prefer to use many smaller disks.



is a storage technology.
was first defined by David Patterson, Garth A.
Gibson, and Randy Katz at the University of
California, Berkeley in 1987.
is the organization of multiple disks into a
large, high performance logical disk.


An array of multiple disks accessed
in parallel will give greater throughput than a
single disk.
Redundant data on multiple disks
provides fault tolerance.


Striping
Redundancy


Take file data and map it to different disks
Allows for reading data in parallel
file data
block 0
Disk 0
block 1
Disk 1
block 2
Disk 2
block 3
Disk 3


In engineering, redundancy is the duplication
of critical components or functions of a
system with the intention of increasing
reliability of the system, usually in the case of
a backup or fail-safe.
Data redundancy occurs in database
systems which have a data that is repeated in
two or more disks.




A number of standard schemes have evolved
which are referred to as levels.
There were five RAID levels originally
conceived
Other kinds have been proposed in literature
Level 2 and 4 are not commercially available



Break a file into blocks of data
Stripe the blocks across disks in the system
provides no redundancy or error detection
 important to consider because lots of disks means
low Mean Time To Failure (MTTF)




A complete file is stored on a single disk
A second disk contains an exact copy of the file
Provides complete redundancy of data
Most expensive RAID implementation
 requires twice as much storage space



RAID 2 implements bit striping with ECC
Error correction code (Hamming code) allows
for correction of a single bit error
is not as efficient as other RAID levels and is
not generally used.




Data is striped so each sequential byte is on a
different drive
Parity is calculated across corresponding
bytes and stored on a dedicated parity drive.
It requires only one disk for parity data.
RAID 3 suffers from a write bottleneck.



Similar to RAID 3.
It employs striped data in much larger blocks
or segments.
Not used commercially.

Distribution of the parity strip to avoid the
bottle neck.

Best of all worlds
 read and write performance close to that of RAID Level-1
 requires as much disk space as Levels-3,4


Combine two levels and get the advantages
from both.
Examples: 0+1, 1+0, 0+3, 3+0, 0+5, 5+0, 1+5,
and 5+1.





Today, RAID is found everywhere--In operating system software.
A stand-alone controller providing advanced
data integrity in high-end storage area
networks.
Laptops, as well as desktops, workstations,
servers, and external enclosures with a larger
number of hard disk drives.
RAID is even included in TV set top boxes or
personal storage devices.