Transcript width

Enhanced Availability With RAID

CC5493/7493

RAID

• Redundant Array of Independent Disks • RAID is implemented to improve: – IO throughput (speed) and – Availability of a file system.

RAID Implementation

• Software – often criticized as not being a true RAID implementation.

• Hardware – A special RAID controller is required.

RAID: Stripe

• The stripe takes on two meanings within the context of a RAID system: – Stripe

width

(number of independent drives) – Stripe

size

(storage block size) Both stripe width and stripe size are adjusted to enhance IO throughput.

RAID Stripe Width

• Stripe

width

refers to the number of disks used in parallel for IO transfers to and from the array.

Raid Stripe Size

• Stripe

size

refers to the size of the storage units organized on the disk surface. • The stripe size is adjusted to optimize the speed of the IO transfers.

Common RAID Types

• RAID-0 • RAID-1 • RAID-1+0, RAID-0+1 • RAID-5 • RAID-6

RAID-0

• AKA disk striping • Does not provide redundancy • Degrades data availability, reduces MTF • Improves IO throughput (average IO transfer rate improves)

RAID-0

• Ideal for temporary storage requiring fast data access.

-Engineering/Scientific calculations on large data volumes. However, the data is a redundant temporary copy.

RAID-1

• AKA mirroring • Requires two independent disk devices – The first disk stores the data – The second disk is an image of the first – Can double the overall read throughput

• width = 1

RAID-1

RAID-1 Advantages

• Improves data availability.

• Dual-channel controller allows for two simultaneous read operations.

• Allows for error detection on read.

• Administrative advantages for service on one drive while the other remains available.

• Fault tolerance is one drive.

RAID-1 Disadvantages

• Writes have a slight performance penalty compared to no RAID.

• Doubles the cost of storage.

• Storage efficiency = 50%

RAID-1

• Ideal for data that is read more often than written: – Some database information that is not updated often.

– Web Server information (lots of reads, few writes)

RAID-1+0

• Enhances IO throughput and data availability.

• Requires 2(n+1) separate disk devices, where n = 1, 2, 3, 4, … – Minimum of 4 disks required (n=1)

Width=2

RAID-1+0

• Width = 4

RAID-1+0

RAID-1+0

• RAID-1+0 has a higher fault tolerance compared to RAID-0,1, & 5.

• Storage efficiency is 50%

RAID-0+1

• Requires the same hardware as RAID 1+0, but less fault tolerant.

• However, there is better read throuthput from RAID-0+1 compared to RAID-1+0.

RAID-0+1

• Duplicate RAID-0 arrays. Allows simultaneous reads

RAID-5

• RAID-5 enhances – IO data throughput – Data availability • Parity information enhances availability • Requires a minimum of 3 independent disk devices.

Parity Information

• Based on the logical exclusive-or operation.

RAID-5 Configuration

• Stripe Width = 4

RAID-5

• The most common implementation of RAID.

• Ideal for a disk-server providing general storage.

• A good balance between reliability and speed.

• Often implemented using high quality disk drives (SCSI, 15k-rpm, high MTF)

RAID-5 Limitations

• Overhead occurs during writes due to the parity calculation and parity write.

• Storage efficiency is not 100% due to the parity storage requirements. storage efficiency = (n-1)/n, where n = number of drives.

RAID-5 (S)ATA Limitations

• Large capacity (S)ATA drives are more likely to contain bad blocks.

• After a disk failure, the bad blocks make it impossible to rebuild the array from the remaining drives.

RAID-6

• Contains two sets of parity.

• Tolerates two simultaneous disk failures.

• A better solution for (S)ATA arrays where each disk has a large capacity (multiple TB).

• Stripe Width = 6

RAID-6

• Higher availability at the cost of greater IO overhead due to complex parity calculations and storage.

• Storage efficiency = (n-2)/n • Becoming more popular for large storage capacity (S)ATA arrays

RAID-6 Disadvantages

• More expensive to implement due to extra parity information • Slower write operations compared to other RAID-5

RAID Disk Swapping

• Hot Swap • Warm Swap • Cold Swap

Hot Swap

• The ability to swap out a failed disk from a RAID array without an interruption of service from the array.

• Performance will be slower due to the operations required to rebuild the new replacement disk.

Warm Swap

• The array is not accessible while a drive is being serviced, but the system does not need to be shut down.

Cold Swap

• System must be shutdown to service the array.

Spare Disk: Hot Spare

• Some RAID controllers can be configured to immediately recover from a disk failure if a hot-spare disk is connected to the controller at all times.

RAID Disk Failure and Performance

• When a failed disk is replaced in an array, there is a performance hit as the new disk must be re-populated with the required data for the complete array.

RAID Summary

• RAID-0 : for temporary storage only • RAID-1 : ideal for disk services that provide mostly read operations like data base services and web services.

• RAID-5 : general purpose disk-server • RAID-6 : for very large data requirement environments (multiple T-Bytes).

RAID Summary

• RAID 1+0 : general purpose disk server where RAID-5 & 6 are not adequate.

– Better fault tolerance – More IO throughput

Other?

• RAID 1+1, mirror a mirrored RAID-1 – Triples the cost of storage – Excellent fault tolerance.

– Excellent read throughput.

– Writes will suffer