Transcript here

Automatic RAID Construction
Ba-Quy Vuong(Bryan) and Yiying
Zhang
Department of Computer Sciences
University of Wisconsin-Madison
Outline
•
•
•
•
•
Introduction
Architecture and Design
Implementation
Performance Evaluation
Conclusion & Future Work
What is RAID?
• Redundant Arrays of Inexpensive Disks
• Purposes:
– Reliability
– Performance
RAID Implementation
• Hardware RAID:
– Using dedicated hardware
to control the disk array
– Host independent
• Software RAID:
– Using a software layer
sitting above the disk
drivers to control the disk
array
– Host dependent
Problems with Software RAID
• There are many ways to build RAID systems,
including:
– Different checksum-based schemes
– Different parity-based scheme
• Not Flexible: Each RAID level requires a
specific RAID driver
• Not Robust: Writing a new RAID driver is timeconsuming and may have lots of bugs
Our Solution: Automatic RAID
Construction
• Approach:
– A way to describe checksum and parity-based
schemes
– Mapping the specified scheme to a RAID driver
• Advantages:
– Flexibility
– Robustness
Outline
•
•
•
•
•
Introduction
Architecture and Design
Implementation
Performance Evaluation
Conclusion & Future Work
Architecture
• Design Consideration
– Parity on top of Checksum
– Checksum on top of Parity
Architecture
• Example:
– 3-disk RAID 5
– Mirroring
checksum
Automatic Parity
• Goals: Allows any parity scheme
• Two data structures
– Layout matrix: How blocks are laid out
•
•
•
•
The whole matrix corresponds to a stripe
Each row corresponds to one strip
Zeros mean data blocks, ones mean parity blocks
Number of columns is the number of disks
0
0
0
1
0
0
1
0
0
1
0
0
1
0
0
0
4-disk RAID 5
0 0 0 1
0 1 0 1
4-disk RAID 4
4-disk RAID 0+1
Automatic Parity
• Two data structures
– Parity matrix: What data blocks contribute to a
parity block
• #rows: #parity blocks in one stripe
• #columns: #data blocks in one stripe
• The element at row i, column j is one means the data
block j is used to calculate the parity block i
1 1 1 0 0 0 0 0 0 0 0 0
0 0 0 1 1 1 0 0 0 0 0 0
0 0 0 0 0 0 1 1 1 0 0 0
0 0 0 0 0 0 0 0 0 1 1 1
4-disk RAID 5
1 1 1
4-disk RAID 4
1 0
0 1
4-disk RAID 0+1
Automatic Checksum - Goals
•
•
•
•
•
Checksum over data and parity blocks
Flexible number of blocks as a checksum unit
Flexible checksum size
Flexible functions
Flexible locations
Automatic Checksum - Design
• User specified parameters:
– # of blocks as a checksum unit
– Checksum size for each block
– Checksum function
• Example:
– 3 blocks as a checksum unit
– 1 block for checksums
• One more level mapping
Outline
•
•
•
•
•
Introduction
Architecture and Design
Implementation
Performance Evaluation
Conclusion & Future Work
Implementation
• RAID driver is implemented as a device driver
in Linux
• Checksums and parities are specified by users
• Checksum functions
• Provided: sum, hash-based
Implementation
• Memory-based version
– Uses each memory chunk as a disk
– Easy to build and debug
– No significant effect on the overall code
• Disk-based version
– Uses real disks
– Communicates with disk drivers through bio structure
– Problems of synchronization due to asynchronous IOs
Outline
•
•
•
•
•
Introduction
Architecture and Design
Implementation
Performance Evaluation
Conclusion & Future Work
Performance Evaluation: Setup
• Host: VMWare, Fedora 8, Intel Core 2 Duo
2.2GHz, 1GB RAM
• Memory-based
• Simulating disk delay
– Each low-level disk read: 15ms
– Each low-level disk write: 17ms
• Simulating disk failure
– Unable to read (20%)
– Read inconsistency (20%)
Performance Evaluation: Settings
• Evaluation settings
– With and without reconstruction
– Different layouts, parity logics, and checksum functions
– Different workloads
• Systems:
–
–
–
–
–
System 1: 4-disk no parity, no checksum
System 2: 4-disk Raid 0+1 with hash-based checksum
System 3: 4-disk Raid 0+1 with sum checksum
System 4: 4-disk Raid 5 with hash-based checksum
System 5: 4-disk Raid 5 with sum checksum
• Workload:
– reading, writing 30KB files
– mkfs, mount
Performance: Avg Read and Write Time
Performance: Deviation of Read & Write
Performance: Reconstruction
Performance: Timeline
Outline
•
•
•
•
•
Introduction
Architecture and Design
Implementation
Performance Evaluation
Conclusion & Future Work
Conclusion
• Why automatic RAID?
– Flexible vs. fixed raid drivers
– Robustness
• Approach
– Automatic Parity with two matrices
– Automatic Checksum with user-defined parameters
• Lessons learned
– Performance is a big issue
– Disk-based RAID is much harder to implement than
Memory-based RAID
Future Work
•
•
•
•
Complete the disk-based version
Improve the performance
Check for input correctness
Extend the parity and checksum layers to
handle more schemes
Questions & Answers