X-RAY: A Non-Invasive Exclusive Caching Mechanism for RAIDs Lakshmi N. Bairavasundaram Muthian Sivathanu Andrea C.
Download
Report
Transcript X-RAY: A Non-Invasive Exclusive Caching Mechanism for RAIDs Lakshmi N. Bairavasundaram Muthian Sivathanu Andrea C.
X-RAY: A Non-Invasive Exclusive
Caching Mechanism for RAIDs
Lakshmi N. Bairavasundaram
Muthian Sivathanu
Andrea C. Arpaci-Dusseau
Remzi H. Arpaci-Dusseau
ADvanced Systems Laboratory
Computer Sciences Department
University of Wisconsin – Madison
Introduction
Caching in modern systems
Level 1: File system (FS) cache
Multiple levels
Storage: 2-level hierarchy
Application
File system cache
Software-managed
Main memory of host/client
LRU-like cache replacement
Level 2: RAID cache
Host
Firmware-managed
Memory inside RAID system
Usually LRU replacement
RAID
RAID cache
.......
Introduction – contd.
LRU
Read Block no. 10
Replace LRU block
Cache placement on read
LRU
39
23
23
…….. ……..
45
45
10
MRU
Read Block no. 10
Introduction – contd.
LRU
Replace LRU block
Cache placement on read
Read Block no. 10
FS Cache
LRU
…. ……..
10 11
10
12 MRU
2 levels of LRU
Redundant contents
Read Block no. 10
RAID Cache
LRU
LRU
….
10 11
……..
12
10
MRU
MRU
Read Block no. 10
Introduction – contd.
LRU
LRU
….
10 11
….
10
12 MRU
2 levels of LRU
Cache placement on read
Replace LRU block
FS Cache
Redundant contents
Goal:
Exclusive caching
RAID Cache
LRU
11
12
MRU
Improved RAID Caching
Multi-Queue (Zhou et al. 2001)
Add frequency component to cache policy
Not strictly exclusive!
DEMOTE (Wong and Wilkes 2002)
Change interface to disk
File system issues “cache place” command
Has perfect information and hence perfectly exclusive caches
Interface changes – difficult to deploy
Ideal RAID Cache
Exclusive caching
File system and RAID caches should have different contents
Global LRU
Known to work well
RAID cache should be a victim cache
No interface changes
….
RAID Cache
LRU
FS Cache
MRU
Victim Block
……
Block Read
X-RAY
Observes disk traffic
Host
Reads and writes to data and metadata
Builds a model of the FS cache
Uses semantic knowledge
Predicts size and contents of FS cache
File system cache
Identifies set of exclusive blocks
Reads blocks from disk into cache
Result
RAID
Recent victims of the FS cache
A nearly exclusive cache without
interface changes
X-RAY
Model of FS cache
RAID cache
Talk Outline
Introduction
File Systems
Information and Inferences
X-RAY Cache Design
Results
Conclusion
File System Operation
Applications perform file reads and writes
File system (Unix)
Translates file accesses to disk block requests
Metadata
To maintain application data on disk and manage disk blocks
Periodically written to disk
Examples: inodes, bitmap blocks
File System Operation
Inode
Pointers to data blocks
File access information
Latest access time
File
Inode
Pointers to data blocks
Data Blocks
File System Operation
File access
Use inode to obtain pointers to disk data blocks
Read corresponding blocks from disk if they are not in FS cache
Update the access time information in inode
Metadata updates
Periodically check for “dirty” inodes and write to disk
The Problem
To observe disk traffic and infer
the contents of FS cache
Why difficult?
FS cache size changes over time
Shares main memory with virtual
memory system
The Problem
To observe disk traffic and infer
the contents of FS cache
Why difficult?
FS cache size changes over time
Disk cannot observe all FS-level
accesses
12
11
Read block: 10
FS Cache
LRU
12
11
10
MRU
Disk Read
RAID
FS Cache Model
10
11
12
LRU
MRU
The Problem
To observe disk traffic and infer
the contents of FS cache
Why difficult?
FS cache size changes over time
Disk cannot observe all FS-level
accesses
Read block:
10
13
FS Cache
LRU
10
11
12
MRU
Disk Read
RAID
FS Cache Model
10
LRU
11
12
MRU
The Problem
To observe disk traffic and infer
the contents of FS cache
Why difficult?
FS cache size changes over time
Disk cannot observe all FS-level
accesses
Read block:
FS Cache
LRU
12
10
13
MRU
RAID
FS Cache Model
11
LRU
12
13
MRU
The Problem
To observe disk traffic and infer
the contents of FS cache
Why difficult?
FS cache size changes over time
Disk cannot observe all FS-level
accesses
Read block:
FS Cache
LRU
12
10
13
MRU
RAID
Key observation
We need information about
accesses that hit in FS cache
File system maintains access
information in inodes
FS Cache Model
11
LRU
12
13
MRU
Talk Outline
Introduction
File Systems
Information and Inferences
X-RAY Cache Design
Results
Conclusion
Information
Obtain information from observing disk traffic
Knowledge of file system structures and operations
File system maintains time of last access in inodes
Periodic inode writes
Assuming whole file access, all blocks are in FS cache
Assume file system cache policy is LRU
Inferences
Read for data block
Block will be placed in file system cache (MRU block)
Read for previously read data block
Block became victim in file system cache
Blocks with an earlier access time should also be victims
Inode write: new access time , no disk read observed
All blocks belonging to file are in FS cache
Other blocks with later access time should also be present
Talk Outline
Introduction
File Systems
Information and Inferences
X-RAY Cache Design
Results
Conclusion
Design
Block number
Recency list (R-list)
List of data blocks ordered
by access time
LRU A, 1
Cache Begin (CB) pointer
Divides R-list into inclusive
and exclusive regions
RAID Cache contents
Subset of blocks in exclusive
region
Access time
B, 1
Exclusive region
Blocks the RAID
should cache
C, 2
D, 3
CB
E, 3
F, 5
Inclusive region
Blocks expected to be
in FS cache
MRU
Disk Read
Read Block ‘D’ ; time = 6
LRU
A,1
B,1
Exclusive region
C,2
CB
D,3
E,3
F,4
Inclusive region
MRU
Disk Read
Read Block ‘D’ ; time = 6
LRU
A,1
B,1
C,2
D,3
E,3
Exclusive region
Inclusive region
CB
F,4
MRU
Disk Read
Read Block ‘D’ ; time = 6
LRU
A,1
B,1
C,2
E,3
Exclusive region
F,4
Inclusive region
CB
D,6
MRU
Inode Write – Access time change
Inode “23” : access time = 6
Semantic knowledge
Inode “23” == blocks D & E
LRU
A,1
B,1
Exclusive region
Blocks D, E : access time = 6
C,2
D,3
E,4
F,5
G,7
Inclusive region
CB
MRU
Inode Write – Access time change
Inode “23” : access time = 6
LRU
A,1
B,1
Blocks D, E : access time = 6
C,2
D,3
Exclusive region
E,4
F,5
Inclusive region
CB
G,7
MRU
Inode Write – Access time change
Inode “23” : access time = 6
Blocks D, E : access time = 6
D,6
LRU
A,1
B,1
C,2
E,6
F,5
Exclusive region
Inclusive region
CB
G,7
MRU
X-RAY Cache
RAID Cache (size = 2 blocks)
LRU
A,1
B,1
C,2
F,5
Exclusive region
D,6
E,6
Inclusive region
CB
Keep track of additions to window in exclusive region
G,7
MRU
X-RAY Cache
RAID Cache (size = 2 blocks)
LRU
A,1
B,1
C,2
F,5
D,6
Exclusive region
E,6
Inclusive region
CB
Read newly-added blocks from disk
Replace blocks no longer in the window
Additional disk bandwidth
Idle time, extra internal bandwidth, freeblock scheduling
G,7
MRU
Talk Outline
Introduction
File Systems
Information and Inferences
X-RAY Cache Design
Results
Tracking FS Cache Contents
RAID Cache Performance
Conclusion
Results – Tracking
Accurate size and content prediction
Highly responsive to FS cache size changes
Tolerates changes in inode write interval
Partial file reads
X-RAY performs well if percentage of partially accessed files is < 40%
(typical traces have less than 30%)
Results – Cache Performance
Performs better than LRU and
Multi-Queue
Close to DEMOTE, in spite of
imperfect information
Hit rate advantage translates to
lower read latency
Additional Results
File system cache policy is not LRU
Clock, 2Q
X-RAY performs nearly as well as before
It performs better than both LRU and Multi-Queue
Idle time requirements
X-RAY reads blocks into cache only during idle time
It performs well if idle time is greater than one-third of actual idle time
observed in the trace
More in the paper …
Conclusion
Easy deployment is an important goal in developing technology
Higher-level systems maintain various pieces of information
about data they manage
Provide low-level systems with basic semantic knowledge
Semantic intelligence for managing RAID caches
Avoid interface changes – use non-invasive mechanisms
Use access information in metadata to track file system cache contents
and cache exclusive blocks
In spite of imperfect information, X-RAY performs nearly as well as
changing the interface
Semantically-smart Disk Systems
Availability, security and performance improvements
Questions ?
ADvanced Systems Laboratory (ADSL)
Computer Sciences, University of Wisconsin-Madison
http://www.cs.wisc.edu/adsl