Optimization of LFS with Slack Space Recycling and
Download
Report
Transcript Optimization of LFS with Slack Space Recycling and
Optimization of LFS with
Slack Space Recycling and
Lazy Indirect Block Update
Yongseok Oh <[email protected]>
The 3rd Annual Haifa Experimental Systems Conference
May 24, 2010
Haifa, Israel
Yongseok Oh and Donghee Lee (University of Seoul)
Jongmoo Choi (Dankook University)
Eunsam Kim and Sam H. Noh (Hongik University)
SYSTOR2010, Haifa Israel
1
Outline
Introduction
Slack Space Recycling and Lazy Indirect Block Update
Implementation of SSR-LFS
Performance Evaluation
Conclusions
SYSTOR2010, Haifa Israel
2
Log-structured File System
A
B
C
D
Segment buffer
Storage
LFS collects small write requests and writes them sequentially to the
storage device [Rosenblum at el, ACM TOCS ‘91]
Advantages
■ Superior write performance for random workloads
■ Fast recovery
Drawbacks
■ On-demand cleaning
■ Cascading meta-data update
SYSTOR2010, Haifa Israel
3
Storage Devices
Hard Disk Drives
■ Mechanical components – disk head, spindle motor, and platter
■ Poor random read/write performance
Solid State Drives
■ Consist of NAND flash memory, no mechanical parts
■ High performance, low-power consumption, and shock resistance
■ Sequential writes are faster than random writes
SYSTOR2010, Haifa Israel
4
Outline
Introduction
Slack Space Recycling and Lazy Indirect Block Update
Implementation of SSR-LFS
Performance Evaluation
Conclusions
SYSTOR2010, Haifa Israel
5
LFS cleaning
Copy
A
B
Segment 1
C
D
Segment 2
A
B
C
D
Segment 3
To make free segments
■ LFS cleaner copies valid blocks to other free segment
On-demand cleaning
■ Overall performance decreases
Background cleaning
■ It does not affect the performance
SYSTOR2010, Haifa Israel
6
Hole-plugging
Copy
A
B
Segment 1
C
A
B
D
Segment 2
Segment 3
Matthews et al. employed Hole-plugging in LFS [Matthews et al, ACM OSR ’97]
The cleaner copies valid blocks to holes of other segments
SYSTOR2010, Haifa Israel
7
Slack Space Recycling (SSR) Scheme
Segment buffer
E
F
G
H
SSR
SSR
A
B
E
Segment 1
F
C
G
H
D
Segment 2
Segment 3
We proposed SSR scheme that directly recycles slack space to avoid ondemand cleaning
Slack Space is invalid area in used segment
SYSTOR2010, Haifa Israel
8
Cascading meta-data update
Double
Indirect
i-node
Indirect
A
Data
update
Update
Update D
B’
A’
Data’
1
C’
C’
A’
B
C
update B
A
A
C
B
1
N-1
N
B’
C
D
E
N-1
N
D’
F
A’
Segment 1
B’
C’
D’
Segment 2
Update of a data block incurs cascading meta-data update
Modification of a data block consumes 4 blocks
SYSTOR2010, Haifa Israel
9
Lazy Indirect Block Update (LIBU) scheme
Double
Indirect
Indirect
Map
B
C
C
E
Indirect
A
Data
Insert
Insert
A
Update
A
B
Update D
B’
C’
E’
C’
A’
2
B’
IBC (Indirect Block Cache)
C
D
E
1
N-1
N
Insert
B
A’
Data’
i-node
F
A’
Segment 1
1
2
N-1
N
D’
D’
Segment 2
We propose LIBU scheme to decrease cascading meta-data update
■ LIBU uses IBC (Indirect Block Cache) to absorb the frequent update of indirect
blocks
■ Indirect Map is necessary to terminate cascading meta data update
SYSTOR2010, Haifa Israel
10
Crash Recovery
i-node map
Segment
usage table
indirect
map
flush
Consistent state
Rebuild
Write
Scan
indirect
blocks
RAM
Power failure
Write
Disk
search the last checkpoint
Check point(last)
Check point
For crash recovery, LFS periodically stores checkpoint information
If power failure occurs,
■ search the last check point
■ scan all segments written after the last check point
■ rebuild i-node map, segment usage table, indirect map, and indirect blocks in IBC
SYSTOR2010, Haifa Israel
11
Outline
Introduction
Slack Space Recycling and Lazy Indirect Block Update
Implementation of SSR-LFS
Performance Evaluation
Conclusions
SYSTOR2010, Haifa Israel
12
Implementation of SSR-LFS
We implemented SSR-LFS (Slack Space Recycling LFS)
■ Using FUSE (Filesystem in Userspace) framework in Linux
SSR-LFS selectively chooses either SSR or cleaning
■ When the system is idle, it performs background cleaning
■ When the system is busy, it performs SSR or on-dmenad cleaning
If average slack size is too small, it selects on-demand cleaning
Otherwise, it selects SSR
SSR-LFS Core
write(“/mnt/file”)
IBC
buffer
cache
i-node
cache
Syncer
Cleaner
Recycler
libfuse
Userspace
Kernel
VFS
FUSE
Architecture of SSR-LFS
SYSTOR2010, Haifa Israel
13
Outline
Introduction
Slack Space Recycling and Lazy Indirect Block Update
Implementation of SSR-LFS
Performance Evaluation
Conclusions
SYSTOR2010, Haifa Israel
14
Experimental Environment
For comparison, we used several file systems
■ Ext3-FUSE
■ Org-LFS(cleaning)
■ Org-LFS(plugging)
■ SSR-LFS
Benchmarks
■ IO TEST that generates random write workload
■ Postmark that simulates the workload of a mail server
Storage Devices
■ SSD – INTEL SSDSA2SH032G1GN
■ HDD – SEAGATE ST3160815AS
SYSTOR2010, Haifa Israel
15
Execution Time(sec)
Random Update Performance
1600
HDD
Ext3-FUSE
Org-LFS(Plugging)
Org-LFS(Cleaning)
SSR-LFS
1400
1200
1000
800
SSD
500
450
400
350
300
250
200
150
100
50
0
600
400
200
Ext3-FUSE
Org-LFS(Plugging)
Org-LFS(Cleaning)
SSR-LFS
10 20 30 40 50 60 70 80 90
10 20 30 40 50 60 70 80 90
Utilization(%)
Utilization(%)
SSR-LFS shows better than others for a wide range of utilization
On HDD
■ SSR-LFS and Org-LFS outperform Ext3-FUSE under utilization of 85%,
On SSD
■ Ext3-FUSE outperforms Org-LFS due to optimization of SSD for random writes
SYSTOR2010, Haifa Israel
16
Postmark Benchmark Result (1)
SSD
6000
700
5000
600
Execution time
Execution time
HDD
4000
3000
2000
1000
500
400
300
200
100
0
0
Org-LFS
SSR-LFS
Ext3-FUSE
Org-LFS
SSR-LFS
Ext3-FUSE
Medium file size (16KB ~ 256KB)
■ Subdirectories 1000, # of files 100,000, # of transactions 100,000
SSR-LFS outperforms other file systems on both devices
Org-LFS shows better performance than Ext3-FUSE on HDD
Ext3-FUSE shows comparative performance to Org-LFS on SSD
SYSTOR2010, Haifa Israel
17
Postmark Benchmark Result (2)
HDD
SSD
7000
1200
1000
5000
Execution time
Execution time
6000
4000
3000
2000
800
600
400
200
1000
0
0
Org-LFS
SSR-LFS
Ext3-FUSE
Org-LFS
SSR-LFS
Ext3-FUSE
Small file size (4KB ~ 16KB)
■ Subdirectories 1000, # of files 500,000 ,# of transactions 200,000
Ext3-FUSE performs better than other file systems on SSD
■ due to meta-data optimization of Ext3 such as hash-based directory
SYSTOR2010, Haifa Israel
18
Outline
Introduction
Slack Space Recycling and Lazy Indirect Block Update
Implementation of SSR-LFS
Performance Evaluation
Conclusions
SYSTOR2010, Haifa Israel
19
Conclusions
SSR-LFS outperforms original style LFS for a wide range of utilization
Future works
■ Optimization of meta-data structures
■ Cost-based selection between cleaning and SSR
We plan to release the source code of SSR-LFS this year
■ http://embedded.uos.ac.kr
SYSTOR2010, Haifa Israel
20
Q&A
Thank you
SYSTOR2010, Haifa Israel
21
Back up slides
SYSTOR2010, Haifa Israel
22
Storage devices
Request Size
16M
8M
4M
Rand
Request Size
16M
8M
4M
2M
1M
512K
256K
0
128K
16M
8M
4M
2M
1M
512K
256K
64K
128K
32K
16K
Seq
50
64K
Rand
0
100
32K
50
150
8K
Seq
200
16K
100
250
4K
150
Throughput(MB/s)
200
8K
2M
Seagate HDD (Read)
250
4K
1M
Request Size
Seagate HDD (Write)
Throughput(MB/s)
512K
4K
16M
8M
4M
2M
1M
512K
256K
128K
64K
32K
8K
16K
Rand
0
256K
Rand
0
Seq
50
128K
50
100
64K
Seq
150
32K
100
200
8K
150
250
16K
200
Throughput(MB/s)
Intel SSD (Read)
250
4K
Throughput(MB/s)
Intel SSD (Write)
Request Size
SYSTOR2010, Haifa Israel
23
Measurement of FUSE overhead
HDD
SSD
80
250
60
50
40
Ext3
30
Ext3-FUSE
20
Throughput(MB/s)
Throughput(MB/s)
70
200
150
Ext3
Ext3-FUSE
100
50
10
0
0
Seq-write
rand-write
seq-read
rand-read
Seq-write
rand-write
seq-read
rand-read
To identify the performance penalty of a user space implementation
Ext3-FUSE underperforms Kernel Ext3 for almost patterns
■ Due to FUSE overhead
SYSTOR2010, Haifa Israel
24
Postmark Benchmark Result (1)
SSD
6000
700
5000
600
Execution time
Execution time
HDD
4000
3000
2000
1000
500
400
300
200
100
0
0
Org-LFS
1302 cleanings
SSR-LFS
Ext3-FUSE
Org-LFS
SSR-LFS
Ext3-FUSE
4142 SSRs
Medium file size (16KB ~ 256KB)
■ Subdirectories 1000, # of files 100,000, # of transactions 100,000
SSR-LFS outperforms other file systems on both devices
Org-LFS shows better performance then Ext3-FUSE on HDD
Ext3-FUSE shows comparative performance to Org-LFS on SSD
SYSTOR2010, Haifa Israel
25
Postmark Benchmark Result (2)
HDD
SSD
7000
1200
1000
5000
Execution time
Execution time
6000
4000
3000
2000
800
600
400
200
1000
0
0
Org-LFS
3018 cleanings
SSR-LFS
Ext3-FUSE
Org-LFS
SSR-LFS
Ext3-FUSE
1692 cleanings
1451 SSRs
Small file size (4KB ~ 16KB)
■ Subdirectories 1000, # of files 500,000 ,# of transactions 200,000
Ext3-FUSE performs better than other file systems on SSD
■ due to meta-data optimization of Ext3 such as hash-based directory
SYSTOR2010, Haifa Israel
26
Postmark Benchmark Result (3)
SSD
HDD
500
1400
450
1200
350
Execution time
Execution time
400
300
250
200
150
1000
800
600
400
100
200
50
0
0
Org-LFS
SSR-LFS
Ext3-FUSE
Org-LFS
SSR-LFS
Ext3-FUSE
Large file size (disappeared in the paper)
■ File size 256KB ~ 1MB
■ Subdirectories 1000
■ # of files 10,000
■ # of transactions 10,000
SYSTOR2010, Haifa Israel
27
Statistics of IO-TEST
SSD (Utilizaton 85%)
18000
18000
16000
16000
14000
14000
12000
10000
8000
6000
4000
written
read
Mega bytes
Mega bytes
HDD (Utilizaton 85%)
12000
10000
8000
6000
written
4000
read
2000
2000
0
0
SYSTOR2010, Haifa Israel
28
Random Update Performance
IO TEST benchmark
■ 1st stage
Creates 64KB files until 90% utilization
Randomly deletes files until target utilization
■ 2nd stage
Randomly updates up to 4GB of file capacity
Measures the elapsed time in this stage
LFS has no free segments
Including cleaning or SSR cost
Create files
Disk
90%(High threshold)
update
files
20%(desiredRandom
threshold)
Delete
files
SYSTOR2010, Haifa Israel
29
Experimental Environment
Type
Specifics
OS
Ubuntu 9.10 (linux-2.6.31.4)
Processor
AMD 2.1GHz Opteron 1352
RAM
SAMSUNG 4GB DDR-2 800Mhz ECC
SSD
INTEL SSDSA2SH032G1GN
HDD
SEAGATE ST3160815AS
SYSTOR2010, Haifa Israel
30
Comparison of cleaning and SSR scheme
Cleaning case :
Write Request
Tseg
Foreground
x
delay
Ton-demand
Tseg
cleaning
write
write
Tidle
Tback-ground
Background
Free Segments
1
SSR case :
0
1
0
T1
T2
T3
1
3
2
Time
Write Request
Foreground
Tseg
Tssr
write
SSR
Tidle
Tseg
write
Tback-ground
Background
Free Segments
1
0
T1
0
1
2
T2
SYSTOR2010, Haifa Israel
3
T3
Time
31
FIO Benchmark Result (1)
HDD
140
80
120
70
100
80
Org-LFS
SSR-LFS
60
Ext3-FUSE
40
20
Throughput(MB/s)
Throughput(MB/s)
SSD
60
50
Org-LFS
40
SSR-LFS
30
Ext3-FUSE
20
10
0
0
Seq-write
rand-write
seq-read
rand-read
Seq-write
rand-write
seq-read
rand-read
To measure the performance impact of Lazy Indirect Block Update
1 worker - File size 1GB, Request size 4KB, Sync I/O mode
SSR-LFS outperforms Org-LFS and Ext3-FUSE for sequential and
random writes on both devices
SYSTOR2010, Haifa Israel
32
FIO Benchmark Result (2)
HDD
140
90
120
80
100
80
Org-LFS
60
SSR-LFS
Ext3-FUSE
40
20
Throughtput(MB/s)
Throughput(MB/s)
SSD
70
60
50
Org-LFS
40
SSR-LFS
30
Ext3-FUSE
20
10
0
0
Seq-write
rand-write
seq-read
rand-read
Seq-write
rand-write
seq-read
rand-read
4 workers - File size 1GB, Request size 4KB, Sync mode
The LIBU scheme has greater impact on performance when four
workers are running
SYSTOR2010, Haifa Israel
33